Bayes Rule

Hello. In todays post we will go through one of the most important topic in machinle learning and mathematics which is Bayes rule. We will see its derivation mainly using some math so before reading make sure you have some knowledge about probability theorem.

Initially lets assume that we have two kinds of data available ( we will learn the Bayes rule using basic robot module):

  1. z sensor information (observation)
  2. u actions.

Firstly lets consider only the first input data which is z sensor observation. Basically, we have to find state estimation given sensor observation z P(x|z). We cannot measure this directly so lets try to simplify it so that we can calcualte it. From conditional probability we know that P(A and B) = P(A|B) * P(B) = P(B|A) * P(A). So Bayes rule can be directly obtained from conditional probability which then becomes:

$$ P(A|B)= \frac{P(B|A)P(A)}{P(B)} $$

This is too easy right ?

If we apply this rule to our case where we have to find likelihood of state x given sensor information z we will get this :

$$ P(x|z)= \frac{P(z|x)P(x)}{P(z)} $$

Lets imagine that we have a sensor and we have to find the probability of it producing some z output. Yes, it is almost impossible. So, usually find it with some workaround using law of total probability:

$$ P(A)=\sum_{n} P(A \land B_n) $$ $$ P(A)=\sum_{n} P(A|B_n) P(B_n) $$

Now, lets apply this to our equation:

$$ P(x|z)= \frac{P(z|x)P(x)}{\sum_{n} P(z|x_n) P(x_n)} $$

If we have some backround knowledge y we can incorporte that into Bayes formula (assuming y and x is not independent from each other):

$$P(x,y|z) =P((x|z)\land (y|z)) = P(x|y,z)p(y|z)$$ $$P(x,y|z) =P((x|z)\land (y|z))= P(y|x,z)p(x|z)$$ $$ P(x|y, z)= \frac{P(y|x,z)P(x|z)}{P(y|z)} $$

Bayes rule for multiple prior observations

If we need to estimate the likelyhood of the current estimate given multiple prior observations, the basic view of the bayes formula becomes like this:

$$ P(x|z_1, z_2 ... z_n)= \frac{P(z_n|x,z_1,z_2 .. z_{n-1})P(x|z_1,z_2 ... z_{n-1})}{P(z_n|z_1, z_2 .. z_{n-1})} $$

Now lets simplify the equation as it is now very obstruse. To do that we will adopt Markov assumption which indicates :

  1. Current state only depends on previous estimate and current action.
  2. Current observation only depends on current estimate.

Based on 1st part of Markov assumption we can simplify the above equation to this:

$$ P(x|z_1, z_2 ... z_n)= \frac{P(z_n|x)P(x|z_1,z_2 ... z_{n-1})}{P(z_n|z_1, z_2 .. z_{n-1})} $$

Now lets replace the denominator with some \(\eta\) variable so that we will compute it later assuming that all the probabilities of the state x given sensor observations \(z_1 .. z_n\) should sum up to 1.

$$ P(x|z_1, z_2 ... z_n)= \eta P(z_n|x)P(x|z_1,z_2 ... z_{n-1}) $$

Now pay attention to the last term of the equation above it is the same equation we have to find but without last sensor observation. Lets apply the bayes filter again:

$$ P(x|z_1, z_2 ... z_n)= \eta _{1:2} P(z_n|x)P(z_{n-1}|x)P(x|z_1,z_2 ... z_{n-2})$$

As you can see it has recursive characteristic,that is why it is called recursive Bayesian estimation.

We can now write it with more general term by using product term:

$$ P(x|z_1, z_2 ... z_n)= \eta _{1:n} (\prod_{i=1,...,n} P(z_n|x)) P(x)$$

Majidov Ikhtiyor

Majidov Ikhtiyor

My name is Majidov Ikhtiyor I am AI/Math enthusiast.

--> --> -->