MAP ESTIMATES¶

Priors, Bayes rule and MAP¶

Last class we dealt with maximising $ \theta $ wrt $ L(x_1, x_2, ... x_n| \theta) $ to obtain a point estimate for $ \theta $ called as the MLE estimate.

Lets use the shorthand $ P(D|\theta) $ for the likelihood term

Now let's assume that we have some beliefs on the value of $ \theta $. For example, if you are tossing a coin and you believe that the coin is extremely biased to either heads or tails. If you represent your beliefs in terms of a probability distribution, you get the prior distribution on $ \theta$. Let's call it $ P(\theta)$

By Bayes' rule we have,

$$ P( \theta | D ) = \frac{ P ( D | \theta) P ( \theta )}{ P( D )} $$

This is called the posterior distribution of $ \theta $

Maximum a posteriori estimate:

$$ \theta_{MAP} = arg max_{\theta} P(\theta | D) $$

Compare it to MLE:

$$ \theta_{MLE} = arg max_{\theta} P( D | \theta) $$

Hypothesis and Data¶

Discrete hypothesis and Discrete data

Continuous hypothesis and Discrete data

Homework (Slides)

Discrete hypothesis and Continuous data

Continuous hypothesis and Continuous data

Discrete hypothesis and Discrete data¶

<img src="./files/dis_dis.png" width = 80%/>

What is the MAP here?

What is the MLE here?

Continuous hypothesis and Discrete data¶

Find the MAP estimate of $\theta$

Beta distribution¶

Let $ X \sim Beta(\alpha, \beta). $

Then $$ P[X=x] = \frac{x^{\alpha - 1} (1 - x)^{\beta - 1} \Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)} $$

where $$ \Gamma(x) = \int_{0}^{∞} t^{x-1} e^{-t} dt $$

Remember $ \Gamma(n) = (n - 1) ! $ for natural number n

Problems:¶

What is the value of X for which P[X=x] is maximized? What if $ \alpha = \beta $?

What is the expected value of X?

Check distribution shape

$ Beta(1, 1) $

What happens when alpha increases keeping beta fixed?

$ Beta (2, 1) $

$ Beta (5, 1) $

$ Beta(10, 1) $

When would you use this as a prior?

$ Beta(0.5, 0.5) $

BETA DISTRIBUTION VISUALIZER

Example:¶

Prior: $ Beta(\alpha, \beta) $

Observed data: s heads, t tails

Find the posterior.
Find the MAP estimate of $ \theta $

Why Beta distribution? Conjugate priors