Suppose we have observations from a known probability distribution whose parameters are unknown. How should we estimate the parameters from our observations?

Throughout we’ll focus on a concrete example. Suppose we observe a random variable drawn from the uniform distribution on $[0,\theta]$, but we don’t know what $\theta$ is. Our one observation is the number $a$. How can we estimate $\theta$?

One method is the ubiquitous maximum likelihood estimator. With this method, we put our observation into the density function, and maximize it with respect to the unknown parameter. The uniform distribution has density $f(x) = 1/\theta$ on the interval $[0,\theta]$ and zero elsewhere. This function is maximized when $\theta = a$. For if $\theta$ were any smaller, then $f(a)$ would be zero.

Also, it’s easy to see that if we draw $n$ samples $a_1,\dots,a_n$ from this distribution, the maximum likelihood estimator for $\theta$, which is the value of $\theta$ that maximizes the joint probability density function, is $\max_i \{a_i\}$.
More »