# Maximum likelihood, moments, and the mean of a Poisson

Posted by Jason Polak on 13. February 2018 · Write a comment · Categories: statistics · Tags: , ,

If we observe data coming from a distribution with known form but unknown parameters, estimating those parameters is our primary aim. If the distribution is uniform on $[0,\theta]$ with $\theta$ unknown, we already looked at two methods to estimate $\theta$ given $n$ i.i.d. observations $x_1,\dots,x_n$:

1. Maximum likelihood, which maximizes the likelihood function and gives $\max\{ x_i\}$
2. Moment estimator $2\sum x_i/n$, or twice the mean

The uniform distribution was an interesting example because maximum likelihood and moments gave two different estimates. But what about the Poisson distribution? It is supported on the natural numbers, depends on a single parameter $\mu$, and has density function
$$f(n) = \frac{e^{-\mu}\mu^n}{n!}$$
What about the two methods of parameter estimation here? Let's start with the method of moments. It is easy to compute the moments of the Poisson distribution directly, but let's write down the moment generating function of the Poisson distribution.
More »

# Maximum likelihood, moments, and the uniform distribution

Suppose we have observations from a known probability distribution whose parameters are unknown. How should we estimate the parameters from our observations?

Throughout we'll focus on a concrete example. Suppose we observe a random variable drawn from the uniform distribution on $[0,\theta]$, but we don't know what $\theta$ is. Our one observation is the number $a$. How can we estimate $\theta$?

One method is the ubiquitous maximum likelihood estimator. With this method, we put our observation into the density function, and maximize it with respect to the unknown parameter. The uniform distribution has density $f(x) = 1/\theta$ on the interval $[0,\theta]$ and zero elsewhere. This function is maximized when $\theta = a$. For if $\theta$ were any smaller, then $f(a)$ would be zero.

Also, it's easy to see that if we draw $n$ samples $a_1,\dots,a_n$ from this distribution, the maximum likelihood estimator for $\theta$, which is the value of $\theta$ that maximizes the joint probability density function, is $\max_i \{a_i\}$.
More »