Maximum likelihood, moments, and the mean of a Poisson

Posted by Jason Polak on 13. February 2018 · Write a comment · Categories: statistics · Tags: , ,

If we observe data coming from a distribution with known form but unknown parameters, estimating those parameters is our primary aim. If the distribution is uniform on $[0,\theta]$ with $\theta$ unknown, we already looked at two methods to estimate $\theta$ given $n$ i.i.d. observations $x_1,\dots,x_n$:

1. Maximum likelihood, which maximizes the likelihood function and gives $\max\{ x_i\}$
2. Moment estimator $2\sum x_i/n$, or twice the mean

The uniform distribution was an interesting example because maximum likelihood and moments gave two different estimates. But what about the Poisson distribution? It is supported on the natural numbers, depends on a single parameter $\mu$, and has density function
$$f(n) = \frac{e^{-\mu}\mu^n}{n!}$$
What about the two methods of parameter estimation here? Let’s start with the method of moments. It is easy to compute the moments of the Poisson distribution directly, but let’s write down the moment generating function of the Poisson distribution.
More »

A very quick tour of R

Posted by Jason Polak on 13. February 2018 · Write a comment · Categories: statistics · Tags: ,

This post is a quick introduction to the R. I learnt R when I was an undergrad and I still use it from time to time. It was one of the first major programs I compiled from source as well.

What is R? It is simply the best statistical computing environment in use today. Better yet, it’s open source. If you’re working with data, you need to learn R. This post won’t teach you how to use R. Instead, it will give a whirlwind tour so you can get appreciate R’s flavour. The only thing I don’t like much about R is searching for material on it. Its one letter name makes that hard.

I will assume that you have R installed, and have opened the interactive console. It should look something like this:

P-values and goodness-of-fit normality testing

Posted by Jason Polak on 07. February 2018 · Write a comment · Categories: statistics · Tags: , ,

In statistical hypothesis testing, the computed p-value is the probability of getting a result “as extreme” as the given data under the null hypothesis. Here, “as extreme” means relative to a test-statistic, which has some distribution under the null hypothesis.

In the natural sciences, an experiment is performed. A statistical test is used whose null hypothesis is supposed to be related to the scientific hypothesis. If the p-value is less than 0.05, the null hypothesis is rejected. This in turn can guide our beliefs about the corresponding scientific hypothesis.

For a concrete example, I took an actual coin and flipped it 32 times. I got 19 heads and 13 tails.

A $\chi^2$-test with the null hypothesis being “equal probabilities of heads and tails” gives a p-value of 0.2888. Based on this data and a rejection level of 0.05, we do not reject the null hypothesis. So, it seems like we don’t have much evidence to say that the coin isn’t fair.

Often there is more than one well-known test that can be used, simply because you can compute any sort of test statistic that you want. In such cases, results under the null may be strongly dependent on the particular test statistic used. I’d like to illustrate this with goodness-of-fit testing for normality. There are quite a few ways to test for normality. One method is the Kolmogorov-Smirnov test, and another is the Shapiro-Wilk test. Here is a little R script that looks at these two different methods of testing for normality:
More »