# Where does the Poisson distribution come from?

Posted by Jason Polak on 08. February 2018 · Write a comment · Categories: statistics · Tags: ,

The Poisson distribution is a discrete probability distribution on the natural numbers $0,1,2,\dots$. Its density function depends on one parameter $\mu$ and is given by
$$d(n) = \frac{e^{-\mu}\mu^n}{n!}$$
Not surprisingly, the parameter $\mu$ is the mean, which follows from the exponential series
$$e^x = \sum_{n=0}^\infty \frac{x^n}{n!}$$
Here is what the density function looks like when $\mu=5$:

How does the Poisson distribution actually arise?

It comes from the following process: suppose you have a fixed interval of time, and you observe the number of occurrences of some phenomenon. In practice, it might be 'the number of buses to arrive at a given bus stop'. Whatever it is, you're counting something.

Moreover, this process has to satisfy the important "Poisson axiom": if you take two disjoint intervals of time that are small, then the number of occurrences in the first is independent of the number of occurrences in the second. Here, "small" means that as the size of the intervals approaches zero, the results should approach independence.

It follows from this axiom that the probability distribution of occurrences is the Poisson distribution. Why is this? Well, first let $p$ be the probability that we observe at least one occurrence. Divide the observation interval into $D$ equally spaced intervals. If $D$ is large, then the Poisson axiom says that we can approximate the probability of $n$ occurrences by
$$f(n) = \binom{D}{n}(1-(1-p)^{1/D})^n[(1-p)^{1/D}]^{D-n}$$

Technically, we are making another assumption here in this approximate formula. It is that $D$ is also quite large so that it is very unlikely to observe two or more occurrences in one of the small intervals. However, it turns out that this approximate formula in the limit gives the correct answer.

Now, to get the true distribution we need to compute $\lim_{D\to\infty} f(n)$. It helps to rewrite $f(n)$ as
$$f(n) = \frac{(1-p)}{n!}D(D-1)\cdots (D-n+1)( (1-p)^{-1/D} -1)^n$$
Computing this limit comes down to computing the limit
$$\lim_{D\to\infty} (D-k)(\alpha^{1/D} – 1).$$
L'Hopital's rule shows that this limit is $\log(\alpha)$. Therefore:
$$\lim_{D\to\infty} f(n) = \frac{(1-p)}{n!}[-\log(1-p)]^n$$.
This is just the Poisson distribution with $\mu = -\log(1-p)$. This also shows the relation to the mean of the Poisson distribution and $p$, which is the probability that at least one event occurs. To get an idea of how the mean depends on $p$, here is a graph: