# Binomial distribution: mean and variance

A Bernoulli random variable with parameter $p$ is a random variable that takes on the values $0$ and $1$, where $1$ happens with probability $p$ and $0$ with a probability of $1-p$. If $X_1,\dots,X_n$ are $n$ independent Bernoulli random variables, we define
$$X = X_1 + \cdots + X_n.$$ The random variable $X$ is said to have a binomial distribution with parameters $n$ and $p$, or a $B(n,p)$ distribution. The probability mass function of a $B(n,p)$ random variable $X$ is
$$f(k) = P(X = k) = \sum_{k=0}^n\binom{n}{k}p^k(1-p)^{n-k}.$$

What is the expected value of the $B(n,p)$ variable $X$? Expectation is linear so we can use the definition of $X$ in terms of a sum of $n$ Bernoulli random variables
$$E(X) = E(X_1) + \cdots + E(X_n).$$ The expectation $E(X_i) = 0(1-p) + 1(p) = p$. Therefore:

Theorem. The expected value of a binomial $B(n,p)$ random variable is $np$.

What about the variance of $X$? Recall, the variance of a random variable $X$ is defined to be:
$$E((X-E(X))^2) = E(X^2) – E(X)^2.$$ Therefore, we need to calculate $E(X^2)$, since we know from our previous calculation that $E(X)^2 = n^2p^2$. In fact, for our $B(n,p)$ random variable $X$,
$$E(X^2) = np(1-p) + n^2p^2.$$ How do we prove this? I thought it might be fun to prove it by induction on $n$, the base case $n=0$ I'll leave to the reader. By definition of expected value,
$$E(X^2) = \sum_{k=0}^n\binom{n}{k} k^2p^k(1-p)^{n-k}.$$ The obvious way to get induction into play is to use the inductive formula for the binomial coefficients
$$\binom{n}{k} = \binom{n-1}{k} + \binom{n-1}{k-1}.$$ By putting this into the definition for $E(X^2)$, we get
$$E(X^2) = (1-p)\sum_{k=0}^{n-1}\binom{n-1}{k}k^2p^k(1-p)^{n-1-k} + \sum_{k=0}^n\binom{n-1}{k-1}k^2p^k(1-p)^{n-k}.$$ The first summation term, which we will call (A), is just $(1-p)$ times the expectation of a $B(n-1,p)$ random variable, and we know this via the induction hypothesis. Therefore, (A) is
$$(1-p)[(n-1)p(1-p) + (n-1)^2p^2]$$ by the induction hypothesis. The second term (B) may be simplified by using the substitution
$$k^2 = (k-1+1)^2 = (k-1)^2 + 2(k-1) + 1.$$ Therefore, $(B)$ may be written as
\begin{align*}\sum_{k=0}^n\binom{n-1}{k-1}k^2p^k(1-p)^{n-k} &= p\sum_{k=1}^n\binom{n-1}{k-1}(k-1)^2p^{k-1}(1-p)^{n-k}\\&+2p\sum_{k=1}^n\binom{n-1}{k-1}(k-1)p^{k-1}(1-p)^{n-k}\\&+ p\sum_{k=1}^n\binom{n-1}{k-1}p^{k-1}(1-p)^{n-k}.\end{align*} Note that I have taken out a factor of $p$ from each sum to make the power of $p$ nice for simplifcation. We see now that these three terms are in order: $p$ times the expected value of the the square of a $B(n-1,p)$ which we again know by the induction hypothesis, $p$ times the expected value of a $B(n-1,p)$, and just $p$. Therefore, putting everything together (i.e. adding together (A) and (B) for $E(X^2)$), we get
\begin{align*}E(X^2) &= (n-1)p(1-p)^2 + (n-1)^2p^2(1-p) + (n-1)p^2(1-p) + 2p^2(n-1) + p\\ &= (n-1)[p^2 + p -2p^2 + 2p^2 + (n-1)p^2] + p\\ &= np – np^2 + n^2p^2\\ &= np(1-p) + n^2p^2.\end{align*} And there we have it. Obviously, I skipped a few straightforward steps in this derivation, but I encourage the reader to go through every step if they need to understand the true way of the binomial distribution. Anyway, by subtracting $E(X)^2$, we get the variance:

Theorem. The variance of a binomial $B(n,p)$ random variable is $np(1-p)$.