Binomial Distribution 

Definition

The Binomial Distribution is the discrete probability distribution of the number of successes in a sequence of repeated Bernoulli trials, such as repeated coin tosses.

Parameters
Notation \text{B}(n, p)
Support k \in \{0, 1, ..., n \}
Mean n \cdot p
Variance np(1-p)
PMF f(k, n, p) = \binom{n}{k} p^k (1-p)^{n-k}

Probability Mass Function

f(k, n, p) = \mathop{P}( X = k ) = \binom{n}{k} p^k (1-p)^{n-k}

with the number of trials n, the number of successes k, and the success probability p.

Explanation of the Terms

The term p^k gives us the probability to get exactly k successes in a row of k trials. Since we have n trials and not k, the term (1-p)^{n-k} gives us the probability to get only misses (or failures) for the remaining n-k trials. Since the successes can appear anywhere among the n trials, we multiply by the term \binom{n}{k}, which corresponds to the number of possible permutations of k successes within the n trials.

Example Coin Tosses

Imagine we toss a fair coin 10 times. The outcome of each toss is either head or tail. The binomial distribution gives us the probability to get a certain amount of heads (or tails).

P(k=7, n=10, p=0.5) = \binom{10}{7} 0.5^7 (1-0.5)^{10-7} = 0.117

Cumulative Distribution Function

The cumulative distribution function states the probability to get at least k successes.

F(k;n,p)=\Pr(X\leq k)=\sum_{i=0}^{\lfloor k\rfloor }{n \choose i}p^{i}(1-p)^{n-i}

where \lfloor k\rfloor is the greatest integer less than or equal to k.

Urn Model

We have 1 urn with N balls (pN red and (1-p)N black). The binomial distribution describes the probability to draw k red balls from the urn with n trials while putting the balls back into the urn after each trial.

Example Urn

Imagine an urn with 20 balls, 8 are red and 12 are black. We draw 15 times from the urn. What is the probability to get 9 black balls? Answer:

P(k=6, n=15, p=0.4) = P(k=9, n=15, p=0.6) = \binom{15}{6} 0.4^6 \cdot 0.6^9 = 0.206

Note that in the urn model, n is not the number of balls in the urn but the number of draws.

Properties

• Sum of Binomials: The sum of two binomial distributions is again a binomial distribution. If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables with the same probability p, then X + Y ~ B(n+m, p)

• Normal Approximation: If n is large enough, B(n, p) can be approximated as normal distribution \mathcal{N}(np,\; np(1-p)). As a rule of thumb, n is large enough if n \gt 9\left(\frac{1-p}{p}\right)\ \text{and}\ n \gt 9\left(\frac{p}{1-p}\right)

• The binomial distribution is the generalization of the Bernoulli trial, which can be expressed as a binomial distribution with n = 1.

Implementations

import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st

n = 30; p1 = 1/6.0; p2 = 0.5;

lx = np.arange(0,n+1)
plt.plot(lx, st.binom.pmf(lx, n, p1), label='n= 30, p= 1/6 ' )
plt.plot(lx, st.binom.pmf(lx, n, p2), label='n= 30, p= 1/2 ' )
plt.show()
N = 10;
p = 0.5;

x = 0:N;
y = binopdf(x,N,p);

figure
bar(x,y,1)
xlabel('Observation')
ylabel('Probability')
N <- 20
p <- 0.5

x <- 0:N
y <- dbinom(x, size=N, prob=0.2)

plot(x, y)