4. Statistics and Probability

The Binomial Distribution

The Binomial Distribution 📊

Introduction: why this model matters

students, many real-life situations involve repeated yes/no outcomes: a coin lands heads or tails, a student answers a multiple-choice question correctly or incorrectly, a product passes quality control or fails, or a medicine works or does not work. When we count how many successes happen in a fixed number of repeated trials, the binomial distribution gives a powerful way to model the results.

In this lesson, you will learn how to recognize when a situation follows a binomial model, how to use the binomial probability formula, and how this distribution connects to the wider study of probability and statistics. By the end, you should be able to describe the assumptions clearly, calculate probabilities, and interpret results in context. ✅

What makes a binomial situation?

A random variable $X$ follows a binomial distribution when it counts the number of successes in $n$ independent trials, where each trial has only two possible outcomes and the probability of success stays the same each time.

The four key conditions are:

  • There is a fixed number of trials, $n$.
  • Each trial has two outcomes, often called success and failure.
  • The trials are independent.
  • The probability of success is constant, $p$, on every trial.

When these conditions are met, we write $X \sim \mathrm{Bin}(n,p)$.

A common mistake is to think “two outcomes” means the model is too simple to be useful. In fact, this model appears everywhere. For example, if a factory checks 20 light bulbs and records how many are defective, that is binomial if the probability of a defective bulb remains the same and each bulb is independent of the others. If a basketball player takes 10 free throws and you count the number made, that can also be binomial if the chance of scoring stays constant. 🏀

The binomial probability formula

If $X \sim \mathrm{Bin}(n,p)$, then the probability of exactly $r$ successes is

$$P(X=r)=\binom{n}{r}p^r(1-p)^{n-r}, \qquad r=0,1,2,\dots,n.$$

This formula has three important parts:

  • $\binom{n}{r}$ counts how many ways $r$ successes can be placed among $n$ trials.
  • $p^r$ gives the probability of success on those $r$ trials.
  • $(1-p)^{n-r}$ gives the probability of failure on the remaining $n-r$ trials.

The combination term is written as

$$\binom{n}{r}=\frac{n!}{r!(n-r)!}.$$

This is often read as “$n$ choose $r$.” It counts arrangements, not probabilities.

Example 1: tossing a fair coin

Suppose a fair coin is tossed $5$ times, and let $X$ be the number of heads. Then $X \sim \mathrm{Bin}(5,0.5)$.

To find the probability of exactly $2$ heads:

$$P(X=2)=\binom{5}{2}(0.5)^2(0.5)^3.$$

Since $(0.5)^2(0.5)^3=(0.5)^5$, this becomes

$$P(X=2)=\binom{5}{2}(0.5)^5=10\cdot \frac{1}{32}=\frac{10}{32}=0.3125.$$

So the probability of exactly $2$ heads is $0.3125$. 🎲

Understanding cumulative probability

Often, the question is not “exactly how many?” but “at least how many?” or “at most how many?” These are cumulative probabilities.

For a binomial random variable $X$:

  • $P(X\le r)$ means at most $r$ successes.
  • $P(X\ge r)$ means at least $r$ successes.
  • $P(a\le X\le b)$ means between $a$ and $b$ successes inclusive.

These are found by adding individual binomial probabilities. For example,

$$P(X\le 2)=P(X=0)+P(X=1)+P(X=2).$$

Example 2: quality control

A company knows that $8\%$ of its screws are faulty. A batch of $12$ screws is selected at random. Let $X$ be the number of faulty screws. Then $X \sim \mathrm{Bin}(12,0.08)$.

To find the probability of at most one faulty screw, calculate

$$P(X\le 1)=P(X=0)+P(X=1).$$

Using the formula:

$$P(X=0)=\binom{12}{0}(0.08)^0(0.92)^{12},$$

$$P(X=1)=\binom{12}{1}(0.08)^1(0.92)^{11}.$$

This type of question is common in IB Mathematics because it tests both your understanding of the model and your ability to interpret the result in a real situation.

Mean and variance of a binomial distribution

The binomial distribution has useful summary measures:

$$\mu=np$$

and

$$\sigma^2=np(1-p).$$

The standard deviation is

$$\sigma=\sqrt{np(1-p)}.$$

These formulas help describe the center and spread of the distribution.

What they mean in context

If $X \sim \mathrm{Bin}(20,0.3)$, then the mean is

$$\mu=20(0.3)=6.$$

This does not mean the number of successes must be $6$ every time. It means that over many repeated sets of $20$ trials, the average number of successes would be about $6$.

The variance is

$$\sigma^2=20(0.3)(0.7)=4.2,$$

so the standard deviation is

$$\sigma=\sqrt{4.2}\approx 2.05.$$

This tells us that the number of successes typically varies by about $2$ from the mean. 📈

Recognizing binomial situations in exam questions

In IB Mathematics, a big skill is deciding whether the binomial model is appropriate before calculating anything.

Ask yourself:

  1. Is the number of trials fixed?
  2. Are there only two outcomes?
  3. Is the probability of success constant?
  4. Are the trials independent?

If the answer to all four is yes, a binomial model is likely appropriate.

Example 3: exam answers

A student answers $15$ multiple-choice questions, each with four options and one correct answer. Suppose the student guesses randomly on each question. Let $X$ be the number correct.

Then $X \sim \mathrm{Bin}(15,0.25)$ because each question has two outcomes for the student: correct or incorrect, the number of questions is fixed, the chance of success is constant, and the questions are treated as independent.

If the question asks for the probability of exactly $4$ correct answers, use

$$P(X=4)=\binom{15}{4}(0.25)^4(0.75)^{11}.$$

If it asks for the probability of at least $4$, then use

$$P(X\ge 4)=1-P(X\le 3).$$

The complement rule often makes calculations shorter. ✨

Linking binomial distributions to the wider topic

The binomial distribution sits inside the larger topic of Statistics and Probability because it connects random experiments, counting outcomes, and interpreting data.

Here are some important links:

  • Probability rules: Binomial calculations use addition, multiplication, and complements.
  • Discrete random variables: The binomial distribution is a discrete distribution because $X$ can only take whole-number values.
  • Statistical modeling: It helps model real situations where success/failure outcomes are counted.
  • Data interpretation: The mean and standard deviation help describe expected patterns in repeated trials.

The binomial distribution is also a foundation for more advanced topics. For example, it helps build intuition for other discrete distributions and for approximations used in statistics when $n$ is large. It also shows how mathematics can move from a simple experiment, like tossing a coin, to a practical problem, like estimating defect rates in manufacturing.

Common errors to avoid

students, these are the mistakes that most often reduce accuracy:

  • Forgetting to check whether the trials are independent.
  • Using the binomial formula when $p$ changes from trial to trial.
  • Confusing “exactly” with “at least” or “at most.”
  • Forgetting to include all terms in a cumulative probability.
  • Using the wrong value of $p$ for success.

For example, if a student answers questions without replacement from a small set, the probability may change after each question, so the situation may not be binomial. Always read the context carefully.

Conclusion

The binomial distribution is one of the most important discrete probability models in IB Mathematics: Analysis and Approaches HL. It applies when there is a fixed number of independent trials, only two outcomes, and the same probability of success each time. Its probability formula,

$$P(X=r)=\binom{n}{r}p^r(1-p)^{n-r},$$

lets you calculate exact and cumulative probabilities in practical situations. Its mean and variance,

$$\mu=np$$

and

$$\sigma^2=np(1-p),$$

help summarize the typical behavior of the random variable. By understanding when and how to use the binomial distribution, you gain a powerful tool for solving problems involving chance, prediction, and decision-making. ✅

Study Notes

  • A binomial random variable counts the number of successes in $n$ fixed trials.
  • The conditions are: fixed $n$, two outcomes, independent trials, and constant success probability $p$.
  • If $X \sim \mathrm{Bin}(n,p)$, then

$$P(X=r)=\binom{n}{r}p^r(1-p)^{n-r}.$$

  • The mean is

$$\mu=np.$$

  • The variance is

$$\sigma^2=np(1-p),$$

and the standard deviation is

$$\sigma=\sqrt{np(1-p)}.$$

  • Use complements for “at least” and “at most” questions when helpful.
  • Check the context carefully to confirm that a binomial model is valid.
  • Binomial distributions are discrete and are a key part of probability modeling in statistics.

Practice Quiz

5 questions to test your understanding