Normal Distribution 📈
Welcome, students! In this lesson, you will learn one of the most important models in statistics: the normal distribution. It appears again and again in real life, from test scores and heights to measurement errors and quality control. By the end of this lesson, you should be able to explain what makes a normal distribution special, use its key terminology, and apply its ideas to solve problems in IB Mathematics: Applications and Interpretation HL.
What you will learn
By studying this lesson, you will be able to:
- Explain the main ideas and terminology behind the normal distribution.
- Use the mean and standard deviation to describe a normal distribution.
- Interpret probabilities and percentages from normal curves.
- Apply standardisation with $z$-scores.
- Connect the normal distribution to data analysis, probability models, and decision-making.
The normal distribution is a central tool in statistics because it helps us understand how data are spread around an average value. Many natural and human-made measurements are close to normal, which makes this model very useful in real-world analysis. 🌍
The shape and meaning of a normal distribution
A normal distribution is a continuous probability distribution that is symmetric and bell-shaped. Its highest point is at the mean, and values get less common as they move farther from the centre. If a data set follows a normal distribution, most values are near the mean, and fewer values are far away from it.
The graph of a normal distribution has these important features:
- It is symmetric about the mean.
- The mean, median, and mode are all equal.
- The total area under the curve is $1$.
- The curve never touches the horizontal axis, though it gets closer and closer to it.
A normal distribution is often written as $X \sim N(\mu, \sigma^2)$, where $\mu$ is the mean and $\sigma$ is the standard deviation. The variance is $\sigma^2$. For example, if exam scores are modeled by $X \sim N(70, 9)$, then the mean is $70$ and the standard deviation is $3$, because $\sigma=\sqrt{9}=3$.
Think of $\mu$ as the centre of the bell curve and $\sigma$ as a measure of spread. A larger $\sigma$ gives a wider, flatter curve, while a smaller $\sigma$ gives a narrower, taller curve. 🎯
Why the normal distribution matters in statistics
In statistics, we often want to describe data and make conclusions from it. The normal distribution is useful because it gives a model for many variables seen in practice. Examples include:
- Heights of people of the same age group
- Measurement errors in experiments
- IQ scores in some populations
- Standardised test scores
- Production measurements in factories
The reason this model is so important is that it helps us make probability statements. For example, if the heights of students are approximately normal, we can estimate the proportion who are taller than a certain height or shorter than another height.
Normal distributions also connect to inferential reasoning. In real studies, we may collect a sample and use it to draw conclusions about a population. If the data are normal, or close to normal, then probability tools based on the normal model can help us make predictions and decisions with evidence.
The empirical rule and what it tells us
For a normal distribution, a useful approximation is the empirical rule:
- About $68\%$ of values lie within $1\sigma$ of the mean.
- About $95\%$ of values lie within $2\sigma$ of the mean.
- About $99.7\%$ of values lie within $3\sigma$ of the mean.
This rule gives a quick way to understand spread without needing exact calculations. For example, suppose the marks on a quiz are normally distributed with mean $60$ and standard deviation $5$. Then:
- About $68\%$ of marks are between $55$ and $65$.
- About $95\%$ of marks are between $50$ and $70$.
- About $99.7\%$ of marks are between $45$ and $75$.
This is helpful when you need a rough estimate or a quick check on whether a result is unusual. If a score lies more than $2\sigma$ away from the mean, it is relatively uncommon. If it lies more than $3\sigma$ away, it is very unusual. 📊
Standardisation and $z$-scores
To compare values from different normal distributions, we use standardisation. This means converting a value $x$ into a $z$-score using:
$$z=\frac{x-\mu}{\sigma}$$
A $z$-score tells us how many standard deviations a value is from the mean.
- If $z=0$, the value is exactly at the mean.
- If $z=1$, the value is $1\sigma$ above the mean.
- If $z=-2$, the value is $2\sigma$ below the mean.
Standardisation is important because it lets us use one common scale for all normal distributions. That means once a value is converted to a $z$-score, we can use standard normal probability tables or technology to find probabilities.
Example 1: Comparing test scores
Suppose School A has scores that are normally distributed with $\mu=72$ and $\sigma=8$. School B has scores with $\mu=68$ and $\sigma=4$. A student gets $80$ in School A and $76$ in School B.
For School A:
$$z=\frac{80-72}{8}=1$$
For School B:
$$z=\frac{76-68}{4}=2$$
Even though $76$ is lower than $80$, the student in School B performed better relative to their group because $z=2$ is farther above the mean than $z=1$. This is a powerful example of why standardisation matters. ✅
Probabilities under the normal curve
In a normal distribution, probability is represented by area under the curve. This is a key idea in continuous probability models. Since the total area is $1$, any probability must be between $0$ and $1$.
Common probability questions include:
- Find $P(X<a)$, the probability that $X$ is less than $a$.
- Find $P(X>a)$, the probability that $X$ is greater than $a$.
- Find $P(a<X<b)$, the probability that $X$ lies between $a$ and $b$.
Because the curve is symmetric, the mean splits the area into two equal halves. So for a normal distribution, $P(X<\mu)=0.5$ and $P(X>\mu)=0.5$.
Example 2: A real-world waiting time model
Suppose the time students wait for the school bus is modeled by $X \sim N(10, 4)$, so $\mu=10$ and $\sigma=2$.
What is the probability that a student waits less than $12$ minutes?
First standardise:
$$z=\frac{12-10}{2}=1$$
Then we use the standard normal distribution. The probability $P(Z<1)$ is about $0.8413$.
So:
$$P(X<12)\approx 0.8413$$
This means about $84.13\%$ of students wait less than $12$ minutes. The remaining $15.87\%$ wait longer. Such results help schools and transport planners make practical decisions. 🚌
Finding values from probabilities
Sometimes the question gives a probability and asks for a value. For example, you may need to find the score above which only the top $10\%$ of students fall. This is called finding a percentile.
If $P(X<a)=0.90$, then $a$ is the $90$th percentile. To find $a$, you can:
- Convert the probability into a $z$-score using a table or technology.
- Use the formula $x=\mu+z\sigma$.
Example 3: Top performers
Suppose mathematics test scores are modeled by $X \sim N(65, 100)$. Then $\mu=65$ and $\sigma=10$.
Find the score needed to be in the top $10\%$.
The $90$th percentile has a standard normal $z$-value of about $1.28$.
Now convert back:
$$x=\mu+z\sigma=65+(1.28)(10)=77.8$$
So a score of about $78$ is needed to be in the top $10\%$. This kind of calculation is useful in grade boundaries, selection tests, and benchmarking. 🏅
How the normal distribution fits into IB Statistics and Probability
In IB Mathematics: Applications and Interpretation HL, the normal distribution is part of a larger system of ideas in statistics and probability. It connects to:
- Data analysis and interpretation, because we use it to describe the spread of data.
- Statistical processes and distributions, because it is one of the main continuous distributions.
- Probability models, because it helps us find chances of outcomes.
- Inferential reasoning, because we use sample data and models to make informed decisions.
- Real-world decision-making, because many practical situations depend on estimating risk, quality, and performance.
A strong IB answer should show interpretation, not just calculation. For example, if you compute $P(X>80)=0.05$, you should explain that there is a $5\%$ chance of a value above $80$, so values above $80$ are unusual in this model.
It is also important to check whether the normal model is appropriate. Real data may be skewed, have outliers, or be discrete rather than continuous. The normal distribution is a model, not a rule for every situation. Good statistical reasoning means judging whether the model fits the context. 🧠
Conclusion
The normal distribution is one of the most important ideas in statistics because it gives a simple and powerful way to model many real-world measurements. You have seen that it is symmetric, bell-shaped, and described by the mean $\mu$ and standard deviation $\sigma$. You have also learned how to use $z$-scores, interpret areas as probabilities, and estimate percentiles.
For IB Mathematics: Applications and Interpretation HL, the normal distribution is more than a formula. It is a tool for reasoning about data, comparing results, and making decisions based on evidence. When used carefully, it connects probability, statistics, and real-life contexts in a meaningful way. ✨
Study Notes
- A normal distribution is a continuous, symmetric, bell-shaped distribution.
- It is written as $X \sim N(\mu, \sigma^2)$.
- The mean $\mu$, median, and mode are equal in a normal distribution.
- The total area under the curve is $1$.
- Probability is represented by area under the curve.
- About $68\%$, $95\%$, and $99.7\%$ of values lie within $1\sigma$, $2\sigma$, and $3\sigma$ of the mean.
- Standardisation uses $z=\frac{x-\mu}{\sigma}$.
- A $z$-score measures how many standard deviations a value is from the mean.
- To find a percentile, use a $z$-score and then convert back with $x=\mu+z\sigma$.
- The normal distribution is widely used in data analysis, probability models, and inferential reasoning.
- Always check whether the normal model is suitable for the context before using it.
