Topic 3: Probability Distributions

Lesson 3.5: The Normal Distribution And The Normal Approximation To The Binomial

Official syllabus section covering Lesson 3.5: The normal distribution and the normal approximation to the binomial within Topic 3: Probability Distributions: The properties of the normal distribution, including its bell-shaped curve and the approximate proportions of observations within one, two and three standard deviations of the mean.; Finding probabilities and unknown parameters (mean, standard deviation or a value of the variable) for a normal distribution using a calculator, and modelling real situations..

Lesson 3.5: The Normal Distribution and the Normal Approximation to the Binomial

Introduction

In this lesson, we will delve into the properties of the normal distribution and explore its significance as a model for random variables. We will also examine how the normal distribution can be used to approximate the binomial distribution under certain conditions. By the end of this lesson, students will understand the key characteristics of the normal distribution, how to find probabilities and parameters using it, and how it relates to the binomial distribution.

Learning Objectives

  • Understand the properties of the normal distribution, including its bell-shaped curve and the approximate proportions of observations within one, two, and three standard deviations of the mean.
  • Learn how to find probabilities and unknown parameters (mean, standard deviation, or a value of the variable) for a normal distribution using a calculator.
  • Use the normal distribution to approximate a binomial distribution under suitable conditions, applying a continuity correction.
  • Recognize how approximately 95% of observations lie within two standard deviations and around 99.8% within three standard deviations of the mean.
  • Determine normal probabilities and find an unknown mean, standard deviation, or value directly from a calculator while modeling real situations.

Understanding the Normal Distribution

The normal distribution, often referred to as the Gaussian distribution, is a continuous probability distribution characterized by its symmetrical, bell-shaped curve. It is defined by two parameters: the mean ($\mu$) and the standard deviation ($\sigma$). The mean represents the center of the distribution, while the standard deviation indicates the spread or dispersion of the data around the mean.

Properties of the Normal Distribution

  1. Bell Shape: The graph of a normal distribution is bell-shaped and symmetric about the mean. This property indicates that most of the observations cluster around the central peak and the probabilities for values further away from the mean taper off symmetrically.
  1. Mean, Median, Mode: In a normal distribution, the mean, median, and mode all coincide, meaning they occur at the same point, which is also the highest point of the curve.
  1. Empirical Rule: The empirical rule (or the 68-95-99.7 rule) outlines the proportion of observations within one, two, and three standard deviations of the mean:
  • Approximately 68% of observations lie within one standard deviation: $ \mu - \sigma < X < \mu + \sigma $
  • Approximately 95% lie within two standard deviations: $ \mu - 2\sigma < X < \mu + 2\sigma $
  • Approximately 99.7% lie within three standard deviations: $ \mu - 3\sigma < X < \mu + 3\sigma $

Example 1: Empirical Rule Application

Suppose we have a normally distributed dataset of students' heights with a mean of $160 \text{ cm}$ and a standard deviation of $10 \text{ cm}$. According to the empirical rule:

  • About 68% of students will have heights between $160 - 10 = 150 \text{ cm}$ and $160 + 10 = 170 \text{ cm}$.
  • About 95% will have heights between $160 - 20 = 140 \text{ cm}$ and $160 + 20 = 180 \text{ cm}$.
  • About 99.7% will have heights between $160 - 30 = 130 \text{ cm}$ and $160 + 30 = 190 \text{ cm}$.

Finding Probabilities in the Normal Distribution

When working with the normal distribution, it is often necessary to calculate probabilities associated with specific values or ranges. This is typically done using z-scores, which standardize values within the normal distribution. The formula for calculating a z-score is:

$$z = \frac{X - \mu}{\sigma}$$

where:

  • $X$ is the value to be standardized,
  • $\mu$ is the mean,
  • $\sigma$ is the standard deviation.

Once a z-score is calculated, probabilities can be found using standard normal distribution tables or calculators.

Example 2: Calculating a Probability

Consider a normally distributed variable with a mean of $50$ and a standard deviation of $5$. What is the probability that a randomly selected value is less than $60$?

  1. Calculate the z-score:

$$z = \frac{X - \mu}{\sigma} = \frac{60 - 50}{5} = 2$$

  1. Find the probability: Using the z-score table, we find the probability corresponding to a z-score of $2$. This is approximately $0.9772$.

Thus, the probability that a randomly selected value is less than $60$ is about $97.72\%$.

The Normal Approximation to the Binomial Distribution

The binomial distribution is used for discrete random variables that represent the number of successes in $n$ independent Bernoulli trials. However, for large $n$, the binomial distribution can be approximated using the normal distribution. This approximation is useful because it simplifies the calculation of probabilities.

Conditions for Normal Approximation

The normal approximation to the binomial distribution is appropriate when the following conditions are met:

  • Both $np \geq 5$ and $n(1 - p) \geq 5$, where $n$ is the number of trials and $p$ is the probability of success.

When using the normal approximation to the binomial distribution, a continuity correction is applied to adjust for the discrete nature of the binomial distribution. This involves adding or subtracting $0.5$ to the discrete value when calculating probabilities.

Example 3: Normal Approximation

Suppose we have a binomial distribution with parameters $n = 100$ and $p = 0.3$. We want to find the probability of getting more than $40$ successes.

  1. Check conditions:
  • $np = 100 \times 0.3 = 30$ (satisfies $np \geq 5$)
  • $n(1-p) = 100 \times 0.7 = 70$ (satisfies $n(1-p) \geq 5$)
  1. Calculate mean and standard deviation:

$$\mu = np = 30, \quad \sigma = \sqrt{np(1-p)} = \sqrt{100 \times 0.3 \times 0.7} \approx 4.58$$

  1. Apply continuity correction for $P(X > 40$):

$$P(X > 40) \approx P(X > 40.5)$$

  1. Calculate z-score:

$$z = \frac{40.5 - 30}{4.58} \approx 2.29$$

  1. Find the probability: Looking up $z = 2.29$ in the standard normal distribution table gives a probability of approximately $0.9890$. Thus, the probability of getting more than $40$ successes is approximately $0.0110$ (or $1.10\%$).

Conclusion

In this lesson, students has learned about the normal distribution, its properties, and how it can be utilized for modeling random variables. We examined the empirical rule, found probabilities using z-scores, and applied the normal approximation to the binomial distribution. By understanding these concepts, students can better analyze data and practical situations involving random variables. As statistics becomes an essential tool in various fields, mastering these concepts will enhance students's ability to interpret data meaningfully.

Study Notes

  • The normal distribution is bell-shaped and symmetric around the mean.
  • Mean, median, and mode are equal in a normal distribution.
  • Use the empirical rule: 68% within 1 $\sigma$, 95% within 2 $\sigma$, 99.7% within 3 $\sigma$.
  • Calculate z-scores to find probabilities.
  • Use continuity correction when approximating binomial with normal distribution.
  • Conditions for approximation: $np \geq 5$ and $n(1 - p) \geq 5$.

Practice Quiz

5 questions to test your understanding