6. Inference for Categorical Data(COLON) Proportions

Carrying Out A Test For A Population Proportion

Carrying Out a Test for a Population Proportion

students, imagine a school wants to know whether most students prefer online homework over paper homework 📱✏️. A survey gives some evidence, but a single sample does not automatically prove what the whole school thinks. In AP Statistics, carrying out a test for a population proportion helps you use sample data to make a decision about a population claim.

In this lesson, you will learn how to:

  • identify the parts of a hypothesis test for a population proportion,
  • check the conditions needed to use the test,
  • calculate the test statistic and $p$-value,
  • make a conclusion in context,
  • connect this process to the larger unit on inference for categorical data.

A population proportion is written as $p$. It represents the true proportion of a population with a certain characteristic, such as the proportion of voters who support a candidate or the proportion of students who arrive at school before $8{:}00$ a.m.

What a One-Proportion Test Is Asking

A significance test for a population proportion is used when you want to decide whether sample data give convincing evidence for a claim about $p$. The claim is usually written as a null hypothesis, $H_0$, and an alternative hypothesis, $H_a$.

For example, suppose a principal claims that $60\%$ of students at a school prefer flexible lunch periods. If you survey a random sample of students, you can test whether the data support that claim.

The hypotheses are written using $p$:

$$H_0: p = 0.60$$

$$H_a: p \ne 0.60$$

The null hypothesis usually contains the value being tested, called $p_0$. The alternative hypothesis depends on the research question:

  • $H_a: p \ne p_0$ for a two-sided test,
  • $H_a: p > p_0$ for a right-tailed test,
  • $H_a: p < p_0$ for a left-tailed test.

These forms matter because they determine how the $p$-value is found and how the conclusion is stated.

Step 1: State the Hypotheses Clearly

The first step in AP Statistics is to write the hypotheses in context. Do not just use symbols; say what the symbols mean.

Example: A company says that $p = 0.25$ of its batteries last at least five years. A consumer group thinks the true proportion is smaller.

The hypotheses are:

$$H_0: p = 0.25$$

$$H_a: p < 0.25$$

Here, $p$ is the true proportion of all batteries from the company that last at least five years.

Why do we start with $H_0$? Because it represents the status quo or the claim we test against. The alternative is the competing idea that the sample may support.

A common AP tip is to use the exact wording of the context when explaining the hypotheses. If the question is about voters, say “the proportion of voters.” If it is about students, say “the proportion of students.” This keeps your answer specific and accurate ✅.

Step 2: Check the Conditions

Before using a one-proportion $z$ test, you must check conditions. These make sure the test is reasonable.

1. Random Condition

The data should come from a random sample or randomized experiment. For a sample survey, this means the sample was chosen randomly from the population.

2. Independence Condition

Responses should be independent. If sampling without replacement, a common rule is the $10\%$ condition: the sample size $n$ should be less than $10\%$ of the population size $N$, or $n < 0.1N$.

3. Large Counts Condition

The expected number of successes and failures under the null hypothesis should both be at least $10$:

$$np_0 \ge 10$$

$$n(1-p_0) \ge 10$$

These are called the success-failure conditions. They help justify using the normal approximation for the sampling distribution of $\hat{p}$.

For example, if $n = 100$ and $p_0 = 0.60$, then

$$np_0 = 100(0.60) = 60$$

$$n(1-p_0) = 100(0.40) = 40$$

Since both are at least $10$, the normal model is appropriate.

If conditions are not met, the usual one-proportion $z$ test is not valid, so you should not continue with that procedure.

Step 3: Find the Test Statistic

The test statistic measures how far the sample proportion is from the null value, in standard error units.

The sample proportion is

$$\hat{p} = \frac{x}{n}$$

where $x$ is the number of successes in the sample.

The one-proportion $z$ statistic is

$$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$

Notice that the standard deviation in the denominator uses $p_0$, not $\hat{p}$. That is because the null hypothesis is assumed true when calculating the test statistic.

Example: Suppose $n = 200$ students are surveyed, and $118$ say they prefer school starting later. Test the claim that $p = 0.50$.

First find

$$\hat{p} = \frac{118}{200} = 0.59$$

Then compute the standard error under the null:

$$\sqrt{\frac{0.50(1-0.50)}{200}} = \sqrt{\frac{0.25}{200}}$$

So

$$z = \frac{0.59-0.50}{\sqrt{\frac{0.25}{200}}}$$

This $z$ value tells you how unusual the sample result would be if $p = 0.50$ were true.

Step 4: Find the $p$-Value

The $p$-value is the probability, assuming $H_0$ is true, of getting a sample result at least as extreme as the one observed.

  • For a right-tailed test, the $p$-value is the area to the right of the test statistic.
  • For a left-tailed test, it is the area to the left.
  • For a two-sided test, it is the area in both tails beyond $\pm |z|$.

A small $p$-value means the sample result would be unlikely if $H_0$ were true. A large $p$-value means the sample result is not surprising under $H_0$.

Common decision rule:

  • If $p\text{-value} \le \alpha$, reject $H_0$.
  • If $p\text{-value} > \alpha$, fail to reject $H_0$.

Here, $\alpha$ is the significance level, often $0.05$ in AP Statistics.

Important language: say reject or fail to reject the null hypothesis. Do not say “accept $H_0$,” because failing to reject does not prove the null is true.

Step 5: Make a Conclusion in Context

The last step is to translate the statistical decision into a real-world conclusion.

Suppose a test gives $p\text{-value} = 0.018$ and the significance level is $\alpha = 0.05$. Since $0.018 < 0.05$, reject $H_0$.

A correct conclusion might be:

“Because the $p$-value is less than $\alpha = 0.05$, there is convincing evidence that the true proportion of students who prefer later school start times is different from $0.50$.”

That conclusion has three important parts:

  1. It mentions the decision.
  2. It includes context.
  3. It refers to the population parameter $p$.

Avoid saying the sample shows the claim is “proven.” Statistics gives evidence, not absolute certainty 🔍.

Understanding Type I and Type II Errors

Every hypothesis test can make errors.

  • A Type I error happens if you reject $H_0$ when $H_0$ is actually true.
  • A Type II error happens if you fail to reject $H_0$ when $H_0$ is actually false.

For a proportion test, imagine testing whether $p = 0.25$ for a company’s batteries.

  • A Type I error would mean concluding the battery lifetime proportion is not $0.25$ when it actually is $0.25$.
  • A Type II error would mean missing a real difference and concluding there is not enough evidence against $p = 0.25$ when the true proportion is different.

The significance level $\alpha$ is the probability of a Type I error, assuming $H_0$ is true. A smaller $\alpha$ lowers the chance of Type I error, but it can make Type II error more likely.

This tradeoff matters in real life. For example, in medical testing, a Type I error might falsely suggest a treatment works, while a Type II error might miss a treatment that actually helps patients.

How This Fits the Bigger Picture of Inference for Proportions

Carrying out a one-proportion test is part of the larger AP Statistics topic of inference for categorical data. In this unit, you study how to use sample data to estimate or test population proportions.

The main ideas connect like this:

  • A confidence interval estimates a plausible range for $p$.
  • A significance test checks whether data support a specific claim about $p$.
  • A two-proportion test compares two population proportions, using a similar logic but a different formula.
  • Errors help explain what can go wrong when making conclusions from sample data.

The same general framework appears again and again in AP Statistics:

  1. State the problem.
  2. Check conditions.
  3. Compute the test statistic.
  4. Find the $p$-value.
  5. Draw a conclusion in context.

Once you learn this flow for a population proportion, you can reuse the same reasoning for many other inference problems.

Conclusion

students, carrying out a test for a population proportion is a structured way to use sample data to make a decision about a population claim. You begin by writing hypotheses about $p$, check the random, independence, and large counts conditions, calculate the one-proportion $z$ statistic, and use the $p$-value to decide whether to reject $H_0$.

The most important skill is not just doing the math, but explaining what it means in context. A good AP Statistics conclusion clearly states the evidence, the decision, and the real-world meaning. This lesson is a foundation for the rest of inference for categorical data, including confidence intervals, two-proportion methods, and understanding statistical error 📊.

Study Notes

  • The population proportion is written as $p$.
  • State hypotheses in context, using $H_0: p = p_0$ and an appropriate alternative $H_a$.
  • Check three conditions before using a one-proportion $z$ test:
  • Random condition
  • Independence condition, often using $n < 0.1N$
  • Large counts condition: $np_0 \ge 10$ and $n(1-p_0) \ge 10$
  • Compute the sample proportion with $\hat{p} = \frac{x}{n}$.
  • The test statistic is

$$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$$

  • The $p$-value is the probability of results at least as extreme as the sample result, assuming $H_0$ is true.
  • If $p\text{-value} \le \alpha$, reject $H_0$; otherwise, fail to reject $H_0$.
  • Always write conclusions in context and refer to the population parameter $p$.
  • A Type I error is rejecting a true $H_0$.
  • A Type II error is failing to reject a false $H_0$.
  • Significance tests for proportions are a key part of inference for categorical data and connect directly to confidence intervals and two-proportion procedures.

Practice Quiz

5 questions to test your understanding