Lesson 5.5: First Hypothesis Tests, Errors and Power

Introduction

In this lesson, we will delve into the critical elements of statistical inference, focusing specifically on hypothesis testing. We will explore hypothesis tests for binomial proportions, conduct tests for the mean of a normal distribution, and understand the implications of errors and power in hypothesis testing. By the end of this lesson, you should be able to:

Conduct hypothesis tests for binomial proportions and means with known variance.
Explain Type I and Type II errors, their significance levels, and calculate the risk of Type II errors.
Differentiate between critical regions and p-values.
Interpret the results of hypothesis tests in contextual settings.

Hypothesis Testing for a Binomial Proportion

Concept Overview

Hypothesis testing is a statistical method that uses sample data to evaluate a hypothesis about a population parameter. In the case of a binomial proportion, we often want to know whether the proportion of successes in a population is equal to, greater than, or less than a claimed value.

The Hypotheses

In hypothesis testing, we formulate two competing hypotheses:

Null Hypothesis ($H_0$): This states that there is no effect or difference. For a binomial proportion, it usually takes the form $H_0: p = p_0$, where $p_0$ is the hypothesized population proportion.
Alternative Hypothesis ($H_a$): This represents what we want to test, indicating a significant difference or effect. It can be one-sided ($H_a: p > p_0$ or $H_a: p < p_0$) or two-sided ($H_a: p \neq p_0$).

Conducting the Test

Collect Sample Data: Assume we have a sample of size $n$ with $x$ successes.
Calculate the Sample Proportion: The sample proportion is given by $\hat{p} = \frac{x}{n}$.
Determine the Test Statistic: The test statistic for a binomial proportion can be calculated using the formula:

$$z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}}$$

Find the P-value: Depending on whether the test is one-sided or two-sided, we determine the p-value from the z-table.
Make a Decision: Compare the calculated p-value against the significance level $\alpha$ (commonly 0.05).

Example

Suppose we want to test whether the proportion of people who prefer tea over coffee is greater than 0.5. In a sample of 100 people, 60 prefer tea.

Set the Hypotheses: $H_0: p = 0.5$, $H_a: p > 0.5$.
Sample Data: $n = 100$, $x = 60$, $\hat{p} = \frac{60}{100} = 0.6$.
Calculate Test Statistic:

$$z = \frac{0.6 - 0.5}{\sqrt{\frac{0.5 \times 0.5}{100}}} = \frac{0.1}{0.05} = 2$$

Find the P-value: Using a z-table, the p-value associated with $z = 2$ is approximately 0.0228.
Decision: If $\alpha = 0.05$, since $0.0228 < 0.05$, we reject the null hypothesis. We conclude there is sufficient evidence to suggest a majority prefers tea.

Hypothesis Testing for the Mean of a Normal Distribution

Concept Overview

In many practical situations, we need to test hypotheses about the mean of a population. If the population is approximately normally distributed and the variance is known, we can use a similar hypothesis testing approach as for binomial proportions.

The Hypotheses

Just as with binomial proportions, we have the null and alternative hypotheses:

Null Hypothesis ($H_0$): $H_0: \mu = \mu_0$, where $\mu_0$ is the claimed population mean.
Alternative Hypothesis ($H_a$): This can be one-sided or two-sided depending on what we wish to test.

Conducting the Test

Collect Sample Data: Assume we have a sample of size $n$ and the sample mean $\bar{x}$.
Calculate the Test Statistic:

$z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$, where $\sigma$ is the known population standard deviation.

Find the P-value: Determine the p-value using the calculated z-score.
Make a Decision: Compare the p-value to the significance level $\alpha$.

Example

Suppose we want to test whether the average height of male students in a certain school is significantly different from 170 cm. We take a sample of 50 students with an average height of 172 cm and a known population standard deviation of 10 cm.

Set the Hypotheses: $H_0: \mu = 170$, $H_a: \mu \neq 170$.
Sample Data: $n = 50$, $\bar{x} = 172$.
Calculate Test Statistic:

$$z = \frac{172 - 170}{10/\sqrt{50}} = \frac{2}{1.4142} \approx 1.4142$$

Find the P-value: For two-sided tests, we find the corresponding p-value for $z = 1.4142$, which is approximately 0.157.
Decision: If $\alpha = 0.05$, since $0.157 > 0.05$, we fail to reject the null hypothesis. We do not have sufficient evidence to claim the average height is significantly different from 170 cm.

Type I and Type II Errors

Understanding Errors

In hypothesis testing, we can make two types of errors:

Type I Error ($\alpha$): This occurs when we reject the null hypothesis when it is actually true. The probability of making a Type I error is denoted by $\alpha$, the significance level of the test.
Type II Error ($\beta$): This occurs when we fail to reject the null hypothesis when it is false. The probability of making a Type II error is denoted by $\beta$.

Relationship to Significance Level

The significance level $\alpha$ is a critical component of hypothesis testing. It defines the cutoff for deciding whether to reject the null hypothesis. Lowering $\alpha$ reduces the risk of a Type I error but increases the risk of a Type II error. Therefore, there is a trade-off between the two errors that statisticians must consider when designing tests.

Calculating Type II Error Probability

To compute the probability of a Type II error, we need to define the alternative hypothesis and the specific parameter under that hypothesis. The calculation typically involves the power of the test, which is defined as:

$$\text{Power} = 1 - P(\text{Type II error}) = 1 - \beta$$

This power measures the likelihood of correctly rejecting the null hypothesis when it is false. High power is desirable for a test to be effective.

Example

Imagine we are testing a new drug's effectiveness against a control where $H_0: \mu = 0$ (no effect) and $H_a: \mu \neq 0$ (the drug has some effect). Let's say the significance level $\alpha = 0.05$.

If the drug is actually effective, but we fail to detect this (resulting in a Type II error), the power of our test would indicate how likely we are to correctly identify effectiveness.

In this instance, if we find that the power of our test is 0.80, this implies:

There is an 80% probability of correctly rejecting the null hypothesis if the drug does have an actual effect and a 20% probability of failing to detect that effect (Type II error).

Critical Regions vs. P-Values

Definitions

Critical Region: This is the range of values for the test statistic that will lead to rejecting the null hypothesis. For a significance level $\alpha$, the critical region is determined based on the distribution of the test statistic.
P-Value: The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, under the assumption that the null hypothesis is true.

Advantages and Disadvantages

Critical Regions: Clear cut-off points for rejecting the null hypothesis are beneficial for simple decision-making but may not convey nuances in the data.
P-Values: While p-values provide more information about the strength of the evidence against the null hypothesis, they can be misinterpreted, leading to confusion regarding their implications.

Example

For our previously mentioned example with a binomial proportion test where $\alpha = 0.05$, the critical value for the test statistic might be $z_{0.05} \approx 1.645$ for a one-tailed test. Thus, if our calculated $z > 1.645$, we reject $H_0$. If we subsequently calculate a p-value of 0.03, we reaffirm our decision since $0.03 < 0.05$.

Conclusion

In conclusion, hypothesis testing is a powerful framework in statistics for making inferences about population parameters. The main takeaways from this lesson are:

Hypothesis tests can be carried out for binomial proportions and means of normal distributions using z-scores.
Understanding Type I and Type II errors helps gauge the reliability of test outcomes and their practical consequences.
Both critical regions and p-values are important tools, each with their advantages and contexts for use.

Study Notes

Hypothesis Testing: Involves $H_0$ and $H_a$, determining significance through test statistics and p-values.
Type I Error ($\alpha$): Rejecting $H_0$ when it is true; chance of this error is $\alpha$.
Type II Error ($\beta$): Not rejecting $H_0$ when it is false; power of the test is $1 - \beta$.
Critical Regions vs. P-Values: Know their definitions and implications in hypothesis testing context.
Practical Applications: Always state the context when interpreting hypothesis test results.