Topic 5: Statistical Inference and Hypothesis Testing

Lesson 5.2: The Language and Logic of Hypothesis Testing

Introduction

In this lesson, we will delve into the fundamental concepts of hypothesis testing, a crucial aspect of statistical inference. The ability to draw conclusions about a population based on a sample is a key skill in statistics. By the end of this lesson, you, students, will understand the terminology and logic that underpin hypothesis testing, which will be essential for analyzing data and making inferences in future statistical applications.

Objectives

Understand and define null and alternative hypotheses, significance levels, test statistics, critical values, and p-values.
Distinguish between one-tailed and two-tailed tests, and set up hypotheses accordingly.
Recognize the convention of using a 5% significance level when performing hypothesis tests.
State conclusions tentatively based on statistical evidence rather than as definitive statements.

1. Hypotheses in Statistics

Hypothesis testing begins with formulating two opposing statements about a population parameter. These are known as hypotheses:

Null Hypothesis ($H_0$): This is a statement that the parameter is equal to a specific value. It represents the status quo or a baseline condition. For example, if we are testing whether a coin is fair, our null hypothesis might be $H_0: p = 0.5$, where $p$ is the probability of landing heads.

Alternative Hypothesis ($H_1$ or $H_a$): This represents a statement that the parameter is different from the value stated in the null hypothesis. For the coin example, that could be $H_1: p \neq 0.5$ if we are conducting a two-tailed test or $H_1: p > 0.5$ for a one-tailed test.

2. Significance Level

The significance level, denoted as $\alpha$, is the probability of rejecting the null hypothesis when it is actually true (Type I error). A commonly used significance level is

$$\alpha = 0.05$$

This means there is a 5% risk of concluding that a difference exists when there is no actual difference. It is essential to choose an $\alpha$ level before conducting a hypothesis test.

3. Test Statistics

The test statistic is a standardized value that is calculated from sample data during a hypothesis test. It helps determine how far the sample statistic is from the null hypothesis. Common test statistics include:

The z-statistic for large sample sizes or known population variance, calculated as

$$z = \frac{\overline{x} - \mu_0}{\sigma / \sqrt{n}}$$

where $\overline{x}$ is the sample mean, $\mu_0$ is the population mean under the null hypothesis, $\sigma$ is the population standard deviation, and $n$ is the sample size.

The t-statistic for small sample sizes or unknown population variance, calculated as

$$t = \frac{\overline{x} - \mu_0}{s / \sqrt{n}}$$

where $s$ is the sample standard deviation.

Example 1: Calculating a Test Statistic

Suppose we want to test whether a new teaching method improves student test scores. A sample of 30 students has an average score of 78, with a population average score under the null hypothesis of 75. The population standard deviation is 10. Let's conduct a one-sample z-test.

Set Hypotheses:

$H_0: \mu = 75$
$H_1: \mu > 75$

Calculate Test Statistic:

$$z = \frac{78 - 75}{10 / \sqrt{30}} = \frac{3}{1.8257} \approx 1.64$$

4. Critical Values and Regions

The critical value is a threshold that determines the cutoff for rejecting the null hypothesis. Depending on whether the test is one-tailed or two-tailed, we find critical values from statistical z or t tables.

In a one-tailed test with $\alpha = 0.05$, the critical value for a z-test is approximately 1.645.
For a two-tailed test, the critical values are approximately ±1.96 for the same significance level.

Critical Regions

Critical Region: The region in the tails of the probability distribution where, if the test statistic lands, we reject the null hypothesis.
Acceptance Region: The region where we do not reject the null hypothesis.

5. p-Value

The p-value is the probability of observing the test statistic or something more extreme under the null hypothesis. A smaller p-value indicates stronger evidence against the null hypothesis.

If the p-value is less than or equal to the significance level $\alpha$, we reject the null hypothesis.
For example, if our calculated p-value is 0.03, we will reject $H_0$ at $\alpha = 0.05$ because $0.03 < 0.05$.

Example 2: Using p-Value to Make a Decision

Continuing with the previous example, suppose we calculate a p-value of 0.045 based on our test statistic.

Set Hypotheses:

$H_0: \mu = 75$
$H_1: \mu > 75$

Calculate p-Value:

From z-tables, for $z = 1.64$ the p-value is approximately 0.045.

Decision:

Since $0.045 < 0.05$, we reject $H_0$. We have sufficient evidence to say that the new teaching method results in higher average test scores.

6. Conclusion

In conclusion, hypothesis testing is an essential technique in statistics that allows us to make inferences about population parameters based on sample data. Through understanding the concepts of null and alternative hypotheses, significance levels, test statistics, critical values, and p-values, you are now equipped to execute hypothesis tests. Remember to always state conclusions tentatively, reflecting the inherent uncertainty of statistical analysis.

Study Notes

Null Hypothesis ($H_0$): Assumes no effect or no difference.
Alternative Hypothesis ($H_1$): Reflects the effect or difference we are testing for.
Significance Level ($\alpha$): The threshold for rejecting $H_0$, often set at 0.05.
Test Statistic: A value calculated from sample data used in testing hypotheses.
One-Tailed Test: Tests for a direction of an effect.
Two-Tailed Test: Tests for any difference, regardless of direction.
Critical Value: The cutoff point that defines the critical and acceptance regions.
p-Value: The probability of obtaining a test statistic at least as extreme as the one observed, under the null hypothesis.