Lesson 6.3: t-tests for One and Two Means

Introduction

In this lesson, we will delve into the concept of t-tests, which are statistical tests used to compare the means of one or two groups. These tests are particularly useful when dealing with small samples (typically less than 30) and when the population variance is unknown. By the end of this lesson, students will be able to:

Conduct a one-sample t-test to assess the mean of a normal distribution with unknown variance.
Conduct a two-sample t-test to evaluate the difference between means of two independent samples, employing a pooled variance estimate when appropriate.
Understand the validity conditions required for t-tests, such as normality and equal variances, and learn how to interpret results in context.

Understanding the t-Test

The t-test is a statistical tool that helps us determine if there is a significant difference between the means of two groups or if a sample mean is significantly different from a known value. The t-distribution, which provides the basis for t-tests, is similar to the normal distribution but has heavier tails, accommodating the increased variability commonly found with smaller sample sizes.

One-Sample t-Test

The one-sample t-test is used to test whether the mean of a single sample differs from a known or hypothesized population mean. The formula for the one-sample t-test is given by:

$$t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$$

Where:

$t$ = t-statistic
$\bar{x}$ = sample mean
$\mu_0$ = hypothesized population mean
$s$ = sample standard deviation
$n$ = sample size

Example 1: One-Sample t-Test

Imagine a group of students wishing to determine if their class's average score on a math exam differs from the statewide average score of 75. Suppose the sample data is as follows:

Student Score
78
85
72
90
76

First, we need to calculate the sample mean and standard deviation:

Calculate the sample mean ($\bar{x}$):

$$\bar{x} = \frac{78 + 85 + 72 + 90 + 76}{5} = \frac{401}{5} = 80.2$$

Calculate the sample standard deviation ($s$):

$$s = \sqrt{\frac{\sum{(x_i - \bar{x})^2}}{n-1}} = \sqrt{\frac{(78-80.2)^2 + (85-80.2)^2 + (72-80.2)^2 + (90-80.2)^2 + (76-80.2)^2}{4}}$$

Calculating the differences:

For $78$: $(78 - 80.2)^2 = 4.84$
For $85$: $(85 - 80.2)^2 = 22.09$
For $72$: $(72 - 80.2)^2 = 67.24$
For $90$: $(90 - 80.2)^2 = 93.76$
For $76$: $(76 - 80.2)^2 = 17.64$

Summing these:

$$\sum{(x_i - \bar{x})^2} = 4.84 + 22.09 + 67.24 + 93.76 + 17.64 = 205.57$$

Thus,

$$s = \sqrt{\frac{205.57}{4}} = \sqrt{51.3925} \approx 7.17$$

Calculate the t-statistic:

Using $\mu_0 = 75$ and $n = 5$:

$$t = \frac{80.2 - 75}{7.17 / \sqrt{5}} = \frac{5.2}{3.20} \approx 1.63$$

Determine the degrees of freedom:

$$df = n - 1 = 5 - 1 = 4$$

Find the critical t-value from the t-table for $\alpha = 0.05$ (two-tailed test) with 4 degrees of freedom, which is approximately $2.776$.

Conclusion: Since the calculated t-statistic ($1.63$) is less than the critical value ($2.776$), we fail to reject the null hypothesis. Hence, we conclude that there is not enough evidence to say that the class's average score is different from the statewide average of 75.

Two-Sample t-Test

The two-sample t-test is used when comparing the means of two independent samples to see if they are significantly different from each other. The formula for the two-sample t-test with equal variances is:

$$t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$$

Where:

$s_p$ is the pooled standard deviation calculated as:

$$s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}$$

Example 2: Two-Sample t-Test

Consider two different classes completing the same exam. The scores for Class A and Class B are as follows:

Class A Scores	Class B Scores
82	78
77	85
90	81
75	88
84	74

Calculate the sample means and standard deviations:

For Class A:

$$\bar{x}_1 = \frac{82 + 77 + 90 + 75 + 84}{5} = \frac{408}{5} = 81.6$$

Sample standard deviation:

$$s_1 = \sqrt{\frac{(82-81.6)^2 + (77-81.6)^2 + (90-81.6)^2 + (75-81.6)^2 + (84-81.6)^2}{4}}$$

Calculating:

$(82-81.6)^2 = 0.16$
$(77-81.6)^2 = 20.16$
$(90-81.6)^2 = 69.76$
$(75-81.6)^2 = 42.76$
$(84-81.6)^2 = 5.76$.

Thus,

$$s_1 = \sqrt{\frac{0.16 + 20.16 + 69.76 + 42.76 + 5.76}{4}} \approx 7.08$$

For Class B:

$$\bar{x}_2 = \frac{78 + 85 + 81 + 88 + 74}{5} = \frac{406}{5} = 81.2$$

Sample standard deviation:

$$s_2 = \sqrt{\frac{(78-81.2)^2 + (85-81.2)^2 + (81-81.2)^2 + (88-81.2)^2 + (74-81.2)^2}{4}}$$

Calculating:

$(78-81.2)^2 = 10.24$
$(85-81.2)^2 = 14.44$
$(81-81.2)^2 = 0.04$
$(88-81.2)^2 = 46.24$
$(74-81.2)^2 = 51.84$.

Thus,

$$s_2 = \sqrt{\frac{10.24 + 14.44 + 0.04 + 46.24 + 51.84}{4}} \approx 7.31$$

Calculate the pooled standard deviation ($s_p$):

$$s_p = \sqrt{\frac{(5 - 1)(7.08^2) + (5 - 1)(7.31^2)}{5 + 5 - 2}}$$

Calculating:

$s_1^2 \approx 50.0864$ and $s_2^2 \approx 53.4641$,

$$s_p = \sqrt{\frac{4 \cdot 50.0864 + 4 \cdot 53.4641}{8}} \approx \sqrt{\frac{205.8176 + 213.8564}{8}} = \sqrt{52.1705} \approx 7.22$$

Calculate the t-statistic:

Now we can calculate the t-statistic using the means and pooled standard deviation:

$$t = \frac{81.6 - 81.2}{7.22 \sqrt{\frac{1}{5} + \frac{1}{5}}} = \frac{0.4}{7.22 \cdot \sqrt{0.4}} = \frac{0.4}{7.22 \cdot 0.6325} \approx 0.09$$

Determine the degrees of freedom:

$$df = n_1 + n_2 - 2 = 5 + 5 - 2 = 8$$

Find the critical t-value for $\alpha = 0.05$ (two-tailed) with 8 degrees of freedom, which is approximately $2.306$.

Conclusion: Since the calculated t-statistic ($0.09$) is less than the critical value ($2.306$), we fail to reject the null hypothesis. Therefore, we conclude that there is not enough evidence to say there is a significant difference between the average scores of Class A and Class B.

Validity Conditions

It is crucial for the validity of t-tests that certain conditions are met:

Normality: The samples should be drawn from a normally distributed population. For small sample sizes, this is particularly important.
Equal Variances: For the two-sample t-test, it is assumed that the variances of the two populations are equal. This can be checked using tests like Levene's test.

Conclusion

In this lesson, we explored the one-sample and two-sample t-tests. We learned how to calculate the t-statistic, understand the underlying assumptions, and interpret results in a real-world context. t-tests are powerful tools in statistical analysis and are widely used in various fields including psychology, business, and healthcare to make inferences about population means.

Study Notes

The t-test compares means when population variances are unknown.
One-sample t-test formula: $t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}}$.
Two-sample t-test formula: $t = \frac{\bar{x}_1 - \bar{x}_2}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$.
Pooled standard deviation for two samples: $s_p = \sqrt{\frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}}$.
Validity conditions include normality and equal variances for two-sample tests.
Use a t-table to find critical values for hypothesis testing.