Lesson 5.1: Parameters, Statistics, and Standard Error
Introduction
In this section, we will delve into the foundational concepts of statistical inference, focusing on the terms parameter, statistic, unbiased, and standard error. By the end of this lesson, you, students, will understand how to differentiate between population parameters and sample statistics, calculate unbiased estimates of population mean and variance, and apply the concept of standard error in your analyses.
Learning Objectives
- Define and differentiate the terms parameter, statistic, unbiased estimator, and standard error with appropriate notation.
- Understand the significance of parameter and statistic in the context of statistical inference.
- Calculate unbiased estimates for population mean and variance using sample data.
- Recognize and apply the (n - 1) divisor when calculating sample variance.
1. Understanding Parameters and Statistics
1.1 What is a Parameter?
A parameter is a numerical characteristic of a population. It is a fixed value that might not be directly measurable because it relates to an entire population. For instance, if we want to know the average height of all the students in a school, the average height is the parameter we seek.
- Notation for Population Parameters: The population mean is often denoted as $\mu$, while the population variance is denoted as $\sigma^2$.
1.2 What is a Statistic?
A statistic, on the other hand, is a numerical characteristic calculated from a sample. It helps estimate the corresponding population parameter. Using the previous example, if we take a sample of 30 students' heights and calculate the average height from that sample, this average is referred to as a statistic, denoted as $\bar{x}$.
Example 1: Population Parameters vs. Sample Statistics
- Population Mean: $\mu = 170 \text{ cm}$ (the true average height of all students).
- Sample Mean: $\bar{x} = 168 \text{ cm}$ (the average height of a sample of 30 students).
In this situation, $\bar{x}$ serves as an estimate of the true parameter $\mu$.
2. Unbiased Estimation
2.1 Definition of Unbiased Estimator
An estimator is a rule or formula that tells us how to use sample data to estimate the underlying parameter. An estimator is deemed unbiased if its expected value equals the true parameter it estimates. In other words, an unbiased estimator will neither systematically overestimate nor underestimate the parameter.
- If $\bar{x}$ is used to estimate $\mu$, it is an unbiased estimator because $E[\bar{x}] = \mu$.
2.2 Standard Error
The standard error of a statistic is the standard deviation of its sampling distribution. It provides insight into how much variability we can expect from our sample statistic when estimating the population parameter. In essence, a smaller standard error indicates that the sample statistic is likely to be closer to the true parameter.
- Formula for Standard Error: The standard error of the mean is given by:
$$SE = \frac{\sigma}{\sqrt{n}}$$
where $\sigma$ is the population standard deviation and $n$ is the sample size. When $\sigma$ is unknown, we can estimate it using the sample standard deviation $s$:
$$SE = \frac{s}{\sqrt{n}}$$
Example 2: Calculating the Standard Error
Assume we have a sample of 36 students with a sample mean height of $\bar{x} = 168 \text{ cm}$ and a sample standard deviation of $s = 10 \text{ cm}$. Then, the standard error can be calculated as:
$$SE = \frac{s}{\sqrt{n}} = \frac{10}{\sqrt{36}} = \frac{10}{6} \approx 1.67 \text{ cm}$$
This indicates that the sample mean height is expected to fluctuate approximately 1.67 cm from the true population mean.
3. Unbiased Estimates of Population Mean and Variance
3.1 Sample Mean and Variance
When analyzing a sample, we can calculate the sample mean and sample variance, which serve as estimates for the population mean $\mu$ and population variance $\sigma^2$.
- Sample Mean: The sample mean is computed as:
$$\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}$$
where $x_i$ represents individual sample values and $n$ is the total number of samples.
- Sample Variance: The sample variance is calculated using:
$$s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n - 1}$$
The $(n - 1)$ divisor, known as Bessel's correction, is used to produce an unbiased estimate of the population variance. This adjustment corrects the bias in the estimation of the population variance based on limited sample data.
Example 3: Calculating Sample Mean and Variance
Consider the following heights of a sample of 5 students (in cm): [160, 165, 170, 175, 180]. Let’s calculate the sample mean and variance:
- Calculating the Sample Mean:
$$\bar{x} = \frac{160 + 165 + 170 + 175 + 180}{5} = \frac{850}{5} = 170 \text{ cm}$$
- Calculating the Sample Variance:
- First, compute deviations from the mean:
- $160 - 170 = -10$,
- $165 - 170 = -5$,
- $170 - 170 = 0$,
- $175 - 170 = 5$,
- $180 - 170 = 10$.
- Squaring these deviations, we have:
- $100$, $25$, $0$, $25$, $100$.
- Summing the squared deviations:
- $100 + 25 + 0 + 25 + 100 = 250$.
- Finally, compute the sample variance:
$$s^2 = \frac{250}{5 - 1} = \frac{250}{4} = 62.5 \text{ cm}^2$$
Conclusion
In this lesson, we have examined the critical concepts of parameters and statistics, emphasizing the importance of unbiased estimators and the calculation of standard error. Understanding these concepts is fundamental for statistical inference, as they form the basis for hypothesis testing and confidence intervals.
Key Takeaways
- A parameter is a characteristic of a population, represented by $\mu$ and $\sigma^2$.
- A statistic is a characteristic derived from a sample, denoted by $\bar{x}$.
- An unbiased estimator ensures that the expected value of the statistic equals the parameter.
- The standard error quantifies the precision of the sample statistic as an estimator of the population parameter.
- Using the (n - 1) divisor produces an unbiased estimate for variance.
Study Notes
- Parameter: Fixed numeric characteristic of a population, denoted $\mu$, $\sigma^2$.
- Statistic: Numeric characteristic derived from a sample, denoted $\bar{x}$.
- Unbiased Estimator: An estimator where $E[\hat{\theta}] = \theta$.
- Standard Error: Measure of the variability of a statistic, calculated as $SE = \frac{s}{\sqrt{n}}$.
- Sample Mean: $\bar{x} = \frac{\sum_{i=1}^{n} x_i}{n}$.
- Sample Variance: $s^2 = \frac{\sum_{i=1}^{n} (x_i - \bar{x})^2}{n - 1}$.
