Lesson 7.2: How Statistics Mislead
Introduction
In our increasingly data-driven world, statistics play a fundamental role in shaping our understanding of events, trends, and phenomena. However, statistics can be deceptive. This lesson aims to make students aware of how data can be manipulated and misrepresented, leading to misconceptions and erroneous conclusions. By the end of this lesson, students will be able to critically evaluate statistical information, identify common pitfalls in statistical presentation, and effectively communicate their findings.
Learning Objectives
- Understand how misleading charts can distort data interpretation.
- Recognize the limitations of averages, particularly in skewed data.
- Make sense of percentages, percentage points, and absolute changes.
- Identify misleading comparisons created by cherry-picking time periods.
- Evaluate how charts, averages, or percentages can mislead the audience.
Understanding Misleading Charts
Charts and graphs are powerful tools for visualizing data, but they can also be used to mislead viewers. Here, we will explore common tactics used to distort information.
Chopped Axes
A chopped axis occurs when a graph does not start at zero, making changes in data appear more dramatic than they are. This tactic can exaggerate the appearance of trends and differences in rates.
Example
Consider the following graph:
In this hypothetical bar chart comparing two products, the y-axis starts at 10 instead of zero. If Product A's sales increase from 10 to 20 and Product B's sales increase from 15 to 25, the difference appears larger than it actually is.
To calculate the relative increase:
- Product A's increase: $20 - 10 = 10$
- Product B's increase: $25 - 15 = 10$
Despite both products having the same increase of $10$, the chart suggests a significant difference due to the chopped y-axis. To avoid being misled by chopped axes, always check the scale of the axes in graphs.
Distorted Scales
Distorted scales can also misrepresent the relationship between variables. When scales are unevenly spaced, they can give an inaccurate picture of the data.
Example
Consider a time series graph displaying the growth of two investments over time:
In this graph, the x-axis intervals are not evenly distributed. A time span that covers several months may appear shorter when in fact it covers several years, leading viewers to emphasize short-term gains instead of long-term trends.
Selective Display
Selective display involves presenting only part of the data, which can lead to an incomplete story.
Example
If a company reports its profits over the last five years but only shows data from the last two profitable years, it may suggest a false sense of consistency and growth. Always ask what data might be missing when evaluating statistical claims.
Misleading Averages
Averages can be significant indicators in statistics, but they can also be misleading, especially in skewed distributions.
Quoting a Mean for Skewed Data
When discussing data, people often refer to the mean. However, the mean can be greatly affected by outliers, skewing the representation of the data.
Example
Consider a dataset of the salaries of a group of workers:
- Worker 1: $30,000
- Worker 2: $32,000
- Worker 3: $31,000
- Worker 4: $28,000
- Worker 5: $1,000,000
The mean salary is:
$$\text{Mean} = \frac{30,000 + 32,000 + 31,000 + 28,000 + 1,000,000}{5} = \frac{1,121,000}{5} = 224,200$$
This figure misleadingly suggests that the average worker earns over $224,000, despite the majority earning around $30,000. Instead, a better representation may be through the median, which is less influenced by extreme values:
$$\text{Median} = 31,000$$
Hiding a Wide Spread
Similar to the mean, when reporting data, it is essential to also consider the spread of the data.
Example
Two different companies may report an average time to complete a task:
- Company A: Average time = 10 hours; Standard deviation = 1 hour
- Company B: Average time = 10 hours; Standard deviation = 5 hours
While their means are equivalent, Company B has a much wider spread of times. This is significant, as it indicates inconsistency and variability in performance. Without this context, the reader could misinterpret the information.
Confusing Percentages, Percentage Points, and Absolute Change
Understanding the differences between percentages, percentage points, and absolute change is critical to interpreting statistical information accurately.
Absolute Change
Absolute change refers to the simple difference between two values.
Example
If a population of a town increases from 1,000 to 1,200, the absolute change is:
$$1200 - 1000 = 200$$
Percentage Change
Percentage change expresses the degree of change in relation to the original value. You can calculate percent change using the formula:
$$\text{Percentage Change} = \left( \frac{\text{New Value} - \text{Old Value}}{\text{Old Value}}
ight) $\times 100$$$
For the previous example:
$$\text{Percentage Change} = \left( \frac{1200 - 1000}{1000}
$ight) \times 100 = 20\%$
$$
Percentage Points
Percentage points is a unit for the arithmetic difference between two percentages. For example, if a poll shows a candidate's approval rating shifted from 30% to 50%, the change is 20 percentage points, not 20%.
Understanding these differences helps students evaluate claims, especially when the numbers don't operate as expected.
Cherry-Picked Time Periods and Unfair Comparisons
Statistical data can be manipulated by choosing specific time periods or comparisons that paint a favorable picture while overlooking less favorable data.
Cherry-Picked Time Periods
When statistics are presented only for specific time frames, they can misinform the audience about the true nature of trends.
Example
If an advertisement for an investment fund showcases its performance over a five-year period showing exceptional growth but neglects to mention that the previous five years included losses, this can lead investors to make uninformed decisions. Understanding the context of time intervals is crucial for drawing accurate conclusions.
Unfair Comparisons
Unfair comparisons involve contrasting dissimilar data sets or categories that cannot be legitimately compared. For instance, comparing the average incomes of two very different regions without accounting for cost of living can lead to misleading interpretations.
Conclusion
In this lesson, students has learned the importance of critically analyzing statistical information. By being aware of misleading charts, understanding the limitations and nuances of averages, and recognizing how data can be cherry-picked or misrepresented, students can make more informed judgments about statistical claims. This knowledge is essential in navigating a world where statistics are prevalent, ensuring that students can discern fact from manipulation.
Study Notes
- Chopped Axes: Can exaggerate trends; always check the scale of axes.
- Distorted Scales: Be cautious of unevenly spaced intervals in graphs.
- Selective Display: Look for missing data to get the complete picture.
- Mean vs. Median: Know the impact of outliers on the average and use the median for skewed data.
- Absolute Change vs. Percentage Change vs. Percentage Points: Different measurement techniques can lead to different interpretations.
- Cherry-Picked Time Periods: Look for comprehensive data that offers context.
- Unfair Comparisons: Ensure the entities being compared are similar to avoid misleading conclusions.
