Lesson 3.1: Tabulating Data: Frequency and Contingency Tables
Welcome to Lesson 3.1 of Foundation Statistics! 🎉 In this lesson, we will explore how to effectively organize and present data using tables. Our focus will be on frequency tables, relative-frequency tables, cumulative-frequency tables, and contingency tables. By the end of this lesson, you will understand how to create these tables and interpret their information.
Learning Objectives
- Understand frequency, relative-frequency, and cumulative-frequency tables.
- Learn how to group continuous data into classes and choose class widths.
- Explore two-way (contingency) tables for two categorical variables.
- Read conditional and marginal frequencies from a table.
- Explain the main ideas and terminology behind these concepts.
Hook: Why Tables are Important
Imagine you have a large dataset of student scores from different classes. How would you make sense of this data? 📊 Analyzing raw data can be overwhelming, but by organizing it into tables, we can quickly identify trends, patterns, and important statistics. Let’s learn how!
Frequency Tables
A frequency table displays how often each value occurs in a dataset. This can be very helpful for understanding the distribution of data.
Constructing a Frequency Table
- Collect data: For example, let’s say we surveyed 20 students about the number of books they read in a month:
$4, 2, 5, 3, 3, 1, 0, 0, 2, 1, 5, 4, 3, 6, 2, 1, 0, 5, 3, 4
- Count occurrences: Next, count how many times each number appears:
- 0 occurs 3 times
- 1 occurs 4 times
- 2 occurs 4 times
- 3 occurs 5 times
- 4 occurs 4 times
- 5 occurs 4 times
- 6 occurs 1 time
- Create the table: Now, let’s summarize this data in a frequency table:
| Number of Books Read | Frequency |
|---------------------|-----------|
| 0 | 3 |
| 1 | 4 |
| 2 | 4 |
| 3 | 5 |
| 4 | 4 |
| 5 | 4 |
| 6 | 1 |
Relative Frequency Table
Instead of just showing how often each value occurs, a relative frequency table shows the proportion of the total for each value.
To calculate relative frequency, use the formula:
$$ \text{Relative Frequency} = \frac{\text{Frequency}}{\text{Total Number of Observations}} $$
For our example:
- Total Number of Observations = 20
- For 1 book: relative frequency = $ \frac{4}{20} = 0.20 $ (20% of students read 1 book).
Now, let’s create the relative frequency table:
| Number of Books Read | Frequency | Relative Frequency |
|---------------------|-----------|-------------------|
| 0 | 3 | 0.15 |
| 1 | 4 | 0.20 |
| 2 | 4 | 0.20 |
| 3 | 5 | 0.25 |
| 4 | 4 | 0.20 |
| 5 | 4 | 0.20 |
| 6 | 1 | 0.05 |
Cumulative Frequency Table
A cumulative frequency table shows the total number of observations that fall below a certain value. To create this table, we sum the frequencies cumulatively:
| Number of Books Read | Frequency | Cumulative Frequency |
|---------------------|-----------|---------------------|
| 0 | 3 | 3 |
| 1 | 4 | 7 |
| 2 | 4 | 11 |
| 3 | 5 | 16 |
| 4 | 4 | 20 |
| 5 | 4 | 24 |
| 6 | 1 | 25 |
This table can help us understand how many students read fewer than a certain number of books.
Grouping Continuous Data
Sometimes we deal with continuous data, such as heights or weights, which can take on any value within a range. In this case, we can group the data into classes.
Choosing Class Widths
When making a frequency table for continuous data, we determine class widths to group our data effectively. A common method involves:
- Finding the range: The range is calculated as:
$$ \text{Range} = \text{Maximum Value} - \text{Minimum Value} $$
- Deciding on the number of classes: Generally, aim for 5 to 10 classes.
- Calculating the class width: Divide the range by the number of classes:
$$ \text{Class Width} = \frac{\text{Range}}{\text{Number of Classes}} $$
For example, if the weights of students were:
$50, 55, 60, 62, 61, 48, 70, 65, 64, 59
- Maximum = 70, Minimum = 48, Range = $70 - 48 = 22$.
- Suppose we decide to use 4 classes: Class Width = $ \frac{22}{4} = 5.5 $ (you might round this to 6).
Contingency Tables
A contingency table displays the frequency of two categorical variables. Let's say we surveyed students about their favorite subject and whether they play sports:
- Math, Science, Art (Subjects)
- Yes, No (Plays sports)
The contingency table will look like:
| Subject | Plays Sports | Does Not Play Sports | Total |
|----------|--------------|----------------------|-------|
| Math | 10 | 5 | 15 |
| Science | 7 | 8 | 15 |
| Art | 3 | 7 | 10 |
| Total | 20 | 20 | 40 |
From this table, we can read marginal frequencies (totals for each row and column) and conditional frequencies (for example, the proportion of students who play sports and favor a particular subject).
Conclusion
In this lesson, we learned about frequency tables, relative-frequency tables, cumulative-frequency tables, and contingency tables. These tools help us organize and interpret data effectively. By using these tables, students, you can quickly analyze information and convey your findings clearly!
Study Notes
- Frequency tables show how many times an event occurs.
- Relative frequency represents how often an event occurs relative to the total.
- Cumulative frequency accumulates totals for observations below a certain value.
- When grouping continuous data, choose appropriate class widths based on the range and desired number of classes.
- Contingency tables summarize relationships between two categorical variables.
