4. Statistics and Probability

Histograms

Histograms πŸ“Š

students, in this lesson you will learn how histograms help us turn raw data into a visual summary that makes patterns easier to see. Histograms are a key part of statistics because they show how data is spread across intervals, helping you understand shape, center, spread, and unusual values. By the end of this lesson, you should be able to explain what a histogram is, interpret one correctly, and connect it to the wider ideas in Statistics and Probability.

Learning objectives:

  • Explain the main ideas and terminology behind histograms.
  • Apply IB Mathematics Analysis and Approaches SL reasoning and procedures related to histograms.
  • Connect histograms to the broader topic of Statistics and Probability.
  • Summarize how histograms fit within Statistics and Probability.
  • Use evidence and examples related to histograms in IB Mathematics Analysis and Approaches SL.

A histogram is especially useful when data values are too many to read one by one. Instead of listing every result, we group values into class intervals and draw bars that show frequency. This helps answer real questions such as: How are test scores spread out? Are most values low, high, or near the middle? Is the data symmetric, skewed, or uniform? 🌟

What a histogram shows

A histogram displays numerical data that has been grouped into intervals called classes or bins. The horizontal axis shows the class intervals, and the vertical axis shows either frequency or frequency density, depending on whether the class widths are equal.

For example, suppose a class recorded the times, in minutes, it took students to run a short race. The data might be grouped like this:

  • $10 \le t < 12$
  • $12 \le t < 14$
  • $14 \le t < 16$
  • $16 \le t < 18$

Each bar in the histogram represents one class interval. If the class width is the same for all intervals, then the bar height can be the frequency. If the intervals have different widths, then the bar height must be the frequency density so that the area of each bar represents the frequency.

The key idea is this: in a histogram, area matters. The frequency of a class is proportional to the area of its bar. If all class widths are equal, then height and area give the same ranking, so frequency alone is enough. If class widths differ, using frequency density is necessary to avoid misleading comparisons.

Core terminology and how to read one

When studying histograms, students, it is important to know the following terms:

  • Class interval: a range of values, such as $20 \le x < 30$.
  • Class width: the size of the interval, found by subtracting the lower boundary from the upper boundary.
  • Frequency: the number of data values in a class.
  • Frequency density: a measure used when class widths are unequal, given by $$\text{frequency density} = \frac{\text{frequency}}{\text{class width}}.$$
  • Cumulative frequency: the running total of frequencies up to a point, which is used more often with cumulative frequency graphs than with histograms.

A common mistake is to think that the tallest bar always represents the largest frequency. That is only true when all class widths are equal. If one class interval is wider, its bar can be taller or shorter depending on frequency density, not just frequency.

Let’s look at a quick example. Suppose the following grouped data represent the number of pages students read in a week:

  • $0 \le x < 10$, frequency $5$
  • $10 \le x < 20$, frequency $9$
  • $20 \le x < 40$, frequency $12$

The class widths are $10$, $10$, and $20$. For the third class, the frequency density is $$\frac{12}{20} = 0.6.$$

For the first two classes, the frequency densities are $\frac{5}{10} = 0.5$ and $$\frac{9}{10} = 0.9.$$

So the bar heights should be $0.5$, $0.9$, and $0.6$ if the vertical axis is frequency density.

This example shows why histograms are not just bar charts. In a bar chart, categories are separate, like types of fruit or favorite sports. In a histogram, the data are numerical and continuous or treated as continuous, so the intervals touch each other without gaps. 🍎⚽

Constructing a histogram step by step

To draw a histogram correctly, follow these steps:

  1. Organize the data into class intervals.
  2. Find the frequency for each class.
  3. Calculate class width using $$\text{class width} = \text{upper boundary} - \text{lower boundary}.$$
  4. Find frequency density if class widths are not all equal using $$\text{frequency density} = \frac{\text{frequency}}{\text{class width}}.$$
  5. Draw axes with class intervals on the horizontal axis.
  6. Plot bars so that each bar covers its class interval.
  7. Make the bar area proportional to frequency.

Here is a realistic example. Imagine the weights, in kilograms, of a group of packaged items are recorded as follows:

  • $0 \le w < 5$: frequency $4$
  • $5 \le w < 10$: frequency $10$
  • $10 \le w < 20$: frequency $16$

Because the last class width is $10$, while the first two are width $5$, the frequencies alone cannot be used as bar heights. We calculate frequency density:

  • $\frac{4}{5} = 0.8$
  • $\frac{10}{5} = 2.0$
  • $\frac{16}{10} = 1.6$

Now the histogram can be drawn so the bar heights are $0.8$, $2.0$, and $1.6$. The areas of the bars will represent the frequencies $4$, $10$, and $16$.

In IB Mathematics Analysis and Approaches SL, you may also be asked to use a histogram to estimate missing information. For instance, if one bar’s area and class width are known, then frequency can be found by rearranging the formula:

$$\text{frequency} = \text{frequency density} \times \text{class width}.$$

This is an important skill because it connects graphical representation with algebraic reasoning.

Interpreting shape, spread, and unusual features

Histograms are useful because they reveal the overall pattern of data. When interpreting one, look for:

  • Shape: symmetric, skewed left, skewed right, uniform, or bimodal.
  • Center: where most data are concentrated.
  • Spread: how wide the distribution is.
  • Outliers or gaps: values far from the main cluster or empty intervals.

A symmetric histogram has a left side and right side that are roughly mirror images. A skewed right histogram has a long tail to the right, meaning a few large values pull the data in that direction. A skewed left histogram has a long tail to the left. A bimodal histogram has two clear peaks, which may suggest two groups within the data.

For example, if a histogram of exam scores has most bars between $60$ and $80$ with a small tail toward $20$, it is likely skewed left. This could happen if most students did well, but a few scores were much lower.

students, interpreting shape matters because it helps you make sense of real situations. A histogram of delivery times might show whether most deliveries are quick and whether a few delays are unusually long. A histogram of daily screen time might show whether a group mostly uses devices a moderate amount or whether some values are much larger than the rest. πŸ“±

Histograms also connect to measures of center and spread. If the data are roughly symmetric, the mean is often a useful measure of center. If the data are skewed, the median may describe the center better because it is less affected by extreme values.

Why histograms matter in statistics and probability

Histograms belong to the branch of statistics because they help summarize data and support informed conclusions. They are often used before more advanced analysis such as correlation and regression, because understanding the distribution of one variable is an important first step.

In the broader topic of Statistics and Probability, histograms help you:

  • summarize large sets of data,
  • compare distributions,
  • identify trends or unusual values,
  • and make decisions based on evidence.

They also prepare you for probability distributions. A histogram of observed data can be compared with the shape of a theoretical discrete or continuous distribution. For instance, if repeated measurements create a roughly bell-shaped histogram, students may later connect that idea with the normal distribution.

Histograms are not probability graphs themselves, but they support probability thinking by showing how frequently values occur in intervals. This makes them a bridge between raw data and probabilistic models.

Conclusion

Histograms are one of the most important tools in introductory statistics because they transform lists of numbers into patterns you can see and interpret. students, you should now understand that class intervals, frequency, and frequency density are the key ideas behind histograms. You should also know that the area of each bar represents frequency, especially when class widths are unequal.

In IB Mathematics Analysis and Approaches SL, histograms help you describe data accurately, avoid common mistakes, and make connections to later topics such as distributions and data analysis. When you read a histogram carefully, you are not just looking at bars β€” you are reading a story about data πŸ“ˆ.

Study Notes

  • A histogram displays grouped numerical data using touching bars.
  • The horizontal axis shows class intervals.
  • The vertical axis shows frequency if class widths are equal, or frequency density if class widths are unequal.
  • Use $$\text{frequency density} = \frac{\text{frequency}}{\text{class width}}.$$
  • The area of each bar represents frequency.
  • Histograms are different from bar charts because histograms show continuous or grouped numerical data.
  • Important features to describe are shape, center, spread, gaps, and outliers.
  • Symmetric data often fit the mean well; skewed data often fit the median better.
  • Histograms connect to data analysis, regression, and probability distributions in Statistics and Probability.
  • Always check class widths before interpreting bar height as frequency.

Practice Quiz

5 questions to test your understanding