4. Statistics and Probability

Histograms

Histograms 📊

Welcome, students. In this lesson, you will learn how histograms help us understand data that is organized into groups, especially when values are continuous, such as height, time, mass, or test scores. Histograms are one of the most important graphs in statistics because they show the shape of a data set at a glance. By the end of this lesson, you should be able to explain what a histogram shows, identify its key features, and use it to make sensible conclusions in IB Mathematics: Applications and Interpretation SL.

Objectives for this lesson:

  • Explain the main ideas and terminology behind histograms.
  • Construct and interpret histograms using grouped data.
  • Use histograms to describe distribution, center, spread, and unusual features.
  • Connect histograms to statistical reasoning and real-world decisions.

A histogram is not just a bar chart with bars touching. It is a graph for continuous data that shows how data values are distributed across class intervals. This makes it useful in real life for things like exam results, reaction times, rainfall, and delivery times 🚚.

What a Histogram Shows

A histogram displays data grouped into intervals, called class intervals or bins. The horizontal axis shows the intervals, and the vertical axis shows frequency or frequency density. The bars touch because the data are continuous, meaning there are no gaps between possible values.

If the data are measured rather than counted, a histogram is often the best way to display them. For example, suppose a teacher records the test scores of a class in intervals like $0 \le x < 10$, $10 \le x < 20$, and so on. A histogram can quickly show whether most students scored high, low, or somewhere in the middle.

The key vocabulary includes:

  • Class interval: a range of values such as $20 \le x < 30$.
  • Frequency: the number of values in a class.
  • Class width: the size of the interval, found by $\text{upper boundary} - \text{lower boundary}$.
  • Frequency density: used when class widths are not all the same.
  • Area: in a histogram, the area of each bar represents frequency.

That last point is very important. In some histograms, especially when class widths are equal, the bar heights can be the frequencies. But when class widths are unequal, the heights must be adjusted so that the area still matches the frequency. This is why frequency density is needed.

If the frequency is $f$ and the class width is $w$, then

$$\text{frequency density} = \frac{f}{w}$$

This formula is central to IB analysis of histograms.

Frequency Density and Unequal Class Widths

Many students first meet histograms using equal-width classes, because those are easier to draw and read. However, real data often come in intervals of different widths. In that case, using frequency as the height would be misleading because a wider class would naturally have a bigger area even if the data were not more concentrated there.

To compare classes fairly, the height of each bar must be the frequency density. Since the area of a rectangle is

$$\text{area} = \text{height} \times \text{width}$$

the frequency in each class is represented by

$$f = (\text{frequency density}) \times (\text{class width})$$

Example

Suppose a data table shows:

  • $0 \le x < 10$: frequency $8$
  • $10 \le x < 30$: frequency $20$
  • $30 \le x < 40$: frequency $12$

The class widths are $10$, $20$, and $10$. The frequency densities are:

  • $\frac{8}{10} = 0.8$
  • $\frac{20}{20} = 1$
  • $\frac{12}{10} = 1.2$

So the histogram bars should have heights $0.8$, $1$, and $1.2$. Notice that the second class has a larger frequency than the first, but its bar is not the tallest. That is because it is wider. The actual meaning is in the area, not just the height.

This helps you avoid a common mistake: judging frequency only by the height of a bar when class widths are unequal.

Reading Shape, Center, and Spread

Histograms are powerful because they help describe the distribution of data. Distribution means how the data values are spread out across intervals. When you look at a histogram, you should ask:

  • Is the distribution symmetric or skewed?
  • Where is the center?
  • How wide is the spread?
  • Are there any gaps or unusual values?

Shape

A histogram may be:

  • Symmetric: roughly balanced on both sides.
  • Positively skewed: a long tail to the right, meaning some large values stretch the graph.
  • Negatively skewed: a long tail to the left, meaning some small values stretch the graph.
  • Bimodal: two clear peaks, which may suggest two different groups mixed together.

For example, the distribution of exam marks is often negatively skewed if the test is difficult, because many students score low and only a few score high. In contrast, delivery times may be positively skewed because most deliveries arrive near the usual time, but a few are delayed a lot.

Center

The center of a histogram is the point around which the data are concentrated. The center is not always exactly one number, especially when data are grouped, but you can estimate where most of the bars are highest.

Spread

Spread tells you how wide the data are. A histogram with bars stretching across many intervals has a larger spread than one where the bars cluster tightly together. Spread matters in real life because consistency is often as important as average performance. For instance, two classes may have similar average scores, but one class may have much more variation.

Constructing a Histogram Step by Step

To construct a histogram from grouped data, follow these steps:

  1. Check the class intervals and make sure they cover the data without gaps.
  2. Find the class widths using $\text{width} = \text{upper boundary} - \text{lower boundary}$.
  3. Calculate frequency density with $\text{frequency density} = \frac{f}{w}$ if needed.
  4. Draw axes with the intervals on the horizontal axis.
  5. Label the vertical axis as frequency or frequency density.
  6. Draw touching bars with correct heights.

Example

A shop records the waiting times of customers:

  • $0 \le x < 5$: $6$
  • $5 \le x < 10$: $10$
  • $10 \le x < 20$: $14$

Since the widths are $5$, $5$, and $10$, the frequency densities are:

  • $\frac{6}{5} = 1.2$
  • $\frac{10}{5} = 2$
  • $\frac{14}{10} = 1.4$

A correct histogram would have bar heights $1.2$, $2$, and $1.4$. The tallest bar is the middle interval, showing that waiting times from $5$ to $10$ minutes were the most concentrated.

This kind of graph helps businesses spot problems. If many customers wait longer than expected, the shop can improve staffing or service flow.

Interpreting Histograms in Real Life

Histograms are useful because they turn raw data into a visual story. Instead of looking at a long list of numbers, you can quickly see patterns.

Example: Sports training

A coach measures sprint times. If the histogram is positively skewed, it may show that most athletes are fairly fast, but a few have much slower times. The coach may decide to give extra support to those athletes.

Example: Environmental data

A scientist records rainfall amounts. If a histogram has a wide spread, that means rainfall varies a lot from day to day. This can affect water supply planning and flood prevention.

Example: School data

If test results are bimodal, it may indicate two distinct groups in the class, such as students who studied a lot and students who did not. The histogram suggests a deeper question rather than giving the final answer on its own.

A histogram helps with inferential reasoning too. It can suggest trends or patterns that motivate further investigation. However, it does not prove causes. A histogram tells you what the data look like, not why they look that way.

Common Mistakes to Avoid

Here are some mistakes students often make:

  • Treating a histogram like a bar chart.
  • Leaving spaces between bars for continuous data.
  • Using frequency as height when class widths are unequal.
  • Forgetting that the area, not just the height, represents frequency.
  • Misreading skewness or assuming a visible pattern has only one explanation.

Always check the class intervals carefully. If the intervals are different widths, the vertical axis should usually be frequency density. If the question asks for frequency from a histogram, use

$$f = (\text{frequency density}) \times (\text{class width})$$

This is especially important in IB exam questions, where a small error in interpreting the axis can lead to a wrong conclusion.

Conclusion

Histograms are a core tool in statistics because they show how continuous data are distributed across intervals. They help you identify shape, center, spread, and unusual features. In IB Mathematics: Applications and Interpretation SL, you must know how to construct histograms, read them correctly, and use frequency density when class widths are unequal. Histograms are not just graphs for classwork; they are practical tools for making decisions in business, science, health, and everyday life 📈.

If you can explain what the bars mean, calculate frequency density, and describe the distribution clearly, you are using statistical reasoning in a strong IB way, students.

Study Notes

  • A histogram is used for continuous data grouped into class intervals.
  • The bars touch because the data values are not separated into categories.
  • For equal class widths, the bar height may represent frequency.
  • For unequal class widths, use $\text{frequency density} = \frac{f}{w}$.
  • In a histogram, area represents frequency.
  • The class width is found by $\text{width} = \text{upper boundary} - \text{lower boundary}$.
  • The histogram can show shape, center, spread, and unusual features.
  • Common shapes include symmetric, positively skewed, negatively skewed, and bimodal.
  • Histograms help with real-world decisions in areas like sport, business, health, and science.
  • A histogram describes the data visually, but it does not explain the cause of the pattern.

Practice Quiz

5 questions to test your understanding