4. Statistics and Probability

Sources Of Bias

Sources of Bias in Statistics and Probability

students, imagine trying to find out whether your school should start a new lunch menu by asking only the students who already love the cafeteria food 🍎. The results would probably look better than the truth. That is the basic idea behind bias in statistics: something in the way data is collected, chosen, measured, or interpreted makes the result systematically untrue or misleading. In this lesson, you will learn how bias affects data analysis, probability models, and real-world decisions. You will also see how to spot it and reduce it so conclusions are more trustworthy.

What bias means and why it matters

In statistics, bias is a systematic error that pushes results in one direction. This is different from random variation. Random variation is natural noise in data; bias is a consistent problem in the method. If a survey is biased, it may overestimate or underestimate the truth every time. That makes the conclusion unreliable.

A key idea in IB Mathematics: Applications and Interpretation SL is that statistics are not just about calculating numbers. They are about making decisions based on data. If the data are biased, then the decision may be wrong even if the calculations are correct.

Common sources of bias include:

  • Selection bias: the sample is not representative of the population.
  • Voluntary response bias: people choose whether to respond, often leading to extreme opinions being overrepresented.
  • Non-response bias: selected people do not respond, and those who do may differ from those who do not.
  • Measurement bias: the way information is collected changes the result.
  • Question wording bias: the wording of a question influences answers.
  • Observer bias: the person collecting data influences the observations.
  • Confounding variables: another variable affects both the explanatory and response variables, making the relationship misleading.

students, a useful question to ask is: “Could this method consistently pull the result away from the truth?” If the answer is yes, bias may be present.

Sources of bias in sampling and surveys

Sampling is one of the most common places where bias appears. In statistics, we often want to estimate a population characteristic using a sample. For example, a school may want to estimate the average study time of all students by surveying a smaller group. If the sample is biased, the estimate may not reflect the whole school.

Selection bias

Selection bias happens when the sampling method makes some members of the population more likely to be chosen than others in a way that distorts the sample. For example, if a student surveys only the chess club to estimate average homework time, the sample may not represent the entire school. Chess club members may have different schedules from athletes or drama students.

A better method would be a random sample, where every member of the population has an equal chance of being selected. Random sampling helps reduce bias, though it does not eliminate random error.

Voluntary response bias

Voluntary response bias happens when people choose whether to participate. This often attracts people with strong opinions, especially those who feel very happy or very unhappy. For example, an online poll asking, “Do you think school lunch is terrible?” may get many responses from unhappy students and few from satisfied students. The result can exaggerate negativity.

Non-response bias

Non-response bias occurs when selected people do not respond. If the non-responders are different from responders, the sample becomes less representative. For instance, if a school sends a survey about after-school sports to students and many busy students do not answer, the sample may overrepresent students who have more free time.

How to reduce sampling bias

Here are common ways to reduce sampling bias:

  • Use random sampling.
  • Use stratified sampling when the population has important subgroups.
  • Increase sample size when appropriate, while still keeping the sampling method fair.
  • Avoid relying only on volunteers or convenience samples.

A stratified sample divides the population into groups, or strata, based on a feature like grade level, then takes a random sample from each group. This helps ensure that important groups are represented fairly.

Bias in measurement and survey design

Even if a sample is chosen well, bias can still appear in how data are measured or asked for. This is especially important in questionnaires, experiments, and observations.

Question wording bias

The wording of a question can shape the answer. Compare these two questions:

  • “Should the school waste money on extra sports equipment?”
  • “Should the school invest in better sports equipment for student health and participation?”

Both ask about the same issue, but they use different language. The first suggests waste, while the second suggests benefit. That wording can influence responses.

To reduce wording bias, questions should be neutral, clear, and specific. Avoid emotional language and double meanings.

Measurement bias

Measurement bias happens when the method or instrument consistently gives results that are too high, too low, or otherwise distorted. For example, a bathroom scale that is not calibrated may always add $2$ kg to the true weight. In a science lab, a thermometer that is poorly placed near sunlight may record a temperatu

Study Notes

  • Review the key concepts covered in this lesson.

Practice Quiz

5 questions to test your understanding