1. Topic 1(COLON) Data and Variables

Lesson 1.3: Categorical And Numerical Variables

Official syllabus section covering Lesson 1.3: Categorical and numerical variables within Topic 1: Data and Variables: The distinction between categorical (qualitative) and numerical (quantitative) variables.; Discrete numerical variables (counts) versus continuous numerical variables (measurements)..

Lesson 1.3: Categorical and Numerical Variables

Introduction

Welcome to Lesson 1.3 of Foundation Preparatory Statistics! In this lesson, we will delve into the fundamental distinction between two primary types of variables: categorical (qualitative) variables and numerical (quantitative) variables. Understanding these differences is crucial for effectively analyzing data in statistics. By the end of this lesson, you will be able to classify various variables, explain their types, and comprehend their implications for data analysis.

Learning Objectives

  • Understand the distinction between categorical (qualitative) and numerical (quantitative) variables.
  • Differentiate between discrete numerical variables (counts) and continuous numerical variables (measurements).
  • Recognize why the type of variable influences the choice of tables, charts, and summaries.
  • Classify everyday variables by type, using worked examples.
  • Confidently classify a variable as categorical or numerical, and within numerical, as discrete or continuous.

Categorical Variables

Categorical variables, also known as qualitative variables, are those that represent characteristics, qualities, or categories. These variables are often descriptive and cannot be measured numerically. Instead, they are assigned to different categories based on attributes or qualities.

Examples of Categorical Variables

  1. Colors of M&Ms: The different colors represent distinct categories: Red, Yellow, Blue, Green, etc.
  2. Types of Vehicles: Categories such as Sedans, SUVs, Trucks, and Vans.
  3. Gender: Categories like Male, Female, and Non-binary.

Categorical variables can be further categorized into two types:

  • Nominal Variables: These do not have a natural order. For example, colors or types of animals (e.g., Dog, Cat, Bird).
  • Ordinal Variables: These have a defined order. For instance, satisfaction levels (e.g., Unsatisfied, Neutral, Satisfied).

Example: Classifying Categorical Variables

Consider a survey conducted to determine favorite fruits among students. The results displayed the following categories:

  • Apple
  • Banana
  • Cherry
  • Date

In this case, the variable 'Favorite Fruit' is a nominal categorical variable as the fruits have no inherent order. If we asked students to rank their fruits from most to least favorite, the situation would involve an ordinal categorical variable.

Numerical Variables

Numerical variables, or quantitative variables, represent measurable quantities and are expressed in numbers. They can be further classified into two types:

  1. Discrete Numerical Variables: These represent countable values. For example, the number of students in a classroom, the number of cars in a parking lot, etc.
  2. Continuous Numerical Variables: These can take any value within a given range and are usually measurements. For example, weight, height, temperature, etc.

Example: Classifying Numerical Variables

Let's consider the heights of students in a class:

  • 150 cm
  • 155 cm
  • 160 cm
  • 167 cm

Here, height is a continuous variable because it could take on any value within a range and can even include decimals.

In contrast, consider the number of students present in a class on a given day:

  • 25, 26, 27, 28

This is a discrete variable since it can only take specific values (you cannot have half a student).

Why Do Variable Types Matter?

Understanding the type of variable is essential because it influences the choice of data representation and analysis. Different tables, charts, and summaries are appropriate for different types of data. Misrepresenting data can lead to incorrect conclusions.

Choosing Tables and Charts

  • Categorical Variables:
  • Bar charts are commonly used to represent the frequency of categories.
  • Pie charts can also visualize proportional relationships among categories.
  • Numerical Variables:
  • Histograms are effective for displaying the distribution of continuous data.
  • Box plots summarize the central tendency and variability of the data for continuous variables.

Worked Example: Creating a Graph

Imagine we surveyed students about their favorite subject in school. The responses were:

  • Math: 20 students
  • Science: 15 students
  • English: 10 students
  • History: 5 students

To present this data, you can create a bar chart where the x-axis represents the subjects and the y-axis represents the number of students. This visualization is appropriate because we are dealing with categorical data.

On the other hand, if we had numerical data, such as the scores of the same students in a test out of 100, we could create a histogram to show the distribution of scores, illustrating how many students scored within certain ranges.

Classifying Everyday Variables

To strengthen your understanding, let’s classify some everyday variables:

  1. Temperature on a given day: Continuous numerical variable.
  2. Number of pets owned: Discrete numerical variable.
  3. Marital Status (Single, Married, Divorced): Categorical nominal variable.
  4. Movie Ratings (1 to 5 stars): Categorical ordinal variable.

Exercise: Classifying Variables

As a practice, classify the following variables:

  • A basketball player's jersey number.
  • Number of books in a library.
  • Types of cuisine (Italian, Mexican, Chinese).
  • Deviation from the mean score in a statistics exam.

Conclusion

In summary, understanding the distinction between categorical and numerical variables is crucial for effective data analysis. Recognizing whether a variable is qualitative or quantitative, and further distinguishing between discrete and continuous variables will significantly aid your understanding of statistical methods and applications. This strength in foundational concepts will be key as we progress to more complex topics and analyses.

Study Notes

  • Categorical Variables: Non-numerical, represent categories or qualities, can be nominal or ordinal.
  • Numerical Variables: Measurable and expressed in numbers, can be discrete (countable) or continuous (measured).
  • Tables and Charts: Different variable types dictate the charts and tables appropriate for displaying data.
  • Classification Exercise: Practice classifying variables to enhance understanding of their types.
  • Significance: Recognizing variable types prevents errors in data analysis and interpretation.

Practice Quiz

5 questions to test your understanding