Continuous Distributions
Hey students! 👋 Welcome to one of the most fascinating topics in AS-level Further Mathematics - continuous distributions! In this lesson, we'll explore how probability works when dealing with variables that can take on any value within a range, rather than just discrete whole numbers. By the end of this lesson, you'll understand uniform, exponential, and normal distributions, know how to work with probability density functions, calculate cumulative probabilities, and apply transformation techniques. Think of it like this: instead of counting individual marbles in a jar, we're now looking at things like the exact height of students in your school or the precise time it takes for your phone battery to drain 📱
Understanding Continuous Random Variables and Probability Density Functions
students, let's start with the fundamental concept that makes continuous distributions special. Unlike discrete random variables (like the number of heads when flipping coins), continuous random variables can take on any value within a specified range. For example, your exact height could be 170.2847... cm - there are infinitely many possible values!
Since there are infinitely many possible values, the probability of any single exact value is essentially zero. Instead, we work with probability density functions (PDFs), denoted as $f(x)$. The PDF doesn't give us probabilities directly - instead, it tells us the "density" of probability at each point. To find actual probabilities, we need to integrate the PDF over an interval.
The key properties of any PDF are:
- $f(x) \geq 0$ for all values of $x$ (probability density can't be negative)
- $\int_{-\infty}^{\infty} f(x)dx = 1$ (the total area under the curve equals 1)
- $P(a \leq X \leq b) = \int_{a}^{b} f(x)dx$ (probability equals the area under the curve)
Real-world example: If you're measuring the exact time students take to complete a math test, the PDF would show you where most completion times cluster, but you'd need to integrate to find the probability that a randomly selected student finishes between 45 and 50 minutes 📊
The Uniform Distribution - Equal Probability Everywhere
The uniform distribution is the simplest continuous distribution, students. Imagine a perfectly fair spinner that can land anywhere on a circle - that's uniform! In a uniform distribution over the interval $[a, b]$, every value within that range is equally likely.
The PDF of a uniform distribution is:
$$f(x) = \begin{cases}
$\frac{1}{b-a}$ & \text{if } a $\leq$ x $\leq$ b \\
$0 & \text{otherwise}$
$\end{cases}$$$
This creates a rectangular shape when graphed, which is why it's sometimes called the rectangular distribution. The height of the rectangle is $\frac{1}{b-a}$ to ensure the total area equals 1.
For a uniform distribution on $[a, b]$:
- Mean: $\mu = \frac{a+b}{2}$ (exactly in the middle)
- Variance: $\sigma^2 = \frac{(b-a)^2}{12}$
Real-world application: Random number generators in computers often use uniform distributions. If you ask a computer to generate a random number between 0 and 1, it's using a uniform distribution where every decimal value has equal probability of being selected 🎲
The Exponential Distribution - Modeling Waiting Times
students, the exponential distribution is incredibly useful for modeling waiting times and the time between events. Think about how long you wait for a bus, the time between phone calls to a customer service center, or how long electronic components last before failing.
The PDF of an exponential distribution with parameter $\lambda$ (lambda) is:
$$f(x) = \lambda e^{-\lambda x} \text{ for } x \geq 0$$
The parameter $\lambda$ is called the rate parameter - it represents how frequently events occur. A larger $\lambda$ means events happen more frequently, so waiting times are shorter.
Key properties of the exponential distribution:
- Mean: $\mu = \frac{1}{\lambda}$
- Variance: $\sigma^2 = \frac{1}{\lambda^2}$
- The distribution has a "memoryless" property: $P(X > s + t | X > s) = P(X > t)$
The memoryless property is fascinating! It means that if you've already been waiting for a bus for 10 minutes, the probability of waiting another 5 minutes is the same as if you had just arrived at the bus stop. This might seem counterintuitive, but it's mathematically proven for exponential distributions 🚌
Real-world example: If customers arrive at a bank at an average rate of 2 per minute, then $\lambda = 2$, and the time between arrivals follows an exponential distribution with mean $\frac{1}{2} = 0.5$ minutes.
The Normal Distribution - The Bell Curve
The normal distribution is probably the most important distribution in statistics, students! Also known as the Gaussian distribution or bell curve, it appears everywhere in nature and human behavior. Heights, test scores, measurement errors, and countless other phenomena follow approximately normal distributions.
The PDF of a normal distribution with mean $\mu$ and standard deviation $\sigma$ is:
$$f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$
This creates the famous bell-shaped curve that's symmetric around the mean. The parameters $\mu$ and $\sigma$ completely determine the shape:
- $\mu$ determines where the center of the curve is located
- $\sigma$ determines how spread out the curve is (larger $\sigma$ = wider curve)
The standard normal distribution is a special case where $\mu = 0$ and $\sigma = 1$. We denote this as $Z \sim N(0,1)$, and it's used extensively in statistical calculations.
Amazing fact: About 68% of values fall within one standard deviation of the mean, 95% within two standard deviations, and 99.7% within three standard deviations. This is called the empirical rule or 68-95-99.7 rule 📈
Real-world example: If SAT scores are normally distributed with mean 1000 and standard deviation 200, then about 68% of students score between 800 and 1200, and 95% score between 600 and 1400.
Cumulative Distribution Functions and Probability Calculations
students, while PDFs tell us about probability density, cumulative distribution functions (CDFs) give us actual probabilities. The CDF, denoted $F(x)$, represents the probability that our random variable is less than or equal to a specific value: $F(x) = P(X \leq x)$.
The relationship between PDF and CDF is:
$$F(x) = \int_{-\infty}^{x} f(t)dt$$
And conversely: $f(x) = \frac{d}{dx}F(x)$ (the PDF is the derivative of the CDF)
For our three distributions:
Uniform Distribution CDF:
$$F(x) = \begin{cases}
0 & \text{if } x < a \\
$\frac{x-a}{b-a}$ & \text{if } a $\leq$ x $\leq$ b \\
1 & \text{if } x > b
$\end{cases}$$$
Exponential Distribution CDF:
$$F(x) = 1 - e^{-\lambda x} \text{ for } x \geq 0$$
Normal Distribution CDF:
The normal CDF doesn't have a simple closed form, so we use tables or technology to find values.
Transformation Techniques
Sometimes, students, we need to find the distribution of a transformed random variable. If $X$ has a known distribution and we want to find the distribution of $Y = g(X)$ for some function $g$, we can use transformation techniques.
Linear Transformations: If $Y = aX + b$ where $a$ and $b$ are constants:
- If $X$ is uniform on $[c, d]$, then $Y$ is uniform on $[ac + b, ad + b]$ (assuming $a > 0$)
- If $X$ is exponential with parameter $\lambda$, then $Y$ is exponential with parameter $\frac{\lambda}{a}$ (for $a > 0$)
- If $X$ is normal with mean $\mu$ and standard deviation $\sigma$, then $Y$ is normal with mean $a\mu + b$ and standard deviation $|a|\sigma$
The Jacobian Method: For more complex transformations, we use:
$$f_Y(y) = f_X(g^{-1}(y)) \left|\frac{d}{dy}g^{-1}(y)\right|$$
This is particularly useful when dealing with non-linear transformations 🔄
Conclusion
students, you've now mastered the fundamentals of continuous distributions! We've explored how uniform distributions model equal probability scenarios, exponential distributions handle waiting times and reliability, and normal distributions describe countless natural phenomena. You've learned to work with probability density functions, calculate cumulative probabilities, and apply transformation techniques to find distributions of modified random variables. These concepts form the foundation for advanced statistical analysis and appear frequently in real-world applications from engineering to finance to biology.
Study Notes
- Continuous Random Variables: Can take any value in a range; probability of exact values is zero
- PDF Properties: $f(x) \geq 0$, $\int_{-\infty}^{\infty} f(x)dx = 1$, $P(a \leq X \leq b) = \int_{a}^{b} f(x)dx$
- Uniform Distribution: $f(x) = \frac{1}{b-a}$ on $[a,b]$, Mean = $\frac{a+b}{2}$, Variance = $\frac{(b-a)^2}{12}$
- Exponential Distribution: $f(x) = \lambda e^{-\lambda x}$, Mean = $\frac{1}{\lambda}$, Variance = $\frac{1}{\lambda^2}$
- Normal Distribution: $f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}$, Bell-shaped, symmetric
- Empirical Rule: 68% within 1σ, 95% within 2σ, 99.7% within 3σ of mean
- CDF Definition: $F(x) = P(X \leq x) = \int_{-\infty}^{x} f(t)dt$
- PDF-CDF Relationship: $f(x) = \frac{d}{dx}F(x)$
- Linear Transformation: If $Y = aX + b$, then $E[Y] = aE[X] + b$, $Var(Y) = a^2Var(X)$
- Exponential Memoryless Property: $P(X > s + t | X > s) = P(X > t)$
