3. Probability Distributions

Discrete Distributions

Explore Bernoulli, Binomial, Poisson, and geometric distributions including PMFs, means, variances, and use cases.

Discrete Distributions

Hey students! πŸ‘‹ Today we're diving into one of the most fascinating areas of statistics: discrete distributions. Think of these as mathematical tools that help us predict outcomes when dealing with countable events - like flipping coins, counting defective products, or tracking customer arrivals. By the end of this lesson, you'll understand four major discrete distributions, know their formulas, and see how they apply to real-world situations that you encounter every day!

Understanding Discrete Distributions πŸ“Š

A discrete distribution describes the probability of different outcomes for events that can only take specific, countable values. Unlike continuous distributions (which deal with measurements like height or weight), discrete distributions work with whole numbers - you can't flip 2.5 coins or have 3.7 customers!

The key concept here is the Probability Mass Function (PMF), which tells us the probability of each possible outcome. Think of it as a recipe that assigns probabilities to each possible value our random variable can take.

For any discrete distribution, two important rules always apply:

  • Each probability must be between 0 and 1
  • All probabilities must sum to exactly 1

Real-world applications are everywhere! Netflix uses discrete distributions to model how many shows you'll binge-watch, Amazon predicts package delivery patterns, and even your favorite mobile game calculates reward probabilities using these mathematical tools.

Bernoulli Distribution: The Foundation 🎯

The Bernoulli distribution is the simplest discrete distribution, modeling experiments with exactly two outcomes: success or failure. Named after Swiss mathematician Jacob Bernoulli, this distribution forms the building block for more complex distributions.

Key Characteristics:

  • Only two possible outcomes: 1 (success) or 0 (failure)
  • Single parameter: $p$ (probability of success)
  • PMF: $P(X = k) = p^k(1-p)^{1-k}$ where $k \in \{0,1\}$

Mean and Variance:

  • Mean: $E[X] = p$
  • Variance: $Var(X) = p(1-p)$

Think about flipping a coin once - that's a Bernoulli trial! If we define "heads" as success, then $p = 0.5$. The mean tells us that over many single flips, we'd expect to get heads 50% of the time on average.

Real-world examples include: clicking on an online ad (click or no click), a medical test result (positive or negative), or whether a student passes an exam (pass or fail). According to recent studies, the average click-through rate for online ads is about 2%, making $p = 0.02$ for most advertising Bernoulli models.

Binomial Distribution: Multiple Trials 🎲

The binomial distribution extends the Bernoulli concept to multiple independent trials. It answers the question: "If I repeat a Bernoulli experiment $n$ times, what's the probability of getting exactly $k$ successes?"

Key Characteristics:

  • Fixed number of trials: $n$
  • Each trial is independent with same success probability $p$
  • We count the total number of successes
  • PMF: $P(X = k) = \binom{n}{k}p^k(1-p)^{n-k}$

Mean and Variance:

  • Mean: $E[X] = np$
  • Variance: $Var(X) = np(1-p)$

The binomial coefficient $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ represents the number of ways to choose $k$ successes from $n$ trials.

Consider quality control in manufacturing: if a factory produces items with a 5% defect rate, and you inspect 20 items, what's the probability of finding exactly 2 defective ones? Using the binomial distribution with $n = 20$, $p = 0.05$, and $k = 2$, we can calculate this precisely.

Another common example is standardized testing. If each multiple-choice question has 4 options and you guess randomly, $p = 0.25$. For a 10-question quiz, the expected number of correct answers would be $np = 10 \times 0.25 = 2.5$ questions.

Poisson Distribution: Rare Events Over Time ⏰

The Poisson distribution models the number of events occurring in a fixed interval when these events happen independently at a constant average rate. Named after French mathematician SimΓ©on Denis Poisson, it's perfect for counting rare events.

Key Characteristics:

  • Events occur independently
  • Average rate $\lambda$ (lambda) remains constant
  • Counts events in fixed intervals (time, space, etc.)
  • PMF: $P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$

Mean and Variance:

  • Mean: $E[X] = \lambda$
  • Variance: $Var(X) = \lambda$

Notice something cool? For Poisson distributions, the mean equals the variance!

Real-world applications include: email arrivals per hour, car accidents at an intersection per month, or customer calls to a service center per day. For example, if a coffee shop averages 15 customers per hour ($\lambda = 15$), we can calculate the probability of serving exactly 20 customers in any given hour.

Recent data shows that the average person receives about 121 emails per day. Using a Poisson model with $\lambda = 121/24 β‰ˆ 5$ emails per hour, we could predict email patterns throughout the day.

The Poisson distribution also appears in unexpected places: the number of chocolate chips in cookies, radioactive decay events, and even the distribution of goals in soccer matches!

Geometric Distribution: Waiting for Success 🎯

The geometric distribution models the number of trials needed to achieve the first success in a sequence of independent Bernoulli trials. It answers: "How long do I have to wait for my first success?"

Key Characteristics:

  • Counts trials until first success (including the success)
  • Each trial is independent with success probability $p$
  • PMF: $P(X = k) = (1-p)^{k-1}p$ for $k = 1, 2, 3, ...$

Mean and Variance:

  • Mean: $E[X] = \frac{1}{p}$
  • Variance: $Var(X) = \frac{1-p}{p^2}$

The geometric distribution has a fascinating "memoryless" property - the probability of success on the next trial doesn't depend on how many failures you've already had!

Consider job hunting: if each application has a 10% chance of success ($p = 0.1$), the expected number of applications needed is $\frac{1}{0.1} = 10$. This doesn't mean you'll definitely get a job on the 10th application, but on average, that's how many you'd expect to submit.

Another example is rolling dice until you get a six. With $p = \frac{1}{6}$, you'd expect to roll $\frac{1}{1/6} = 6$ times on average. Gaming companies use geometric distributions to model how many attempts players need to obtain rare items, helping them balance game difficulty and player engagement.

Conclusion πŸŽ‰

Discrete distributions are powerful tools that help us understand and predict countable events in our daily lives. The Bernoulli distribution handles single yes/no situations, while the binomial extends this to multiple trials. The Poisson distribution excels at modeling rare events over time, and the geometric distribution tells us how long we might wait for our first success. Each distribution has its unique PMF, mean, and variance formulas that make calculations precise and reliable. Understanding these distributions gives you the mathematical foundation to analyze everything from manufacturing quality to social media engagement, making you better equipped to interpret data and make informed decisions in our increasingly data-driven world!

Study Notes

β€’ Discrete Distribution: Probability distribution for countable outcomes (whole numbers only)

β€’ Probability Mass Function (PMF): Function that gives probability of each possible value

β€’ Bernoulli Distribution: Models single trial with two outcomes

  • PMF: $P(X = k) = p^k(1-p)^{1-k}$, $k \in \{0,1\}$
  • Mean: $E[X] = p$
  • Variance: $Var(X) = p(1-p)$

β€’ Binomial Distribution: Models number of successes in $n$ independent trials

  • PMF: $P(X = k) = \binom{n}{k}p^k(1-p)^{n-k}$
  • Mean: $E[X] = np$
  • Variance: $Var(X) = np(1-p)$

β€’ Poisson Distribution: Models rare events occurring at constant rate $\lambda$

  • PMF: $P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$
  • Mean: $E[X] = \lambda$
  • Variance: $Var(X) = \lambda$

β€’ Geometric Distribution: Models number of trials until first success

  • PMF: $P(X = k) = (1-p)^{k-1}p$
  • Mean: $E[X] = \frac{1}{p}$
  • Variance: $Var(X) = \frac{1-p}{p^2}$

β€’ Key Properties: All probabilities sum to 1, each probability between 0 and 1

β€’ Applications: Quality control, customer service, gaming, manufacturing, email patterns, job hunting

Practice Quiz

5 questions to test your understanding

Discrete Distributions β€” Statistics | A-Warded