Discrete Distributions

Hey students! 👋 Welcome to one of the most exciting topics in actuarial science - discrete distributions! In this lesson, you'll discover how mathematicians and actuaries use special probability models to predict everything from insurance claims to lottery outcomes. By the end of this lesson, you'll understand four fundamental discrete distributions and know exactly when and how to use each one in real-world scenarios. Get ready to unlock the mathematical tools that help insurance companies calculate premiums and assess risks! 🎯

Understanding Discrete Distributions

Before we dive into specific distributions, let's understand what makes a distribution "discrete." students, imagine you're counting things that come in whole numbers - like the number of car accidents in a day, defective products in a batch, or customers entering a store. These are all discrete events because you can't have 2.5 accidents or 3.7 customers!

A discrete probability distribution describes the likelihood of each possible outcome when dealing with countable events. Unlike continuous distributions (which we'll study later), discrete distributions only assign probabilities to specific, separate values.

In actuarial science, discrete distributions are incredibly valuable because many insurance-related events are naturally discrete. Think about it - an insurance policy either results in a claim or it doesn't, a person either experiences an accident or they don't, and a building either floods or it doesn't. These yes/no, count-based scenarios are perfect for discrete distribution modeling.

The Bernoulli Distribution: The Foundation of Success and Failure

Let's start with the simplest discrete distribution - the Bernoulli distribution. Named after Swiss mathematician Jacob Bernoulli, this distribution models situations with exactly two possible outcomes: success or failure, yes or no, heads or tails.

The Bernoulli distribution has just one parameter, p, which represents the probability of success. The probability of failure is simply 1-p. We can write this mathematically as:

$$P(X = 1) = p$$

$$P(X = 0) = 1-p$$

Where X = 1 represents success and X = 0 represents failure.

Real-World Example: students, imagine you work for an auto insurance company, and you want to model whether a randomly selected driver will file a claim this year. If historical data shows that 15% of drivers file claims annually, then p = 0.15. For any individual driver, there's a 15% chance they'll file a claim (success) and an 85% chance they won't (failure).

The mean of a Bernoulli distribution is simply p, and the variance is p(1-p). This makes sense - if p = 0.5 (like flipping a fair coin), the variance is maximized at 0.25, indicating the highest uncertainty about the outcome.

The Binomial Distribution: Counting Successes

Now, students, what happens when you repeat a Bernoulli trial multiple times? That's where the binomial distribution comes in! This distribution counts the number of successes in a fixed number of independent trials, each with the same probability of success.

The binomial distribution has two parameters:

n: the number of trials
p: the probability of success in each trial

The probability of getting exactly k successes in n trials is:

$$P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$$

Where $\binom{n}{k}$ is the binomial coefficient, calculated as $\frac{n!}{k!(n-k)!}$.

Real-World Example: Let's say you're analyzing insurance claims for a company that insures 1,000 smartphones against theft. If the probability that any individual phone gets stolen in a year is 0.03 (3%), you can use the binomial distribution to calculate the probability of exactly 25 phones being stolen. Here, n = 1000, p = 0.03, and k = 25.

The mean of a binomial distribution is np, and the variance is np(1-p). In our smartphone example, you'd expect about 30 thefts per year (1000 × 0.03), with a variance of 29.1.

According to actuarial studies, binomial distributions are commonly used in insurance to model the number of claims in a portfolio of independent policies, making it one of the most practical distributions for risk assessment.

The Poisson Distribution: Modeling Rare Events

The Poisson distribution is perfect for modeling rare events that occur over a fixed period or in a fixed space. Named after French mathematician Siméon Denis Poisson, this distribution is incredibly useful in actuarial science because many insurance events are relatively rare.

The Poisson distribution has just one parameter, λ (lambda), which represents both the mean and variance of the distribution. The probability of observing exactly k events is:

$$P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$$

Real-World Example: students, consider modeling the number of major earthquakes (magnitude 7.0+) in California per year. Historical data shows an average of 2.1 such earthquakes annually. Using a Poisson distribution with λ = 2.1, insurance companies can calculate the probability of 0, 1, 2, or more major earthquakes in any given year.

The Poisson distribution has a fascinating property: as n increases and p decreases such that np remains constant, the binomial distribution approaches the Poisson distribution. This makes it an excellent approximation for binomial distributions when n is large and p is small (typically when n ≥ 20 and p ≤ 0.05).

Insurance companies frequently use Poisson distributions to model catastrophic events, equipment failures, and other low-probability, high-impact scenarios. Research shows that about 60% of property insurance claims follow patterns that can be modeled using Poisson distributions.

The Geometric Distribution: Waiting for the First Success

The geometric distribution models a different question: "How many trials do we need until we get our first success?" This distribution is particularly useful when you're interested in waiting times or the number of attempts needed to achieve a goal.

The geometric distribution has one parameter, p, representing the probability of success on each trial. The probability that the first success occurs on the kth trial is:

$$P(X = k) = (1-p)^{k-1} p$$

Real-World Example: Imagine you're working for a life insurance company, and you want to model how many people you need to approach before finding someone willing to purchase a policy. If your success rate is 20% (p = 0.20), the geometric distribution can tell you the probability that your first sale will be the 3rd person you approach, the 5th person, and so on.

The mean of a geometric distribution is $\frac{1}{p}$, and the variance is $\frac{1-p}{p^2}$. In our insurance sales example, you'd expect to make your first sale after approaching 5 people on average (1/0.20 = 5).

The geometric distribution has a unique "memoryless" property - the probability of success on the next trial is always the same, regardless of how many failures you've already experienced. This makes it perfect for modeling situations where past failures don't affect future success probabilities.

Conclusion

students, you've now mastered four fundamental discrete distributions that form the backbone of actuarial modeling! The Bernoulli distribution handles simple success/failure scenarios, while the binomial distribution counts successes in multiple trials. The Poisson distribution excels at modeling rare events, and the geometric distribution tracks waiting times until success. Each distribution serves specific purposes in insurance and risk assessment, giving actuaries powerful tools to quantify uncertainty and calculate appropriate premiums. Understanding these distributions will serve as your foundation for more advanced actuarial concepts! 🚀

Study Notes

• Bernoulli Distribution: Models single trial with two outcomes (success/failure)

Parameter: p (probability of success)
Mean: p, Variance: p(1-p)
Used for: Individual policy claims, single event modeling

• Binomial Distribution: Counts successes in n independent trials

Parameters: n (trials), p (success probability)
Formula: $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}$
Mean: np, Variance: np(1-p)
Used for: Multiple policy claims, portfolio analysis

• Poisson Distribution: Models rare events over fixed time/space

Parameter: λ (average rate, also mean and variance)
Formula: $P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}$
Used for: Catastrophic events, equipment failures
Approximates binomial when n large, p small (np = λ)

• Geometric Distribution: Models trials until first success

Parameter: p (success probability per trial)
Formula: $P(X = k) = (1-p)^{k-1} p$
Mean: $\frac{1}{p}$, Variance: $\frac{1-p}{p^2}$
Memoryless property: past failures don't affect future probabilities
Used for: Waiting times, sales processes, time to first claim