Summer 2024 Chapter 7

Welcome to Chapter 7: Discrete Probability Distributions! In this section of our summer course, we are moving from simply describing data to building models that help us predict future outcomes. This is where statistics starts to feel like a crystal ball, allowing us to calculate the likelihood of specific events occurring.

1. Random Variables: Discrete vs. Continuous

Everything starts with a Random Variable (usually denoted as $x$), which is a numerical outcome of a random process. Before we calculate probabilities, we must determine what type of variable we are dealing with:

Discrete Random Variables: These are values you can count. They usually involve whole numbers.
Examples: The number of pages in a textbook, the number of customers entering a store, or the number of tails in a coin toss.
Continuous Random Variables: These are values you measure on a continuous scale.
Examples: The amount of electricity used, the time spent on a phone, or the height of a student.

Note: In Chapter 7, we focus exclusively on Discrete variables.

2. The Discrete Probability Distribution

A probability distribution lists all possible values of the random variable alongside their associated probabilities. For a distribution to be valid, it must meet two strict rules:

The sum of all probabilities must equal 1: $\sum P(x) = 1$
Each individual probability must be between 0 and 1, inclusive: $0 \le P(x) \le 1$

Expected Value and Standard Deviation

We can find the "long-term average" of a random variable, known as the Expected Value ($E(X)$ or $\mu$). This is crucial for decision-making, such as comparing investment alternatives (as seen in the class notes).

Mean (Expected Value): $\mu = \sum [x \cdot P(x)]$
Variance: $\sigma^2 = \sum [(x-\mu)^2 \cdot P(x)]$
Standard Deviation: $\sigma = \sqrt{\sigma^2}$

3. The Binomial Distribution

One of the most important distributions we cover is the Binomial Distribution. This applies to scenarios where there are strictly two outcomes (Success or Failure). To use the Binomial formulas, an experiment must satisfy these four conditions:

There are a fixed number of trials ($n$).
The trials are identical and independent.
There are only two outcomes: Success ($p$) and Failure ($1-p$ or $q$).
The probability of success ($p$) remains constant for each trial.

The Binomial Probability Formula:
To find the probability of exactly $x$ successes in $n$ trials, we use:

$$P(X=x) = {}_nC_x \cdot p^x \cdot (1-p)^{n-x}$$

Shortcuts for Binomial Statistics:
Luckily, calculating the mean and standard deviation for a Binomial distribution is much simpler than the general discrete formula:

Mean: $\mu = np$
Standard Deviation: $\sigma = \sqrt{np(1-p)}$

Real-World Example: Do You Believe in Ghosts?

As discussed in the class notes, if $40\%$ of Americans believe in ghosts ($p=0.40$), and we randomly select 20 people ($n=20$), we can calculate the expected number of believers easily:

$\mu = 20(0.40) = 8$ people.

We can also calculate the standard deviation to determine the "spread" of our data:

$\sigma = \sqrt{20(0.40)(0.60)} \approx 2.19$

Make sure to utilize the Statistics Calculators linked above to help with these calculations, especially when solving for cumulative probabilities (like $P(x \ge 12)$). Keep practicing those problem sets, and I'll see you in the next lecture!