Scatter Diagrams and Correlation
Explore the relationship between two quantitative variables using scatter plots and measure the strength of linear associations with the correlation coefficient.
Visualizing Relationships
Explanatory Variable
The explanatory variable (or independent variable) is the variable we believe may explain or influence changes in another variable.
Response Variable
The response variable (or dependent variable) is the variable we measure to see how it responds to changes in the explanatory variable.
LogicLens: Reading a Scatter Plot
When examining a scatter diagram, look for two key features:
Direction
- Positive: As increases, increases
- Negative: As increases, decreases
- None: No clear pattern
Form
- Linear: Points follow a straight-line pattern
- Nonlinear: Points follow a curved pattern
The Linear Correlation Coefficient (r)
Definition
The linear correlation coefficient () is a numerical measure of the strength and direction of the linear relationship between two quantitative variables.
Sample Linear Correlation Coefficient Formula
LogicLens: Critical Properties of r
r is always between −1 and +1
r is unitless
No units (like meters or dollars)
r = +1 or −1
Perfect linear relationship
r is NOT resistant
Sensitive to outliers!
| |r| Value | Interpretation |
|---|---|
| 0.90 − 1.00 | Very Strong |
| 0.70 − 0.89 | Strong |
| 0.50 − 0.69 | Moderate |
| 0.30 − 0.49 | Weak |
| 0.00 − 0.29 | Very Weak / None |
Determining Linearity
To formally test whether a linear relationship exists, we compare our calculated value to a critical value from a statistical table based on the sample size .
LogicLens: The Linearity Test
✓ A linear relation EXISTS
✗ NO linear relation concluded
Example Critical Values
| n (sample size) | 5 | 10 | 15 | 20 | 25 |
|---|---|---|---|---|---|
| Critical Value | 0.878 | 0.632 | 0.514 | 0.444 | 0.396 |
Correlation vs. Causation
Critical Warning
Correlation does NOT imply Causation!
Just because two variables are strongly correlated does NOT mean one causes the other.
LogicLens: The Lurking Variable
A lurking variable is a variable that is not included in the study but affects both the explanatory and response variables, creating a false appearance of a direct relationship.
Classic Example: Ice Cream & Shark Attacks
Hot weather causes both increased ice cream consumption AND more people swimming (leading to more shark encounters).
Another Example: Fire Trucks & Damage
There's a strong positive correlation between the number of fire trucks at a fire and the amount of property damage. Does this mean fire trucks cause damage?
No! The lurking variable is the size of the fire. Larger fires require more trucks AND cause more damage.
Try It Yourself
Scatter Plot Explorer
LogicLens Interpretation
This scatter plot shows a very strong positive linear relationship. As Hours Studied increases, Exam Score tends to increase in a fairly predictable pattern.
Adaptive Assessment
Unlock Your Personalized Quiz
Sign in to access AI-generated practice problems tailored to this section.