Section 4.3

Diagnostics on the Least-Squares Regression Line

Learn how to evaluate whether a linear model is appropriate using R², residual plots, and influential observation analysis.

1

The Coefficient of Determination ()

Definition

is the percentage of total variation in the response variable that is explained by the least-squares regression line.

The Relationship

R² equals the correlation coefficient squared

LogicLens: Explained vs. Unexplained

Percentage of variation explained by the regression line

Percentage of variation unexplained (due to other factors)

Example

If , then:

= 72.25% explained

= 27.75% unexplained

2

Residual Analysis & Scatter Plots

The Residual Plot

Plot the residuals () on the y-axis against the explanatory variable () on the x-axis.

LogicLens: Three Things to Check

Random Scatter

Good: No discernible pattern — linear model is appropriate

Bad: U-shape or curve — non-linear relationship

Constant Variance

Good: Same spread throughout — homoscedasticity

Bad: Funnel/fan shape — heteroscedasticity

No Outliers

Good: All points near the e=0 line

Bad: Points far from zero line

✓ Good Residual Plot

Random scatter around zero — linear model appropriate

✗ U-Shaped Pattern

Systematic curve — relationship is non-linear

3

Influential Observations

Definition

An influential observation is a point that significantly affects the slope or intercept of the regression line when included or removed.

LogicLens: Leverage vs. Influence

High-Leverage Point

A point that is far from the mean of x (extreme in the x-direction).

Has the potential to influence the line, but doesn't necessarily do so.

Influential Point

A point that actually changes the regression line significantly.

All influential points have high leverage, but not all high-leverage points are influential.

Visual Comparison

High leverage but on the line — NOT influential

High leverage and off the pattern — INFLUENTIAL

Try It Yourself

Residual Plot Explorer

Correlation (r)
0.9985
(Explained)
99.7%
(Unexplained)
0.3%

Residual Plot

0x (Explanatory Variable)Residual (e)

Model Appears Valid

Residuals show random scatter around zero with no discernible pattern. The linear model is appropriate for this data.

Quick Reference: What to Look For

✓ Random Scatter

No pattern = linear model OK

✓ Constant Width

No funnel = equal variance

✓ No Outliers

Check points far from e=0

LogicLens Practice

Adaptive Assessment

Unlock Your Personalized Quiz

Sign in to access AI-generated practice problems tailored to this section.