Welcome back, Stats Spring 2025 class! In this session, we are laying the groundwork for statistical analysis by covering Chapter 2-2 and 2-3. Understanding how to classify data is not just about vocabulary; it determines which statistical tests you can use later in the course. Let's break down the key concepts from our class notes.

1. Data Classification: Qualitative vs. Quantitative

The first step in analyzing data is determining whether it describes a quality or a quantity.

  • Qualitative Data (Categorical): Consists of attributes, labels, or non-numerical entries. It describes a quality.
    Example: Eye color, car models, or your major.
  • Quantitative Data (Numerical): Consists of numbers that are measurements or counts.
    Example: Heights, weight, or the number of students in a class.

2. Discrete vs. Continuous Data

When dealing with quantitative data, we must differentiate between discrete and continuous variables. This often comes down to the difference between "counting" and "measuring."

  • Discrete: Data restricted to a specific set of values, usually integers, containing gaps. You can count these.
    Definition: Data in which observations are restricted to a set of values (such as $1, 2, 3, 4$) that possess gaps.
    Class Example: The number of goals scored in a soccer game. You can score $2$ goals or $3$ goals, but not $2.5$ goals.
  • Continuous: Data that can take on any value within an interval. These are usually measurements.
    Class Example: The speed of cars on a highway or temperature. A car can be traveling at $60.5$ mph or $60.55$ mph.

3. Levels of Measurement

Data quality is described by its level of measurement. The hierarchy moves from simplest (Nominal) to most complex (Ratio).

  1. Nominal: Qualitative only. Data categorized using names, labels, or qualities. No mathematical computations can be made.
    Example: Hair color (Black, Blonde, Brown, Red).
  2. Ordinal: Qualitative or Quantitative. Data can be arranged in order, or ranked, but differences between entries are not meaningful.
    Example: Rankings on a best-seller list or satisfaction surveys (1-Very Bad to 5-Very Good).
  3. Interval: Quantitative. Data can be ordered, and you can calculate meaningful differences. However, there is no inherent zero (zero is just a position on the scale, not "none").
    Example: Temperature in Fahrenheit or Celsius. $0^{\circ}C$ represents a specific temperature, not the absence of heat. The math works for subtraction ($48^{\circ} - 45^{\circ} = 3^{\circ}$ difference), but ratios do not work.
  4. Ratio: Quantitative. Similar to interval data, with the added property of a meaningful zero. Because zero implies "none," ratios are valid.
    Example: Money or Income. A friend with $\$40$ has exactly twice as much as a friend with $\$20$ because $\frac{40}{20} = 2$.

4. Time Series vs. Cross-Sectional Data

Finally, we looked at how data relates to time:

  • Time Series: Measurements taken from a process over equally spaced intervals of time. Ideally, this shows trends.
    Example: Average global temperature over the last 100 years or daily stock prices.
  • Cross-Sectional: Measurements taken at approximately the same point in time across different groups.
    Example: Life expectancy across different countries in the year 2015.

Review the attached PDF for the specific examples we worked through in class, including the "Frosty Pops" taste test and the analysis of US fuel exports. Mastering these classifications now will make the upcoming chapters much smoother!

Keep studying and see you in the next class!