Section 4.1: Measures of Location
Welcome to Section 4.1, where we'll be exploring different ways to measure the "center" of a dataset. These measures are crucial for understanding and summarizing data, and choosing the right one depends on the specific characteristics of your data.
Arithmetic Mean
The arithmetic mean, often simply called the "average," is probably the most familiar measure of center. It's calculated by summing all the values in a dataset and dividing by the number of values. Let's define it formally:
Suppose we have $n$ observations in a data set, denoted as $x_1, x_2, ..., x_n$. Then the arithmetic mean, $\bar{x}$, is given by:
$$ \bar{x} = \frac{1}{n}(x_1 + x_2 + ... + x_n) = \frac{\sum_{i=1}^{n} x_i}{n} $$For example, let's calculate the mean of the following sample data values: 4, 10, 7, 15.
$$ \bar{x} = \frac{4 + 10 + 7 + 15}{4} = \frac{36}{4} = 9 $$So, the arithmetic mean of this dataset is 9.
Weighted Mean
The weighted mean is useful when some values in a dataset are more important or occur more frequently than others. Each value is assigned a weight, and the weighted mean is calculated as follows:
The weighted mean, $ \bar{x}_w $, of a data set with values $x_1, x_2, ..., x_n$ and corresponding weights $w_1, w_2, ..., w_n$ is given by:
$$ \bar{x}_w = \frac{w_1x_1 + w_2x_2 + ... + w_nx_n}{w_1 + w_2 + ... + w_n} = \frac{\sum_{i=1}^{n} w_ix_i}{\sum_{i=1}^{n} w_i} $$Let's consider Meghan's grades. To find the GPA, you can calculate the weighted mean.
For instance, if a grade of A is worth 4 points and B is worth 3, and C is worth 2 and the courses have associated credit hours. To calculate the GPA, multiply the point values by the credit hours (weights), sum these products, and divide by the total credit hours.
Median
The median is the middle value in a dataset when the values are arranged in ascending order. It's a useful measure because it's not affected by extreme values (outliers). To find the median:
- Arrange the data in ascending order.
- Determine the number of values in the data.
- If the number of data values is odd, the median is the data value in the middle.
- If the number of data values is even, the median is the mean of the two middle observations.
Mode
The mode is the most frequently occurring value in a dataset. A dataset can have no mode (if all values occur only once), one mode (unimodal), or multiple modes (bimodal, trimodal, etc.).
Trimmed Mean
The trimmed mean is a modification of the arithmetic mean which ignores an equal percentage of the highest and lowest data values in calculating the mean. This helps to reduce the impact of outliers on the mean.
To find the 10% trimmed mean:
- Arrange the data in ascending order.
- Delete the lowest 10% of the values.
- Delete the highest 10% of the values.
- Calculate the arithmetic mean of the remaining 80% of the values.
Outliers and Resistant Measures
An outlier is a data value that is extremely different from other measurements in the data set. Statistical measures which are not affected by outliers are said to be resistant. The median and mode are resistant measures, while the mean is not.
Keep practicing, and you'll become a master of measures of location in no time!