## 2. Measures of Center

In this section we learn three measures of central tendency, i.e., three ways to identify the “center” of a data set. You should already be somewhat familiar with these basic ideas.

### The Mode

The mode of a variable is a simple measure of center that describes the most frequent (recurring) observation in a data set. For example, in a data set such as {0,1,2,2,2,3,4,8}, the value 2 would be the mode of the data set since it appears the most times. If you refer back to the exam score data shown previously, none of the data values appears more than once. This means that the exam score data has NO mode. The mode is rarely used in serious statistics studies.

### The Arithmetic Mean The arithmetic mean of a variable is often what people mean by the “average. ” To calculate, simply add up all the values and divide by how many there are. In statistics, the symbol x̄ (pronounced x-bar) is used to denote the mean of a sample.

As an example, the data shown in the table are the first exam scores for all 17 students from a calculus class. You should verify that the average is: The mean is only valid for quantitative data and can be thought of as a balance point for the set of data. For example, if you had only three exam scores, such as 81, 85, and 83, you can see how “borrowing” two exam points from the 85 and applying them to the 81 would make all three scores be 83. That is your average value.

### The Median

The median of a variable, typically denoted by the capital letter M, is another measure of the “center.” The median is simply the numerical value of the data value that occupies the physical middle location of your ordered data set.

After just a moment of thinking, it’s clear that the calculation of the median of a variable is slightly different depending on if there are an odd number of data points, or if there are an even number of data points. (Think about the middle value of 3 data points and the middle value of 4 data points.)

To calculate the median of a data set, arrange the data in order and count the number of observations, n. If the value of n is odd, then there will be data value that is exactly in the middle. That data value is the median M. If n is even, then there will be two values on either side of the exact middle. In this case, the median is defined to be the average of these two data values. In either case, the location of the mean can be found through the simple calculation: Please be careful to note that the median is NOT the value of the fraction The value of this fraction is simply the location of the median. Returning to the example of the first exam scores above, we first sort the scores in ascending order, as shown. Since there are 17 scores, an odd number, there will be an exact middle score. If your data set has just a few values, it is easy to find the middle. However, just to be sure, the location of the median comes from the formula Thus, the 9th data value gives us our median, M = 76.5.

Now suppose that the student who scored a 46 wasn’t enrolled in class in the first place. That would mean the population has n = 16 members (just ignore the 46). In that case, the location of the median would be Obviously there is not a score in the 8.5th location. Thus, we take the two scores from the 8th and 9th positions, 76.5 and 77, and find their average: Therefore, the population median would be M = 76.75.

On the TI-83/84 Calculator:

First we enter the data into the lists. To do this we hit the STATS Key and select the first option 1:Edit. Then we enter the data into the first list and hit ENTER after every value. Then we hit STAT again and then arrow over to the CALC menu and select 1: 1-Var Stats, hit ENTER and then choose your list to be L1 and hit ENTER on the Calculate. We notice that the value of the mean is shown as well as the size of our data set n=13. If we arrow down further we see some more statistics including the median. Observe that the calculator does not give us the mode. For more details on using the calculator to do statistics check out the Technology Guide on D2L. ### Comparing the Mean and Median Values

Although both the mean and median are measures of the center of a data set, it is rarely the case that the two will be exactly equal. In fact, there are times when the two will be very different. How the mean and median relate to each other tells is a lot about the distribution of the underlying data set.

Basically, it’s important to know that the mean is a measure of center that is highly sensitive to changes in the data values. Think about the process of taking an average… every data value is used in the computation. If simply one data value changes, then the average will change. The change in the value of the mean will be much more drastic if one of the outlying data values changes. In essence, the mean is not resistant to changes to extreme values.

On the other hand, the median is a measure of center that is not very sensitive to changes in the data values. Really only one or two data points determine the value of the mean. The values of the other data points do not factor at all into the median. They serve only as placeholders, which lead to the middle value. Because of this, the median is resistant to changes in extreme values.

Example:

Data Mean Median
{1, 5, 13, 20, 28} μ=13.4 M = 13
{1, 5, 13, 20, 280} μ=63.8 M = 13
{1, 5, 13, 20} μ=7.8 M = 9

Notice how drastically the mean changes in each case, while the median stays either the same or changes just slightly.

Because of the sensitivity of the mean, it gets pulled in the direction of the tails for skewed data sets. Basically, if the distribution is:

• Symmetric: the mean will usually be close to the median
• Left (Or Negative) Skew: the mean will usually be smaller than the median
• Right (or Positive) Skew: the mean will usually be larger than the median

The following picture illustrates the graphical relationship of the mean, median, and mode in the three types of data: