1. Statistics in General

Statistics is a discipline that focuses on the collection, organization, and analysis of data (information) to answer questions or make predictions. In most cases it is impossible to know certain characteristics about an entire population, which is simply every member of a group being studied. For example, imagine trying to answer the question: “How many hours of TV, on average, do Americans watch each week?” If you wanted an exact answer, you would have to ask this question of roughly 300+ million people, record their answer, and finally perform mathematical calculations to reach your conclusion. This is clearly not possible, in terms of both time and expense.

With the science of statistics, however, we can ask the question of a representative sample of individuals from the overall population, and then use results from the sample to infer conclusions about the population. One difference between statistics and other math courses you’ve likely taken is that answers from statistics are not usually 100% accurate. This is because data is variable. With statistics, variability in information (data) leads to conclusions that are not certain. Is this a problem? No!

Example 1.1: Identifying Population and Sample

Identify the population and the sample.

  1. In a survey, 359 college students at the University of Jackson were asked if they had tried the October flavor of the month at the campus coffee shop. Eighty‑three of the students surveyed said yes.
  2. A survey of 1125 households in the United States found that 24% subscribe to satellite radio.

Solution

  1. Population: All college students at the University of Jackson

Sample: The 359 college students who were surveyed

  1. Population: All households in the United States

Sample: The 1125 households in the United States that were surveyed

The characteristics of individuals under study are called variables (because information is different from person to person). Basically, variables fall into two categories. Qualitative (or categorical) variables describe a characteristic about an individual such as hair color, gender, or favorite ice cream flavor. Quantitative variables are numerical variables that can be measured with a scale, such as temperature, weight, height, or distance. Notice that all the quantitative variable examples can be ordered (from least to most, for example), whereas there is no natural “ordering” of hair color or ice cream flavor.

Example 1.2: Classifying Data as Qualitative or Quantitative

Classify the following data as either qualitative or quantitative.

  1. Shades of red paint in a home improvement store
  2. Rankings of the most popular paint colors for the season
  3. Amount of red primary dye necessary to make one gallon of each shade of red paint
  4. Numbers of paint choices available at several stores

Solution

  1. Shades of paint are descriptions and cannot be measured, so these are qualitative data.
  2. Rankings are numeric but not measurements or counts, so these are qualitative data.
  3. The amounts of dye needed are measured and therefore are quantitative data.
  4. The numbers of paint choices must be counted, so they are quantitative data as well.

Quantitative variables can be further classified as either discrete (those with a finite or countable number of possible values) or continuous (those with an infinite or un-countable number of possibilities). An example of a discrete variable would be something like the number of offspring a raccoon produces each year. Variables such as height and weight are continuous.

Example 1.5: Classifying Data as Continuous or Discrete

Determine whether the following data are continuous or discrete.

  1. Temperatures in Fahrenheit of cities in South Carolina
  2. Numbers of houses in various neighborhoods in a city
  3. Numbers of elliptical machines in every YMCA in your state
  4. Heights of doors

Solution

  1. Temperatures could be measured to any level of precision based on the thermometer used, so these are continuous data.
  2. Numbers of houses are discrete data because houses are counted in whole numbers. A house under construction is still a house.
  3. The numbers of elliptical machines are counts, so these are discrete data.

One word of warning: although the word quantitative means numerical, that doesn’t mean that numerical variables are automatically classified as quantitative variables. For example, social security numbers and zip codes are numerical, but mathematical operations of adding, subtracting, averaging, or even ordering provide results that make no sense. Suppose you grew up in Buhl, Idaho (zip code 83316) and now live in Manitowoc, WI (zip code 54220). Do the numbers:

section1-1

or

section1-2

provide any useful information about where you have lived? Definitely not! Just for fun, find out where the cities corresponding to the numbers 29096 and 68786 are located.