1. Intro

In this unit you will learn about measures of position, or location, within a data set. One important measure of position, which will be used extensively later in the course, tells us the position of a data value relative to the standard deviation. Other measures tell us location in terms of groups (or percents) of the data set.

Once you have created a distribution of your data, you can use its shape, center, and spread to tell the story of your underlying data.

The most important idea that you need to take from this unit is that of a probability density curve, the graphical representation of a continuous random variable. When you are looking at a histogram of continuous data, you can almost imagine a smooth curve making the same shape as the histogram’s bars. For example, if you think back to the example from Unit 1 concerning a state’s residents living in poverty, we produced the following histogram:

Percentage of a State's Population in Poverty

 A smooth curve that has (roughly) the same shape as this histogram would be something like:

Smooth Curve

The smooth curve that represents our histogram is called a density curve and it has some cool properties. First, it is always on or above our horizontal axis. Since our vertical axis represents a count or percentage of data falling in a particular class, there can’t be a negative amount of data in a class. The other property is that the total area under an entire density curve is 1 (or 100%). Since a density curve represents our data, ALL of our data…or 100% of it…must be included in the distribution. We’re going to routinely utilize the result that an area (or region) under a density curve represents the probability of obtaining results falling in that area. So remember, AREA = PERCENTAGE OR PROBABILITY. Keep reminding yourself: AREA = PERCENTAGE OR PROBABILITY.

Again, the main purpose of a density function is to be a smooth and continuous representation of our actual data. Because the density function is a “model” of our data, we will use Greek letters such as μ and σ to represent the mean and standard deviation of the density curve. Statistics is full of symbols; it is most important to remember that x  and s represent the mean and standard deviation, respectively, of a SAMPLE, while μ and σ represent the mean and standard deviation, respectively, of a POPULATION. The density curve is a stand-in for our population.

To begin, let’s investigate the distributions of two continuous random variables, the uniform distribution and the normal distribution, which will be the focus of our statistical studies from here on out.