In this section we learn about inference methods for comparing population means from two independent samples of data. These methods work for situations such as testing whether a new drug lowers cholesterol levels more than the current drug or comparing two different teaching methods conducted on difference classes (or different students within the same class).
The formulas and steps that we’ll use for hypothesis testing and confidence intervals are fundamentally the same as we used throughout the previous sub-competencies. However, when comparing two population means from independent samples of data, some of the details of the new formulas go beyond the scope of this class because we now have two values for each variable–one for each of the two samples:
- The two hypothesized means: μ1 and μ2
- The two sample sizes: n1 and n2
- The two sample means, x̄1 and x̄2
- The two sample standard deviations, s1 and s2
In particular, the sample standard deviation of the sample difference x̄1 – x̄2 is not as intuitive because we must “combine” the standard deviations from two independent samples. Without showing any proof, it turns out that the standard deviation of the sample difference x̄1 – x̄2 is:
Realize that although this standard deviation looks completely different from the previously used standard deviation of the sample mean, σx̄ = σ/√n , the difference is actually very minimal, and is based on the following (obvious) bit of algebra:
(you should verify this algebraically)
Thus, instead of including the term
in a formula, we could just as easily use
In a rough sense, because we have a sample standard deviation from each independent sample, there will be a term
representing each sample. This should help illustrate why the standard deviation of the sample difference x̄1 – x̄2 looks the way it does.