Skipping most of the details, the null hypothesis is the assumed condition that the proportions from both populations are equal,*H*_{0}: *p*_{1} = *p*_{2}, and the alternative hypothesis is one of the three conditions of non-equality.

When calculating the test statistic *z*_{0} (notice we use the standard normal distribution), we are assuming that the two population proportions are the same, *p*_{1} = *p*_{2} = *p̂*. Now if both Population 1 and Population 2 are the same in terms of the required proportion, they could be considered to be the “same” population. (Think about this a bit.) We define *p̂* to be the ** pooled** population proportion:

Substituting *p̂* into the sample standard deviation expression gives:

The formula for the test statistic *z*_{0} becomes:

The term *p*_{1} – *p*_{2} in the numerator disappears because we are assuming that *p*_{1} = *p*_{2}, so *p*_{1} – *p*_{2} = 0.

All other steps for the hypothesis test remain the same as discussed in sub-competency 9.

*Example*

A nutritionist claims that the proportion of individuals who have at most an eighth-grade education and consume more than the USDA’s recommended daily allowance of 300 mg of cholesterol is higher than the proportion of individuals who have at least some college and consume too much cholesterol. In interviews with 320 individuals who have at most an eighth-grade education, she determined that 114 of them consumed too much cholesterol. In interviews with 350 individuals with at least some college, she determined that 112 of them consumed too much cholesterol per day.

The most challenging part of using inference tools on problems dealing with two populations is ** keeping the information straight**. Before starting any of the inference methods, take a few moments to label each population. That way you won’t get confused about which information pertains to which population. For example, Population 1 is the group of people who have at most an eighth-grade education and consume more than the USDA’s recommended daily allowance of 300 mg of cholesterol. Therefore, Population 2 is the group of people who have at least some college education and consume too much cholesterol.

First, let’s perform a hypothesis test on the difference in the two population proportions using a level of significance *α* = 0.05 (i.e., 5%). Keeping the information straight, we find:

Population 1 |
Population 2 |

n = 320_{1}p = 114/320_{1}= 0.356.25 |
n = 350_{2}p = 112/350_{2}= 0.32 |

The null hypothesis, which is stated in Step 1, is the assumption that the two populations do not differ in terms of the characteristic of interest. We therefore need to determine the *pooled* proportion, *p̂*:

**Step 1:** State the null and alternative hypotheses.

*H*_{0}:*p*_{1}=*p*_{2}*H*_{1}:*p*_{1}*> p*_{2}

Notice that this is a **one-tail test**, since the nutritionist claims that *p*_{1} “… is ** higher than**…”

*p*

_{2}.

**Step 2:** Determine the test statistic *z*_{0}.

Using the calculated information shown in the above chart, we see:

In other words, the two population proportions are roughly only 1 standard deviation different from each other.

**Step 3:** Determine the *P*-value and Identify the Level of Significance

Using the test statistic *z*_{0} ≈ 0.99, a table of standard normal values indicates that the *P*-value is 0.1611. Using technology, we get the results *z*_{0} ≈ 0.9913 and *P*-value = 0.1608.

The level of significance was given to us as *α* = 0.05.

**Step 4:** Make Appropriate Conclusions

Because our *P*-value is far greater than the level of significance, *α* = 0.05, the conclusion of our hypothesis test is: **Do Not Reject the Null Hypothesis H_{0}: p_{1} = p_{2}**. In other words, the data set does not provide significant evidence that there is a real difference in how college-educated people and those with at most an eighth-grade education consume cholesterol.

### A Confidence Interval Approach

For fun, let’s continue with this example but use a 95% confidence interval about the difference between the two population proportions *p*_{1} – *p*_{2}. Using the information in the table above, we compute:

In interval notation, our 95% confidence interval is:

(-0.0355,0.1080)

Since the value 0 ** is **contained within this interval, this means there is likely

*no difference*in the two population proportions. This supports the conclusions of our hypothesis test.