1. Intro

A Confidence Interval for a Population Proportion

The most obvious point estimate for the true (yet unknown) population proportion p is the sample proportion = x/n, where x is the number of individuals in the sample with the characteristic and n is the sample size. This means we are hoping that p. But how far off are we? Is our point estimate too high? Is it too low? Because  came from one sample, we have no clue! Does this mean we should give up hope? NO!

We will rely on the sampling distribution of , which was presented previously to quantify the accuracy and precision of our point estimate . Recall that there are two requirements on the sample size help to ensure the normality of the distribution and independence on the values in the sample: (1) our sample size is less than 5% of the population (n ≤ 0.05 ⋅ N) and (2) np(1 – p) ≥ 10.

We will never actually do this, but if we repeatedly obtained a random sample to get other values of , sometimes our point estimate will be too high, other times it will be too low. But we know something about how the values of  are distributed around the true population proportion p: the majority of the time,  will fall close to p, with lower and lower chances or probabilities of  falling further away from p. In fact, the central limit theorem guarantees that


A level C confidence interval for an unknown parameter has two parts:

  1. An interval of numbers calculated from the data, consisting of our point estimate with some error allowed on either side of the point estimate.
  2. A level of confidence C that represents the probability that the interval will capture the true population value in repeated samples. In other words, our confidence level indicates the expected proportion of intervals that will contain the parameter if a large number of different samples are obtained. The level of confidence is denoted by C = (1 – α) ⋅ 100%.