6. Binomial Distribution

There are many real world scenarios that you may encounter where a situation is repeated several times with the same likelihood of success or failure. For example: a basketball player shooting free throws, tossing a coin repeatedly, or looking for errors in a production line. These repeated experiments are called a binomial experiment.

A binomial experiment has four characteristics:

  1. The experiment is repeated a fixed number of times called n, these are called observations or trials.
  2. Each trial results in only one of two possible outcomes, which we call either “success” or “failure.”
  3. The probability of success on a single trial does not change as we repeat the experiment from trial to trial and is called p. The probability of failure in each trial is then (1-p).
  4. The n observations or trials are That is, knowing the result of one trial does not change the probabilities we assign to the other observations.

We are interested in x, the number of successes observed in the first n trials, for x=0, 1, 2, … , n. The count of x successes in a binomial experiment has a binomial distribution. This distribution has parameters n and p, where n is the number of trials and p is the probability of success on one trial. We write that x~B(n,p) or x~Bin(n,p), to say that x has such a distribution.

For example a light bulb producer might be interested in how likely it is that he has shipped a defective bulb. He takes 10 light bulbs and then test them all, instead of just testing one light bulb. If each light bulb had a probability of 0:99 of working, what is the probability that all 10 chosen light bulbs work? This is a question that can be answered with the binomial distribution. To see how to calculate the distribution we will use an example

Example:

Suppose as an experiment you flip a coin four times and record the results. You are interested in how likely you are to get two heads in four flips, since this would probably mean that the coin is fair.

For us the success is getting a heads, but there are lots of ways for this to happen. This could be happen by getting any of the flips in the following sequence:

{HHTT; HTHT; HTTH; THHT; THTH; TTHH}

We see that there are 6 ways to get exactly 2 heads when flipping a coin 4 times. Thus there are 6 ways to get exactly two heads, and there are 24 = 16 total outcomes in flipping a coin 4 times. Combining these calculations we get that the probability of getting two heads is 6 out of 16 or:

Equation

But a more useful way to see this as follows:

Equation

It is important to notice that here both our probability calculations are the same for all 6 possibilities. There are two successes and two failures. The more difficult part was finding out the number of ways we could arrange two successes in four trials, or figuring out there were 6 ways to arrange two heads in four flips. However, how to count this is known and it is called a binomial coefficient.

The binomial coefficient is the number of ways to arrange k successes among n observations and is given by the formula:

Equation

Where k=0, 1, 2, … ,n.

Usually Equation is read as n choose k and n! is read as “n factorial”. Factorials are successively multiplying and reducing until 1 is reached, so 6! = 6 ∙  5 ∙ 4 ∙ 3 ∙ 2 ∙ 1 = 720.

For our heads experiment we would have calculated this as:

Equation

This result allows us to easily describe the binomial probabilities. In our example with the use of binomial coefficients would give us:

Equation

Additionally, this result will allow us to find the distribution of the binomial random variable.

Binomial Mean and Standard Deviation

If x represents the count of successes in a binomial experiment with n trials and probability of success p, then what should be the mean? Using reasoning we can picture an experiment where we know how often there is success, p and how many times we repeat the trials n times. We should expect that on average the number of success should be np. For example if a light bulb works with .99 probability and we turn it on 200 times, we expect to have 198 success. So if x has a binomial distribution with mean μ=np and standard deviation
Equation

Example (from Moore’s Basic Practice of Statistics)

Typing errors fall in to two categories: “nonword” errors (typing teh for the), which will be caught by a word processor and word errors (using form where the word from goes), which will not be caught by a word processor. Most human proofreaders will catch about 70% of all word errors.  You ask a fellow student to check an essay for you, in which you deliberately placed 10 word errors.

  • (a) What is the probability that the student will catch exactly 2 errors?
  • (b) What is the probability that the student will catch 2 or fewer errors?
  • (c) What is the mean number of missed errors caught?
  • (d) What is the standard deviation or the number of errors caught?
  • (e) Suppose that the proofreader catches 90% of errors, what is the standard deviation then? What about 99%? What happens to the standard deviation as the probability of success increases toward 1?

Solution

  • (a) This is a binomial experiment with n=10. If we define our success to be the student catches the error then p=.7. Since we want exactly two errors we need to find the probability that x=2.

Equation

  • (b)
    Equation
  • (c) μ = np = 10(.7) = 7  or on average they catch 7 errors. This justifies that small probability in a and b.
  • (d)
    Equation
  • (e)
    Equation σ decreases toward 0 as p increases toward 1.