2. Four Parts of a Hypothesis

1. Intro
3. Type I and Type II Errors

Let’s quickly investigate the four main parts of any hypothesis test:

1. The Null and Alternative Hypotheses

In statistics, a hypothesis is a statement, or assumption, about the characteristics of one or more variables in one or more populations. Since a statement can be either true or false, there are two hypotheses to identify.

  • The null hypothesis, denoted by H0, is the statement about a value of a population parameter that we intend to test. Since H0 is the statement we (or someone else) believe to be true, H0 is the statement of “no difference,” and thus always contains the condition of equality. The conclusions of our hypothesis test will be either: “reject H0” or “do not reject H0.” Keep in mind that we always assume the null hypothesis to be true until we get evidence from sample data that suggests otherwise.
  • The alternative hypothesis, denoted by Hα (or H1 in other resources), must be what is true if the null hypothesis is false. There are three ways to be different from the null hypothesis test: larger than, smaller than, or just plain different (not equal). Therefore, the notation for Hα will always contain a condition of inequality. If Hα contains either of the inequalities > or <, we call the hypothesis test a one-tail test, since there is only one way to be greater than or one way to be less than H0. If Hα contains the inequality ≠, then we call the hypothesis test a two-tailed test, since being not equal to something means you could be less than or greater than the given value.

To see the difference between a one-tail test and a two-tail test, imagine that I ask you to guess a number between one and 10, and in addition I tell you that the number is greater than 5. What numbers will you guess? You will choose numbers in only one direction from 5…those greater than 5. You won’t bother guessing numbers below five, since I specified the direction where you should start guessing numbers. Now suppose I ask you to guess a number between one and 10, and all I tell you is that the number is different from (i.e., not equal to) 5. Which numbers will you guess this time? You now have to choose numbers in two directions from 5…those less than five and those greater than 5.

2. The Test Statistic

Once we have a statement of hypothesis, the next thing that happens is we analyze sample data (collected appropriately…see sampling in subcompetency 5) using both graphical and numerical summaries (subcompetencies 1 and 2, respectively). We specifically want to use numerical summaries to compare with what is being claimed in the null hypothesis. The types of hypothesis tests you will run in this course will focus on either population means (μ vs. ) or population proportions (p vs. ).

The nice thing about computing a test statistics is that it is a computation you are already familiar with. In subcompetency 3 you were introduced to the z-score


which in English terms is:


For hypothesis testing, we’ll use the same form but slightly modify the terms:


3. Probability Values and Statistical Significance

The key to making an appropriate conclusion to a hypothesis test is to identify results that are statistically significant. When results observed from the sample are unlikely under the assumption that the null hypothesis is true, we say the result is statistically significant. If our sample results are unlikely, then we reject the null hypothesis H0. Repeat this to yourself over and over: if we have statistically significant results, we reject H0. In other words, if the difference between our sample result and the null hypothesis claim is large (which results in a really small probability value), then we reject H0.

There are actually TWO ways to proceed with hypothesis testing, the “classical” method and the P-value (probability value) approach, with both giving the same conclusions to the test. For this class we will use the P-value approach in our hypothesis testing, as it is far more prevalent in scientific research. Our goal is to calculate the probability of observing a sample statistic as extreme, or more extreme, than the one observed from our sample under the assumption that the null hypothesis is true. Since AREA = PROBABILITY, a P-value represents the total area that lies outside of the region(s) defined by our test statistic(s). The main question we are trying to answer is: Could random variation alone account for the difference between the null hypothesis and our observations from a random sample? A small P-value implies that random variation through the sampling process alone is not likely to account for the observed difference. Therefore, with a small P-value, we reject H0 and are led to believe that the true value of the population is significantly different from what was stated in H0. Small P-values are strong evidence against H0.

Once we know the value of our test statistic and whether our hypothesis test is a one-tail or two-tail test, we determine the P-value. Keeping in mind the idea of statistical significance, our hypothesis test boils down to two situations:

  • A small P-value implies that the probability of seeing our sample mean is very unlikely if the null hypothesis is true. This is considered significant evidence that the null hypothesis is not true. Therefore, a small P-value means we need to reject the null hypothesis H0.
  • A large P-value implies that it is not unlikely to see our sample results given the null hypothesis is true. This is considered significant evidence that the null hypothesis could be true, and we do not reject the null hypothesis H0. Notice, we never conclude that the null hypothesis is indeed true!

We call the level of significance of our hypothesis test the value of alpha, α. Typical values for α are: α = 0.10, α = 0.05, or α = 0.01. The value of α chosen for a hypothesis test must be reported using language such as, “Our hypothesis test has a level of significance α = 0.01.”

4. The Conclusions of Hypothesis Testing

It is important to keep in mind that the statistical results from a hypothesis test only deal with the null hypothesis H0. There are only two statistical conclusions to make at the end of a hypothesis test: “we reject H0” or “we fail to reject H0.” The statistical conclusions never deal with the alternative hypothesis H1. Moreover, we never say that, “we accept H0.” It is either, “we reject H0” or “we fail to reject H0.”

After stating the statistical conclusions, it is important to write a sentence or two about what are our conclusions are in a way that a non-statistics person can understand. Usually this sentence starts out with, “Our data provides sufficient evidence that…” or, “Our data does not provides sufficient evidence that…” It will take lots of practice to become comfortable making the proper conclusions from a hypothesis test.

Concluding Example

Returning to the example at the beginning of our discussion, a hypothesis test would look like the following:

  1. Statement of Null and Alternative Hypotheses
    • H0: μ = 98.6°
    • Hα: μ ≠ 98.6°
  2. Calculate the Test Statistic
    • section9-5
  3. Compute the Probability Value (Two-Tail Test)Using a table of probability values, all we’d be able to say is that the probability of seeing such a test statistic assuming that 98.6°F is indeed the average human temperature is much less than 0.0005 (or 0.05%). Relying on technology to calculate the probability, we find a p-value of 0.00000000142, which is as close to 0 as we’d ever care to see.
  4. Make ConclusionsAs our P-value is lower than any significance value (such as α = 0.05), we would state:
    • “We reject the null hypothesis H0: μ = 98.6°. Our sample data provides sufficient evidence that the typical average human body temperature is significantly different from 98.6°.”

That’s it… the entire process of carrying out a hypothesis test. To conclude this subcompetency, it must be acknowledged that even with careful data collection and precise test statistics and p-value computations, mistakes (errors) can be made! Remember, we are relying on sample data on which to base our conclusions. Even using proper experimental design with randomization throughout the process, sometimes by chance our sample data may not adequately represent the overall population. There is nothing we can do about that! This means there is a possiblity that we could draw an inaccurate conclusion from the hypothesis test!

1. Intro
3. Type I and Type II Errors