3. The chi-Squared Test

If one thinks back to the problems that considered the difference of two proportions, the method considered a binomial variables, with probability of success p; and compared this with another proportion in order to find a statistically significant difference. The x2-test is an extension of this concept to a multinomial trials where there are k outcomes, each with probability of success p1, p2, … , pk. The x2 compares the observed outcomes with the outcomes expected by assuming a null hypothesis H0 is true. The use of the chi-squared test statistic is only appropriate when all of the expected counts are greater than or equal to 5!

The test statistic for the chi-squared test is a measure of how far apart the observed values are from the expected in all cells of the two-way tables.

section12-2

Returning to our example, our test statistic is:

section12-3

This tells us that large values of the test statistic will indicate that the values are far apart, or rather that the distributions are different. This will give us evidence to suggest that our null hypothesis is not true. Be careful, as we will see later the x2distribution is not symmetric, and the alternative hypothesis has many options for sides and directions but chi is one-sided. As such, any violation of the null hypothesis will produce a large test statistic. However, small values of x2 are not evidence against the null hypothesis.

The chi-Squared Distribution

The chi-squared distribution is different depending on the degrees of freedom present in our data, just like was true for the Student’s t-distribution. For this reason calculating p-values on this particular distribution is best done using technology or tables. Since there are two categorical variables to consider our degrees of freedom is k = (r – 1)(c – 1), where r and c are the number of rows and columns, respectively. The graph below is how the distribution of chi-squared looks for three different values of k.

section12-4

Example 2

The article “Determination of Carboxyhemoglobin Levels and Health Effect on Officers Working at the Istanbul Bosphorus Bridge”(G. Kocasoy and H. Yalin, Journal of Environmental Science and Health, 2004: 1129-1139) presents assessments of health outcomes of people working in an environment with high levels of carbon monoxide (CO). Following are the numbers of workers reporting various symptoms, categorized by work shift. Can you conclude that the proportions of workers with the various symptoms differ among shifts?

Ailment
Shift
Totals
Morning
Evening
Night
Influenza
16
13
18
47
Headache
24
33
6
63
Weakness
11
16
5
32
Shortness of Breath
7
9
9
25
Total
58
71
38
167

 

Solution: First the null hypothesis is stated, “H0 is there is no difference in the proportion of workers with the various symptoms between the shifts.” This is used to generate the following expected table.

Expected

Ailment

Shift
Morning
Evening
Night
Totals
Influenza
16.232
19.982
10.695
47
Headache
21880
26.784
14.335
63
Weakness
11.114
13.605
7.281
32
Shortness of Breath
8.683
10.629
5.689
25
Total
58
71
38
167

 

Next we calculate the x2-test statistic.

section12-5

If you are familiar with Excel, you can create a table of the actual counts and a table of the expected counts, and then use the command CHISQ.TEST(actual_range,expected_range), to calculate x2-test statistic. Now we calculate the degrees of freedom k = (4 – 1) (3 – 1) = 6. Then we can use the command CHIDIST(x,degrees_freedom, to calculate the p-value in Excel. For this example we do CHIDIST(17.570, 6) =0.007402. This p-value is less than 5%, so one may conclude that there is enough evidence to reject the null hypothesis. So we conclude that there is evidence to suggest that the proportion of workers with the various symptoms differs among the shifts.

Alternatively, we could calculate a critical value from the distribution of x2 with 6 degrees of freedom and compare it with our test statistic 17.570. This can be done from a table or on Excel using the CHIINV(probability,degrees_freedom) command. Please note that this gives the critical value for the upper one-tailed test. In our example, the critical value is CHIINV(0.05,6) = 12.592. When we compare the critical value is less than our test statistic, as 12.592 < 17.570. Hence our test statistic falls in the rejection region, which follows our earlier conclusion.

Additional Uses of the chi-Squared Test

The chi-squared distribution can also be used to test of significance about variance or the standard deviation of the normal distribution. It also can be used to test the goodness of fit for a theoretical model against sample data.

Chi-Squared Testing with the TI-83/84

All of these test can be found by hitting the [STAT] button and arrowing over to the TESTS menu.

Calculator Example: The Chi-Squared Goodness of fit test.

If births were uniformly distributed across the week, we would expect that about 1/7 of all births occur during each day of the week. How closely do the observed number of births fit this expected distribution? The chi-square goodness-of-fit test is used to determine whether an observed frequency distribution is significantly different from the expected distribution, or how “good” (sic) the two distributions fit each other. If we were only interested in one day of the week, we could conduct a 1-proportion z test. However, because we have seven hypothesized proportions, we need to conduct a test that considers all of them together and gives an overall indication of whether the observed distribution differs from the expected one. The chi-square goodness-of-fit test is just what we need. Let’s consider the frequency distribution of all 2008 Wisconsin births by day of the week.

sec12_b

NOTE: This is not an option on all calculators, yours must have the GOF test on it..

Solution for the TI-84: 

1. Enter the observed data into L1.

2. Here we are hypothesizing that the births all occur in equal proportions for every day of the week. Now compute the expected frequencies as Expected=n/k, n is the total number of trials (births) and k is the number of different categories(days of the week). For this example E=116823/7=16689 for all the days since they are all the same. Enter 16689 into all the rows for L2.

sec12c

3. Now hits [Stat] arrow over to the TESTS menu, arrow down to D: χ2  GOF-Test hit ENTER. Then enter in the following for the screen:

 

sec12d

4. The degrees of freedom is k-1 highlight and hit enter to get:

sec12e

5. It gives you both the value of the χ2 test statistic and its associated P-value. CNTRB provides a list of the CoNTRiButions of each category to the overall χ2 value. Use the arrow key to scroll through these numbers. Round chi-square values to 3 decimal places and P-values to 3 significant figures. You could report these results as P( χ2> 3679.867) ≈0.

What does this mean?

If births were in fact distributed uniformly across the seven days of the week, an observed χ2value of 3679.867 would occur about 0% of the time. This result is certainly unusual, so we reject H0 and conclude that the sample data are consistent with births being non-uniformly distributed across the seven days of the week.