Biology 205

Hypotheses Testing in Genetics. The Chi square test.

Suppose you flip a coin 100 times. You know that if the coin is fair or unbiased that half the time you expect to get a head each time you toss the coin and a tail the other half of the time. How do you know though that the coin really is fair and not biased in some way? If you are a gambler and want to know if a die is fair or biased in some way, how would you be able to test this hypothesis? If you are a geneticist and want to know if the results from the F2 generation of a monohybrid cross are consistent with Mendel's law of segregation, how would you test this?

 

All three of these questions can be dealt with using a very powerful statistical test called the Chi-square goodness of fit test. The basic idea of this test is to calculate a particular statistic called the observed chi square statistic. Once we have the observed statistic, we mathematically ask, what is the probability that this statistic or a larger one will occur if the data really are really a random sample of experiments which governed by the hypothesis being tested?

 

Let us see how this test works with a simple example. Consider these results tabulated from a series of 100 coin tosses.

Outcome

Observed Number, Oi

Expected number, Ei

(Oi - Ei)2/E i

Heads

60

50

(60 -50)2/50 = 2.0

Tails

40

50

(40-50)2/50 = 2.0

Total

100

100

Observed Chi square

c observed = 2.0 + 2.0 = 4.0

 

 

The observed number column represents the actual number of heads and tails gotten by flipping the coin 100 times. The expected number is what we expect to find on average if the coin is fair. The column on the right shows how to calculate an observed chi-square statistic, which is a sum:

c2 observed = S (O i - Ei)2/Ei.

 

The beauty of this statistic is that if the data are a really a random sample from all possible experiments where the hypothesis being tested is correct, then the observed chi square statistic will vary randomly. The distribution of this statistic can be shown to follow a mathematical function called the Chi square function with N-1 degrees of freedom or:

c2 N-1.

This is actually a complex function, but fortunately for us, we generally do not need to work directly with the function or if we do, many software packages including spreadsheets can be used to calculate it for us. In statistics we usually work with a special form of the chi square called a cumulative density function. This function gives us the probability that an observed chi square or larger will happen by chance if the data are really from a chi square with N-1 degrees of freedom and our data really do fit our hypothesis. This is why the Chi square tests commonly used in statistics are often called goodness of fit tests.

The idea of the chi square tests is that if the observed chi square value is very large compared to the expected theoretical chi square for the appropriate degrees of freedom, then this suggests that the data do not fit what is expected if the hypothesis being tested is incorrect. Often this is comparison is done using chi square tables such as this one , which I created using the Chi-square function in a common spread sheet program. 

The column down the left side of the table is used to find the row that chi square values for the proper degrees of freedom. Degree of freedom(df) is best understood as the number of possible outcomes minus one. So for the coin example, the appropriate df is 1. Degree of freedom basically is the number of classes needed to specify the data completely for a particular total number of outcomes. In the coin example if the total number of classes is 2 but you really only need to know the number of heads along with the total number of tosses.

The other columns give probabilities that a chi square or a larger one in the intersecting row for a particular df will happen by chance if the data really do fit the hypothesis being tested. Generally we pick a certain probability in advance of doing the experiment. In statistics often the probability is set at 0.05. This probability and the appropriate degree of freedom allows us to specify what is often called a critical statistic. The observed chi-square statistic is compared to the critical statistic and the hypothesis being tested is rejected if and only if the observed chi-square statistic is as large or larger than the critical statistic.

For our coin data, the observed chi square is 4.0 and the critical chi square statistic, the chi square at p = 0.05 and 1 df is about 3.84. Since 4.0 is greater than 3.84 then we reject the hypothesis that the coin is fair.

A couple of points need to be made about this sort of hypothesis testing. First of all, the hypothesis could in fact be correct, but a chi square statistic as large or larger than the critical statistic for p = 0.05 will happen only 5% of the time if the hypothesis is correct. So picking a small p means that when you observe a chi square larger than the critical chi square, your probability of rejecting a hypothesis when it is true is less than 5%. Thus a chi-square that is larger than the critical chi square gives you confidence that you are not rejecting a true hypothesis.

Next, there is nothing sacred about p = 0.05. If you are conservative and demand a high degree of confidence before rejecting a hypothesis then you might set p = 0.01 or even p = 0.001. Finally, the p value should not be interpreted as a probability that the hypothesis is wrong. The hypothesis used to calculate the observed chi square is either correct or not so saying a hypothesis has a certain probability of being correct or not does not really make sense at least with this type of hypothesis testing.

What we have said can be distilled into a general hypothesis testing procedure:

Step 1. Specify the hypothesis to test. This typically has to be a hypothesis that makes a specific prediction. For instance, the hypothesis that a coin is biased or not fair does not make a specific prediction about the number of heads and tails you might observe if the coin is tossed 100 times. This is because there are an infinite number of ways the coin can be biased. But if you specify that the coin is fair there is only one such hypothesis namely that P(Head) = 0.5 and P(Tail) = 0.5. Sometimes the hypothesis that allows you to make a specific prediction is called the null hypothesis because it usually assumes that the outcomes are due strictly to chance and are unbiased.

Step 2. Specify the critical statistic against which you will compare the observed statistic. This usually means for the chi square, specifying the probability level and the appropriate degrees of freedom.

Step 3. Collect the data and calculate the observed chi square.

Step 4. Reject the null hypothesis if and only if the observed chi-square is larger than the critical chi-square.

Application to genetics.

The chi square test can be used to test deviations from specific genetic hypotheses. For example, the expected phenotypic numbers given a theoretical 3 : 1 ratio for the F2's from a monohybrid cross can be tested against the observed numbers. A common use of the chi square is to detect linkage between two genes using the results of a test cross.

Consider the following example examined by Thomas Hunt Morgan and discussed in Griffiths et al 1993. He investigated linkage between the gene for purple eye color(pr) and the gene for vestigial wing. He first mated flies known to be pure breeding for purple eye and vestigial wing with wild type flies pure breeding for both traits. Let pr be the allele for purple eye and vg be the allele for vestigial. Both these alleles are recessive. For clarity, let pr+ and vg+ be the corresponding wild type alleles. The F1s all are of genotype pr pr+ vg vg+, written in our chromosomal notation pr vg /+ + .

When Morgan mated female offspring from the test cross with male + + / + + flies, he obtained the following results:

Genotype

Number observed Oi

Number expected Ei

(Oi -Ei)^2/E

pr+ vg+ / + +

1139

659.75

348.132721

pr+ vg / + +

151

659.75

392.310061

pr vg+ / + +

154

659.75

387.69695

pr vg / + +

1195

659.75

659.75

total

2639

2639

1787.88973

 

The observed Chi square, 1787.9, is clearly much better than the critical chi square (c 2 3 df = 7.81) at the 0.05 level of significance. Thus the hypothesis that the genes are unlinked is rejected. In this case the results are very likely to be related to linkage. However, one must be very careful to check that the hypothesis being tested is appropriate for a use of a simple

chi square. Deviations from the expected 1 : 1 : 1 : 1 ratio may also arise if zygotes from a particular female mating with the wild type male differ in viability. Further even if linkage exists, viability effects between the cross over gametes will inflate the c 2 test leading to a greater chance of rejecting the hypothesis of linkage.

 

In addition, if when one knows that two alleles are linked, the temptation is to use c 2 tests to distinguish between two specific hypotheses, say that r = 0.3 vs. r = 0.4. The problem with this approach is that the four possible outcomes are not statistically independent when linkage exists. This is because one cross over event will yield the two non-parental chromatids. In this case the best c 2 test is based on two classes, namely the parental vs. non-parental offspring from the test cross. The observed c 2 would then be compared against your critical chi square statistic using only 2 degrees of freedom. But if you are testing r = 0 against any alternative then the full test as explained earlier, and as typically explained in genetics books is fine since if r = 0 then this is the same thing as statistical independence assuming no viability issues.

There are other approaches to testing hypotheses in genetics, and while the chi square goodness of fit test is relatively easy to understand there are actually better tests for these applications. The main alternative approach is the so-called likelihood ratio test. If you want to get a feel for these tests now you might read about the LOD score test used in analyzing linkage using pedigree analysis.

pgd 02/26/02