Biology 205 General Genetics:

VBS Home page, Genetics Home Page, Previous Page, Next Page

 

Hypotheses testing in Genetics I. The Chi square test.

Introduction. Suppose you flip a coin 100 times. You know that if the coin is fair or unbiased that half the time you expect to get a head each time you toss the coin and a tail the other half of the time. How do you know though that the coin really is fair and not biased in some way? If you are a gambler and want to know if a die is fair or biased in some way, how would you be able to test this hypothesis? If you are a geneticist and want to know if the results from the F2 generation of a monohybrid cross are consistent with Mendel's law of segregation, how would you test this?

Learning Goals.

Define null hypothesis as used in genetics

Calculate the expected results for a particular null hypothesis

Calculate an observed chi square statistic

Explain what the chi square distribution and degrees of freedom mean

Find critical chi square statistics for the chi square test for a particular p value

Outline a general procedure for hypothesis testing

Carry out the chi square test procedure for monohybrid crosses

All three of these questions can be dealt with using a very powerful statistical test called the Chi-square goodness of fit test. The basic idea of this test is to calculate a particular statistic called the observed chi square statistic. Once we have the observed statistic, we mathematically ask, what is the probability that this statistic or a larger one will occur if the data really are really a random sample of experiments which governed by the hypothesis being tested?

The null hypothesis: The hypothesis being tested is often called a null hypothesis. The null hypothesis is the hypothesis that the data you are examining is random sample from an infinitely large number of experiments assuming a specific hypothesis is true. For example if you flip a coin, the null hypothesis is that the coin is fair and that when you toss the coin say 100 times then the 100 tosses are indeed a random sample of tosses from a fair coin. The null hypothesis is important because it allows you to make specific predictions about the out come of your experiment. For instance if your null hypothesis is that a coin is fair then if you toss a coin 100 times then the expected number of heads in 100 tosses is 50 heads. This is a specific prediction. The alternative hypothesis- the coin is not fair does not allow you to make a specific prediction since there are an infinite number of ways the coin might not be fair.

Comparing observed and expected results with a chi square test. Let us see how this test works with a simple example. Consider these results tabulated from a series of 100 coin tosses.

Outcome

Observed Number, Oi

Expected number, Ei

(Oi - Ei)2/Ei

Heads

60

50

(60 -50)2/50 = 2.0

Tails

40

50

(40-50)2/50 = 2.0

Total

100

100

Observed Chi square

c observed = 2.0 + 2.0 = 4.0

The observed number column represents the actual number of heads and tails gotten by flipping the coin 100 times. The expected number is what we expect to find on average if the coin is fair. The column on the right shows how to calculate an observed chi-square statistic, which is a sum:

c observed = S (Oi - Ei)2/Ei.

Testing the observed chi square statistic using the chi square distribution:

The beauty of this statistic is that if the data are a really a random sample from all possible experiments where the hypothesis being tested is correct, then the observed chi square statistic will vary randomly. The distribution of this statistic can be shown to follow a mathematical function called the Chi square distribution with N-1 degrees of freedom or:

c2( N-1).

This is actually a complex function, but fortunately for us, we generally do not need to work directly with the function or if we do, many software packages including spreadsheets can be used to calculate it for us. In statistics we usually work with a special form of the chi-square called a cumulative density function. This function gives us the probability that an observed chi square or larger will happen by chance if the data are really from a chi square with N-1 degrees of freedom and our data really do fit our hypothesis. This is why the Chi square tests commonly used in statistics are often called goodness of fit tests.

The idea of the chi square tests is that if the observed chi square value is very large compared to the expected theoretical chi square for the appropriate degrees of freedom, then this suggests that the data do not fit what is expected if the hypothesis being tested is incorrect.

Often this is comparison is done using tables such as this one, which I created using the Chi-square function in a common spreadsheet program. The top part of this table is shown here for simplicity:

 

The column down the left side of the table is used to find the row that chi square values for the proper degrees of freedom. Degree of freedom(df) is best understood as the number of possible outcomes minus one. So for the coin example, the appropriate df is 1. Degree of freedom basically is the number of classes needed to specify the data completely for a particular total number of outcomes. In the coin example if the total number of classes is 2 but you really only need to know the number of heads along with the total number of tosses.

The other columns give probabilities that a chi-square or a larger one in the intersecting row for a particular df will happen by chance if the data really do fit the hypothesis being tested. Generally we pick a certain probability in advance of doing the experiment. In statistics often the probability(p) is often set at 0.05. This probability and the appropriate degree of freedom allow us to specify what is often called a critical statistic. The observed chi-square statistic is compared to the critical statistic and the hypothesis being tested is rejected if and only if the observed chi-square statistic is as large or larger than the critical statistic.

For our coin data, the observed chi square is 4.0 and the critical chi square statistic, the chi square at p = 0.05 and 1 df is about 3.84. Since 4.0 is greater than 3.84 then we reject the hypothesis that the coin is fair.

A couple of points need to be made about this sort of hypothesis testing. First of all, the hypothesis could in fact be correct, but a chi square statistic as large or larger than the critical statistic for p = 0.05 will happen only 5% of the time if the hypothesis is correct. So picking a small p means that when you observe a chi square larger than the critical chi square, your probability of rejecting a hypothesis when it is true is less than 5%. Thus a chi-square that is larger than the critical chi square gives you confidence that you are not rejecting a true hypothesis. Next, there is nothing sacred about p = 0.05. If you are conservative and demand a high degree of confidence before rejecting a hypothesis then you might set p = 0.01 or even p = 0.001. Finally, the p value should not be interpreted as a probability that the hypothesis is wrong. The hypothesis used to calculate the observed chi square is either correct or not so saying a hypothesis has a certain probability of being correct or not does not really make sense at least with this type of hypothesis testing.

An aside for those of you who have had statistics: sometimes the p value is termed the a or "alpha" value and it corresponds to the probability of rejecting the null hypothesis when it is in fact true, what is celled a type I error. Sometimes this a is referred to as the significance level of a statistical test.

A general procedure for hypothesis testing:

What we have said can be distilled into a general hypothesis testing procedure:

Step 1. Specify the hypothesis to test. This typically has to be a hypothesis that makes a specific prediction. For instance, the hypothesis that a coin is biased or not fair does not make a specific prediction about the number of heads and tails you might observe if the coin is tossed 100 times. This is because there are an infinite number of ways the coin can be biased. But if you specify that the coin is fair there is only one such hypothesis namely that P(Head) = 0.5 and P(Tail) = 0.5. Sometimes the hypothesis that allows you to make a specific prediction is called the null hypothesis because it usually assumes that the outcomes are due strictly to chance and are unbiased.

Step 2. Specify the critical statistic against which you will compare the observed statistic. This usually means for the chi-square, specifying the probability level and the appropriate degrees of freedom.

 

Step 3. Collect the data and calculate the observed chi square.

Step 4. Reject the null hypothesis if and only if the observed chi-square is larger than the critical chi-square.

More details about the chi square test are here.

An abbreviated chi square table

Chi Square Table: produced using a spreadsheet program.

 

 Probability levels(p)   

Degrees of freedom(df)

0.5

0.1

0.05

0.01

0.005

1

0.455

2.706

3.841

6.635

7.879

2

1.386

4.605

5.991

9.210

10.597

3

2.366

6.251

7.815

11.345

12.838

4

3.357

7.779

9.488

13.277

14.860

5

4.351

9.236

11.070

15.086

16.750

6

5.348

10.645

12.592

16.812

18.548

7

6.346

12.017

14.067

18.475

20.278

8

7.344

13.362

15.507

20.090

21.955

9

8.343

14.684

16.919

21.666

23.589

10

9.342

15.987

18.307

23.209

25.188

11

10.341

17.275

19.675

24.725

26.757

12

11.340

18.549

21.026

26.217

28.300

13

12.340

19.812

22.362

27.688

29.819

14

13.339

21.064

23.685

29.141

31.319

15

14.339

22.307

24.996

30.578

32.801

20

19.337

28.412

31.410

37.566

39.997

50

49.335

63.167

67.505

76.154

79.490

100

99.334

118.498

124.342

135.807

140.170

 

pgd 8/02/01

Exercises(work in pairs):

1. Flip a coin 100 times and test the hypothesis that the coin is fair. follow steps 1 through 4 for hypothesis testing and write out what you do at each step.

 

 

What can you conclude about your coin?

 

 

 

2. Repeat the chi square test for your coin data assuming that the true probability of getting a head on a single toss P(Head) = 0.55

What can you conclude about your coin?

 

 

 

 

3. Pool the class data and repeat exercise 1 and 2. Does having more data change your conclusions?

 

 

 

 

 

 

 

4. Application to genetics: In corn kernel color is controlled by a number of gene pairs. One of these gene pairs has the following effect on the phenotype:

Kernels with genotypes PP and Pp are "purple" while kernels which are pp are "yellow".

Working in pairs, select an ear of corn. This ear has kernels from the cross Pp X Pp.

If availiable use a simulation program to investigate this cross.

A. What are the expected kernel genotypic and phenotypic ratios from this cross? Use a Punnett Square to do this.

 

B. Tally the number of purple and yellow kernels on at least five rows of corn and carry out a chi square test of your expected results. Show all your work.

 

 

C. Pool the class data. Does having more data change your conclusion?

 

5. Application to duhybrid cross. In corn another gene pair with alleles S,s is responsible for starchy vs suary kernel. Use either an ear of corn showing the F'2s from a dihybrid cross or a simulation program to generate data for the expected phenotypes in the F2. Tru to tally at least 100 kernels. Complete the following table and carry out the appropriate chi square test.

 

Phenotype Observed number Expected number

Contribution to Ch square:

(Oi - Ei)2/Ei

Purple Starchy      
Purple Sugary      
Yellow Starchy      

Yellow sugary

     
Total      

 

 

 

 

 

 

 

 

 

 

pgd 09/10/03 revised 3/22/04