VBS Home page,VBS Course Navigator, Biology 205- Genetics, Hypothesis Testing in Genetics, Previous Page, Next Page,top of page
While most genetics books focus on the chi square goodness of fit tests for investigating hypotheses, several alternative techniques exist. The best known are the likelihood ratio tests. The idea is quite simple suppose you have two hypotheses H0 and H1 and you want to distinguish between them. For each hypothesis and the observed data assume there is a probability that the data observed would have arisen if Ho were true. Call that probability l 0 and an analogous probability l 1 that the data would have arisen if H1 were true. The ratio of l 1/l 0 is called the likelihood ratio (L) and it amounts to the odds that H1 indeed is correct as opposed to H0.
L itself can be treated as a random variable, but often times L may have a complex distribution and computer programs are required to handle the computations. However, some of the more useful likelihood ratio tests for genetics are well understood. Here we will examine the use of likelihood ratio tests to detect the presence of linkage. We will assume a very simple situation where Ho is the absence of linkage (r = 0.5) versus H1, the presence of linkage 0 <= r' < 0.5 where r is the rate of cross over between two loci. The test discussed here is very closely related to the LOD score test used to detect linkage using pedigree studies.
The rate of recombination r is estimated using its so called maximum likelihood estimator r' =M/N in place of the real r where N is the number of gametes observed and M is the number of non recombinant gmetes observed. This is very much akin to using the mean Xn of N observations as an estimate for a true mean m . Proving that r' = M/N is indeed the maximum likelihood estimate for r is fairly simple and merely involves examining the maximum of ln(l1).
Consider the standard test cross to detect linkage in which a double heterozygote with chromosomes of the form a b/+ + when linkage is assumed. Remember these individuals are obtained by mating the double homozygous dominant with the homozygous recessive and hence if linkage between the A and B loci exist then the non wild type forms a and b must be on the same chromatid.
VBS Home page,VBS Course Navigator, Biology 205- Genetics, Hypothesis Testing in Genetics, Previous Page, Next Page,top of page
Analysis of Linkage by likelihood ratio test. Data after Morgan from Griffiths et al 1993
| Genotype | Number observed |
| wild type eye, wild type wing | 1339 |
| wild type eye, vestigial wing | 151 |
| purple eye, normal wing | 154 |
| purple eye, vestigial wing | 1195 |
| total | 2839 |
Let M be the frequency of the crossover phenotypes and N be the total number of flies examined.
The estimate for r, r' = frequency of phenotypes associated with cross over/total offspring = (151+154)/2839 = .107
If the genes for eye color and wing type were unlinked the theoretical r = 0.5.
G = 2ln(L) and L = l 1/l0
is distributed as a chi square random variable with 1 degree of freedom.
We can test r vs any r' by using the following expression for L = l1/l 0:
If there is no linkage then the probability of observing M non parental gametes(as detected by the test cross) out of a total of N gametes is merely
l 0 = N!/(M! (N-M)!) 1/2N .
This is because half the gametes should be parental gametes and the other half should be the non parental gametes.
If linkage exists then this probability (l) becomes
l 1= N!/(M! (N-M)!)rM(1-r) (N-M)
L in our example (testing r < /12 Vs r = 1/2) becomes fairly simple:
L = l1/l 0 = [r'M(1-r')N-M]/[1/2]N
where r' is the maximum likelihood estimator for the recombination rate given by:
r' = M/N. Thus by substitution we have
L = (M/N)M(1-M/N)(N-M)/(1/2)N
and G = 2 ln(L)
For our example G
= 1998.85
We can use this test to distinguish between r = 0.5 versus 0 <= r' < 0.5 by calculating G and comparing G against the critical region for c 2(1, 0.05) = 3.85.
This again suggests that the loci studied by Morgan are linked as we concluded with the goodness of fit chi square test.
VBS Home page,VBS Course Navigator, Biology 205- Genetics, Hypothesis Testing in Genetics, Previous Page, Next Page,top of page
Advantages and Disadvantages of Likelihood ratio tests
There are several advantages to likelihood ratio tests:
First it follows a natural probability model which leads naturally to the so called LOD score employed in pedigree analysis. The LOD score which is based on Log10 can easily be converted to a G score and vice versa. Next, more complex tests of this type can be developed which allow the G score to be partitioned into different components or sources of variation much as can often be done with analysis of variance.
Next, since the likelyhood ratio tests allow one to estimate r, it is possible to construct a likelyhood ratio test of the hypothesis that only linkage is affecting the data, but we will not pursue these tests here. Such a test though would be more complex than the simple test shown here.
The main diadvantages of these test are:
Often complex to set up, especially if multiple parameters are being examined.
Except in the simplest cases such as analysed here exact test statistics are not often available or easy to use.
Maximum likelihood and related tests are becoming more common in genetics. For example, the BLAST database of gene and protein sequences uses a likelihood ratio test to detect possible homologies between gene sequences of different organisms and thus are a valuable tool in understanding the evolutionary changes that have shaped the genomes of humans and other organisms.
VBS Home page,VBS Course Navigator, Biology 205- Genetics, Hypothesis Testing in Genetics, Previous Page, Next Page,top of page
pgd 08/06/02