Biology 205 Probability in Genetics.

Basic definitions:

Sample space.

All possible outcomes for a random experiment. For example. If you flip a coin once the sample space is the set {H,T}. Suppose you have two coins the sample space when both coins are flipped is {HH, HT, TH, TT}. An event is a subset of the sample space.

Ordered vs unordered event.

If the sequence of the joint occurrence of several events is important then this is ordered. For example, maybe it is important to distinguish between {HT} and {TH} when two coins are flipped. This is an ordered event. If you define your event as being 'exactly one head' when two coins are flipped, then this can happen in two ways {HT} and {TH}

Probability.

The expected frequency of a particular event when an experiment is repeated an infinite number of times. For a single coin toss, the probability of a head on a single toss is 1/2. Probabilities are always assumed to be real numbers between 0 and 1. Probabilities in genetics are often predicted based on certain hypotheses and then the predictions are used to test the hypothesis using real data.

We will refer to the probability of an event as P(event). So for a coin flipped once and only once for the sample space {T,H} P(H) = 1/2. Notice that in the absence of other information, we will often assume that the probability of elementary events such as the result of a single toss of a coin are equally likely. The problem is for us is going to be when elementary events are combined.

Counting ordered Vs unordered events which are the joint occurrence of elementary events.

Ordered events. Example. Consider flipping two coins once. For each coin there are two possible outcomes. Hence the total number of possible outcomes involving both coins is 2*2 = 4. Hence the probability of each outcome is going to be 1/4. So. P{HH} =1/4 etc.

In general we can ask for N experiments with two possible outcomes for each experiment about the number of events involving M of the first outcome and N-M of the second outcome. For example if you flip 10 coins you can ask about the number of ways you can get 6 heads in 10 flips.

This is going to be all the possible ordered combinations of 6 Heads in 10 flips{ TTTTHHHHHH, THTTTHHHHH, etc.}. This number is given by the following formula:

N!/[(M!)(N-M)!]

Where N! is N factorial or 1*2*3* ...*N

For for our coin example, the number of ways you can get 6 heads in 10 tosses is:

10!/(6! 4!) = 10*9*8*7/(4*3*2*1)

Make sure you understand why this is!.

Note that for this example the total number of possible ordered events is 2^N or 2 raised to the Nth power. So the probability of any ordered outcome is going to be 1/(2^N). Thus the probability of getting 6 heads for 10 coin tosses is going to be:

number of all ordered outcomes resulting in 6 heads/total number of all possible ordered outcomes. =

10!/(6! 4!)/2^N

This works since we are assuming the probabilities for all the ordered outcomes are equal.

Applications of these ideas to genetics.

The cells in your body receive half their chromosomes from your father and one half of their chromosomes from your mother. So for each pair of homologous chromosomes one will be a maternal chromosome and one will be a paternal chromosome.. During meiosis when the haploid gametes are formed, each member of the pair but not both ends up in a gamete. This is the principle of segregation.

1. How many possible combinations of maternal and paternal chromosomes are there?

Answer. For any single pair of chromosomes either the gamete has a maternal chromosome or a paternal chromosome, so there are two possibilities for any single pair of chromosomes. Since we have 23 pairs of homologous chromosomes there are 2^23 possible combinations of maternal Vs paternal chromosomes in the gametes.

2. How many of these combinations involve 10 maternal chromosomes and 13 paternal chromosomes?

Answer: We want the total number of ordered combinations involving 10 maternal and 13 paternal chromosomes. So use the formula:

N!/[(M!)(N-M)!] where N = 23 and M = 13

3. What is the probability of getting a gamete with 10 maternal and 13 paternal chromosomes? Assume any ordered combination is equally likely.

 

4. Recall that DNA is made from a sequence of four nucleotides that differ in terms of which nitrogen base A,T,G,C is present in each nucleotide.

Suppose you have a region of DNA which is 100 nucleotides long. How many possible nucleotide sequences are there for this region of DNA?

 

More on probabilities:

Two events A and B are said to be independent events if the probabilities of both events happening jointly

P(A and B) = P(A)*P(B)

So for example if we flip two coins P(H on the first coin and H on the second coin) = P(H on the first coin)*P(H on the second coin) = 1/4

Two events are said to be mutually exclusive if P(A and B) = 0

For example for a single flip of a coin the event H and the event T are mutually exclusive since they cannot happen at the same time.

Note that if for all the mutually exclusive events in a sample space the probabilities must some to 1. Thus P(A or B) = 1

For example a die has 6 different faces which are mutually exclusive. If each one is equally likely then the probability of any on occurring on a single toss is 1/6. So all the probabilities have to sum to 1 for our sample space{1,2,3,4,5,6}

This leads to the following useful trick for mutually exclusive events:

P(A particular mutually exclusive event) = 1 - P(all the others)

Example for coin tossing.

Suppose I toss a coin 10 times (or 10 coins once) and ask what is the probability of getting at least one head in 10 tosses. We could do this in two ways one is to sum probabilities P(1 at least head in 10) + P(2 heads in 10) + P(10 heads in 10) or we can simply go:

P(1 at least head in 10) = 1 - P(no heads in 10) = 1 - 1/(2^10)

 

Application to genetics:

5. For maternal and paternal chromosomes in the gamete example, what is the probability of a gamete having at least one maternal chromosome?

Answer:

P(at least one maternal chromosome) = 1 - P(no maternal chromosomes) = 1 - 1/2^23

 

6. For a type of tissue the probability of a randomly selected cell being in a particular stage of the cell cycle is given by the following table:

Cell cycle Probability
Interphase 0.5
Prophase 0.1
Metaphase 0.05
Anaphase 0.2
Telophase 0.15

Suppose you examine 100 cells from this tissue. What is the probability of seeing at least one metaphase?

 

Conditional probability:

Often our estimate of the probability on an event B will be modified based on partial information that we have or because we know a particular event A has taken place. This kind of probability is called conditional probability. For the events A and B we often use the phrase, the probability of B given A. This is often written as P(B | A) where the '|' means 'given'. If two events are independent then it is always the case that

P(B | A) = P(B) and if two events are mutually exclusive the P(B | A) = 0.

Conditional probability is particularly useful where events are correlated with each other or in situations where we are given partial information about an event that restricts what part of the total sample space we need to examine.

Example suppose we have two die and we are interested in the total number of dots that come up when both dots are tossed.

What is the probability of this sum being 5?

If we don't know any thing in advance then we are interested in the outcomes 0 + 5, 1 + 4 , 2 + 3 , 3 + 2 , 4 + 1 , 5 + 0 since these are the combinations of numbers which add up to five dots total.

The probability of this is clearly going to be 6*1/6^2 = 6/36 = 1/6. Where does the 1/^6^2 come about from?

What is P(5 dots total for both die | the first die comes up with 1 dot)?

This turns out to be 1/6 since now we only have to deal with the second die and there is only one possibility for the second die where the total dots is 5, namely the second die comes up 4.

A useful rule is called Baye's theorem:

P(B | A ) = P(A and B)/P(A)

So for our dice problem

P(5 dots total for both die | the first die comes up with 1 dot) = P(second die comes up 4 and the first die comes up 1)/P(first die comes up 1)

or assuming the die are independent:

P(second comes up 4)*P(first comes up 1)/P(first comes up 1)

Bayes theorem seems simple but it very important in genetic counseling and pedigree analysis.

Application to genetics.

7. Many types of color blindness are what are called X linked, that is determined by genes on the X chromosome. Suppose a woman is carrying one X chromosome with the gene for a particular type of color blindness; her other X chromosome does not have this gene. If she is married to a man who does not have this gene on his X chromosome, consider the following:

A. What is the probability that her first child will carry the X chromosome with the gene associated with color blindness?

Hint: remember the male has sex chromosomes X,y

 

B. An individual will have color vision if he or she has at least on X chromosome without the gene for color blindness. What is the probability that the child is color blind?

Hints: P(color blind) = 1 - P(at least one X chromosome without the gene for color blindness).

1 - P(at least one x chromosome without the gene for color blindness AND child is female)*P(child is female) +

P(at least one x chromosome without the gene for color blindness AND child is male)*P(child is male)

 

C. Suppose amniocentesis reveals that the child is male. What is the probability that the child is color blind.

Hint: use Baye's theorem.

 

pgd 02/23/02