Recombination, linkage and genetic mapping

M.Sc. Microbiology, 2nd Semester MCB 202 : Genetics and Gene regulation Gr. A: Fundamental Genetics

by Dr. Suman Kumar Halder Independent assortment and Recombination The principle of segregation states that each individual diploid organism possesses two alleles at a locus that separate in meiosis, with one allele going into each gamete. The principle of independent assortment provides additional information about the process of segregation: it tells us that, in the process of separation, the two alleles at a locus act independently of alleles at other loci. The independent separation of alleles results in recombination, the sorting of alleles into new combinations.

Recombination means that, when one of the F1 progeny reproduces, the combination of alleles in its gametes may differ from the combinations in the gametes from its parents. Chromosomal Theory of Inheritance

Walter Sutton (left) and Theodor Boveri (right) independently developed the chromosome theory of inheritance in 1902.

The chromosome theory of inheritance is credited to papers by Walter Sutton in 1902 and 1903, as well as to independent work by Theodor Boveri during roughly the same period. Molecular recombination

(or genetic reshuffling) is the exchange of genetic material between different organisms which leads to production of offspring with combinations of traits that differ from those found in either parent. • Takes place in both and eukaryotes. • Best studied in , archaea, yeast and phage. • Mediated by breakage and joining of DNA strand. • Reason: DNA repair, regulation of gene expression, maintenance of genetic diversity, genetic rearrangement, etc.

Molecular recombination: Types

• At least four types of naturally occurring recombination have been identified in living organisms: • General or • Illegitimate or non-homologous recombination • Site-specific recombination • Replicative recombination

General recombination can appear to result in either an equal or an unequal exchange of genetic information, which respectively called reciprocal and non-reciprocal recombination. Homologous recombination

• Homologous recombination occurs between DNA molecules of very similar sequence, such as homologous chromosomes in diploid organisms. General recombination can occur throughout the genome of diploid organisms, using one or a small number of common enzymatic pathways.

Each homologous chromosome is shown as a different shade of blue and a distinctive thickness, with different alleles for each of the three genes on each. Recombination between genes A and B leads to a reciprocal exchange of genetic information, changing the arrangement of alleles on the chromosomes Reciprocal recombination

• Two homologous chromosomes are distinguished by having wild type alleles on one chromosome (A+ B+ C+) and mutant alleles on the other (A- B- C-). • Homologous recombination between genes A and B exchanges the segment of one chromosome containing the wild type alleles of genes B and C (B+ and C+) for the segment containing the mutant alleles (B- and C-) on the homologous chromosome. • This could be explained by breaking and rejoining of the two homologous chromosomes during meiosis. • This process resulting in new DNA molecules that carry genetic information derived from both parental DNA molecules is called reciprocal recombination. The number of alleles for each gene remains the same in the products of this recombination, only their arrangement has changed. Mechanism (Holliday Model)

Single-strand break model

Holliday Model

In this model, recombination takes place through a single-strand break in each DNA duplex, strand displacement, , and resolution of a single Holliday junction

Non-cross over Cross over recombination recombination Mechanism..cont

Double-strand break model Non-reciprocal recombination

• General recombination can also result in a one- way transfer of genetic information, resulting in an allele of a gene on one chromosome being changed to the allele on the homologous chromosome. • Recombination between two homologous chromosomes A+B+C+ and A-B-C- can result in a new arrangement, A-B+C-, without a change in the parental A+B+C+. • In this case, the allele of gene B on the bottom chromosome has changed from B- to B+ without a reciprocal change on the other chromosome. Non-homologous recombination

• Illegitimate or non-homologous recombination occurs in regions where no large-scale sequence similarity is apparent, e.g. translocations between different chromosomes or deletions that remove several genes along a chromosome. However, when the DNA sequence at the breakpoints for these events is analyzed, short regions of sequence similarity are found in some cases. For instance, recombination between two similar genes that are several million bp apart can lead to deletion of the intervening genes in somatic cells.

Two different chromosomes (denoted by the different colors and different genes) recombine, moving, e.g. gene C so that it is now on the same chromosome as genes D and E. Although the sequences of the two chromosomes differ for most of their lengths, the segments at the sites of recombination may be related, denoted by the yellow and orange rectangles Non-homologous recombination

• Major pathway for the repair of chromosomal double- strand breaks in the DNA of somatic cells, V(D)J recombination of antibody. • This pathway is often used when the cell is in G1 and a sister chromatid is not available for repair through homologous recombination. • Non-homologous end joining uses proteins that recognize the broken ends of DNA, bind to the ends, and then joins them together. • It is more error prone than homologous recombination and often leads to deletions, insertions, and translocations. Site-specific recombination

• Site-specific recombination occurs between particular short sequences (about 12 to 24 bp) present on otherwise dissimilar parental molecules. Site-specific recombination requires a special enzymatic machinery, basically one enzyme or enzyme system for each particular site. Good examples are the systems for integration of some bacteriophage, such as λ, into a bacterial chromosome and the rearrangement of immunoglobulin genes in vertebrate animals. • Unlike general recombination, site specific recombination is guided by a recombination enzyme that recognizes specific nucleotide sequences present on one of both recombining DNA molecules.

Leads to the combination of two different DNA molecules, illustrated here for a bacteriophage integrating into the E. coli chromosome, catalyzed by a specific enzyme that recognizes a short sequence present in both the phage DNA and the target site in the bacterial chromosome, called att Replicative recombination

• Replicative recombination generates a new copy of a segment of DNA. Many transposable elements use a process of replicative recombination to generate a new copy of the transposable element at a new location.

Seen for some transposable elements (shown as red rectangles) again using a specific enzyme, in this case encoded by the transposable element Bacterial recombination

• Bacterial recombination is a type of genetic recombination in bacteria characterized by DNA transfer from one organism called donor to another organism as recipient. This process occurs in three main ways: – Transformation – – Conjugation • The final result of conjugation, transduction, and/or transformation is the production of genetic recombinants, individuals that carry not only the genes they inherited from their parent cells but also the genes introduced to their genomes by conjugation, transduction, and/or transformation. • Recombination in bacteria is ordinarily catalyzed by a RecA type of recombinase. • These recombinases promote repair of DNA damages by homologous recombination. • In the archaea, the ortholog of the bacterial RecA protein is RadA.

Linkage

Number of chromosomes in most organisms is limited and that there are certain to be more genes than chromosomes; so some genes must be present on the same chromosome and should not assort independently.

Genes located close together on the same chromosome are called linked genes and belong to the same linkage group.

Linked genes travel together in meiosis, eventually arriving at the same destination (the same gamete), and are not expected to assort independently. Linkage (cont.)

Linked genes Linked genes

Linkage group History of Linkage establishment One of the first cases was reported in sweet peas by William Bateson, Edith Rebecca Saunders, and Reginald C. Punnett in 1905. They crossed a homozygous strain of peas having purple flowers and long pollen grains with a homozygous strain having red flowers and round pollen grains. All the F1 had purple flowers and long pollen grains, indicating that purple was dominant over red and long was dominant over round. When they intercrossed the F1, the resulting F2 progeny did not appear in the 9 : 3 : 3 : 1 ratio expected with independent assortment. An excess of F2 plants had purple flowers and long pollen or red flowers and round pollen (the parental phenotypes).

In the light of linkage we now can explain that the two loci that they examined lie close together on the same chromosome and therefore do not assort independently. Linkage Vs. Independent Assortment

• Genes are rarely completely linked but, by assuming that no crossing over occurs, we can see the effect of linkage more clearly. We will then consider what happens when genes assort independently. Finally, we will consider the results obtained if the genes are linked but exhibit some crossing over. • A testcross reveals the effects of linkage. • Consider a pair of linked genes in tomato plants. One of the genes affects the type of leaf: an allele for mottled leaves (m) is recessive to an allele that produces normal leaves (M). Nearby on the same chromosome the other gene determines the height of the plant: an allele for dwarf (d) is recessive to an allele for tall (D). • Testing for linkage can be done with a testcross, which requires a plant heterozygous for both characteristics. A geneticist might produce this heterozygous plant by crossing a variety of tomato that is homozygous for normal leaves and tall height with a variety that is homozygous for mottled leaves and dwarf height:

• The geneticist would then use these F1 heterozygotes in a testcross, crossing them with plants homozygous for mottled leaves and dwarf height: This crossing will be used in recombination frequency calculation subsequently

Crossing over Vs. No Crossing over

Recombination frequency

• We can use the frequency of recombination events between two genes (i.e., their degree of genetic linkage) to estimate their relative distance apart on the chromosome. • Two very close-together genes will have very few recombination events and be tightly linked, while two genes that are slightly further apart will have more recombination events and be less tightly linked. • In the next section, we'll see how to calculate the recombination frequency between two genes, using information from genetic crosses. Recombination frequency

Lets take the example of the cross of marked slide:

The closer together the two genes are, the less likely it is that cross-overs will occur between them Linked genes have a recombination frequency that is less than 50%. The recombination frequency CANNOT be greater than 50%. The recombination frequency CANNOT be greater than 50%...why?

• Imagine a case where there is NO linkage at all. Two genes are located on different chromosomes, so that during meiosis the segregation of alleles of one gene is completely independent of alleles of the other gene (this is stated in Mendel's Second Law and is known as the law of independent assortment).

MD (25%) - present in parental generation Md (25%) - NOT present in parental generation mD (25%) - NOT present in parental generation md (25%) - present in parental generation

• Start by crossing pure-bred homozygote parents MMDD x mmdd. The parental gametes will be MD and md. F1 generation offspring will all have the genotype MmDd.

With independent assortment the F1 offspring will produce the following gametes: 50% of the F1 gametes were not present in the parental generation, so are recombinant gametes. The recombination frequency is 50%.

If the genes are linked then when the F1 gametes are formed the alleles of the two genes will not assort independently. The M and D alleles will have been located close to each other on the same chromosome from the MMDD parent (the same is true for m and d). The closer together the two genes are, the less likely it is that cross-overs will occur between them, so when gametes are formed there will be more MD and md gametes, and fewer Md and mD. Thus linked genes have a recombination frequency that is <50%.

Recombination frequency & linkage maps

What is the benefit of calculating recombination frequency? • One way that recombination frequencies have been used historically is to build linkage maps, chromosomal maps based on recombination frequencies. • Recombination frequency is not a direct measure of how physically far apart genes are on chromosomes. However, it provides an estimate or approximation of physical distance. • So, we can say that a pair of genes with a larger recombination frequency are likely farther apart, while a pair with a smaller recombination frequency are likely closer together. Morgan’s Work • Thomas Hunt Morgan was an American evolutionary biologist, geneticist, embryologist, and science author who won the Nobel Prize in Physiology or Medicine in 1933 for discoveries elucidating the role that the chromosome plays in heredity. • T. H. Morgan and his students developed the idea that physical distances between genes on a chromosome are related to the rates of recombination.

Thomas Hunt Morgan's Drosophila melanogaster genetic linkage map. This was the first successful gene mapping work and provides important evidence for the chromosome theory of inheritance. The map shows the relative positions of allelic characteristics on the second Drosophila chromosome. The distance between the genes (map units) are equal to the percentage of crossing-over events that occurs between different alleles. Linkage mapping

• Comparison of recombination frequencies can also be used to figure out the order of genes on a chromosome. • For example, let's suppose we have three genes, A, B, and C, and we want to know their order on the chromosome (ABC? ACB? CAB?) If we look at recombination frequencies among all three possible pairs of genes (AC, AB, BC), we can figure out which genes lie furthest apart, and which other gene lies in the middle. Specifically, the pair of genes with the largest recombination frequency must flank the third gene: Linkage mapping (cont.)

• By doing this type of analysis with more and more genes (e.g., adding in genes D, E, and F and figuring out their relationships to A, B, and C) we can build up linkage maps of entire chromosomes. • In linkage maps, you may see distances expressed as centimorgans (cM) or map units rather than recombination frequencies. • There's a direct relationship among these values: 1% recombination frequency is equivalent to 1 centimorgan (cM) or 1 map unit Process of mapping

or Process of mapping..cont

Mapping distance b/w 2 genes

Recombination frequency = (268/1448) x 100 = 18.5 % The map distance between A and B allele is 18.5 cM Mapping distance more than 2 genes

A B D a b d

a b d a b d

A B D

a b d A B d Like that…. a b d

13.2 mu 6.4 mu

A D B

18.5 mu Exercise 1

Exercise 2

Exercise 3 Exercise 4

Exercise 5

Exercise 6