<p>BIT150 – Fall 2008 – Homework 3 Due on Thursday October 15th by email to TA: [email protected] as Hwk3_Lastname BEFORE the Lab</p><p>1. 35 points The following is a multiple sequence alignment of a 41-bp fragment from a putative plant cytochrome P450 gene from rice, maize, sorghum, and rye:</p><p>1.1. Using the Jukes and Kantor 1-parameter model showed below, A C G T A - 1 1 1 C 1 - 1 1 G 1 1 - 1 T 1 1 1 - - Calculate pair-wise distances between the sequences and construct BY HAND a distance matrix. - Show your calculations. - Present the distance matrix.</p><p>1.2. Using the distance-based method UPGMA, - Construct BY HAND a phylogenetic tree based on the distance matrix created in 1.1. - Provide distances for all the branches. - Include all your intermediate matrices. - Show your calculations. - Manually draw the phylogenetic tree.</p><p>1 2. 10 points Sequences from the flavanoid 3’ hydroxylase gene, Fop1, are provided below. >Triticum monoccocum MDHSVLLLLASLAAVAVAAVWHLRSHGRRTKLPLPPGPRGWPVLGNLPQLGAMPHHTMAALARQHGPLFRLRFGSVEVVVAASAKVARS FLRAHDANFSDRPPTSGAEHLAYNYQDLVFAPYGARWRALRKLCALHLFSARALDALRTIRQDEARLMVTHLLSSSSPAGVAVNLCAIN VCATNALARAAIGRRMFGDGVGEGAREFKDMVVELMQLAGVLNIGDFVPALRWLDPQGVVAKMKRLHRRYDRMMDGFISERGQHAGEME GNDLLSVMLATMRWQSPADAGEEDGIKFTEIDIKALLLNLFTAGTDTTSSTVEWALAELIRDPCILKQLQHELDGVVGNDRLVTEADLP RLTFLAAVIKETFRLHPATPLSLPRVAAEDCEVDGYHVSKGTTLIMNVWAIARDPASWGPDPLEFRPVRFLPGGLHESADVKGGDYELI PFGAGRRICAGLGWGLRMVTLMTAMLVHAFDWSLVDGTTPEKLNMEEAYGQTLQRAVPLVVQPVPRLLSSAYTV</p><p>>Zea mays MCAMAREYGPLFRLRFGSAEVVVAASARVAAQFLRAHDANFSNRPPNSGAEHVAYNYQDLVFAPYGSRWRALRKLCALHLFSAKALDDL RGVREGEVALMVRELARQGERGRAAVALGQVANVCATNTLARATVGRRVFAVDGGEGAREFKEMVVELMQLAGVFNVGDFVPALAWLDP QGVVGRMKRLHRRYDDMMNGIIRERKAAEEGKDLLSVLLARMREQQPLAEGDDTRFNETDIKALLLNLFTAGTDTTSSTVEWALAELIR HPDVLRKAQQELDAVVGRDRLVSESDLPRLTYLTAVIKETFRLHPSTPLSLPRVAAEECEVDGFRIPAGTTLLVNVWAIARDPEAWPEP LEFRPARFLPGGSHAGVDVKGSDFELIPFGAGRRICAGLSWGLRMVTLMTATLVHALDWDLADGMTADKLDMEEAYGLTLQRAVPLMVR PAPRLLPSAYAE</p><p>>Oryza sativa MDVVPLPLLLGSLAVSAAVWYLVYFLRGGSGGDAARKRRPLPPGPRGWPVLGNLPQLGDKPHHTMCALARQYGPLFRLRFGCAEVVVAA SAPVAAQFLRGHDANFSNRPPNSGAEHVAYNYQDLVFAPYGARWRALRKLCALHLFSAKALDDLRAVREGEVALMVRNLARQQAASVAL GQEANVCATNTLARATIGHRVFAVDGGEGAREFKEMVVELMQLAGVFNVGDFVPALRWLDPQGVVAKMKRLHRRYDNMMNGFINERKAG AQPDGVAAGEHGNDLLSVLLARMQEEQKLDGDGEKITETDIKALLLNLFTAGTDTTSSTVEWALAELIRHPDVLKEAQHELDTVVGRGR LVSESDLPRLPYLTAVIKETFRLHPSTPLSLPREAAEECEVDGYRIPKGATLLVNVWAIARDPTQWPDPLQYQPSRFLPGRMHADVDVK GADFGLIPFGAGRRICAGLSWGLRMVTLMTATLVHGFDWTLANGATPDKLNMEEAYGLTLQRAVPLMVQPVPRLLPSAYGV</p><p>>Sorghum bicolor MDVPLPLLLGSLAVSVVVWCLLLRRGGNGKGKGKRPLPPGPRGWPVLGNLPQVGSHPHHTMCALAKEYGPLFRLRFGSAEVVVAASARV AAQFLRAHDANFSNRPPNSGAEHVAYNYQDLVFAPYGSRWRALRKLCALHLFSAKALDDLRGVREGEVALMVRELARHQHQHAGVPLGQ VANVCATNTLARATVGRRVFAVDGGEEAREFKDMVVELMQLAGVFNVGDFVPALAWLDLQGVVGKMKRLHRRYDDMMNGIIRERKAVEE GKDLLSVLLARMREQQSLADGEDSMINETDIKALLLNLFTAGTDTTSSTVEWALAELIRHPDVLKKAQEELDAVVGRDRLVSESDLPRL TYLTAVIKETFRLHPSTPLSLPRVAAEECEVDGFRIPAGTTLLVNVWAIARDPEAWPEPLQFRPDRFLPGGSHAGVDVKGSDFELIPFG AGRRICAGLSWGLRMVTLMTATLVHALDWDLADGMTAYKLDMEEAYGLTLQRAVPLMVRPAPRLLPSAYAAE</p><p>>Phyllostachys edulis MDLPLPLVLSTLAVSAIVCYVLFFRAGKARRRAPLPPGPRGWPVLGNLPQLGGKTHQTLHVMTKVYGPLLRLRFGSSDVVVAGSAAVAE QFLRIHDAKFSNRPPNSGGEHMAYNYQDVVFGPYGPRWRAMRKVCAVNLFSARALDDLRAVRERETALMVRSLVEASAPRGAPAVPLGK AVNVCTTNALSRAAVGRRVFAAGSEVAKEFKEIVLEVMQVGGVLNVGDFVPALRWLDPQGVVAKMKKLHRRYDDMMNAIIGERRAGVKP AGEEGKDLLGLLLAMMQEEQPLAGGEEDKITDTDIKALTLVS 2.1. Construct phylogenetic tees using NJ and UPGMA methods: - Use Number of differences as the substitution model. - Use bootstrap as the test of inferred phylogeny, with 1,000 replications. - Present the trees in your homework.</p><p>2.2. What are the bootstrap values indicating in these trees?</p><p>2 3. 15 points From the following trees (A, B, C, D):</p><p>3.1. Construct BY HAND: - a strict consensus tree (groups present in ALL trees); - a 50% majority-rule consensus tree (groups in >50% of the trees). 3.2. What are consensus trees used for?</p><p>4. 10 points From the following induced multiple sequence alignment:</p><p>Induced multiple sequence alignment of a segment of the ‘4-coumarate Co-A Ligase’ gene (‘-‘ indicates a gap). H1 T C T A C T G A C H2 A C - A C G G A C H3 A C T A C G A A T H4 A C T G T G - - C</p><p>4.1. Calculate BY HAND the ‘sum-of-pairs’ distance score, scoring transitions (A<->G and C<->T) as 1 unit of distance and transversions as 2 unit of distance (Kimura 2- Parameter model) and affine gap penalties: gap opening 3; gap extension 1. A C G T A - 2 1 2 C 2 - 2 1 G 1 2 - 2 T 2 1 2 - - Indicate all your calculations within the table provided: Kimura 2-parameter H1 vs. H2 H1 vs. H3 H1 vs. H4 H2 vs. H3 H2 vs. H4 H3 vs. H4 Sum of Pairs</p><p>3 5. 30 points Given the following 6 CCT domain protein sequences:</p><p>>T._urartu_ZCCT1 MSMSCGLCGANNCPRLMVSPIHHRHHHHQEHQLREHQFFAQGNHHHHHPVPLPPANFDHSRTWTTPFHETAAAGNSSRLTLEVGAGGRP MAHLVQPPARAHIVPFYGGAFTNTISNEAIMTIDTEMMVGPAHYPTMQERAAKVMRYREKRKRRRYDKQIRYESRKAYAELRPRVNGRF VKVPEAMASPSSPASPYDPSKLHLRWFR</p><p>>Ae._tauschii_ZCCT-D1 MSMSCGLCGPNNCPRLMVSPIHHHHHQEHQLREHQFFAQGNHHHQHHGAAADHPVPLPPANFDHRRTWTTPFHETAAAGSSISRLTLEV GAGGRHMAHLSSARAHIVPFYGGAFTNTISNEAIMTIDTEMMVGPAHYPTMQERAAKVMRYREKRKRRRYDKQIRYESRKAYAELRPRV NGRFVKVPEAMASPSSPASPYDPSKLHLGWLR</p><p>>ZCCT-S2_Ae._speltoides MSMSCGLCGASNCPHHMISPVLQHHQEHGLREYQFFAQGHHHHHHDGTAADYPPPPPANCHHCKSWTTPFHETAAAGNSSRLTLEVDAG GQHLAHLLQPPAPPRATIVPFREGAFTSTISNATIMTIDTEMMVGAAHNPTMQERHAKVMRYREKRKRRRYDKQIRYESRKAYAKLRPR VNGRFVKVPEAAVSPSPPASPYDPSKLNLGLFR</p><p>>ZCCT2_T._tauschii MSMSCGLCGASNCPHHMNSPVLHHHHHHQEHRLCEYQFFAQGQHHHHHGAAADYPPPPPANCHHRRSWTTPFHETAAAGNSSRLTLEVD AGGQHTAHLLQPPAPPRATIVPFCGGAFTSTISNATIRTIDTEMMVGAAHNPTMQEREAKVMRYREKRKRRRYDKQIRYESRKAYAELR PRVNGRFVKVPEATASPSPPTSPYDPSKLHLGWFR</p><p>>Os_AAL7978 MSAASGAACGVCGGGVGECGCLLHQRRGGGGGGGGGGVRCGIAADLNRGFPAIFQGVGVEETAVEGDGGAQPAAGLQEFQFFGHDDHDS VAWLFNDPAPPGGTDHQLHRQTAPMAVGNGAAAAQQRQAFDAYAQYQPGHGLTFDVPLTRGEAAAAVLEASLGLGGAGAGGRNPATSSS TIMSFCGSTFTDAVSSIPKDHAAAAAVVANGGLSGGGGDPAMDREAKVMRYKEKRKRRRYEKQIRYASRKAYAEMRPRVKGRFAKVPDG ELDGATPPPPSSAAGGGYEPGRLDLGWFRS</p><p>>OSI Os_AP005307 MGMANEESPNYQVKKGGRIPPRSSLIYPFMSMGPAAGEGCGLCGADGGGCCSRHRHDDDGFPFVFPPSACQGIGAPAPPVHEFQFFGND GGGDDGESVAWLFDDYPPPSPVAAAAGMHHRQPPYDGVVAPPSLFRRNTGAGGLTFDVSLGERPDLDAGLGLGGGGGRHAEAAASATIM SYCGSTFTDAASSMPKEMVAAMADDGESLNPNTVVGAMVEREAKLMRYKEKRKKRCYEKQIRYASRKAYAEMRPRVRGRFAKEPDQEAV APPSTYVDPSRLELGQWFR</p><p>5.1. Use tCOFFEE to produce an alignment of conserved protein regions.</p><p>5.2. Produce a multiple sequence alignment using ClustalW. Using BOXSHADE, prepare a publishable alignment for these sequences. Paste the alignment into your homework document. -Between tCOFFEE and ClustalW which program seems to have better identified conserved regions between these genes? 5.3. Construct a phylogenetic tree with the NJ method (using Number of differences as the substitution model) with bootstrap values. Include the tree here.</p><p>4</p>
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages4 Page
-
File Size-