
Cuban-­‐Amazon genome annotaon Student: Sofiia Kolchanova Scienfic advisor: Pavel Dobrynin Introduc/on • Amazona leucocephala has 5 subspecies with populaons distributed across Cuba, Bahamas and Cayman islands. They are characterized by different habitats, diets, unique plumage coloraon paerns. • Due to habitat loss, and trapping for the wild parrot trade, the Cuban Amazon is now an endangered species • Genome annotaon provides informaon, which can be employed for populaon studies, phylogenecs, studies of evolu0on$and$func0oning$of$ genes$and$gene$families$ Main goal and obBec/ves • The$main$goal$of$this$proEect is$to$annotate$the$ already$assembled$genome$of$cuban$parrot FDobrynin,$P.,$Rivera,$I.,$and$Oleksyk$T.L.$Sequencing,$assembly$ and$comparave$genomics$analysis$of$Cuban$amazon$parrot genome$ FAmazona leucocephalaNN$ • Objecves:$$ PN$Annotate$repeats$using$di<erent repeat$masking$ tools$ 2) Annotate$genes$in$the$,hole$genome$Fbased$on$ homology$and$de$novo,$Rnd$core$genes$,ith$CSCMA$ FCore$Sukaryo0c$Cenes$Mapping$ApproachN$ 3N$Annotate$SPNs$and$predict their$possible$e<ects$ (e.g. missense, nonsenseN$ Assembly Annota/on of repeats • Repeat Masker,$Tandem$Repeat Masker,$python$ TRFO$12,1$%$of$the$genome$seq$masked$ HTO$9,7$%$masked$ $ Аннотация генов 1. Annotaon$using$homologues$from$the$reference$ genome$ $Tools: BLASTN,$Splign,$python$scripts$$ $FDobby-­‐tools$JN5$6edtools. $TotalO$8403 genes$annotated $,ith$the$ORF$correc0onO $$ 4984 genes $ 2. $De novo search$for$genes$F AUGUSTUSNO$ $23870$genes$predicted$ Annota/on of genes CEGMA resultO$$ 205$conservave$genes, of$those$12_$are$complete$Fnot par0alN$ Fof$the$dataset of$45a$core$eukaryo0c$genesN$ Intersec/onO$ CEGMA X AUGU0TU0 97,3 % CEGMA X Splign 98,1 % Splign X Augustus 30,1 % SNPs annotaon SNPs annotaon SNPeff _ a SNPs annotaon SNPeff _ a SNPs annotaon SNPeff _ a Coverage$ SNPs annotaon SNPeff _ a RE0UcT0 • Repeats$annotated$Fsimple$repeats, microsatellites, DNA$transposones, retroelements etc.N$in$the$genome$of$cuban$ parro-b$on$the$average$repeats$cons0tute$ about 10$%$of$the$genomic$sequence.$ • De$novo$search$for genes performed$ Fpredic0onN,$$as$,ell$as$homology-­‐based$ annotaon$and$veriRcaon$employing$the$ dataset of$core$eukaryo0c$genes$FCSCMAN.$$ • SNPs$annotated,$alongside$,ith$predic0on$ of$their$possible$e<ects$on$kno,n$genes$ Flike$amino$acid$subs0tu0ons,$gain$of$stop$ codonN Plans • Make$the$annotaon$more$precise$ • Place$the$acquired genes against$gene$families$ from$TreeFam$$ • Scpand$annotaon$of$repeats$ • Assess$demographic$history$,ith$use$of$PST4$ .
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages15 Page
-
File Size-