Genome by Yu Zhang a Dissertat

Identification and analysis of recently duplicated genes in Channel Catfish (Ictalurus punctatus) genome by Yu Zhang A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy Auburn, Alabama December 14, 2013 Keywords: gene duplication, catfish, genome, teleost Copyright 2013 by Yu Zhang Approved by Zhanjiang Liu, Chair, Alumni Professor of Fisheries and Allied Aquacultures Eric Peatman, Assistant Professor of Fisheries and Allied Aquacultures Rex Dunham, Professor of Fisheries and Allied Aquacultures Nannan Liu, Professor of Entomology and Plant Pathology Abstract Gene duplication is an important source of genetic innovation and structure and usually associated to the lineage-specific adaptive trait during evolution, such as genes related to immunity, development and reproduction. Channel catfish (Ictalurus punctatus) is the primary aquaculture species in the United States, which represents 70% of the total aquaculture production. However, the set size of duplicated genes and duplication pattern in channel catfish are still less well understood. In this study, a total of 2,130 duplicated gene sets were identified by using predicted mRNA in channel catfish. Similar overall distribution patterns of duplication size were observed among catfish and model fish species (zebrafish, medaka, stickle and Tetraodon) under comparison, significant differences were noted, with catfish harboring the highest percentage (69.4%) of duplicated gene sets with only two copies. A total of 1,128 duplication sets can be characterized with their duplication types with Intra-chromosomal duplications accounting for most proportion. The large number of duplicated genes with low Ks indicated that most of duplicated genes predicted in this study were recently duplicated genes. GO enrichment analysis indicated that GO terms associated to various metabolic and catabolic processes, response to stimuli and signal transduction were over-represented in duplicated genes. When compared with the four model fish species, a total of 239 catfish specific duplicated genes were identified, with 71.1% of duplication sets containing 2 copies and 59.4% of duplicated genes (pairwise) had small Ks values of ≤ 1. These results suggested that catfish species-specific duplicated genes were ii mostly recent. The catfish species-specific duplicated genes included a set of genes related to immune and stress responses suggesting a mechanism for catfish to adapt to its environment. This study is the first study on surveying genome-wide duplicated genes in channel catfish. This research is not only useful for further catfish whole genome annotation, but also useful for understanding environmental adaption in channel catfish. iii Acknowledgments First and foremost, I would like to express my deep and sincere gratitude to my advisor Dr. Zhanjiang Liu for his constant guidance, support and patience throughout my degree program. His scientific attitude and philosophy have had a profound influence on me. Under his supervision, I learned how to do scientific research. Without him, it would be impossible for me to finish this dissertation. I would like to express my gratitude to my committee: Dr. Nannan Liu, Dr. Eric Peatman, Dr. Dunham and my dissertation outside reader, Dr. Bernhard Kaltenboeck for their advice and critical reading of my dissertation. Sincere thanks will also go to Huseyin Kucuktas and Ludmilla Kaltenboeck for technical help, and the colleagues in the laboratory for their help, collaboration, and friendship. I would like to express my gratitude to the Chinese Scholarship Council for the financial support through my PhD study. Finally, I would like to thank my parents and all my family members for their love and support. iv Table of Contents Abstract ......................................................................................................................................... ii Acknowledgments........................................................................................................................ iv List of Tables .............................................................................................................................. vii List of Illustrations ..................................................................................................................... viii List of Abbreviations ................................................................................................................... ix I. INTRODUCTION AND LITERATURE REVIEW ....................................................................1 Mechanism of gene duplication .................................................................................................1 The fate of duplicated genes .......................................................................................................3 Tandem duplication .....................................................................................................................4 Segmental duplications ................................................................................................................5 Copy number variation ................................................................................................................7 Fish-specific genome duplication (FSGD) and Hox gene cluster .........................................9 Identification of gene duplication ........................................................................................... 11 II. MATERIALS AND METHODS ................................................................................................. 15 mRNA prediction, clustering and identification of duplicated gene sets .......................... 15 Analysis for the nature of duplicated genes in catfish .......................................................... 16 Identification of the catfish species-specific duplicated genes ........................................... 17 Analysis of synonymous substitution (Ks) for duplicated gene pairs ................................ 17 Gene enrichment analysis of duplicated genes ...................................................................... 18 III. RESULT .......................................................................................................................................... 18 v Generation of the duplicated gene pool ....................................................................................... 18 Analysis of duplicated copy numbers .......................................................................................... 19 Characterization of duplicated genes ........................................................................................... 22 Distribution of duplicated genes on the 29 chromosomes ........................................................ 23 Analysis of Ks distance in duplicated genes .............................................................................. 24 Evolutionary conservation of duplicated genes among teleost species .................................. 26 Catfish species-specific duplicated genes ................................................................................... 28 Functional enrichment analysis of the catfish duplication genes ............................................ 31 IV. DISCUSSIONS .............................................................................................................................. 40 V. CONCLUSION .............................................................................................................................. 45 REFERENCE ........................................................................................................................................ 103 vi List of Tables Table 1 ....................................................................................................................................... 19 Table 2 ....................................................................................................................................... 20 Table 3 ....................................................................................................................................... 22 Table 4 ....................................................................................................................................... 28 Table 5 ....................................................................................................................................... 29 Table 6 ....................................................................................................................................... 30 Table 7 ....................................................................................................................................... 33 Table 8 ....................................................................................................................................... 35 Table 9 ....................................................................................................................................... 37 Table 10 ..................................................................................................................................... 38 Table 11 ..................................................................................................................................... 39 Supplement Table 1 ..................................................................................................................

Genome by Yu Zhang a Dissertat

Identification of Potential Autoantigens in Anti-CCP-Positive and Anti-CCP-Negative Rheumatoid Arthritis Using Citrulline-Specific Protein Arrays

A Field Guide to Eukaryotic Transposable Elements

Tandemly Arrayed Genes in Vertebrate Genomes

Evolution of Orthologous Tandemly Arrayed Gene Clusters

Noelia Díaz Blanco

Correlated Variation and Population Differentiation in Satellite DNA Abundance Among Lines of Drosophila Melanogaster

Genome Analysis of Protozoan Parasites: Proceedings of a Workshop Held at ILRAD, Nairobi, Kenya, 11–13 November 1992, Ed

First Efficient CRISPR-Cas9-Mediated Genome Editing in Leishmania Parasites

Nº Ref Uniprot Proteína Péptidos Identificados Por MS/MS 1 P01024

Applying CRISPR/Cas for Genome Engineering in Plants

A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat

Efficient Inversions and Duplications of Mammalian