Network Mining Approach to Cancer Biomarker Discovery

Total Page:16

File Type:pdf, Size:1020Kb

Network Mining Approach to Cancer Biomarker Discovery NETWORK MINING APPROACH TO CANCER BIOMARKER DISCOVERY THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Praneeth Uppalapati, B.E. Graduate Program in Computer Science and Engineering The Ohio State University 2010 Thesis Committee: Dr. Kun Huang, Advisor Dr. Raghu Machiraju Copyright by Praneeth Uppalapati 2010 ABSTRACT With the rapid development of high throughput gene expression profiling technology, molecule profiling has become a powerful tool to characterize disease subtypes and discover gene signatures. Most existing gene signature discovery methods apply statistical methods to select genes whose expression values can differentiate different subject groups. However, a drawback of these approaches is that the selected genes are not functionally related and hence cannot reveal biological mechanism behind the difference in the patient groups. Gene co-expression network analysis can be used to mine functionally related sets of genes that can be marked as potential biomarkers through survival analysis. We present an efficient heuristic algorithm EigenCut that exploits the properties of gene co- expression networks to mine functionally related and dense modules of genes. We apply this method to brain tumor (Glioblastoma Multiforme) study to obtain functionally related clusters. If functional groups of genes with predictive power on patient prognosis can be identified, insights on the mechanisms related to metastasis in GBM can be obtained and better therapeutical plan can be developed. We predicted potential biomarkers by dividing the patients into two groups based on their expression profiles over the genes in the clusters and comparing their survival outcome through survival analysis. We obtained 12 potential biomarkers with log-rank test p-values less than 0.01. ii DEDICATION This document is dedicated to my family & friends. iii ACKNOWLEDGMENTS I would like to thank my research advisor Dr. Kun Huang for the support and guidance he has given me throughout the entire period of my work. It has been a great pleasure to work with him. I would also like to thank my advisor Dr. Raghu Machiraju for his unconditional help and suggestions. I also thank Yang Xiang and Abhisek Kundu for the help and support they have extended to me. Their inputs and suggestions have been a great help throughout my thesis. I would like to express my deepest gratitude to my parents who have shown unconditional love and care throughout my life. Am thankful to my friends who have given me moral support and helped me get through many harder times. iv VITA May 2004 .................................................... Sri Chaitanya Jr. College 2008 ............................................................ B.E. Computer Science and Engineering, Osmania University 2008 to present ............................................ Graduate Student. Computer Science and Engineering Department, The Ohio State University 2009 to present ........................................... Graduate Research Associate, Department of Bio-Medical Informatics, The Ohio State University FIELDS OF STUDY Major Field: Computer Science and Engineering v TABLE OF CONTENTS ABSTRACT ................................................................................................................... ii DEDICATION .............................................................................................................. iii ACKNOWLEDGMENTS ............................................................................................. iv VITA ...............................................................................................................................v LIST OF TABLES ...................................................................................................... viii LIST OF FIGURES .........................................................................................................x CHAPTER 1: INTRODUCTION ...................................................................................1 1.1 Background ............................................................................................................1 1.2 Motivation..............................................................................................................6 1.3 Thesis Statement ....................................................................................................9 1.4 Contribution ......................................................................................................... 10 1.5 Organization......................................................................................................... 12 CHAPTER 2: GENE CO-EXPRESSION NETWORK ANALYSIS .............................. 13 2.1 Co-expression Similarity ...................................................................................... 13 2.2 Building a Gene co-expression network................................................................ 16 2.3 Mining Modules ................................................................................................... 18 vi 2.4 Comparing Modules ............................................................................................. 20 CHAPTER 3: DENSE NETWORK COMPONENT DISCOVERY METHODS............ 23 3.1 K – Core Algorithm.............................................................................................. 23 3.2 Min-Cut Algorithm .............................................................................................. 27 3.3 Prune-Cut Algorithm – Modification to Min-Cut algorithm .................................. 29 CHAPTER 4: EIGEN CUT ALGORITHM: A NEW NETWORK APPROACH ........... 33 4.1 The Algorithm ...................................................................................................... 33 4.2 Performance ......................................................................................................... 41 CHAPTER 5: APPLICATIONS .................................................................................... 46 5.1 Application 1: TCGA data on Glioblastoma Multiforme....................................... 46 5.1.1 Gene – miRNA Interaction Prediction ............................................................ 47 5.1.2 Gene Signature Discovery .............................................................................. 52 5.1.2.1 Survival Analysis ..................................................................................... 52 5.1.2.3 Results ..................................................................................................... 57 5.2 Application 2: Breast Cancer Data - GDS2250 dataset ......................................... 70 CHAPTER 6: CONCLUSION & FUTURE WORK ...................................................... 75 BIBLIOGRAPHY ......................................................................................................... 78 Appendix A: Gene lists for Clusters A - L ..................................................................... 81 Appendix B: Codes/Programs ........................................................................................ 88 vii LIST OF TABLES Table 1. Running times and number of clusters for different graph sizes( no. of nodes) . 42 Table 2. Gene Enrichment results for Cluster 1 (GO: Molecular Function) .................... 48 Table 3. Gene Enrichment results for cluster 1 (GO: Biological Process) ....................... 49 Table 4. Gene Enrichment results for cluster 1 (GO: Cellular Component) .................... 49 Table 5. List of clusters (EigenCut without overlaps and next-available-hub seed- selection) with p-values less than 0.05 in the log-rank tests ............................................ 59 Table 6. List of clusters (EigenCut without overlaps and next-available-higher index seed-selection) with p-values less than 0.05 in the log-rank tests .................................... 59 Table 7. List of clusters (EigenCut with multi-merge, overlaps and next-available-hub seed-selection) with p-values less than 0.05 in the log-rank tests .................................... 60 Table 8. List of clusters (EigenCut on K-core) with p-values less than 0.05 in the log-rank tests ............................................................................................................................... 61 Table 9. Potential Biomarkers identified through different methods ............................... 62 Table 10. GO enrichment results using ToppGene for Cluster D (GO: Biological Processes) ...................................................................................................................... 64 Table 11. GO Enrichment results using ToppGene for Cluster E (GO: Biological Processes) ...................................................................................................................... 66 viii Table 12. GO Enrichment results using ToppGene for Cluster H (GO: Biological Processes) ...................................................................................................................... 68 Table 13. Overlap values for the basal-like cancer clusters versus non-basal-like cancer clusters .......................................................................................................................... 74 Table 14. Cluster A ........................................................................................................ 81 Table 15. Cluster B .......................................................................................................
Recommended publications
  • Microdeletions in 16P11.2 and 13Q31.3 Associated with Developmental Delay and Generalized Overgrowth
    Microdeletions in 16p11.2 and 13q31.3 associated with developmental delay and generalized overgrowth A.M. George1, J. Taylor2 and D.R. Love1 1Diagnostic Genetics, LabPlus, Auckland City Hospital, Auckland, New Zealand 2Northern Regional Genetic Service, Auckland City Hospital, Auckland, New Zealand Corresponding author: D.R. Love E-mail: [email protected] Genet. Mol. Res. 11 (3): 3133-3137 (2012) Received November 28, 2011 Accepted July 18, 2012 Published September 3, 2012 DOI http://dx.doi.org/10.4238/2012.September.3.1 ABSTRACT. Chromosome microarray analysis of patients with developmental delay has provided evidence of small deletions or duplications associated with this clinical phenotype. In this context, a 7.1- to 8.7-Mb interstitial deletion of chromosome 16 is well documented, but within this interval a rare 200-kb deletion has recently been defined that appears to be associated with obesity, or developmental delay together with overgrowth. We report a patient carrying this rare deletion, who falls into the latter clinical category, but who also carries a second very rare deletion in 13q31.3. It remains unclear if this maternally inherited deletion acts as a second copy number variation leading to pathogenic variation, or is non-causal and the true modifiers are yet to be determined. Key words: Developmental delay; Obesity; Overgrowth; GPC5; SH2B1 Genetics and Molecular Research 11 (3): 3133-3137 (2012) ©FUNPEC-RP www.funpecrp.com.br A.M. George et al. 3134 INTRODUCTION Current referrals for chromosome microarray analysis (CMA) are primarily for de- termining the molecular basis of developmental delay and autistic spectrum disorder in child- hood.
    [Show full text]
  • Protein Interaction Network of Alternatively Spliced Isoforms from Brain Links Genetic Risk Factors for Autism
    ARTICLE Received 24 Aug 2013 | Accepted 14 Mar 2014 | Published 11 Apr 2014 DOI: 10.1038/ncomms4650 OPEN Protein interaction network of alternatively spliced isoforms from brain links genetic risk factors for autism Roser Corominas1,*, Xinping Yang2,3,*, Guan Ning Lin1,*, Shuli Kang1,*, Yun Shen2,3, Lila Ghamsari2,3,w, Martin Broly2,3, Maria Rodriguez2,3, Stanley Tam2,3, Shelly A. Trigg2,3,w, Changyu Fan2,3, Song Yi2,3, Murat Tasan4, Irma Lemmens5, Xingyan Kuang6, Nan Zhao6, Dheeraj Malhotra7, Jacob J. Michaelson7,w, Vladimir Vacic8, Michael A. Calderwood2,3, Frederick P. Roth2,3,4, Jan Tavernier5, Steve Horvath9, Kourosh Salehi-Ashtiani2,3,w, Dmitry Korkin6, Jonathan Sebat7, David E. Hill2,3, Tong Hao2,3, Marc Vidal2,3 & Lilia M. Iakoucheva1 Increased risk for autism spectrum disorders (ASD) is attributed to hundreds of genetic loci. The convergence of ASD variants have been investigated using various approaches, including protein interactions extracted from the published literature. However, these datasets are frequently incomplete, carry biases and are limited to interactions of a single splicing isoform, which may not be expressed in the disease-relevant tissue. Here we introduce a new interactome mapping approach by experimentally identifying interactions between brain-expressed alternatively spliced variants of ASD risk factors. The Autism Spliceform Interaction Network reveals that almost half of the detected interactions and about 30% of the newly identified interacting partners represent contribution from splicing variants, emphasizing the importance of isoform networks. Isoform interactions greatly contribute to establishing direct physical connections between proteins from the de novo autism CNVs. Our findings demonstrate the critical role of spliceform networks for translating genetic knowledge into a better understanding of human diseases.
    [Show full text]
  • Datasheet: VMA00439 Product Details
    Datasheet: VMA00439 Description: MOUSE ANTI ACBD3 Specificity: ACBD3 Format: Purified Product Type: PrecisionAb™ Monoclonal Clone: 5F9 Isotype: IgG1 Quantity: 100 µl Product Details Applications This product has been reported to work in the following applications. This information is derived from testing within our laboratories, peer-reviewed publications or personal communications from the originators. Please refer to references indicated for further information. For general protocol recommendations, please visit www.bio-rad-antibodies.com/protocols. Yes No Not Determined Suggested Dilution Western Blotting 1/1000 PrecisionAb antibodies have been extensively validated for the western blot application. The antibody has been validated at the suggested dilution. Where this product has not been tested for use in a particular technique this does not necessarily exclude its use in such procedures. Further optimization may be required dependant on sample type. Target Species Human Species Cross Reacts with: Rat Reactivity N.B. Antibody reactivity and working conditions may vary between species. Product Form Purified IgG - liquid Preparation Mouse monoclonal antibody purified by affinity chromatography from ascites Buffer Solution Phosphate buffered saline Preservative 0.09% Sodium Azide (NaN3) Stabilisers 1% Bovine Serum Albumin 50% Glycerol Immunogen Full length recombinant human ACBD3 (NP_073572) produced in HEK293T cells External Database Links UniProt: Q9H3P7 Related reagents Entrez Gene: 64746 ACBD3 Related reagents Page 1 of 2 Synonyms GCP60, GOCAP1, GOLPH1 Specificity Mouse anti Human ACBD3 antibody recognizes ACBD3, also known as PBR- and PKA-associated protein 7, PKA (RIalpha)-associated protein, acyl-Coenzyme A binding domain containing 3, golgi complex associated protein 1 60kDa, golgi phosphoprotein 1 and peripheral benzodiazepine receptor-associated protein PAP7.
    [Show full text]
  • Seq2pathway Vignette
    seq2pathway Vignette Bin Wang, Xinan Holly Yang, Arjun Kinstlick May 19, 2021 Contents 1 Abstract 1 2 Package Installation 2 3 runseq2pathway 2 4 Two main functions 3 4.1 seq2gene . .3 4.1.1 seq2gene flowchart . .3 4.1.2 runseq2gene inputs/parameters . .5 4.1.3 runseq2gene outputs . .8 4.2 gene2pathway . 10 4.2.1 gene2pathway flowchart . 11 4.2.2 gene2pathway test inputs/parameters . 11 4.2.3 gene2pathway test outputs . 12 5 Examples 13 5.1 ChIP-seq data analysis . 13 5.1.1 Map ChIP-seq enriched peaks to genes using runseq2gene .................... 13 5.1.2 Discover enriched GO terms using gene2pathway_test with gene scores . 15 5.1.3 Discover enriched GO terms using Fisher's Exact test without gene scores . 17 5.1.4 Add description for genes . 20 5.2 RNA-seq data analysis . 20 6 R environment session 23 1 Abstract Seq2pathway is a novel computational tool to analyze functional gene-sets (including signaling pathways) using variable next-generation sequencing data[1]. Integral to this tool are the \seq2gene" and \gene2pathway" components in series that infer a quantitative pathway-level profile for each sample. The seq2gene function assigns phenotype-associated significance of genomic regions to gene-level scores, where the significance could be p-values of SNPs or point mutations, protein-binding affinity, or transcriptional expression level. The seq2gene function has the feasibility to assign non-exon regions to a range of neighboring genes besides the nearest one, thus facilitating the study of functional non-coding elements[2]. Then the gene2pathway summarizes gene-level measurements to pathway-level scores, comparing the quantity of significance for gene members within a pathway with those outside a pathway.
    [Show full text]
  • Autism Multiplex Family with 16P11.2P12.2 Microduplication Syndrome in Monozygotic Twins and Distal 16P11.2 Deletion in Their Brother
    European Journal of Human Genetics (2012) 20, 540–546 & 2012 Macmillan Publishers Limited All rights reserved 1018-4813/12 www.nature.com/ejhg ARTICLE Autism multiplex family with 16p11.2p12.2 microduplication syndrome in monozygotic twins and distal 16p11.2 deletion in their brother Anne-Claude Tabet1,2,3,4, Marion Pilorge2,3,4, Richard Delorme5,6,Fre´de´rique Amsellem5,6, Jean-Marc Pinard7, Marion Leboyer6,8,9, Alain Verloes10, Brigitte Benzacken1,11,12 and Catalina Betancur*,2,3,4 The pericentromeric region of chromosome 16p is rich in segmental duplications that predispose to rearrangements through non-allelic homologous recombination. Several recurrent copy number variations have been described recently in chromosome 16p. 16p11.2 rearrangements (29.5–30.1 Mb) are associated with autism, intellectual disability (ID) and other neurodevelopmental disorders. Another recognizable but less common microdeletion syndrome in 16p11.2p12.2 (21.4 to 28.5–30.1 Mb) has been described in six individuals with ID, whereas apparently reciprocal duplications, studied by standard cytogenetic and fluorescence in situ hybridization techniques, have been reported in three patients with autism spectrum disorders. Here, we report a multiplex family with three boys affected with autism, including two monozygotic twins carrying a de novo 16p11.2p12.2 duplication of 8.95 Mb (21.28–30.23 Mb) characterized by single-nucleotide polymorphism array, encompassing both the 16p11.2 and 16p11.2p12.2 regions. The twins exhibited autism, severe ID, and dysmorphic features, including a triangular face, deep-set eyes, large and prominent nasal bridge, and tall, slender build. The eldest brother presented with autism, mild ID, early-onset obesity and normal craniofacial features, and carried a smaller, overlapping 16p11.2 microdeletion of 847 kb (28.40–29.25 Mb), inherited from his apparently healthy father.
    [Show full text]
  • A Gene-Level Methylome-Wide Association Analysis Identifies Novel
    bioRxiv preprint doi: https://doi.org/10.1101/2020.07.13.201376; this version posted July 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 A gene-level methylome-wide association analysis identifies novel 2 Alzheimer’s disease genes 1 1 2 3 4 3 Chong Wu , Jonathan Bradley , Yanming Li , Lang Wu , and Hong-Wen Deng 1 4 Department of Statistics, Florida State University; 2 5 Department of Biostatistics & Data Science, University of Kansas Medical Center; 3 6 Population Sciences in the Pacific Program, University of Hawaii Cancer center; 4 7 Tulane Center for Biomedical Informatics and Genomics, Deming Department of Medicine, 8 Tulane University School of Medicine 9 Corresponding to: Chong Wu, Assistant Professor, Department of Statistics, Florida State 10 University, email: [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.07.13.201376; this version posted July 14, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 11 Abstract 12 Motivation: Transcriptome-wide association studies (TWAS) have successfully facilitated the dis- 13 covery of novel genetic risk loci for many complex traits, including late-onset Alzheimer’s disease 14 (AD). However, most existing TWAS methods rely only on gene expression and ignore epige- 15 netic modification (i.e., DNA methylation) and functional regulatory information (i.e., enhancer- 16 promoter interactions), both of which contribute significantly to the genetic basis ofAD.
    [Show full text]
  • Knock-Out of ACBD3 Leads to Dispersed Golgi Structure, but Unaffected Mitochondrial Functions in HEK293 and Hela Cells
    International Journal of Molecular Sciences Article Knock-Out of ACBD3 Leads to Dispersed Golgi Structure, but Unaffected Mitochondrial Functions in HEK293 and HeLa Cells Tereza Da ˇnhelovská 1 , Lucie Zdražilová 1 , Hana Štufková 1, Marie Vanišová 1, Nikol Volfová 1, Jana Kˇrížová 1 , OndˇrejKuda 2 , Jana Sládková 1 and Markéta Tesaˇrová 1,* 1 Department of Paediatrics and Inherited Metabolic Disorders, Charles University, First Faculty of Medicine and General University Hospital in Prague, 128 01 Prague, Czech Republic; [email protected] (T.D.); [email protected] (L.Z.); [email protected] (H.Š.); [email protected] (M.V.); [email protected] (N.V.); [email protected] (J.K.); [email protected] (J.S.) 2 Institute of Physiology, Academy of Sciences of the Czech Republic, 142 00 Prague, Czech Republic; [email protected] * Correspondence: [email protected] Abstract: The Acyl-CoA-binding domain-containing protein (ACBD3) plays multiple roles across the cell. Although generally associated with the Golgi apparatus, it operates also in mitochondria. In steroidogenic cells, ACBD3 is an important part of a multiprotein complex transporting cholesterol into mitochondria. Balance in mitochondrial cholesterol is essential for proper mitochondrial protein biosynthesis, among others. We generated ACBD3 knock-out (ACBD3-KO) HEK293 and HeLa cells and characterized the impact of protein absence on mitochondria, Golgi, and lipid profile. In ACBD3- Citation: Daˇnhelovská,T.; KO cells, cholesterol level and mitochondrial structure and functions are not altered, demonstrating Zdražilová, L.; Štufková, H.; that an alternative pathway of cholesterol transport into mitochondria exists. However, ACBD3- Vanišová, M.; Volfová, N.; Kˇrížová,J.; Kuda, O.; Sládková, J.; Tesaˇrová,M.
    [Show full text]
  • Supplementary Information
    Supplementary Information This text file includes: Supplementary Methods Supplementary Figure 1-13, 15-30 Supplementary Table 1-8, 16, 20-21, 23, 25-37, 40-41 1 1. Samples, DNA extraction and genome sequencing 1.1 Ethical statements and sample storage The ethical statements of collecting and processing tissue samples for each species are listed as follows: Myotis myotis: All procedures were carried out in accordance with the ethical guidelines and permits (AREC-13-38-Teeling) delivered by the University College Dublin and the Préfet du Morbihan, awarded to Emma Teeling and Sébastien Puechmaille respectively. A single M. myotis individual was humanely sacrificed given that she had lethal injuries, and dissected. Rhinolophus ferrumequinum: All the procedures were conducted under the license (Natural England 2016-25216-SCI-SCI) issued to Gareth Jones. The individual bat died unexpectedly and suddenly during sampling and was dissected immediately. Pipistrellus kuhlii: The sampling procedure was carried out following all the applicable national guidelines for the care and use of animals. Sampling was done in accordance with all the relevant wildlife legislation and approved by the Ministry of Environment (Ministero della Tutela del Territorio e del Mare, Aut.Prot. N˚: 13040, 26/03/2014). Molossus molossus: All sampling methods were approved by the Ministerio de Ambiente de Panamá (SE/A-29-18) and by the Institutional Animal Care and Use Committee of the Smithsonian Tropical Research Institute (2017-0815-2020). Phyllostomus discolor: P. discolor bats originated from a breeding colony in the Department Biology II of the Ludwig-Maximilians-University in Munich. Approval to keep and breed the bats was issued by the Munich district veterinary office.
    [Show full text]
  • Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Nutrition Publications and Other Works Nutrition 12-19-2010 Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P. Stewart Marshall University Hyoung Y. Kim University of Tennessee - Knoxville, [email protected] Arnold M. Saxton University of Tennessee - Knoxville, [email protected] Jung H. Kim Marshall University Follow this and additional works at: https://trace.tennessee.edu/utk_nutrpubs Part of the Animal Sciences Commons, and the Nutrition Commons Recommended Citation BMC Genomics 2010, 11:713 doi:10.1186/1471-2164-11-713 This Article is brought to you for free and open access by the Nutrition at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Nutrition Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Stewart et al. BMC Genomics 2010, 11:713 http://www.biomedcentral.com/1471-2164/11/713 RESEARCH ARTICLE Open Access Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P Stewart1, Hyoung Yon Kim2, Arnold M Saxton3, Jung Han Kim1* Abstract Background: Type 2 diabetes (T2D) is the most common form of diabetes in humans and is closely associated with dyslipidemia and obesity that magnifies the mortality and morbidity related to T2D. The genetic contribution to human T2D and related metabolic disorders is evident, and mostly follows polygenic inheritance. The TALLYHO/ JngJ (TH) mice are a polygenic model for T2D characterized by obesity, hyperinsulinemia, impaired glucose uptake and tolerance, hyperlipidemia, and hyperglycemia.
    [Show full text]
  • Bioinformatics Analyses of Genomic Imprinting
    Bioinformatics Analyses of Genomic Imprinting Dissertation zur Erlangung des Grades des Doktors der Naturwissenschaften der Naturwissenschaftlich-Technischen Fakultät III Chemie, Pharmazie, Bio- und Werkstoffwissenschaften der Universität des Saarlandes von Barbara Hutter Saarbrücken 2009 Tag des Kolloquiums: 08.12.2009 Dekan: Prof. Dr.-Ing. Stefan Diebels Berichterstatter: Prof. Dr. Volkhard Helms Priv.-Doz. Dr. Martina Paulsen Vorsitz: Prof. Dr. Jörn Walter Akad. Mitarbeiter: Dr. Tihamér Geyer Table of contents Summary________________________________________________________________ I Zusammenfassung ________________________________________________________ I Acknowledgements _______________________________________________________II Abbreviations ___________________________________________________________ III Chapter 1 – Introduction __________________________________________________ 1 1.1 Important terms and concepts related to genomic imprinting __________________________ 2 1.2 CpG islands as regulatory elements ______________________________________________ 3 1.3 Differentially methylated regions and imprinting clusters_____________________________ 6 1.4 Reading the imprint __________________________________________________________ 8 1.5 Chromatin marks at imprinted regions___________________________________________ 10 1.6 Roles of repetitive elements ___________________________________________________ 12 1.7 Functional implications of imprinted genes _______________________________________ 14 1.8 Evolution and parental conflict ________________________________________________
    [Show full text]
  • Supplemental Information
    Supplemental information Dissection of the genomic structure of the miR-183/96/182 gene. Previously, we showed that the miR-183/96/182 cluster is an intergenic miRNA cluster, located in a ~60-kb interval between the genes encoding nuclear respiratory factor-1 (Nrf1) and ubiquitin-conjugating enzyme E2H (Ube2h) on mouse chr6qA3.3 (1). To start to uncover the genomic structure of the miR- 183/96/182 gene, we first studied genomic features around miR-183/96/182 in the UCSC genome browser (http://genome.UCSC.edu/), and identified two CpG islands 3.4-6.5 kb 5’ of pre-miR-183, the most 5’ miRNA of the cluster (Fig. 1A; Fig. S1 and Seq. S1). A cDNA clone, AK044220, located at 3.2-4.6 kb 5’ to pre-miR-183, encompasses the second CpG island (Fig. 1A; Fig. S1). We hypothesized that this cDNA clone was derived from 5’ exon(s) of the primary transcript of the miR-183/96/182 gene, as CpG islands are often associated with promoters (2). Supporting this hypothesis, multiple expressed sequences detected by gene-trap clones, including clone D016D06 (3, 4), were co-localized with the cDNA clone AK044220 (Fig. 1A; Fig. S1). Clone D016D06, deposited by the German GeneTrap Consortium (GGTC) (http://tikus.gsf.de) (3, 4), was derived from insertion of a retroviral construct, rFlpROSAβgeo in 129S2 ES cells (Fig. 1A and C). The rFlpROSAβgeo construct carries a promoterless reporter gene, the β−geo cassette - an in-frame fusion of the β-galactosidase and neomycin resistance (Neor) gene (5), with a splicing acceptor (SA) immediately upstream, and a polyA signal downstream of the β−geo cassette (Fig.
    [Show full text]
  • Noelia Díaz Blanco
    Effects of environmental factors on the gonadal transcriptome of European sea bass (Dicentrarchus labrax), juvenile growth and sex ratios Noelia Díaz Blanco Ph.D. thesis 2014 Submitted in partial fulfillment of the requirements for the Ph.D. degree from the Universitat Pompeu Fabra (UPF). This work has been carried out at the Group of Biology of Reproduction (GBR), at the Department of Renewable Marine Resources of the Institute of Marine Sciences (ICM-CSIC). Thesis supervisor: Dr. Francesc Piferrer Professor d’Investigació Institut de Ciències del Mar (ICM-CSIC) i ii A mis padres A Xavi iii iv Acknowledgements This thesis has been made possible by the support of many people who in one way or another, many times unknowingly, gave me the strength to overcome this "long and winding road". First of all, I would like to thank my supervisor, Dr. Francesc Piferrer, for his patience, guidance and wise advice throughout all this Ph.D. experience. But above all, for the trust he placed on me almost seven years ago when he offered me the opportunity to be part of his team. Thanks also for teaching me how to question always everything, for sharing with me your enthusiasm for science and for giving me the opportunity of learning from you by participating in many projects, collaborations and scientific meetings. I am also thankful to my colleagues (former and present Group of Biology of Reproduction members) for your support and encouragement throughout this journey. To the “exGBRs”, thanks for helping me with my first steps into this world. Working as an undergrad with you Dr.
    [Show full text]