Supplementary Information on Material and Methods

Total Page:16

File Type:pdf, Size:1020Kb

Supplementary Information on Material and Methods Supplementary information on Material and Methods Patients and samples All 165 study patients were treated by surgery for lung adenocarcinoma without prior chemotherapy. Fifty-eight patients received cisplatin-based adjuvant chemotherapy. They belonged to two groups based on their smoking status. Never smokers had a lifetime exposure of less than 100 cigarettes, which was cross-validated using an ad hoc form. The main cohort included 77 never smokers to which 77 ever smokers were matched by surgery center, sex and disease stage. This main cohort was designed to address the issue of differences in genomic DNA copy-number profiles between never and ever smokers. Patients were predominantly women (88%). According to the TNM system in use at the time of diagnosis, the tumors could be classified as stage I for 88 patients, stage II for 18 patients or stage III for 48 patients(1). While by design sex and disease stage were distributed equally between never and ever smokers, the two groups defined by smoking status differed by age (mean age 60 years for ever smokers versus 68 years for never smokers; p-value <0.0001) and rate of EGFR (66% for never smokers versus 17% for ever smokers; p-value <0.0001) or KRAS mutation (7% for never smokers versus 41% for ever smokers; p-value <0.0001) (supplementary Table S4). An additional group of 11 never smokers, who had been treated at participating surgical centers, had been studied using CGH arrays. They presented with typical characteristics of never smoker patients, including female sex preponderance (10 patients), higher age (median 65 years) compared to ever smokers and a high rate of EGFR mutation (7 patients). These additional cases were included only in the present study of gene expression as the CGH array data could not be easily combined with genomic data obtained using SNP arrays. Their RNAs were extracted, purified, quantified, qualified and hybridized together with the samples from the main cohort. The pathological diagnoses were reviewed with the help of immunohistochemical stains. Most tumors (93%) expressed the NKX2-1 protein. Cases for which a doubt about the primary site in the 1 lung remained were excluded. All adenocarcinomas were invasive. A bronchiolo-alveolar component was recorded when a non invasive lepidic growth was seen adjacent to a component of invasive adenocarcinoma, which corresponds to the lepidic subtype in the proposed revisions to the histological classification of lung adenocarcinoma,(2)(3). A bronchiolo-alveolar component was more frequent (p-value 0.0007) in never smokers (61%) than in ever smokers (34%). A translocation involving ALK was shown using FISH in 4 tumors from never smokers with wild-type EGFR(4). Genomic DNA and RNA were extracted from frozen tissue sections using commercial kits (Qiagen, Hilden, Germany) at Institut Gustave-Roussy. Frozen samples were sectioned after removing most of embedding medium. Beginning and end sections were stained with haematoxylin and eosin to assess the proportions of tumor cells. Only cases with an average of tumor cells equal to or above 50% were included. Thirty to 60 sections were placed in two separate tubes and kept frozen in liquid nitrogen until nucleic acid extractions. Both RNA and DNA were assessed for integrity and quantity following stringent quality control criteria (CIT program protocols: cit.ligue-cancer.net). Genomic array analysis Genomic arrays were carried out on the Integragen platform (Evry, France). DNAs were hybridized on Illumina SNP HumanCNV370 chips according to the instructions provided by the array manufacturer (Illumina, San Diego, CA). Raw fluorescent signals were imported and normalized using Illumina BeadStudio software. A supplemental normalization procedure tQN was applied to correct for dye bias(5). Genomic profiles were segmented using the circular binary segmentation algorithm (DNAcopy package, Bioconductor)(6). The absolute copy numbers were determined using the Genome Alteration Print method(7). The Genomic Identification of Significant Targets In Cancer (GISTIC) version 2.0 algorithm was applied to high-quality copy-number profiles with an amplitude threshold of 0.2 for copy-number amplifications or copy-number deletions(8). Scoring was performed using the GeneGISTIC 2 procedure. Significant regions were identified with a significance level of 0.25 for the residual q- value. Their peak boundaries were calculated with a confidence level of 0.85. Genes overlapping wide peak limits were listed as the most likely gene targets in each region. The frequencies of aberrations contributing to significant peak regions identified by GISTIC 2.0 were compared using chi-square tests with Bonferroni correction for multiple testing. Gene expression analysis Gene expression arrays were carried out on the IGBMC microarray platform (Strasbourg, France). Total RNAs were amplified, labeled and hybridized to Affymetrix Human Genome U133 Plus 2.0 GeneChip, following the manufacturer’s protocol (Affymetrix, Santa Clara, CA). Microarrays were scanned with an Affymetrix GeneChip Scanner 3000 and raw intensities were quantified from subsequent images using GCOS 1.4 software (Affymetrix). Data were normalized using Robust Multi- array Average method(9). Unsupervised hierarchical clustering analysis of normal lung and tumors samples using Pearson correlation metric was performed on the 1183 probe sets (quantile 0.975) with the greatest robust coefficient of variation between samples. Only probe sets with Affymetrix annotation class A and located on autosomes were considered. The tumor samples were from the LG cohort. The normal lung samples were from eleven female Asian never smokers with publicly-available Human Genome U133 Plus 2.0 Affymetrix gene expression data (accession number: GSE 19804). Clustering stability was assessed using resampling strategy as well as noise addition and clustering procedures comparison. Differences between sample clusters were tested using the chi-square test. Gene Ontology (GO) sets were obtained from GeneOntology.org. Genatomy was used to calculate hypergeometric enrichment with FDR correction of p-values(10)(11). Literature Vector Analysis (LitVAn) was used to infer gene cluster functionality with an evaluation of the significance of their scores (litvan.bio.columbia.edu)(12). 3 Both genomic and gene expression data were deposited in ArrayExpress database (accession number: E-MTAB-923), which also includes a list of EGFR and KRAS mutations. ATAD2 relative expression was measured in duplicates by real-time RT-PCR using the Hs00204205 Taqman® probe (Applied Biosystems, Carlsbad, CA, USA). POLR2A and YAP1 were selected as the best combination of stable internal control genes across the cohort using the Normfinder algorithm(13). The ATAD2 and internal control data were corrected for PCR efficiency and interplate variations. The comparative threshold cycle (CT) method was applied to calculate the mean s.d. 2-CT, and then the fold-change differences between groups and the corresponding Welch t test two- sided p-values(14). Integration of copy-number and gene expression data We used COpy Number and EXpression In Cancer (CONEXIC) to integrate matched copy number (amplifications or deletions) and gene expression data from 80 paired samples(15). The list of potential regulators (candidate genes) for which amplification data or deletion data were considered included genes overlapping GISTIC 2.0 significant regions that were less than one third of a chromosome arm in length. Too many candidate drivers can burden CONEXIC computationally and statistically(15). We experienced such problem with the full list of candidate drivers (3048 genes) overlapping all GISTIC regions. To reduce the number of candidate genes, we could have used a lower threshold for the FDR q-value in the GISTIC analysis. Instead, we kept the FDR q-value threshold 0.25 which is typically used in GISTIC analyses, and considered only ‘focal’ aberrations, which were arbitrarily identified by their length relative to the chromosome arm. The criterion ‘less than a third of chromosome arm’ was applied exactly by calculating the ratio of the length of each aberrant region to that of the chromosome arm where they were located. Gene expression data were processed by removing probe sets whose standard deviation was smaller than 0.25 on a log2 scale, resulting in 28821 probe sets measuring 16009 unique genes. Inconsistent 4 multiple probe sets were then removed resulting in a final set of 10358 genes. Expression values were normalized to mean of zero and a standard deviation of one for each gene. During the ‘SingleModulator’ step, a conservative set of potential driver genes was obtained using a Welch t-test (p-value<0.05), comparing amplified versus normal or deleted versus normal. All potential driver genes were considered during the following ‘NetworkLearning’ step. The ‘Single Modulator’ step was run using permutation testing (p-value<0.001) and the ‘noUpDown’ parameter such that each module contained genes positively and negatively regulated with expression of the regulators. Potential regulators were excluded as members of the resulting clusters. The parameters for the scoring function were alpha=2 and lambda=1. Non-parametric bootstrapping was applied to the 80 samples and repeated 100 times. Candidate drivers were selected when they were selected in at least 90% of the runs. The ‘Network Learning’ step was run using permutation testing (p-value<0.05) and the ‘LikelihoodCutoff’ parameter (value 2) to remove
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Targeted Exome Sequencing Provided Comprehensive Genetic Diagnosis of Congenital Anomalies of the Kidney and Urinary Tract
    Journal of Clinical Medicine Article Targeted Exome Sequencing Provided Comprehensive Genetic Diagnosis of Congenital Anomalies of the Kidney and Urinary Tract 1,2, 3,4, 3 1,5 Yo Han Ahn y, Chung Lee y, Nayoung K. D. Kim , Eujin Park , Hee Gyung Kang 1,2,6,* , Il-Soo Ha 1,2,6, Woong-Yang Park 3,4,7 and Hae Il Cheong 1,2,6 1 Department of Pediatrics, Seoul National University College of Medicine, Seoul 03080, Korea; [email protected] (Y.H.A.); [email protected] (E.P.); [email protected] (I.-S.H.); [email protected] (H.I.C.) 2 Department of Pediatrics, Seoul National University Children’s Hospital, Seoul 03080, Korea 3 Samsung Genome Institute, Samsung Medical Center, Seoul 06351, Korea; [email protected] (C.L.); [email protected] (N.K.D.K.); [email protected] (W.-Y.P.) 4 Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul 06351, Korea 5 Department of Pediatrics, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07441, Korea 6 Kidney Research Institute, Medical Research Center, Seoul National University College of Medicine, Seoul 03080, Korea 7 Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon 16419, Korea * Correspondence: [email protected] These authors equally contributed to this article. y Received: 31 January 2020; Accepted: 8 March 2020; Published: 10 March 2020 Abstract: Congenital anomalies of the kidney and urinary tract (CAKUT) are the most common cause of chronic kidney disease in children.
    [Show full text]
  • Noelia Díaz Blanco
    Effects of environmental factors on the gonadal transcriptome of European sea bass (Dicentrarchus labrax), juvenile growth and sex ratios Noelia Díaz Blanco Ph.D. thesis 2014 Submitted in partial fulfillment of the requirements for the Ph.D. degree from the Universitat Pompeu Fabra (UPF). This work has been carried out at the Group of Biology of Reproduction (GBR), at the Department of Renewable Marine Resources of the Institute of Marine Sciences (ICM-CSIC). Thesis supervisor: Dr. Francesc Piferrer Professor d’Investigació Institut de Ciències del Mar (ICM-CSIC) i ii A mis padres A Xavi iii iv Acknowledgements This thesis has been made possible by the support of many people who in one way or another, many times unknowingly, gave me the strength to overcome this "long and winding road". First of all, I would like to thank my supervisor, Dr. Francesc Piferrer, for his patience, guidance and wise advice throughout all this Ph.D. experience. But above all, for the trust he placed on me almost seven years ago when he offered me the opportunity to be part of his team. Thanks also for teaching me how to question always everything, for sharing with me your enthusiasm for science and for giving me the opportunity of learning from you by participating in many projects, collaborations and scientific meetings. I am also thankful to my colleagues (former and present Group of Biology of Reproduction members) for your support and encouragement throughout this journey. To the “exGBRs”, thanks for helping me with my first steps into this world. Working as an undergrad with you Dr.
    [Show full text]
  • Subterranean Mammals Show Convergent Regression in Ocular Genes and Enhancers, Along with Adaptation to Tunneling
    RESEARCH ARTICLE Subterranean mammals show convergent regression in ocular genes and enhancers, along with adaptation to tunneling Raghavendran Partha1, Bharesh K Chauhan2,3, Zelia Ferreira1, Joseph D Robinson4, Kira Lathrop2,3, Ken K Nischal2,3, Maria Chikina1*, Nathan L Clark1* 1Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, United States; 2UPMC Eye Center, Children’s Hospital of Pittsburgh, Pittsburgh, United States; 3Department of Ophthalmology, University of Pittsburgh School of Medicine, Pittsburgh, United States; 4Department of Molecular and Cell Biology, University of California, Berkeley, United States Abstract The underground environment imposes unique demands on life that have led subterranean species to evolve specialized traits, many of which evolved convergently. We studied convergence in evolutionary rate in subterranean mammals in order to associate phenotypic evolution with specific genetic regions. We identified a strong excess of vision- and skin-related genes that changed at accelerated rates in the subterranean environment due to relaxed constraint and adaptive evolution. We also demonstrate that ocular-specific transcriptional enhancers were convergently accelerated, whereas enhancers active outside the eye were not. Furthermore, several uncharacterized genes and regulatory sequences demonstrated convergence and thus constitute novel candidate sequences for congenital ocular disorders. The strong evidence of convergence in these species indicates that evolution in this environment is recurrent and predictable and can be used to gain insights into phenotype–genotype relationships. DOI: https://doi.org/10.7554/eLife.25884.001 *For correspondence: [email protected] (MC); [email protected] (NLC) Competing interests: The Introduction authors declare that no The subterranean habitat has been colonized by numerous animal species for its shelter and unique competing interests exist.
    [Show full text]
  • Literature Mining Sustains and Enhances Knowledge Discovery from Omic Studies
    LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES by Rick Matthew Jordan B.S. Biology, University of Pittsburgh, 1996 M.S. Molecular Biology/Biotechnology, East Carolina University, 2001 M.S. Biomedical Informatics, University of Pittsburgh, 2005 Submitted to the Graduate Faculty of School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2016 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Rick Matthew Jordan It was defended on December 2, 2015 and approved by Shyam Visweswaran, M.D., Ph.D., Associate Professor Rebecca Jacobson, M.D., M.S., Professor Songjian Lu, Ph.D., Assistant Professor Dissertation Advisor: Vanathi Gopalakrishnan, Ph.D., Associate Professor ii Copyright © by Rick Matthew Jordan 2016 iii LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES Rick Matthew Jordan, M.S. University of Pittsburgh, 2016 Genomic, proteomic and other experimentally generated data from studies of biological systems aiming to discover disease biomarkers are currently analyzed without sufficient supporting evidence from the literature due to complexities associated with automated processing. Extracting prior knowledge about markers associated with biological sample types and disease states from the literature is tedious, and little research has been performed to understand how to use this knowledge to inform the generation of classification models from ‘omic’ data. Using pathway analysis methods to better understand the underlying biology of complex diseases such as breast and lung cancers is state-of-the-art. However, the problem of how to combine literature- mining evidence with pathway analysis evidence is an open problem in biomedical informatics research.
    [Show full text]
  • Chromosomal Microarray Analysis in Turkish Patients with Unexplained Developmental Delay and Intellectual Developmental Disorders
    177 Arch Neuropsychitry 2020;57:177−191 RESEARCH ARTICLE https://doi.org/10.29399/npa.24890 Chromosomal Microarray Analysis in Turkish Patients with Unexplained Developmental Delay and Intellectual Developmental Disorders Hakan GÜRKAN1 , Emine İkbal ATLI1 , Engin ATLI1 , Leyla BOZATLI2 , Mengühan ARAZ ALTAY2 , Sinem YALÇINTEPE1 , Yasemin ÖZEN1 , Damla EKER1 , Çisem AKURUT1 , Selma DEMİR1 , Işık GÖRKER2 1Faculty of Medicine, Department of Medical Genetics, Edirne, Trakya University, Edirne, Turkey 2Faculty of Medicine, Department of Child and Adolescent Psychiatry, Trakya University, Edirne, Turkey ABSTRACT Introduction: Aneuploids, copy number variations (CNVs), and single in 39 (39/123=31.7%) patients. Twelve CNV variant of unknown nucleotide variants in specific genes are the main genetic causes of significance (VUS) (9.75%) patients and 7 CNV benign (5.69%) patients developmental delay (DD) and intellectual disability disorder (IDD). were reported. In 6 patients, one or more pathogenic CNVs were These genetic changes can be detected using chromosome analysis, determined. Therefore, the diagnostic efficiency of CMA was found to chromosomal microarray (CMA), and next-generation DNA sequencing be 31.7% (39/123). techniques. Therefore; In this study, we aimed to investigate the Conclusion: Today, genetic analysis is still not part of the routine in the importance of CMA in determining the genomic etiology of unexplained evaluation of IDD patients who present to psychiatry clinics. A genetic DD and IDD in 123 patients. diagnosis from CMA can eliminate genetic question marks and thus Method: For 123 patients, chromosome analysis, DNA fragment analysis alter the clinical management of patients. Approximately one-third and microarray were performed. Conventional G-band karyotype of the positive CMA findings are clinically intervenable.
    [Show full text]
  • Transcriptional Fates of Human-Specific Segmental Duplications in Brain
    Downloaded from genome.cshlp.org on September 27, 2021 - Published by Cold Spring Harbor Laboratory Press Method Transcriptional fates of human-specific segmental duplications in brain Max L. Dougherty,1,7 Jason G. Underwood,1,2,7 Bradley J. Nelson,1 Elizabeth Tseng,2 Katherine M. Munson,1 Osnat Penn,1 Tomasz J. Nowakowski,3,4 Alex A. Pollen,5 and Evan E. Eichler1,6 1Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA; 2Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, California 94025, USA; 3Department of Anatomy, 4Department of Psychiatry, 5Department of Neurology, University of California, San Francisco, San Francisco, California 94158, USA; 6Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, in- correct, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then am- plified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distin- guish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth–death process of duplicate genes and particularly how the process leads to gene innovation.
    [Show full text]
  • Evolution of Genetic and Genomic Features Unique to the Human Lineage
    REVIEWS Evolution of genetic and genomic features unique to the human lineage Majesta O’Bleness1, Veronica B. Searles1, Ajit Varki2,3,4, Pascal Gagneux3,4 and James M. Sikela1,4 Abstract | Given the unprecedented tools that are now available for rapidly comparing genomes, the identification and study of genetic and genomic changes that are unique to our species have accelerated, and we are entering a golden age of human evolutionary genomics. Here we provide an overview of these efforts, highlighting important recent discoveries, examples of the different types of human-specific genomic and genetic changes identified, and salient trends, such as the localization of evolutionary adaptive changes to complex loci that are highly enriched for disease associations. Finally, we discuss the remaining challenges, such as the incomplete nature of current genome sequence assemblies and difficulties in linking human-specific genomic changes to human-specific phenotypic traits. 1,2 Accelerated evolution Many phenotypic traits that are unique to the human lin- expression of human regulatory regions in mice , are More nucleotide or copy eage are likely to have resulted from selective pressures allowing for evolutionary hypotheses to be tested in number changes in a particular on our genome and the unique demographic history ways that were previously not possible. Work in this region or gene than would be since our divergence from the Pan lineage approximately field has substantially advanced in recent years: the expected from background (BOX 1) rates of mutation over time 6 million years ago . A fundamental question that number of gene‑to‑phenotype candidates has more (for example, in cytochrome c relates to the origin of our species is which genomic than doubled since the topic was last covered in two oxidase subunit Va (COX5A)).
    [Show full text]
  • Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham a Thesis Submitted in Conformity
    Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Molecular Genetics University of Toronto © Copyright by Edward James Higginbotham 2020 i Abstract Characterizing Genomic Duplication in Autism Spectrum Disorder Edward James Higginbotham Master of Science Graduate Department of Molecular Genetics University of Toronto 2020 Duplication, the gain of additional copies of genomic material relative to its ancestral diploid state is yet to achieve full appreciation for its role in human traits and disease. Challenges include accurately genotyping, annotating, and characterizing the properties of duplications, and resolving duplication mechanisms. Whole genome sequencing, in principle, should enable accurate detection of duplications in a single experiment. This thesis makes use of the technology to catalogue disease relevant duplications in the genomes of 2,739 individuals with Autism Spectrum Disorder (ASD) who enrolled in the Autism Speaks MSSNG Project. Fine-mapping the breakpoint junctions of 259 ASD-relevant duplications identified 34 (13.1%) variants with complex genomic structures as well as tandem (193/259, 74.5%) and NAHR- mediated (6/259, 2.3%) duplications. As whole genome sequencing-based studies expand in scale and reach, a continued focus on generating high-quality, standardized duplication data will be prerequisite to addressing their associated biological mechanisms. ii Acknowledgements I thank Dr. Stephen Scherer for his leadership par excellence, his generosity, and for giving me a chance. I am grateful for his investment and the opportunities afforded me, from which I have learned and benefited. I would next thank Drs.
    [Show full text]
  • Chromosomal Level Assembly and Population Sequencing of The
    ZOOLOGICAL RESEARCH INTRODUCTION architecture of genomes and can order, orient, and anchor contigs into chromosome-scale scaffolds (Burton et al., 2013). Tree shrews (Tupaia belangeri) are widely distributed Here, we applied both long-read single-molecule sequencing throughout South Asia, Southeast Asia (Fuchs & Corbach- and Hi-C technology to obtain a new reference genome for Söhle, 2010), and South and Southwest China (Peng et al., the Chinese tree shrew. We also generated a single 1991). They possesses many unique characteristics that are nucleotide polymorphism (SNP) map of the tree shrew by Chromosomal level assembly and population useful in biomedical research models, such as small adult whole-genome sequencing of six individuals. We updated the body size (100–150 g), easy and low cost maintenance, short TreeshrewDB v2.0 (http://www. treeshrewdb. org) to sequencing of the Chinese tree shrew genome reproductive cycle (~6 weeks), moderate life span (6–8 years), incorporate the new reference genome and population genetic high brain-to-body mass ratio, and very close relationship to variations. primates (Fan et al., 2013; Xiao et al., 2017; Xu et al., 2012; Yu Fan1,2, Mao-Sen Ye1,3, Jin-Yan Zhang1,3, Ling Xu1,2, Dan-Dan Yu1,2, Tian-Le Gu1,3, Yu-Lin Yao1,3, Jia-Qi Chen4, Long-Bao Yao, 2017; Zheng et al., 2014). Hitherto, tree shrews have 4 2,3,4,5,6,7 2,6 2,6 1,2,3,4,5,* MATERIALS AND METHODS Lv , Ping Zheng , Dong-Dong Wu , Guo-Jie Zhang , Yong-Gang Yao been used in a wide variety of studies, including research on 1 Key Laboratory of
    [Show full text]
  • Evolutionary-Based Methods for Predicting Genotype-Phenotype Associations in the Mammalian Genome
    Evolutionary-based methods for predicting genotype-phenotype associations in the mammalian genome by Raghavendran Partha Bachelors and Masters, Indian Institute of Technology Madras, 2014 Submitted to the Graduate Faculty of the School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2019 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Raghavendran Partha It was defended on July 23rd 2019 and approved by Maria Chikina, Assistant Professor, Department of Computational and Systems Biology, University of Pittsburgh Chakra Chennubhotla, Associate Professor, Department of Computational and Systems Biology, University of Pittsburgh Miler Lee, Assistant Professor, Department of Biological Sciences, University of Pittsburgh Andreas Pfenning, Assistant Professor, Computational Biology Department, Carnegie Mellon University Mark Rebeiz, Associate Professor, Department of Biological Sciences, University of Pittsburgh Dissertation Director: Nathan Clark, Associate Professor, Department of Computational and Systems Biology, University of Pittsburgh ii Copyright c by Raghavendran Partha 2019 iii Evolutionary-based methods for predicting genotype-phenotype associations in the mammalian genome Raghavendran Partha, PhD University of Pittsburgh, 2019 Phenotypic and genotypic variation between species are the result of millions of exper- iments performed by nature. Understanding why and how phenotypic complexity arises is a central goal of evolutionary biology.
    [Show full text]
  • Human-Specific NOTCH-Like Genes in a Region Linked to Neurodevelopmental Disorders Affect Cortical Neurogenesis
    bioRxiv preprint doi: https://doi.org/10.1101/221226; this version posted November 17, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Human-specific NOTCH-like genes in a region linked to neurodevelopmental disorders affect cortical neurogenesis Authors: Ian T Fiddes1,12, Gerrald A Lodewijk2,12, Meghan Mooring1, Colleen M Bosworth1, Adam D Ewing1#, Gary L Mantalas1,3, Adam M Novak1, Anouk van den Bout2, Alex Bishara4, Jimi L Rosenkrantz1,5, Ryan Lorig-Roach1, Andrew R Field1,3, Maximilian Haeussler1, Lotte Russo2, Aparna Bhaduri6, Tomasz J. Nowakowski6, Alex A. Pollen6, Max L. Dougherty7, Xander Nuttle8, Marie-Claude Addor9, Simon Zwolinski10, Sol Katzman1, Arnold Kriegstein6, Evan E. Eichler7,11, Sofie R Salama1,5,13, Frank MJ Jacobs1,2,13.14*, David Haussler1,5,13,14* Affiliations: 1 UC Santa Cruz Genomics Institute, Santa Cruz, California, United States of America, 2 University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, The Netherlands 3 Molecular, Cell and Developmental Biology, of California Santa Cruz, Santa Cruz, California, United States of America 4 Department of Computer Science and Department of Medicine, Division of Hematology, Stanford University, California, USA 5Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, California, United States of America 6The Eli and Edythe Broad
    [Show full text]