Acquired Copy Number Alterations in Adult Acute Myeloid Leukemia Genomes

Total Page:16

File Type:pdf, Size:1020Kb

Acquired Copy Number Alterations in Adult Acute Myeloid Leukemia Genomes Acquired copy number alterations in adult acute myeloid leukemia genomes Matthew J. Waltera,b,c,1,2, Jacqueline E. Paytond,1, Rhonda E. Riesa,b,1, William D. Shannona, Hrishikesh Deshmukhd, Yu Zhaoa,b, Jack Batye, Sharon Heatha,b, Peter Westervelta,b,c, Mark A. Watsonc,d, Michael H. Tomassona,b,c, Rakesh Nagarajanc,d, Brian P. O’Garaa,b, Clara D. Bloomfieldf,g, Krzysztof Mro´ zekf,g, Rebecca R. Selzerh, Todd A. Richmondh, Jacob Kitzmanh, Joel Geogheganh, Peggy S. Eish, Rachel Maupini, Robert S. Fultoni, Michael McLellani, Richard K. Wilsoni, Elaine R. Mardisi, Daniel C. Linka,b,c, Timothy A. Grauberta,b,c, John F. DiPersioa,b,c, and Timothy J. Leya,b,c aDepartment of Medicine, bDivision of Oncology, cSiteman Cancer Center, dDepartment of Pathology and Immunology, and eDivision of Biostatistics, Washington University School of Medicine, St. Louis, MO 63110; fDivision of Hematology and Oncology, Department of Medicine, Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210; gCancer and Leukemia Group B, Chicago, IL 60601; and hRoche NimbleGen, Inc., Madison, WI 53719; and iThe Genome Center, Washington University School of Medicine, St. Louis, MO 63110 Edited by Janet D. Rowley, University of Chicago Medical Center, Chicago, IL, and approved May 18, 2009 (received for review March 23, 2009) Cytogenetic analysis of acute myeloid leukemia (AML) cells has (CNAs) and UPD are common in AML genomes (6–12). However, accelerated the identification of genes important for AML patho- these studies used low-resolution arrays, often used reference DNA genesis. To complement cytogenetic studies and to identify genes that was not obtained from the same patient’s normal cells, and did altered in AML genomes, we performed genome-wide copy num- not routinely validate copy number changes with independent ber analysis with paired normal and tumor DNA obtained from 86 platforms. These limitations made it difficult to distinguish between adult patients with de novo AML using 1.85 million feature SNP acquired (somatic) CNA and inherited copy number variants arrays. Acquired copy number alterations (CNAs) were confirmed (CNVs) that exist in all individuals; furthermore, secondary vali- using an ultra-dense array comparative genomic hybridization dation methods are required to distinguish between true events and platform. A total of 201 somatic CNAs were found in the 86 AML false-positive findings, which are extremely common using the genomes (mean, 2.34 CNAs per genome), with French-American- current platforms. To overcome these limitations and to definitively British system M6 and M7 genomes containing the most changes identify genes that are somatically altered in AML genomes, we (10–29 CNAs per genome). Twenty-four percent of AML patients used the Affymetrix Genome-Wide Human SNP Array 6.0 plat- with normal cytogenetics had CNA, whereas 40% of patients with form (containing 1.85 million probes, median interprobe spacing an abnormal karyotype had additional CNA detected by SNP array, 680 bp) to screen paired tumor and normal DNA samples obtained and several CNA regions were recurrent. The mRNA expression from 86 adult patients with de novo AML, and validated putative levels of 57 genes were significantly altered in 27 of 50 recurrent CNA using an independent, ultra-dense custom Roche NimbleGen CNA regions <5 megabases in size. A total of 8 uniparental disomy CGH 12 ϫ 135K array (median interprobe spacing 245 bp). We (UPD) segments were identified in the 86 genomes; 6 of 8 UPD calls identified a mean of 2.34 CNAs per genome, and 76% of the CNAs occurred in samples with a normal karyotype. Collectively, 34 of 86 involved a known cancer-related gene. We identified 50 recurrent AML genomes (40%) contained alterations not found with cyto- CNAs Ͻ5 megabases (Mb) in size in the 86 genomes, and 32 of genetics, and 98% of these regions contained genes. Of 86 ge- these 50 regions contained genes not previously implicated in AML. nomes, 43 (50%) had no CNA or UPD at this level of resolution. In UPD was more common in normal karyotype samples. Fifty this study of 86 adult AML genomes, the use of an unbiased percent of the AML genomes tested in this study had no detectable high-resolution genomic screen identified many genes not previ- CNAs or UPD, indicating that other approaches, including whole- ously implicated in AML that may be relevant for pathogenesis, genome sequencing, may be required to discover the remaining along with many known oncogenes and tumor suppressor genes. genetic changes that contribute to AML pathogenesis. AML ͉ array CGH ͉ genomics ͉ SNP array Results Patient Characteristics. A total of 86 adult patients (aged Ͼ18 years) cute myeloid leukemia (AML) is a heterogeneous group of with de novo AML were chosen for study on the basis of the Adiseases currently classified by abnormalities in bone mar- availability of high-quality, abundant, paired bone marrow (tumor) row morphology, karyotype, acquired gene mutations, and al- and skin (normal) DNA samples. Paired samples allowed us to terations in gene expression (1–3). Although the identification of distinguish acquired CNA from inherited CNV. Cases were clas- specific gene mutations has resulted in improved treatments and sified in accordance with the French-American-British (FAB) outcomes for some AML patients (4), enormous clinical heter- system upon diagnosis and banking of their bone marrow speci- ogeneity exists and may reflect the presence of as-yet undetected initiating and cooperating mutations. Therefore, the discovery of somatic mutations in the genomes of AML patients with Author contributions: M.J.W., J.E.P., R.E.R., and T.J.L. designed research; M.J.W., J.E.P., normal and abnormal karyotypes will advance our understand- R.E.R., R.R.S., T.A.R., J.K., J.G., P.S.E., R.M., R.S.F., and M.M. performed research; R.R.S., ing of the genetics underlying AML and should lead to more T.A.R., J.K., J.G., P.S.E., R.M., R.S.F., and M.M. contributed new reagents/analytic tools; M.J.W., J.E.P., R.E.R., W.D.S., H.D., Y.Z., J.B., S.H., P.W., M.A.W., M.H.T., R.N., B.P.O., C.D.B., specific therapies and better patient classification schemes. K.M., R.R.S., T.A.R., J.K., J.G., P.S.E., R.M., R.S.F., M.M., R.K.W., E.R.M., D.C.L., T.A.G., J.F.D., The discovery of previously uncharacterized genes mutated in and T.J.L. analyzed data; and M.J.W., J.E.P., R.E.R., and T.J.L. wrote the paper. acute lymphoblastic leukemia (ALL) was recently reported using Conflict of interest: R.R.S., T.A.R., J.G., and J.K. are employees of Roche NimbleGen, Inc., which SNP array technology for DNA copy number analysis (5). SNP supplied the arrays and hybridization services for the research. array platforms can detect genomic amplifications, deletions, SNP Freely available online through the PNAS open access option. loss of heterozygosity (LOH), and regions of uniparental disomy 1M.J.W., J.E.P., and R.E.R. contributed equally to this work. (UPD) (copy-neutral LOH events) in cancer cells. Early studies 2To whom correspondence should be addressed. E-mail: [email protected]. using SNP arrays and array comparative genomic hybridization This article contains supporting information online at www.pnas.org/cgi/content/full/ (CGH) platforms have suggested that both copy number alterations 0903091106/DCSupplemental. 12950–12955 ͉ PNAS ͉ August 4, 2009 ͉ vol. 106 ͉ no. 31 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0903091106 Downloaded by guest on September 24, 2021 Fig. 1. Copy number and UPD heatmap for 86 AML genomes. The results of copy number and UPD (copy-neutral LOH) analysis of 86 paired tumor and normal DNA samples assayed on the Affymetrix Genome-Wide SNP 6.0 arrays are shown. For each of the 86 genomes, each genome is represented by 2 columns, copy number as the log2 ratio of tumor/normal DNA is shown on the left and UPD on the right. Copy number is designated by a color range from white (deletion) to red (amplification), with pink indicating a normal copy number. The presence of UPD is shown in blue and the normal non-UPD state in gray. The y axis represents the chromosome number, with chromosome 1 at the top and Y on the bottom. The x axis displays samples grouped by common cytogenetic abnormalities. The patient number labels correspond to the patient numbers in Table S1. See Table S2 for a complete listing of miscellaneous cytogenetics. mens. The patients include FAB M0–M7, with a median blast count CNAs that were not independently assessed on the custom array of 64% (range, 30–100%) [supporting information (SI) Table S1 CGH platform had a minimum size of 300 kb and involved at least and Table S2]. 100 probes. All putative CNAs Ͼ200 kb in size that were detected on the SNP array were validated on the custom array CGH platform Acquired CNA. We identified 201 acquired CNAs in the 86 AML (see SI Results and Fig. S1 for a complete description). genomes using the SNP arrays (Fig. 1 and Table S1 and Table S2). Of the 201 CNAs, 198 (99%) contained known genes, and 154 of The 201 CNAs occurred in 38 of 86 AML genomes, spanned from 201 loci (77%) contained at least 1 gene that had previously been associated with cancer- or AML/myelodysplastic syndromes (MDS) 35 kb (34 probes) to 250 Mb (146,524 probes) in size (median, 9.15 Ͻ Mb), and involved every chromosome at least once.
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • 4-6 Weeks Old Female C57BL/6 Mice Obtained from Jackson Labs Were Used for Cell Isolation
    Methods Mice: 4-6 weeks old female C57BL/6 mice obtained from Jackson labs were used for cell isolation. Female Foxp3-IRES-GFP reporter mice (1), backcrossed to B6/C57 background for 10 generations, were used for the isolation of naïve CD4 and naïve CD8 cells for the RNAseq experiments. The mice were housed in pathogen-free animal facility in the La Jolla Institute for Allergy and Immunology and were used according to protocols approved by the Institutional Animal Care and use Committee. Preparation of cells: Subsets of thymocytes were isolated by cell sorting as previously described (2), after cell surface staining using CD4 (GK1.5), CD8 (53-6.7), CD3ε (145- 2C11), CD24 (M1/69) (all from Biolegend). DP cells: CD4+CD8 int/hi; CD4 SP cells: CD4CD3 hi, CD24 int/lo; CD8 SP cells: CD8 int/hi CD4 CD3 hi, CD24 int/lo (Fig S2). Peripheral subsets were isolated after pooling spleen and lymph nodes. T cells were enriched by negative isolation using Dynabeads (Dynabeads untouched mouse T cells, 11413D, Invitrogen). After surface staining for CD4 (GK1.5), CD8 (53-6.7), CD62L (MEL-14), CD25 (PC61) and CD44 (IM7), naïve CD4+CD62L hiCD25-CD44lo and naïve CD8+CD62L hiCD25-CD44lo were obtained by sorting (BD FACS Aria). Additionally, for the RNAseq experiments, CD4 and CD8 naïve cells were isolated by sorting T cells from the Foxp3- IRES-GFP mice: CD4+CD62LhiCD25–CD44lo GFP(FOXP3)– and CD8+CD62LhiCD25– CD44lo GFP(FOXP3)– (antibodies were from Biolegend). In some cases, naïve CD4 cells were cultured in vitro under Th1 or Th2 polarizing conditions (3, 4).
    [Show full text]
  • Regulation of Neuronal Gene Expression and Survival by Basal NMDA Receptor Activity: a Role for Histone Deacetylase 4
    The Journal of Neuroscience, November 12, 2014 • 34(46):15327–15339 • 15327 Cellular/Molecular Regulation of Neuronal Gene Expression and Survival by Basal NMDA Receptor Activity: A Role for Histone Deacetylase 4 Yelin Chen,1 Yuanyuan Wang,1 Zora Modrusan,3 Morgan Sheng,1 and Joshua S. Kaminker1,2 Departments of 1Neuroscience, 2Bioinformatics and Computational Biology, and 3Molecular Biology, Genentech Inc., South San Francisco, California 94080 Neuronal gene expression is modulated by activity via calcium-permeable receptors such as NMDA receptors (NMDARs). While gene expression changes downstream of evoked NMDAR activity have been well studied, much less is known about gene expression changes that occur under conditions of basal neuronal activity. In mouse dissociated hippocampal neuronal cultures, we found that a broad NMDAR antagonist, AP5, induced robust gene expression changes under basal activity, but subtype-specific antagonists did not. While some of the gene expression changes are also known to be downstream of stimulated NMDAR activity, others appear specific to basal NMDARactivity.ThegenesalteredbyAP5treatmentofbasalcultureswereenrichedforpathwaysrelatedtoclassIIahistonedeacetylases (HDACs), apoptosis, and synapse-related signaling. Specifically, AP5 altered the expression of all three class IIa HDACs that are highly expressed in the brain, HDAC4, HDAC5, and HDAC9, and also induced nuclear accumulation of HDAC4. HDAC4 knockdown abolished a subset of the gene expression changes induced by AP5, and led to neuronal death under
    [Show full text]
  • Genome-Wide Human SNP Array 6.0
    Data Sheet Genome-Wide Human SNP Array 6.0 Introduction The Genome-Wide Human SNP Array 6.0 contains more than 906,600 single nucleotide polymorphisms (SNPs) and more than 946,000 probes for the detection of copy number variation. SNPs on the array are present on 200 to 1,100 base pairs (bp) Nsp I or Sty I digested fragments in the human genome, and are amplified using the Genome-Wide Human SNP Nsp/Sty Assay Kit 5.0/6.0. This assay, which is also compatible with the SNP Array 5.0, now combines the Nsp and Sty fractions previously assayed on two separate arrays. SNPs on the SNP Array 6.0 were screened in more than 500 distinct samples, including 270 HapMap samples and separate diversity samples. Approximately 482,000 SNPs are derived from the previous-generation Mapping 500K and SNP 5.0 Arrays. The remaining 424,000 SNPs include tag SNP markers derived from the International HapMap Project. These novel markers have better representation of SNPs on chromosomes X and Y, mitochondrial SNPs, SNPs in recombination hotspots, and new SNPs added to the dbSNP database after completion of the GeneChip® Human Mapping 500K Array Set. This array contains a total of 946,000 non-polymorphic copy number probes. These probes—744,000 originally selected for their spacing and 202,000 selected based on known copy number changes reported in the Toronto Database of Genomic Variants (DGV)—enable you to detect de novo copy number changes and perform association studies by genotyping both SNP and known copy number polymorphism (CNP) loci (as The Genome-Wide Human SNP Array 6.0 reported by McCarroll, et al.).
    [Show full text]
  • University of Alberta
    University of Alberta Tripartite-motif family members in the White Pekin duck (Anas platyrhynchos) modulate antiviral gene expression by Alysson Heather Blaine A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science in Physiology, Cell and Developmental Biology Biological Sciences ©Alysson Heather Blaine Fall 2013 Edmonton, Alberta Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission. Abstract Wild waterfowl, including mallard ducks, are the natural reservoir of avian influenza A virus and are resistant to highly pathogenic strains. This is primarily due to the robust innate immune response of ducks. Shortly after exposure to both highly pathogenic (A/Viet Nam/1203/04 (H5N1)) and low pathogenic (A/mallard/BC/500/05 (H5N2)) avian influenza, many immune genes are upregulated including members of the diverse tripartite-motif (TRIM) family. TRIM proteins have species-specific antiviral roles in a variety of viral infections. I have identified a contig of TRIM genes located adjacent to the MHC locus in the White Pekin duck (Anas platyrhynchos) genome.
    [Show full text]
  • An Integrated Map of Genetic Variation from 1,092 Human Genomes
    ARTICLE doi:10.1038/nature11632 An integrated map of genetic variation from 1,092 human genomes The 1000 Genomes Project Consortium* By characterizing the geographic and functional spectrum of human genetic variation, the 1000 Genomes Project aims to build a resource to help to understand the genetic contribution to disease. Here we describe the genomes of 1,092 individuals from 14 populations, constructed using a combination of low-coverage whole-genome and exome sequencing. By developing methods to integrate information across several algorithms and diverse data sources, we provide a validated haplotype map of 38 million single nucleotide polymorphisms, 1.4 million short insertions and deletions, and more than 14,000 larger deletions. We show that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation, which is further increased by the action of purifying selection. We show that evolutionary conservation and coding consequence are key determinants of the strength of purifying selection, that rare-variant load varies substantially across biological pathways, and that each individual contains hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites. This resource, which captures up to 98% of accessible single nucleotide polymorphisms at a frequency of 1% in related populations, enables analysis of common and low-frequency variants in individuals from diverse, including admixed, populations. Recent efforts to map human genetic variation by sequencing exomes1 individual genome sequences, to help separate shared variants from and whole genomes2–4 have characterized the vast majority of com- those private to families, for example.
    [Show full text]
  • Chromosome SNP Microarray a New High-Density Allele-Specific Diagnostic Platform
    Chromosome SNP Microarray A New High-density Allele-specific Diagnostic Platform Analysis of submicroscopic genomic changes can pair (allele) targets that have two different forms, revealing which form is present at that locus as well as the number of copies of that detect the cause of congenital anomalies and/or DNA segment. CGH-based arrays cannot detect polymorphic allele learning disabilities. targets (only dosage), resulting in a significant advantage for the SNP array. This advantage is based both on added confirmation of Introduction dosage changes through allele comparisons and the identification of Genetic imbalances are often associated with multiple birth defects, syndrome-associated “copy neutral” contiguous stretches of allele developmental delay, growth retardation, and dysmorphic features. homozygosity. The presence of the latter allows for detection of Standard cytogenetic analysis can identify visible chromosomal uniparental disomy for all chromosomes and, when consanguinity alterations, such as an extra chromosome band, but small deletions is present, it will provide the degree as well as the resulting genomic or duplications in the genome cannot be reliably detected. location of regions of recessive allele risk.5,6 Submicroscopic unbalanced rearrangements have been found in approximately 3% of patients with learning disabilities and mental Increase in Genomic Targets retardation of unknown cause using a set of FISH (fluorescence The initial 262,000 SNP microarray has been upgraded to offer a in situ hybridization) probes that can only target the ends of the much more dense array of 1.8 million genomic targets (marker every chromosomes.1 700 bp).7 The ultra dense array is much more sensitive in identifying extremely small genomic variations and more statistically reliable Advances in molecular cytogenetics further improved the sensitivity due to the large increase in markers through which each variation is of testing through the application of microarray-based comparative detected.
    [Show full text]
  • Identity-By-Descent Detection Across 487,409 British Samples Reveals Fine Scale Population Structure and Ultra-Rare Variant Asso
    ARTICLE https://doi.org/10.1038/s41467-020-19588-x OPEN Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations ✉ Juba Nait Saada 1 , Georgios Kalantzis 1, Derek Shyr 2, Fergus Cooper 3, Martin Robinson 3, ✉ Alexander Gusev 4,5,7 & Pier Francesco Palamara 1,6,7 1234567890():,; Detection of Identical-By-Descent (IBD) segments provides a fundamental measure of genetic relatedness and plays a key role in a wide range of analyses. We develop FastSMC, an IBD detection algorithm that combines a fast heuristic search with accurate coalescent-based likelihood calculations. FastSMC enables biobank-scale detection and dating of IBD segments within several thousands of years in the past. We apply FastSMC to 487,409 UK Biobank samples and detect ~214 billion IBD segments transmitted by shared ancestors within the past 1500 years, obtaining a fine-grained picture of genetic relatedness in the UK. Sharing of common ancestors strongly correlates with geographic distance, enabling the use of genomic data to localize a sample’s birth coordinates with a median error of 45 km. We seek evidence of recent positive selection by identifying loci with unusually strong shared ancestry and detect 12 genome-wide significant signals. We devise an IBD-based test for association between phenotype and ultra-rare loss-of-function variation, identifying 29 association sig- nals in 7 blood-related traits. 1 Department of Statistics, University of Oxford, Oxford, UK. 2 Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA. 3 Department of Computer Science, University of Oxford, Oxford, UK.
    [Show full text]
  • Regions of Homozygosity Identified by SNP Microarray Analysis Aid in The
    ORIGINAL RESEARCH ARTICLE ©American College of Medical Genetics and Genomics Regions of homozygosity identified bySN P microarray analysis aid in the diagnosis of autosomal recessive disease and incidentally detect parental blood relationships Kristen Lipscomb Sund, PhD, MS1, Sarah L. Zimmerman, PhD1, Cameron Thomas, MD2, Anna L. Mitchell, MD, PhD3, Carlos E. Prada, MD1, Lauren Grote, BS1, Liming Bao, MD, PhD1, Lisa J. Martin, PhD1 and Teresa A. Smolarek, PhD1 Purpose: The purpose of this study was to document the ability of was suspected in the parents of at least 11 patients with regions of single-nucleotide polymorphism microarray to identify copy-neutral homozygosity covering >21.3% of their autosome. In four patients regions of homozygosity, demonstrate clinical utility of regions of from two families, homozygosity mapping discovered a candidate homozygosity, and discuss ethical/legal implications when regions of gene that was sequenced to identify a clinically significant mutation. homozygosity are associated with a parental blood relationship. Conclusion: This study demonstrates clinical utility in the identifica- Methods: Study data were compiled from consecutive samples tion of regions of homozygosity, as these regions may aid in diagnosis sent to our clinical laboratory over a 3-year period. A cytogenetics of the patient. This study establishes the need for careful reporting, database identified patients with at least two regions of homozygosity thorough pretest counseling, and careful electronic documentation, >10 Mb on two separate chromosomes. A chart review was conduct- as microarray has the capability of detecting previously unknown/ ed on patients who met the criteria. unreported relationships. Results: Of 3,217 single-nucleotide polymorphism microarrays, Genet Med 2013:15(1):70–78 59 (1.8%) patients met inclusion criteria.
    [Show full text]
  • Genetic Profiles of 103,106 Individuals in the Taiwan Biobank Provide Insights Into the Health and History of Han Chinese
    UCSF UC San Francisco Previously Published Works Title Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese. Permalink https://escholarship.org/uc/item/9s81s8g7 Journal NPJ genomic medicine, 6(1) ISSN 2056-7944 Authors Wei, Chun-Yu Yang, Jenn-Hwai Yeh, Erh-Chan et al. Publication Date 2021-02-11 DOI 10.1038/s41525-021-00178-9 Peer reviewed eScholarship.org Powered by the California Digital Library University of California www.nature.com/npjgenmed ARTICLE OPEN Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese Chun-Yu Wei1,4, Jenn-Hwai Yang1,4, Erh-Chan Yeh1, Ming-Fang Tsai1, Hsiao-Jung Kao1, Chen-Zen Lo1, Lung-Pao Chang1, Wan-Jia Lin1, Feng-Jen Hsieh1, Saurabh Belsare 2, Anand Bhaskar 3, Ming-Wei Su1, Te-Chang Lee1, Yi-Ling Lin1, Fu-Tong Liu1, Chen-Yang Shen1, ✉ Ling-Hui Li1, Chien-Hsiun Chen1, Jeffrey D. Wall2, Jer-Yuarn Wu1 and Pui-Yan Kwok 1,2 Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high- coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort.
    [Show full text]
  • SNP Ascertainment Bias in Population Genetic Analyses: Why It Is Important, and How to Correct It Recently in Press
    Prospects & Overviews SNP ascertainment bias in population genetic analyses: Why it is important, and how to correct it Recently in press Joseph Lachanceà and Sarah A. Tishkoff Whole genome sequencing and SNP genotyping arrays African hunter-gatherers and the power can paint strikingly different pictures of demographic of whole genome sequencing history and natural selection. This is because genotyping arrays contain biased sets of pre-ascertained SNPs. In this Due to technological advances and increases in computation- al power, the cost of genotyping has plummeted over the past short review, we use comparisons between high-coverage few years. Because of this, it is now feasible to conduct whole genome sequences of African hunter-gatherers and population genetic analyses of whole genome sequencing data from genotyping arrays to highlight how SNP data. One advantage of whole genome sequencing is that ascertainment bias distorts population genetic inferences. SNP ascertainment bias is reduced compared to alternative Sample sizes and the populations in which SNPs are genotyping technologies. This lack of SNP ascertainment bias discovered affect the characteristics of observed variants. is critical for accurate population genetic analyses where allele frequency distributions are used to infer demographic We find that SNPs on genotyping arrays tend to be older history and scan for past targets of natural selection. Using the and present in multiple populations. In addition, geno- technology of Complete Genomics [1], we recently sequenced typing arrays cause allele frequency distributions to be the whole genomes of 15 African hunter-gatherers at >60 shifted towards intermediate frequency alleles, and coverage [2]. Sequenced individuals included five Pygmies estimates of linkage disequilibrium are modified.
    [Show full text]
  • Mutagenesys: Estimating Individual Disease Susceptibility Based On
    Vol. 24 no. 3 2008, pages 440–442 BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm587 Databases and ontologies MutaGeneSys: estimating individual disease susceptibility based on genome-wide SNP array data Julia Stoyanovich* and Itsik Pe’er Department of Computer Science, Columbia University, 1214 Amsterdam Avenue, New York, NY 10025, USA Received on May 29, 2007; revised on October 16, 2007; accepted on November 22, 2007 Advance Access publication November 29, 2007 Associate Editor: Jonathan Wren ABSTRACT First and foremost, genetic information remains expensive Summary: We present MutaGeneSys: a system that uses genome- to collect, and it is currently economically prohibitive to wide genotype data to estimate disease susceptibility. Our system make a complete set of an individual’s genotypes of SNPs integrates three data sources: the International HapMap project, available for analysis. Nature Genetics’ ‘Question of the Year’ whole-genome marker correlation data and the Online Mendelian (www.nature.com/ng/qoty) announced the sequencing of the Inheritance in Man (OMIM) database. It accepts SNP data of indivi- entire human genome for $1000 as a goal for the genetics duals as query input and delivers disease susceptibility hypotheses community. Cost-effective methods (e.g. SNP arrays) currently even if the original set of typed SNPs is incomplete. Our system exist for collecting genetic data from 1% to 5% of all 11 million is scalable and flexible: it produces population, technology and human SNPs. This calls for the development of techniques that confidence-specific predictions in interactive time. can effectively utilize partial genetic information for disease Availability: Our system is available as an online resource at http:// prediction.
    [Show full text]