Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention

Total Page:16

File Type:pdf, Size:1020Kb

Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention bioRxiv preprint doi: https://doi.org/10.1101/491357; this version posted December 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention Alona Levy-Jurgenson12, Xavier Tekpli3, Vessela N. Kristensen3 and Zohar Yakhini12 Abstract— DNA methylation has been extensively linked DNA methylation plays a key role in disease development. to alterations in gene expression, playing a key role in the Specifically, hypermethylation can lead to stable silencing of manifestation of multiple diseases, most notably cancer. For tumor suppressor genes [21]. This process has therefore been this reason, researchers have long been measuring DNA methylation in living organisms. The relationship between extensively observed and studied in the context of cancer methylation and expression, and between methylation in [24], [15], [41], [29]. While various forms of cancer are different genomic regions is of great theoretical interest from a central to the discussion on DNA methylation, it has also molecular biology perspective. Therefore, several models have been linked to other diseases and biological processes such been suggested to support the prediction of methylation status as cardiovascular disease [13] and Alzheimer’s Disease [19] in samples. These models, however, have two main limitations: (a) they heavily rely on partially measured methylation levels as well as gene expression regulation in general [40] and as input, somewhat defeating the object as one is required epigenetic editing [35]. Hence, researchers have long been to collect measurements from the sample of interest before measuring DNA methylation levels. applying the model; and (b) they are largely based on human Currently, there are several methods available for mea- mediated feature engineering, thus preventing the model from suring DNA methylation [23]. Some of these methods, unveiling its own representations. To address these limitations we used deep learning, with an attention mechanism, to however, require specialized protocols or a relatively large produce a general model that predicts DNA methylation for a DNA sample size. Hence, depending on the required task, given sample in any CpG position based solely on the sample’s the costs could be significant and the data collection may gene expression profile and the sequence surrounding the be impractical. For this reason, the prediction of DNA CpG. methylation levels through other means could prove highly We show that our model is capable of generalizing to a completely separate test set of CpG positions and subjects. useful. In addition, and perhaps more importantly, the link Depending on gene-CpG proximity conditions, our model can between gene expression and methylation is still an open- attain a Spearman correlation of up to 0.8 and MAE of 0.14 ended question and predictive analyses may provide insight for thousands of CpG sites in the test data. We also identify into this relationship. In this work we set out to address both and analyze several motifs and genes that our model suggests aspects of methylation. may be linked to methylation activity, such as Nodal and Hand1. Moreover, our approach, and most notably the use of The contribution of this work includes: (1) We provide attention mechanisms, offers a novel framework with which a practical tool that enables potential users to input any to extract valuable insights from gene expression data when CpG position for which DNA methylation was not measured, combined with sequence information. along with the sample’s gene-expression profile, and obtain The code and trained models are available at: a prediction. We do so by training a machine learning model https://github.com/YakhiniGroup/Methylation that combines gene expression data with the ambient DNA sequence at the CpG of interest. We demonstrate that this I. INTRODUCTION model is capable of generalizing across CpG sites as well as across samples. (2) From a theoretical perspective, this DNA methylation is a chemical process that modifies result provides proof for a sharp, albeit not necessarily DNA in living organisms and can significantly affect gene causal, link between sequence and expression and between expression, mostly through the inhibition of transcription. In local methylation events. Furthermore, we observe better humans, DNA methylation refers to the presence of a methyl predictability for CpGs that reside closer to genes. We group at a defined position of a cytosine and occurs mostly also unveil motifs and genes that the model identified as in CpG dinucleotides. It has been particularly shown to affect significant contributors to the prediction. Specifically, we gene expression in gene promoter regions with relatively link HAND1 and NODAL to methylation activity in the dense CpGs, known as CpG islands (CGI). When a large cohort analyzed. (3) We provide a novel model design and number of proximal CpGs are methylated, the transcription framework that support the combination of gene-expression of nearby downstream genes may be inhibited. This process data with genomic sequences to extract valuable molecular- is, for example, prominent in the silent X-chromosome in level insights. females [37]. A. Related Work 1 Technion - Israel Institute of Technology, Haifa, Israel Over the past decade, researchers have been investigating 2School of Computer Science, Interdisciplinary Center, Herzeliya, Israel 3Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Nor- the use of machine learning for the prediction of methy- way lation. In [4] and [10] the authors used classifiers such bioRxiv preprint doi: https://doi.org/10.1101/491357; this version posted December 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. as Support Vector Machines (SVMs) and decision trees to III. METHODS determine the status of a given CpG using both structural A. Datasets and sequence-specific features. Similarly, [44] suggested a random forest classifier that uses features such as genomic We used data from two cancer cohorts: (1) 782 breast position and neighbor methylation levels. The latter were cancer patients and (2) 498 prostate cancer patients. For each noted as significant contributors but clearly require collecting patient, we obtained two types of data: (a) gene expression partial methylation data. Others [26], have used a regression data in RSEM normalized count for 17,997 genes (RNA- approach to predict continuous methylation levels across seq) and (b) methylation levels at 360,531 CpG sites (450K tissues, also using SVMs. While the use of regression is Illumina array). indeed more appropriate in the context of continous methy- In addition to the patient-specific data, we also use data lation measurements, this approach requires extensive data specific to any CpG locus: (a) the ambient sequence - 399 collection from a source tissue. More recently, [42] used a base-pairs upstream of the CpG and 399 downstream, for a deep learning model to predict whether a CpG was hypo- total of 800 base-pairs and (b) the genomic distance between or hyper-methylated by using DNA patterns and topological each gene in the profile considered and the locus of interest. features. The latter are human engineered features taken as B. Constructing the Model input by the network model. Like previous methods, this model is limited to binary classification, and is specifically Our task is to predict the methylation level at a CpG site constrained to hypo-/hyper- methylation. in a sample taken from a given subject, using the samples’s To conclude, the main limitations posed by previous gene expression profile and the ambient sequence at the CpG models include: (1) The need to measure methylation in some site. To do so, we created a multi-modal neural network (or all, in the case of learning between tissues) CpG sites. (2) comprised of four sub-networks: one CNN, which acts as Extensive use of human-engineered features. This not only a motif detector for the surrounding sequence, and three incorporates human biases, but also prevents the model from attention components which act as gene amplifiers, each unveiling novel representations. (3) The majority are binary based on the input provided. These sub-networks are then classifiers when in reality methylation levels are measured combined into a single fully-connected network to produce continuously, representing fractions of cells with any given the final prediction. status. Input Data II. APPROACH We will define a single training example to represent one To address the limitations posed by the aforementioned subject (or sample) and one CpG. It contains the following methods, we suggest a general deep learning model that does components: not require measuring methylation levels in the sample of 1) The subject’s gene expression vector e, where each interest, is not limited to specific CpGs, uses neural networks entry, ei, represents the expression level of a gene gi. as feature extractors instead of human-engineered features 2) The sequence surrounding the CpG of interest, repre- and provides continuous predictions. Specifically, we predict sented as a one-hot matrix S. methylation levels at a given CpG in a given sample based 3) A vector d, where di is computed based on the distance, on three factors: in base-pairs, between gi and the CpG of interest. Specifically, a gene residing within the first 2,000 base- • The sequence surrounding the CpG. pairs received a value of 1, the next 2,000 a value of • The sample’s gene expression profile. 0.5 and so on until the last bucket of 2,000 was given • The distance between the profiled genes and the CpG. 9 a value of 0:5 . Beyond this point di was set to 0. For We use a generalized approach that can be applied to any set genes residing on a different chromosome this value was of CpGs.
Recommended publications
  • Table S1. List of Proteins in the BAHD1 Interactome
    Table S1. List of proteins in the BAHD1 interactome BAHD1 nuclear partners found in this work yeast two-hybrid screen Name Description Function Reference (a) Chromatin adapters HP1α (CBX5) chromobox homolog 5 (HP1 alpha) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins (20-23) HP1β (CBX1) chromobox homolog 1 (HP1 beta) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins HP1γ (CBX3) chromobox homolog 3 (HP1 gamma) Binds histone H3 methylated on lysine 9 and chromatin-associated proteins MBD1 methyl-CpG binding domain protein 1 Binds methylated CpG dinucleotide and chromatin-associated proteins (22, 24-26) Chromatin modification enzymes CHD1 chromodomain helicase DNA binding protein 1 ATP-dependent chromatin remodeling activity (27-28) HDAC5 histone deacetylase 5 Histone deacetylase activity (23,29,30) SETDB1 (ESET;KMT1E) SET domain, bifurcated 1 Histone-lysine N-methyltransferase activity (31-34) Transcription factors GTF3C2 general transcription factor IIIC, polypeptide 2, beta 110kDa Required for RNA polymerase III-mediated transcription HEYL (Hey3) hairy/enhancer-of-split related with YRPW motif-like DNA-binding transcription factor with basic helix-loop-helix domain (35) KLF10 (TIEG1) Kruppel-like factor 10 DNA-binding transcription factor with C2H2 zinc finger domain (36) NR2F1 (COUP-TFI) nuclear receptor subfamily 2, group F, member 1 DNA-binding transcription factor with C4 type zinc finger domain (ligand-regulated) (36) PEG3 paternally expressed 3 DNA-binding transcription factor with
    [Show full text]
  • NICU Gene List Generator.Xlsx
    Neonatal Crisis Sequencing Panel Gene List Genes: A2ML1 - B3GLCT A2ML1 ADAMTS9 ALG1 ARHGEF15 AAAS ADAMTSL2 ALG11 ARHGEF9 AARS1 ADAR ALG12 ARID1A AARS2 ADARB1 ALG13 ARID1B ABAT ADCY6 ALG14 ARID2 ABCA12 ADD3 ALG2 ARL13B ABCA3 ADGRG1 ALG3 ARL6 ABCA4 ADGRV1 ALG6 ARMC9 ABCB11 ADK ALG8 ARPC1B ABCB4 ADNP ALG9 ARSA ABCC6 ADPRS ALK ARSL ABCC8 ADSL ALMS1 ARX ABCC9 AEBP1 ALOX12B ASAH1 ABCD1 AFF3 ALOXE3 ASCC1 ABCD3 AFF4 ALPK3 ASH1L ABCD4 AFG3L2 ALPL ASL ABHD5 AGA ALS2 ASNS ACAD8 AGK ALX3 ASPA ACAD9 AGL ALX4 ASPM ACADM AGPS AMELX ASS1 ACADS AGRN AMER1 ASXL1 ACADSB AGT AMH ASXL3 ACADVL AGTPBP1 AMHR2 ATAD1 ACAN AGTR1 AMN ATL1 ACAT1 AGXT AMPD2 ATM ACE AHCY AMT ATP1A1 ACO2 AHDC1 ANK1 ATP1A2 ACOX1 AHI1 ANK2 ATP1A3 ACP5 AIFM1 ANKH ATP2A1 ACSF3 AIMP1 ANKLE2 ATP5F1A ACTA1 AIMP2 ANKRD11 ATP5F1D ACTA2 AIRE ANKRD26 ATP5F1E ACTB AKAP9 ANTXR2 ATP6V0A2 ACTC1 AKR1D1 AP1S2 ATP6V1B1 ACTG1 AKT2 AP2S1 ATP7A ACTG2 AKT3 AP3B1 ATP8A2 ACTL6B ALAS2 AP3B2 ATP8B1 ACTN1 ALB AP4B1 ATPAF2 ACTN2 ALDH18A1 AP4M1 ATR ACTN4 ALDH1A3 AP4S1 ATRX ACVR1 ALDH3A2 APC AUH ACVRL1 ALDH4A1 APTX AVPR2 ACY1 ALDH5A1 AR B3GALNT2 ADA ALDH6A1 ARFGEF2 B3GALT6 ADAMTS13 ALDH7A1 ARG1 B3GAT3 ADAMTS2 ALDOB ARHGAP31 B3GLCT Updated: 03/15/2021; v.3.6 1 Neonatal Crisis Sequencing Panel Gene List Genes: B4GALT1 - COL11A2 B4GALT1 C1QBP CD3G CHKB B4GALT7 C3 CD40LG CHMP1A B4GAT1 CA2 CD59 CHRNA1 B9D1 CA5A CD70 CHRNB1 B9D2 CACNA1A CD96 CHRND BAAT CACNA1C CDAN1 CHRNE BBIP1 CACNA1D CDC42 CHRNG BBS1 CACNA1E CDH1 CHST14 BBS10 CACNA1F CDH2 CHST3 BBS12 CACNA1G CDK10 CHUK BBS2 CACNA2D2 CDK13 CILK1 BBS4 CACNB2 CDK5RAP2
    [Show full text]
  • Co-Occupancy by Multiple Cardiac Transcription Factors Identifies
    Co-occupancy by multiple cardiac transcription factors identifies transcriptional enhancers active in heart Aibin Hea,b,1, Sek Won Konga,b,c,1, Qing Maa,b, and William T. Pua,b,2 aDepartment of Cardiology and cChildren’s Hospital Informatics Program, Children’s Hospital Boston, Boston, MA 02115; and bHarvard Stem Cell Institute, Harvard University, Cambridge, MA 02138 Edited by Eric N. Olson, University of Texas Southwestern, Dallas, TX, and approved February 23, 2011 (received for review November 12, 2010) Identification of genomic regions that control tissue-specific gene study of a handful of model genes (e.g., refs. 7–10), it has not been expression is currently problematic. ChIP and high-throughput se- evaluated using unbiased, genome-wide approaches. quencing (ChIP-seq) of enhancer-associated proteins such as p300 In this study, we used a modified ChIP-seq approach to define identifies some but not all enhancers active in a tissue. Here we genome wide the binding sites of these cardiac TFs (1). We show that co-occupancy of a chromatin region by multiple tran- provide unbiased support for collaborative TF interactions in scription factors (TFs) identifies a distinct set of enhancers. GATA- driving cardiac gene expression and use this principle to show that chromatin co-occupancy by multiple TFs identifies enhancers binding protein 4 (GATA4), NK2 transcription factor-related, lo- with cardiac activity in vivo. The majority of these multiple TF- cus 5 (NKX2-5), T-box 5 (TBX5), serum response factor (SRF), and “ binding loci (MTL) enhancers were distinct from p300-bound myocyte-enhancer factor 2A (MEF2A), here referred to as cardiac enhancers in location and functional properties.
    [Show full text]
  • Supplementary Materials
    Supplementary Materials + - NUMB E2F2 PCBP2 CDKN1B MTOR AKT3 HOXA9 HNRNPA1 HNRNPA2B1 HNRNPA2B1 HNRNPK HNRNPA3 PCBP2 AICDA FLT3 SLAMF1 BIC CD34 TAL1 SPI1 GATA1 CD48 PIK3CG RUNX1 PIK3CD SLAMF1 CDKN2B CDKN2A CD34 RUNX1 E2F3 KMT2A RUNX1 T MIXL1 +++ +++ ++++ ++++ +++ 0 0 0 0 hematopoietic potential H1 H1 PB7 PB6 PB6 PB6.1 PB6.1 PB12.1 PB12.1 Figure S1. Unsupervised hierarchical clustering of hPSC-derived EBs according to the mRNA expression of hematopoietic lineage genes (microarray analysis). Hematopoietic-competent cells (H1, PB6.1, PB7) were separated from hematopoietic-deficient ones (PB6, PB12.1). In this experiment, all hPSCs were tested in duplicate, except PB7. Genes under-expressed or over-expressed in blood-deficient hPSCs are indicated in blue and red respectively (related to Table S1). 1 C) Mesoderm B) Endoderm + - KDR HAND1 GATA6 MEF2C DKK1 MSX1 GATA4 WNT3A GATA4 COL2A1 HNF1B ZFPM2 A) Ectoderm GATA4 GATA4 GSC GATA4 T ISL1 NCAM1 FOXH1 NCAM1 MESP1 CER1 WNT3A MIXL1 GATA4 PAX6 CDX2 T PAX6 SOX17 HBB NES GATA6 WT1 SOX1 FN1 ACTC1 ZIC1 FOXA2 MYF5 ZIC1 CXCR4 TBX5 PAX6 NCAM1 TBX20 PAX6 KRT18 DDX4 TUBB3 EPCAM TBX5 SOX2 KRT18 NKX2-5 NES AFP COL1A1 +++ +++ 0 0 0 0 ++++ +++ ++++ +++ +++ ++++ +++ ++++ 0 0 0 0 +++ +++ ++++ +++ ++++ 0 0 0 0 hematopoietic potential H1 H1 H1 H1 H1 H1 PB6 PB6 PB7 PB7 PB6 PB6 PB7 PB6 PB6 PB6.1 PB6.1 PB6.1 PB6.1 PB6.1 PB6.1 PB12.1 PB12.1 PB12.1 PB12.1 PB12.1 PB12.1 Figure S2. Unsupervised hierarchical clustering of hPSC-derived EBs according to the mRNA expression of germ layer differentiation genes (microarray analysis) Selected ectoderm (A), endoderm (B) and mesoderm (C) related genes differentially expressed between hematopoietic-competent (H1, PB6.1, PB7) and -deficient cells (PB6, PB12.1) are shown (related to Table S1).
    [Show full text]
  • Transcription Factor Gene Expression Profiling and Analysis of SOX Gene Family Transcription Factors in Human Limbal Epithelial
    Transcription factor gene expression profiling and analysis of SOX gene family transcription factors in human limbal epithelial progenitor cells Der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades Dr. rer. nat. vorgelegt von Dr. med. Johannes Menzel-Severing aus Bonn Als Dissertation genehmigt von der Naturwissenschaftlichen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg Tag der mündlichen Prüfung: 7. Februar 2018 Vorsitzender des Promotionsorgans: Prof. Dr. Georg Kreimer Gutachter: Prof. Dr. Andreas Feigenspan Prof. Dr. Ursula Schlötzer-Schrehardt 1 INDEX 1. ABSTRACTS Page 1.1. Abstract in English 4 1.2. Zusammenfassung auf Deutsch 7 2. INTRODUCTION 2.1. Anatomy and histology of the cornea and the corneal surface 11 2.2. Homeostasis of corneal epithelium and the limbal stem cell paradigm 13 2.3. The limbal stem cell niche 15 2.4. Cell therapeutic strategies in ocular surface disease 17 2.5. Alternative cell sources for transplantation to the corneal surface 18 2.6. Transcription factors in cell differentiation and reprogramming 21 2.7. Transcription factors in limbal epithelial cells 22 2.8. Research question 25 3. MATERIALS AND METHODS 3.1. Human donor corneas 27 3.2. Laser Capture Microdissection (LCM) 28 3.3. RNA amplification and RT2 profiler PCR arrays 29 3.4. Real-time PCR analysis 33 3.5. Immunohistochemistry 34 3.6. Limbal epithelial cell culture 38 3.7. Transcription-factor knockdown/overexpression in vitro 39 3.8. Proliferation assay 40 3.9. Western blot 40 3.10. Statistical analysis 41 2 4. RESULTS 4.1. Quality control of LCM-isolated and amplified RNA 42 4.2.
    [Show full text]
  • Single Cell RNA-Sequencing Reveals the Existence of Naïve and Primed Pluripotency in Pre-Implantation Rhesus Monkey Embryos
    Downloaded from genome.cshlp.org on October 9, 2021 - Published by Cold Spring Harbor Laboratory Press Single cell RNA-sequencing reveals the existence of naïve and primed pluripotency in pre-implantation rhesus monkey embryos Running title: Pluripotency dynamics in monkey early embryos Denghui Liu#2,5, Xinyi Wang#1, 3, Dajian He#1,3,5, Chunli Sun#1,3,5, Xiechao He4, Lanzhen Yan 4, Yizhou Li2 , Jing-Dong J. Han*2, Ping Zheng*1,3,4,6 1 State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China 2 Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China 3 Yunnan Key Laboratory of Animal Reproduction, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan 650223, China 4 Primate Research Center, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, 650223, China 5 University of Chinese Academy of Sciences, Beijing, China 6 Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China 1 Downloaded from genome.cshlp.org on October 9, 2021 - Published by Cold Spring Harbor Laboratory Press # Denghui Liu, Xinyi Wang, Dajian He and Chunli Sun contributed equally to this work * Correspondence: Jing-Dong J. Han, Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai, 200031, China.
    [Show full text]
  • DNMT1 and DNMT3B Modulate Distinct Polycomb-Mediated Histone Modifications in Colon Cancer
    Published OnlineFirst September 1, 2009; DOI: 10.1158/0008-5472.CAN-09-0116 Molecular Biology, Pathobiology, and Genetics DNMT1 and DNMT3B Modulate Distinct Polycomb-Mediated Histone Modifications in Colon Cancer Bilian Jin,1 Bing Yao,2 Jian-Liang Li,4 C. Robert Fields,2 Amber L. Delmas,2 Chen Liu,3 and Keith D. Robertson1 1Department of Biochemistry and Molecular Biology and Cancer Research Center, Medical College of Georgia, Augusta, Georgia and Departments of 2Biochemistry and Molecular Biology and 3Pathology and 4Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, Florida Abstract Aberrant DNA methylation patterns, including CpG island hyper- DNA methylation patterns are established and maintained by methylation and repetitive region hypomethylation, are hallmarks of three DNA methyltransferases (DNMT): DNMT1, DNMT3A, and transformed cells (8). Misregulation of DNMT expression clearly DNMT3B. Although essential for development, methylation contributes to tumorigenesis as shown recently by the work of Linhart and colleagues, which showed that elevated Dnmt3b1 patterns are frequently disrupted in cancer and contribute Min/+ directly to carcinogenesis. Recent studies linking polycomb expression in the Apc model enhanced colorectal carcinogen- group repression complexes (PRC1 and PRC2) to the DNMTs esis and caused tumor suppressor gene methylation (9). DNA have begun to shed light on how methylation is targeted. We methylation, however, does not act alone in repressing gene activity. identified previously a panel of genes regulated by DNMT3B. A complex and intertwined set of post-translational modifications of Here, we compare these with known polycomb group targets to the core histone tails dynamically imparts both transcriptionally show that f47% of DNMT3B regulated genes are also bound by repressive and activating signals.
    [Show full text]
  • The Transcription Factor HAND1 Is Involved in Cortical Bone Mass Through the Regulation of Collagen Expression
    International Journal of Molecular Sciences Article The Transcription Factor HAND1 Is Involved in Cortical Bone Mass through the Regulation of Collagen Expression Noriko Funato 1,2,* , Yuki Taga 3, Lindsay E. Laurie 1, Chisa Tometsuka 3, Masashi Kusubata 3 and Kiyoko Ogawa-Goto 3 1 Department of Signal Gene Regulation, Tokyo Medical and Dental University (TMDU), 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan; [email protected] 2 Research Core, Tokyo Medical and Dental University (TMDU), 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan 3 Nippi Research Institute of Biomatrix, 520-11 Kuwabara, Toride, Ibaraki 302-0017, Japan; [email protected] (Y.T.); [email protected] (C.T.); [email protected] (M.K.); [email protected] (K.O.-G.) * Correspondence: [email protected]; Tel.: +81-3-5803-5796 Received: 20 October 2020; Accepted: 14 November 2020; Published: 16 November 2020 Abstract: Temporal and/or spatial alteration of collagen family gene expression results in bone defects. However, how collagen expression controls bone size remains largely unknown. The basic helix-loop-helix transcription factor HAND1 is expressed in developing long bones and is involved in their morphogenesis. To understand the functional role of HAND1 and collagen in the postnatal development of long bones, we overexpressed Hand1 in the osteochondroprogenitors of model mice and found that the bone volumes of cortical bones decreased in Hand1Tg/+;Twist2-Cre mice. Continuous Hand1 expression downregulated the gene expression of type I, V, and XI collagen in the diaphyses of long bones and was associated with decreased expression of Runx2 and Sp7/Osterix, encoding transcription factors involved in the transactivation of fibril-forming collagen genes.
    [Show full text]
  • NKX2-5: an Update on This Hypermutable Homeodomain Protein and Its Role in Human Congenital Heart Disease (CHD) Stella Marie Reamon-Buettner, Juergen T Borlak
    NKX2-5: An Update on this Hypermutable Homeodomain Protein and its Role in Human Congenital Heart Disease (CHD) Stella Marie Reamon-Buettner, Juergen T Borlak To cite this version: Stella Marie Reamon-Buettner, Juergen T Borlak. NKX2-5: An Update on this Hypermutable Home- odomain Protein and its Role in Human Congenital Heart Disease (CHD). Human Mutation, Wiley, 2010, 31 (11), pp.1185. 10.1002/humu.21345. hal-00585168 HAL Id: hal-00585168 https://hal.archives-ouvertes.fr/hal-00585168 Submitted on 12 Apr 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Human Mutation NKX2-5: An Update on this Hypermutable Homeodomain Protein and its Role in Human Congenital Heart Disease (CHD) For Peer Review Journal: Human Mutation Manuscript ID: humu-2010-0256.R1 Wiley - Manuscript type: Review Date Submitted by the 15-Jul-2010 Author: Complete List of Authors: Reamon-Buettner, Stella Marie; Fraunhofer Institute of Toxicology and Experimental Medicine, Molecular Medicine and Medical Biotechnology Borlak, Juergen; Fraunhofer Institute of Toxicology and Experimental Medicine, Molecular Medicine and Medical Biotechnology heart development, congenital heart disease, cardiac Key Words: malformations, transcription factors, NKX2-5, mutations John Wiley & Sons, Inc.
    [Show full text]
  • Mis-Expression of a Cranial Neural Crest Cell-Specific Gene Program in Cardiac Neural Crest Cells Modulates HAND Factor Expressi
    Journal of Cardiovascular Development and Disease Article Mis-Expression of a Cranial Neural Crest Cell-Specific Gene Program in Cardiac Neural Crest Cells Modulates HAND Factor Expression, Causing Cardiac Outflow Tract Phenotypes Joshua W. Vincentz 1,*, David E. Clouthier 2 and Anthony B. Firulli 1,* 1 Herman B Wells Center for Pediatric Research, Departments of Pediatrics, Anatomy and Medical and Molecular Genetics, Indiana Medical School, Indianapolis, IN 46202, USA 2 Department of Craniofacial Biology, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; [email protected] * Correspondence: [email protected] (J.W.V.); tfi[email protected] (A.B.F.) Received: 30 March 2020; Accepted: 14 April 2020; Published: 20 April 2020 Abstract: Congenital heart defects (CHDs) occur with such a frequency that they constitute a significant cause of morbidity and mortality in both children and adults. A significant portion of CHDs can be attributed to aberrant development of the cardiac outflow tract (OFT), and of one of its cellular progenitors known as the cardiac neural crest cells (NCCs). The gene regulatory networks that identify cardiac NCCs as a distinct NCC population are not completely understood. Heart and neural crest derivatives (HAND) bHLH transcription factors play essential roles in NCC morphogenesis. The Hand1PA/OFT enhancer is dependent upon bone morphogenic protein (BMP) signaling in both cranial and cardiac NCCs. The Hand1PA/OFT enhancer is directly repressed by the endothelin-induced transcription factors DLX5 and DLX6 in cranial but not cardiac NCCs. This transcriptional distinction offers the unique opportunity to interrogate NCC specification, and to understand why, despite similarities, cranial NCC fate determination is so diverse.
    [Show full text]
  • Single-Cell Transcriptome Analysis During Cardiogenesis Reveals Basis for Organ Level
    bioRxiv preprint doi: https://doi.org/10.1101/365734; this version posted July 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Single-cell transcriptome analysis during cardiogenesis reveals basis for organ level developmental anomalies T. Yvanka de Soysa1,2, Sanjeev S. Ranade1,2, Satoshi Okawa3, Srikanth Ravichandran3, Yu Huang1,2, Hazel T. Salunga1,2, Amelia Schricker1,2, Antonio Del Sol3, Casey A. Gifford1,2*, Deepak Srivastava1,2,4,5* Affiliations: 1Gladstone Institute of Cardiovascular Disease, San Francisco, CA 94158, USA 2Roddenberry Center for Stem Cell Biology and Medicine at Gladstone, San Francisco, CA 94158, USA 3Computational Biology Group, Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg Departments of 4Pediatrics, and 5Biochemistry and Biophysics, University of California, San Francisco, San Francisco, CA 94143, USA *Corresponding author: [email protected], [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/365734; this version posted July 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Organogenesis involves integration of myriad cell types with reciprocal interactions, each progressing through successive stages of lineage specification and differentiation. Establishment of unique gene networks within each cell dictates fate determination, and mutations of transcription factors that drive such networks can result in birth defects.
    [Show full text]
  • Expression of Genes Involved in Early Cell Fate Decisions in Human Embryos and Their Regulation by Growth Factors
    REPRODUCTIONRESEARCH Expression of genes involved in early cell fate decisions in human embryos and their regulation by growth factors S J Kimber1, S F Sneddon2,3, D J Bloor1, A M El-Bareg1, J A Hawkhead4, A D Metcalfe1, F D Houghton5, H J Leese4, A Rutherford6, B A Lieberman2,3 and D R Brison1,2,3 1Faculty of Life Sciences and 2Faculty of Medical and Human Sciences, University of Manchester, Manchester M13 9PT, UK, 3Department of Reproductive Medicine, St Mary’s Hospital, Manchester M13 OJH, UK, 4Department of Biology, University of York, York YO10 5YW, UK, 5Division of Human Genetics, University of Southampton, Southampton SO16 6YD, UK and 6Assisted Conception Unit, Leeds General Infirmary, Leeds LS1 3EX, UK Correspondence should be addressed to S J Kimber who is now at University of Manchester, Core Technology Facility, 46 Grafton St, Manchester M13 9NT, UK; Email: [email protected] Abstract Little is understood about the regulation of gene expression in human preimplantation embryos. We set out to examine the expression in human preimplantation embryos of a number of genes known to be critical for early development of the murine embryo. The expression profile of these genes was analysed throughout preimplantation development and in response to growth factor (GF) stimulation. Developmental expression of a number of genes was similar to that seen in murine embryos (OCT3B/4, CDX2, NANOG). However, GATA6 is expressed throughout preimplantation development in the human. Embryos were cultured in IGF-I, leukaemia inhibitory factor (LIF) or heparin-binding EGF-like growth factor (HBEGF), all of which are known to stimulate the development of human embryos.
    [Show full text]