bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. A Genome Model Linking
Birth Defects to Infections Bernard Friedenson Deparment of Biochemistry and Molecular Genetics College of Medicine University of Illinois Chicago Chicago, Illinois 60612
Abstract
The purpose of this study was to test the hypothesis that infections are linked to chromosomal anomalies that cause neurodevelopmental disorders. In chil- dren with disorders in the development of their nervous systems, chromosome anomalies known to cause these disorders were compared to microbial DNA, including known teratogens. Genes essential for neurons, lymphatic drainage, immunity, circulation, angiogenesis, cell barriers, structure, epigenetic and chro- matin modifications were all found close together in polyfunctional clusters that were deleted or rearranged in neurodevelopmental disorders. In some patients, epigenetic driver mutations also changed access to large chromosome segments. These changes account for immune, circulatory, and structural deficits that ac- company neurologic deficits. Specific and repetitive human DNA encompassing large deletions matched infections and passed rigorous artifact tests. Deletions of up to millions of bases accompanied infection-matching sequences and caused massive changes in the homologies to foreign DNAs. In data from three inde- pendent studies of private, familial and recurrent chromosomal rearrangements, massive changes in homologous microbiomes were found and may drive rear- rangements and encourage pathogens. At least one chromosomal anomaly was found to consist of human DNA fragments with a gap that corresponded to a piece of integrated foreign DNA. Microbial DNAs that match repetitive or spe- cific human DNA segments are thus proposed to interfere with the epigenome and highly active recombination during meiosis, driven by massive changes in the homologous microbiome. Abnormal recombination in gametes produces zygotes containing rare chromosome anomalies which cause neurologic disorders and non-neurologic signs. Neurodevelopmental disorders may be examples of assault on the human genome by foreign DNA at a critical stage. Some infections may be more likely tolerated because they resemble human DNA segments. Further tests of this model await new technology.
Keywords: 1; genome 2; epigenetics 3; neurodevelopmental disorders; 4; chromosome anoma- lies; 5; retrotransposon; 6; chromosome rearrangement; 7; neurologic disease; 8; birth defects; 9; development 10; infection bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
Some infection DNA can integrate into human DNA (e.g. retroviruses) and many microbial DNAs can interfere with meiosis because of homologies to sections of human DNA (green stripes below). These events can drive deletion of gene clusters essential for coordinated neurodevelopment and can inactivate crucial epigenetic factors.
Viral,Microbial bacterial homologies Nervous system Immunity homologies Red blood cells Structure Immunity Immunity Structure Nervous system Circulation Structure Immunity to human DNA Structure Nervous system Cell adhesion to human DNA Nervous system Nervous system Immunity Structure Angiogenesis Nervous system Nervous system Angiogenesis Nervous system (green stripes) Immunity Nervous system Immunity (green stripes) Gene cluster Nervous system Infections deleted in Missing meiosis Gene Cluster, Abnornal Microbiome,Epigenetics, EpigeneticsMaternal factors Massive shiftschanges in homologous in homologous microbiome microbiome Truncated/deleted epigenetic driversdriver Intellectual disability Disrupted chromatin interactions Further infection Impaired circulation Craniofacial malformation
Introduction tion of the neurologic system. Based on DNA se- An approach to preventing neurodevelopmen- quence analyses, some chromosome rearrangements tal disorders is to gain better understanding of have been identified as causing individual congenital how neurodevelopment is coordinated and then to disorders because they disrupt genes essential for identify interference from environmental, genom- normal development 1-3. There is poor understanding ic and epigenomic factors. The development of the and no effective treatment for many of these over- nervous system requires tight regulation and coordi- whelming abnormalities. Signs and symptoms include nation of multiple functions essential to protect and autism, microcephaly, macrocephaly, behavioral prob- nourish neurons. As the nervous system develops, lems, intellectual disability, tantrums, seizures, respi- the immune system, the circulatory system, cranial ratory problems, spasticity, heart problems, hearing and skeletal systems must all undergo synchronized loss, and hallucinations 1. Because the abnormalities and coordinated development. Neurodevelopmental do not correlate well with the outcome, genetic disorders follow the disruption of this coordination. counseling is difficult and uncertain3 .
A significant advance in genome sequence level In congenital neurologic disease, inheritance is resolution of balanced cytogenetic abnormalities usually autosomal dominant and the same chromo- greatly improves the ability to document changes in somal abnormalities occur in every cell. The genetic regulation and dosage for genes critical to the func- events that lead to most neurodevelopmental dis- bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. page 3
4 orders are not understood but several maternal contrast to oocytes, meiotic recombination in sperm infections and other lifestyle factors are known to cells occurs continuously after puberty. interfere. The exact DNA sequences of known pathogenic DNA homology between microbes and hu- rearrangements in individual, familial and recurrent mans is a known fact and DNA swapping between congenital disorders 1-3 make it possible to test for vertebrates and invertebrates has been reported. association with foreign DNA. Even rare develop- An early draft of the human genome found human mental disorders can be screened for homology to genomes have undergone lateral gene transfer to infections within altered epigenomes and chromatin 5 incorporate microorganism genes . Lateral transfer structures. This screening may assist counseling, diag- from bacteria may have generated many candidate nosis, prevention, and early intervention. human genes 6. Genome-wide analyses in animals The results showed that DNA abnormalities in found up to hundreds of active genes generated by some neurodevelopmental disorders closely match horizontal gene transfer. Fruit flies and nematodes DNA in multiple infections that extend over long have continually acquired foreign genes as they linear stretches of human DNA and often resemble evolved. Although these transfers are thought to be repetitive human DNA sequences. Massive changes rarer in primates and humans, at least 33 previously in the identity and distribution of sets of homolo- unreported examples of horizontally acquired genes gous microorganisms that accompany chromosome were found 6. These findings argue that horizontal abnormalities may drive and stabilize them. Removing gene transfer continues to occur to a larger extent competition by changing the normal microbiome may than previously thought. Transferred genes that encourage pathogens. survive have been largely concerned with metabo- lism and make important contributions to increasing The affected sequences are shown to exist as biochemical diversity7. linear clusters of genes closely spaced in two dimen- sions. Interference from infection can also delete or The present work implicates foreign DNA damage human gene clusters and change epigenetic largely from infections as a cause of the chromo- functions that coordinate neurodevelopment. This some anomalies that cause birth defects. Infections microbial interference accounts for immune, circula- replicate within the human CNS by taking advantage tory, and structural deficits that accompany neuro- of immune deficiencies such as those traced back logic deficits. to deficient microRNA production8 or other gene losses. Disseminated infections can then interfere Congenital neurodevelopmental disorders are with the highly active DNA break repair process thus viewed as resulting from an assault on human required during meiosis. The generation of gametes DNA by foreign DNA and perhaps selection of by meiosis is the most active period of recombi- infecting microorganisms based on their similarity to nation which occurs at positions enriched in epi- host DNA. It is important to remember that effects genetic marks on chromatin. Hundreds of double in two dimensions can sometimes alter 3D topology strand breaks accompany meiotic recombination9. as well 10. Gametes with errors in how this recombination oc- Testing and verifying predictions from a viable crs cause chromosome anomalies in the zygote. In bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. model may spur the development of methods for genomes against human genomes and by extending identifying contributions from infections in intracta- analyses to 20000 matches. ble rare disorders that are not now available. Con- Various literature analyses have placed Alu repeats vergent arguments from testing predictions based on into 8 subfamilies having consensus sequences (Gen- any proposed model might lessen effects of limita- Bank; accession numbers U14567 - U14574). Micro- tions in currently available technology. bial sequences were independently compared to all 8 consensus Alu sequences and to 442 individual AluY sequences. Materials and Methods Chromosome localizations. The positions of Data Sources. DNA sequences from acquired microbial homologies in human chromosomes were congenital disorders were from published whole determined using BLAT or BLAST. Comparisons were genome sequences at chromosome breakpoints and also made to cDNAs based on 107,186 Reference rearrangement sites 1-3. Comparison to multiple Sequence (RefSeq) RNAs derived from the genome databases of microbial sequences determined whether sequence with varying levels of transcript or protein there was significant human homology. Emphasis was homology support. Tests for contamination by vector on cases with strong evidence that a particular human sequences in these non-templated sequences were chromosome rearrangement was pathologic for the also carried out with the BLAST program. Insert- congenital disorder. Patients in the 3 major studies 1-3 ed sequences were also compared to Mus musculus used in this analysis were 98 females and 144 males. GRCM38.p4 [GCF_000001635.24] chromosome plus Of the ages most, most (46) patients were under age unplaced and unlocalized scaffolds (reference assem- 10, 7 were age 10-20, 6 were age 20-40 and 1 patient bly in Annotation Release 106). Homology of inserted was older than 40. sequences to each other was tested using the Needle- Testing for homology to microbial sequenc- man and Wunsch algorithm. Lists of total homology es. Hundreds of different private rearrangements in scores for microbes vs human chromosome rear- patients with different acquired congenital disorders rangements were compared by the Mann-Whitney U were tested for homology 11 against non-human se- test using the StatsDirect Statistical analysis program. quences from microorganisms known to infect humans Fishers test was also used. as follows: Viruses (taxid: 10239), and retroviruses including HIV-1 (taxid:11676), human endogenous ret- Results roviruses (Taxids:45617, 87786, 11745, 135201, 166122, 228277and 35268); bacteria (taxid:2); Mycobacteria Interdependent functions are clustered together (taxid:85007); fungi (taxid; 4751), and chlamydias on the same chromosome segment. (taxid:51291) The nervous system has a close relationship Because homologies represent interspecies simi- to structures essential for immunity, circulation, cell larities, “Discontinuous Megablast” was most frequent- barriers and protective enclosures. In chromosome ly used, but long sequences were sometimes tested segments deleted in neurologic disorders, genes es- against highly similar microbial sequences. Significant sential for all these functions must develop in con- homology (indicated by homology score) occurs when cert. Genes for these related functions are located microbial and human DNA sequences have more close to each other on the same linear segment of similarity than expected by chance (E value<=e-10) 12. a chromosome (Fig. 1). Deletions at 4q34 in patient Confirmation of microorganism homologies was done DGAP161 are shown in Fig. 1. The genes within the by testing multiple variants of complete microorganism 4q34 deletion in each of four categories tested are bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. page 5
Chromosome 4q34 gene cluster, deleted in patient DGAP161 Fig. 1. Chromosome 4q34 as a typical ex- Genes Developmental Functions ample of close relationships between nervous Controls neural stem cell proliferation system genes and genes for other essential V(D)J recombination DNA RNA sensor in Innate immune response to nucleic acids. developmental functions. Pleiotropic genes HMGB2 Chemokine. Anti-microbial Red blood cell dierentiation for multiple interdependent systems appear Connective tissue dierentiation in clusters on the same chromosomes. Ner- Hematopoiesis, anti-in ammatory response to glucocorticoids. vous system genes are near genes essential for SAP30 Hypoxia response the immune system, connections to lymphat- Autophagy, Anti-in ammatory, chemotaxis for immune system cells, prion response ic circulation, ability to form tight junctions, SCRG1 Bone dierentiation from stem cells structural enclosures and chromatin control. Noradrenergic properties of neurons, controls neuroblast multiplication Clusters of genes encoding these and other HAND2 Development of circulation, cardiac development interdependent functions on chromosome Craniofacial skeleton segments are deleted in private neurodevel- Pro-in ammatory gene HPGD opmental disorders. These losses increase Bone structure and connective tissue susceptibility to infections that have DNA Tonic and agonist induced currents in forebrain, auditory nerve activity, GLRA3 homologous to long stretches of human DNA. brain development, vision In the example shown, genes are listed ADAM29 Cell adhesion and membrane structure in the order they occur on 4q34. Develop- Signaling in neuron lipid rafts. Dendritic spine and synapse formation. GPM6A Candidate causal gene for schizophrenia mental functions are to the right of the gene
WDR17 Retinal development symbol. Blue genes are associated with the Apoptosis inhibitor nervous system, pink with the immune system. SPATA4 Osteoblast dierentiation Yellow genes have functions associated with ASB5 Vascular remodeling in embryogenesis, artery development angiogenesis, or lymphangiogenesis or cell SPCS3 Preprohormone convertase aects pituitary hormone ghrelin barriers. Genes for development of essential bone structure or connective tissues needed Neural stem cell activation VEGFC to protect and house the nervous system are Formation of lymphatics, Endothelial cell migration, Circulation light grey. Isoforms of the same gene may Induces neurogenesis NEIL3 encode different functions in different cellular B-cell expansion in germinal center, prevention of autoimmunity locations. Consistent results were obtained Thalamus structure from deletions involving 6 other chromosome AGA Aspartylglucosaminidase, lysosomal hydrolase, joint in ammation bands (see text).
TENM Development in nervous system Synapse organization color coded in Fig.1. Fig 1 is representative of six other Many deleted gene clusters include long stretch- chromosome bands that were also tested and gave es of DNA strongly related to infections. similar results: 2q24.3; 6q13-6q14.1; 10p14-10p15.1; 13q14.2; 18p11.22-p11.32; and 19q12-q13.1. Deletion To investigate the chromosomal segment dele- of these clustered arrangements has been correlated tions that likely cause neurodevelopmental disorders, with serious neurodevelopmental disorders1. Alterna- homologies to infection were tested in sequences with- tively-spliced forms of the same gene may encode for in and flanking deleted clusters. Strong homologies to pleiotropic functions that must be synchronized and infections were interspersed. To demonstrate the ex- coordinated among diverse cells. Multiple functions tent of these relationships, deleted 4q34 chromosome for the same gene in different cell types are commonly segment (Patient DGAP161 1) was tested for homology found. Hormonal signaling represents a major control to microbes in 200 kb chunks. mechanism13. Fig. 2 shows that stretches of homology to microorganisms are distributed throughout chromo- some 4q34. Only 10 homologous microbes are shown for each 200 kb division, but there are up to hundreds, bioRxiv preprint doi: https://doi.org/10.1101/674093; this version posted June 19, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.
page 6
Fig. 2. Example of total homologies to microbial sequences dispersed throughout the normal 4q34 chromosome segment deleted in patient DGAP161. The top 10 homologies are shown for each 200,000 bp segment and are listed within each of these 57 segments in order of their scores. The Y axis plots the total homology scores and the x axis lists the individual microorganism. The mean E value was 5.3e-90 (range 0.0 to 1e-87). F. tularensis had values of 33,368 and 79,059 which were truncated at 25000 to show more detail for the other microbial matches. - - e n irus co lete eno e page 7 page Clostridiu otulinu strain M � ulc chro oso e co lete ucal tus lo ulus icostata s iont R un u an a aher es irus isolate D ar �al eno e u an a aher es irus isolate D ar �al eno e u an a aher es irus isolate D ar �al eno e u an a aher es irus isolate C ar �al eno e