A Frameshift Mutation Is Repaired Through Nonsense-Mediated Gene
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/069971; this version posted October 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Premature termination codons signal frame repair 1 A frameshift mutation is repaired through 2 nonsense-mediated gene revising in E. coli 3 Xiaolong Wang1*, Xuxiang Wang1, Chunyan Li1, Haibo Peng1, Yalei Wang1, Gang Chen1, Jianye 4 Zhang1 5 College of Life Sciences, Ocean University of China, Qingdao, 266003, P. R. China 6 Abstract 7 The molecular mechanisms for repairing DNA damages and point mutations have 8 been well understood but it remains unclear how a frameshift mutation is repaired. 9 Here we report that frameshift reversion occurs in E. coli more frequently than 10 expected and appears to be a targeted gene repair signaled by premature termination 11 codons (PTCs), producing high-level variations in the repaired genes. Genome 12 resequencing shows that the revertant genome is highly stable, and the 13 single-molecule variations in the repaired genes are derived from RNA editing. A 14 multi-omics analysis shows that the expression levels change greatly in most the DNA 15 and RNA manipulating genes. DNA replication, transcription, RNA editing, RNA 16 degradation, nucleotide excision repair, mismatch repair, and homologous 17 recombination were upregulated in the frameshift or revertant, but the base excision 18 repair was not. Moreover, genes and transposons in a duplicate region silenced in wild 19 type E. coli were activated in the frameshift. Finally, we propose a nonsense-mediated 20 gene revising (NMGR) model for frame repair, which also acts as a driving force for 21 molecular evolution. In essence, nonsense mRNAs are recognized, edited, and 22 transported to template the repair of the coding gene by RNA-directed DNA repair, 23 nucleotide excision, mismatch repair, and homologous recombination. Thanks to 24 NMGR, the mutation rate temporarily rises in a frameshift gene, bringing genetic * To whom correspondence should be addressed: Xiaolong Wang, Department of Biotechnology, Ocean University of China, No. 5 Yushan Road, Qingdao, 266003, Shandong, P. R. China, Tel: 0086-139-6969-3150, E-mail: [email protected]. 1 / 24 bioRxiv preprint doi: https://doi.org/10.1101/069971; this version posted October 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Premature termination codons signal frame repair 1 diversity while repairing the frameshift mutation and accelerating the evolution 2 process without a high mutation rate in the genome. 3 4 1. Introduction 5 DNA replication and DNA repair happen in every cell division and multiplication. 6 Physical or chemical mutagens, such as radiation, pollutants, or toxins, can induce 7 point mutations, insertions/deletions (InDels), and damages in DNA. Besides, because 8 of the imperfect fidelity of DNA polymerase, spontaneous mutations and InDels also 9 occur as replication errors or slipped-strand mispairing. If an InDel happens in a 10 coding DNA sequence (CDS), and its size is not a multiple of three, it causes a 11 frameshift mutation, leading to a substantial change in its encoded amino acid 12 sequence. Besides, premature termination codons (PTCs) are often produced at the 13 downstream of the InDel, resulting in truncated and dysfunctional proteins [1], 14 leading to genetic disorders or even death. 15 The repair of DNA damages and point mutations have been intensively studied 16 [2], including base-/nucleotide-excision, mismatch repair, homologous recombination, 17 and non-homologous end-joining. However, it remains unclear how frameshift 18 mutation is repaired. The reverse mutation phenomenon was discovered as early as in 19 the 1960s [3], but it has long been explained simply by spontaneous mutagenesis (Fig 20 1). In principle, the reverse mutation rate is much lower than the forward mutation 21 rate. However, it was reported that frameshift reversion occurs more frequently than 22 expected in Escherichia coli (E. coli), which was believed to be an adaptive mutation 23 [4-8]. Here we report that frameshift reversion occurs in E. coli much more frequently 24 than expected and appears to be a targeted gene repair involving many gene repairing 25 pathways. 26 2. Materials and Methods 27 2.1 Frameshift preparation and revertant screening 28 The G:C base pair at +136 was deleted from the wild-type bla gene (bla+) by 29 using an overlapping extension polymerase chain reaction (OE-PCR), resulting in a 2 / 24 bioRxiv preprint doi: https://doi.org/10.1101/069971; this version posted October 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Premature termination codons signal frame repair 1 plasmid containing a frameshift bla gene, pBR322-(bla-). Competent cells of E. coli 2 DH5α were transformed with plasmid pBR322 or pBR322-(bla-), propagated in 3 tetracycline broth; dilutions were plated in parallel on tetracycline plates and 4 ampicillin plates to screen for revertants. The recovery rates were calculated by a 5 standard method [9]. The growth rates were evaluated by the doubling time in the 6 exponential growth phase. The plasmid DNA was extracted, and the bla genes were 7 sequenced by the Sanger method. 8 2.2 Construction and expression of a PTC-substituted frameshift gene 9 A PTC-substituted bla- gene, denoted as bla#, was derived from bla- by replacing 10 each nonsense codon with a sense codon according to the readthrough rules (Table 1). 11 A stop codon, TAA, was added at the 3'-end. The bla# gene was chemically 12 synthesized by Sangon Biotech, Co. Ltd (Shanghai), inserted into the expression 13 vector pET28a and transformed into E. coli competent cells strain BL21. The 14 transformants were plated on a kanamycin plate, propagated in kanamycin broth, and 15 plated on ampicillin plates to screen for revertants. The expression of the bla# gene 16 was induced by 0.1 mM IPTG. The total protein samples were analyzed by sodium 17 dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), and the product 18 protein was purified by a nickel column chromatography. The purified product was 19 tested by an iodometry assay to measure its lactamase activity [10]. 20 2.3 Genome resequencing and variation analysis 21 Genomic DNA was extracted from the frameshift (Fs) and the revertant (Rs). The 22 Novogene Co. Ltd. performed next-generation sequencing (NGS) on the Illumina 23 HiSeq 250PE platform. For each strain, clean reads were mapped onto the reference 24 sequence of E. coli K12 MG1655 (NC_000913.3) and pBR322 (J01749.1). Single 25 nucleotide polymorphisms (SNPs), InDels, and structural variations (SVs) were 26 analyzed. 27 2.4 Transcriptome profiling and gene expression analysis 28 Total RNA samples were extracted from four different strains, including a wild 29 type (Wt), a frameshift (Fs), a slow-growing revertant (Rs), and a fast-growing 30 revertant (Rf). Novogene Co. Ltd performed RNA sequencing (RNAseq). The 3 / 24 bioRxiv preprint doi: https://doi.org/10.1101/069971; this version posted October 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Premature termination codons signal frame repair 1 expected number of Fragments PerKilobase of transcript sequence per million base 2 pairs sequenced (FPKM) was calculated for each gene. The threshold for significantly 3 differential expression was set as corrected P-value (q) < 0.005 and fold change (f) ≥ 4 2.0. The enrichment analysis of Gene Ontology (GO) terms and Kyoto encyclopedia 5 of genes and genomes (KEGG) pathways for the identified DEGs were performed. 6 2.5 Integrative analysis of genome and transcriptome datasets 7 All NGS datasets for wild-type E. coli that are available in Sequence Read 8 Archive (SRA) were downloaded, including 1 genomic (SRR11474703) and 7 9 transcriptomic or RNAseq (SRR1149439, SRR2914524, SRR2914548, SRR2914549, 10 SRR2914550, SRR6111094, and SRR6111095) datasets; the RNAseq datasets are 11 pooled into a large transcriptomic dataset for wild-type E. coli. After mapping the 12 reads onto the reference genome, the genomic and the transcriptomic coverage depths 13 for each location (in KB) were plotted next to each other on a circular map using 14 Circos (v0.69) [11]. 15 2.6 Quantitative analysis of global proteomes 16 Total protein samples were prepared for the frameshift (Fs) and the revertant (Rs). 17 Quantitative analysis of the proteome was performed using a service provided by 18 PTM-Biolabs (Hangzhou), Co., Ltd. Identified protein IDs were converted into 19 UniProt ID and mapped to GO IDs. For the identified proteins unannotated in the 20 UniProt-GOA database, InterProScan is used to determine their GO annotations. 21 3. Results and Analysis 22 3.1 The growth of the frameshift and the revertants 23 The plasmid pBR322 contains two wild-type resistance genes, a β-lactamase gene 24 (bla+) and a tetracycline resistance gene (tet+). The +136 G:C base pair of the bla 25 gene was deleted by OE-PCR (Fig 2A), resulting in a plasmid containing a frameshift 26 bla, pBR322-(bla-). This deletion is a lost-of-function mutation because 17 PTCs 27 appeared, and all the active sites locates at the downstream of the deletion, including 28 the acyl ester intermediate (AA 68), the proton receptor (AA 166), and the 29 substrate-binding site (AA 232-234), which were all destroyed.