Supplementary Text and Tables for a Frameshift Mutation Is Repaired
Total Page:16
File Type:pdf, Size:1020Kb
1 Supplementary text and tables for 2 A frameshift mutation is repaired through nonsense-mediated 3 gene revising in E. coli 4 Xiaolong Wang1*, Xuxiang Wang1, Chunyan Li1, Haibo Peng1, Gang Chen1, Jianye Zhang1 5 College of Life Sciences, Ocean University of China, Qingdao, 266003, P. R. China 6 7 1. Materials and Methods in details 8 1.1 Frameshift preparation and revertant screening 9 The G:C base pair located at +136 was deleted from the wild-type bla gene (bla+) by 10 an overlapping extension polymerase chain reaction (OE-PCR). E. coli DH5α competent 11 cells were transformed with pBR322-(bla-), propagated in tetracycline broth. Dilutions 12 were plated in parallel on tetracycline plates and ampicillin plates to screen for revertants. 13 The recovery rates were calculated by a standard method [1]. The revertants were 14 propagated in ampicillin broth at 37°C with 200 rpm shaking and overnight seed culture 15 (1:50). The growth rates of the revertants were evaluated by the doubling time in their 16 exponential growth phase. Their plasmid DNA was extracted, and their bla genes were 17 sequenced by the Sanger method. 18 1.2 Construction and expression of a PTC-free frameshift 19 Usually, a suppressor tRNA gene is introduced into cells to readthrough a nonsense 20 mutation and incorporate an amino acid instead of terminating the translation. However, 21 suppressor tRNA genes are not suitable for reading through the PTCs of bla- for two 22 reasons: first, all three different PTCs (UGA, UAG, UAA) are present in bla-, we must 23 introduce three different suppressor tRNA genes. However, there is no means to restrict 24 their function to bla and not to other genes. If three suppressor tRNAs were introduced, all 25 the true termination codons of other genes would also be readthrough, producing all sorts 26 of odd peptides in the host; second, even if three different suppressor tRNAs were 27 introduced, there is no guarantee that bla- would be expressed as expected, because the * To whom correspondence should be addressed: Xiaolong Wang, Department of Biotechnology, Ocean University of China, No. 5 Yushan Road, Qingdao, 266003, Shandong, P. R. China, Tel: 0086-139-6969-3150, E-mail: [email protected]. 1 nonsense mRNAs may be subject to nonsense-mediated mRNA decay or translational 2 frameshifting pathways. 3 A PTC-free frameshift bla, denoted as bla#, was derived from bla- by replacing each 4 nonsense codon with a sense codon according to the readthrough rules (Table 1). A stop 5 codon, TAA, was added in the 3'-end. The bla# gene was chemically synthesized by Sangon 6 Biotech, Co. Ltd (Shanghai), inserted into the expression vector pET28a and transformed 7 into E. coli competent cells strain BL21. Since pET28a has a kanamycin resistance gene, 8 the transformants were then plated on a kanamycin plate, propagated in kanamycin broth, 9 and plated on ampicillin plates to screen for revertants. The expression of the bla# gene 10 was induced by 0.1 mM IPTG. Their total protein samples were extracted, the product was 11 purified by the nickel column chromatography and analyzed by sodium dodecyl sulfate- 12 polyacrylamide gel electrophoresis (SDS-PAGE). The purified product was tested by an 13 iodometry assay to measure its lactamase activity . 14 1.3 Plasmid and genomic DNA extraction 15 The revertant (DH5α/pBR322/bla*) was transferred in ampicillin and tetracycline 16 containing LB, and the frameshift (DH5α/pBR322/bla-) was inoculated in tetracycline 17 containing LB. After inoculation, the cells were cultured at 37 ℃ and 200 rpm for 12 h. 18 For each strain, 1.0 mL cultures were sent to Sangon Biotech Co., Ltd (Shanghai) to Sanger 19 sequence the bla and tet genes. The sequencing primer was tet-f: TAA CGC AGT CAG 20 GCA CCG t (65-83 base on plasmid pBR322). The genome DNA of the above strains were 21 extracted using genome DNA extraction kit (Tiangen). 22 1.4 Genome resequencing and structure/variation analysis 23 Library preparation and genome sequencing were conducted by a commercial service 24 provided by Novogene Co. Ltd. Paired-end reads were obtained on an Illumina HiSeq 25 250PE platform. The raw reads were trimmed by Trimmomatic (v0.39) to remove adapters 26 and low-quality sequences for each sample. The clean reads were mapped onto the 27 reference genome of E. coli K12 MG1655 (NC_000913.3) and the reference sequence of 28 plasmid pBR322 by bwa (v0.7.17); the output alignments were indexed by Samtools 29 (v0.1.18) [2], sorted by Picard (v2.23.4); then the reads around indels were realigned, and 30 single nucleotide polymorphisms (SNPs) and indels were called by GATK (v4.1.2.0) [3]. 31 Structural variations (SVs) were scanned by BreakDancer (v1.3.6) [4, 5]. 1 1.5 RNA extraction 2 The wild type (DH5α/pBR322/bla+) and the revertants (DH5α/pBR322/bla*) were 3 inoculated in amp and tet containing LB, and the frameshift (DH5α/pBR322/bla-) was 4 inoculated in tet containing LB. After inoculation, the cells were cultured at 37 ℃ and 200 5 rpm for 12 h. The wild type, the frameshift and the revertants were inoculated in fresh LB 6 with appropriate antibiotics at the ratio of 1:100 and cultured at 37 ℃ and 200 rpm for 3 h 7 to log phase. The total RNA samples of the above strains were extracted using RNA 8 extraction kit (Tiangen). 9 1.6 Transcriptome analysis 10 Library preparation and RNA sequencing were conducted by a commercial service 11 (Novogene Co. Ltd). The high-quality clean reads for each sample were mapped using bwa 12 (v0.7.17) onto the reference sequence of the E. coli K12 MG1655 genome (NC_000913.3), 13 and the plasmid pBR322 (J01749.1) using Circos (v0.69) [6], the coverage depths of the 14 transcriptomic reads were displayed on a ring diagram next to those of the corresponding 15 genomic reads. 16 The expression level for each gene was calculated and analyzed as followed: 17 (1) Quantification of gene expression level 18 Bowtie (v2-2.2.3) was used for aligning the clean reads to the reference genome [7]. 19 HTSeq (v0.6.1) was used to count the numbers of reads mapped to each gene [8]. The 20 expected number of Fragments PerKilobase of transcript sequence per million base pairs 21 sequenced (FPKM) was calculated for each gene. 22 (2) Identification of differential expression genes 23 To identify differential expression genes (DEGs), read counts were adjusted by the 24 edgeR program for each genotype [9]. Differential expressions of the read counts of two 25 conditions were performed using the DEGSeq R package (1.20.0) [10]. The P values were 26 adjusted by the Benjamini & Hochberg method. The significantly differential expression 27 threshold was set as corrected P-value (q-value) < 0.005; fold change ≥ 2.0 and ≥ 1.5 are 28 considered great and moderate changes. 29 (3) GO and KEGG enrichment of the DEGs 30 Gene Ontology (GO) enrichment of the DEGs was implemented by the GOseq R 31 package [11], in which the potential gene length bias was corrected, and GO terms with 1 corrected P-value < 0.05 were considered significantly enriched by DEGs. KOBAS 2.0 [12] 2 was used to test the statistical enrichment of DEGs in the Kyoto encyclopedia of genes and 3 genomes (KEGG) pathways (http://www.genome.jp/kegg/). 4 1.7 Quantitative analysis of global proteome 5 Quantitative analysis of global proteome was performed by PTM-Biolabs (Hangzhou), 6 Co., Ltd. Peptides were loaded onto a reverse-phase pre-column (Acclaim PepMap 100), 7 separated on a reverse-phase analytical column and analyzed by Q Exactive™ hybrid 8 quadrupole-Orbitrap mass spectrometer (ThermoFisher Scientific). 9 The MS/MS data were processed using the Mascot search engine (v.2.3.0). The tandem 10 mass spectra were searched against the Uniprot database. GO annotation proteome was 11 derived from the UniProt-GOA database (http://www.ebi.ac.uk/GOA/). Identified protein 12 IDs were converted into UniProt ID and mapped to GO IDs. For the proteins unannotated 13 in the UniProt-GOA database, InterProScan is used to determine their GO annotations. 14 2. A review of previous studies that are supportive of the NMGR model 15 In the main text, the proposed NMGR model suggests that frame repair is triggered by 16 PTCs, based on RNA-directed DNA repair, and nonsense mRNAs are edited before 17 directing the repair of the coding DNA. NMGR integrates several key links, which have 18 already been intensively studied, including mRNA surveillance, mRNA processing, mRNA 19 editing, DNA recombination and repair. Like the connected pathways, frame repair should 20 be widely existing and highly conserved among species. NMGR shows consistency with 21 many previous studies. This supplementary text reviewed previous studies covering a wide 22 range of prokaryotic and eukaryotic species to gain deeper knowledge. 23 2.1 mRNA decay is linked to other nonsense mRNA processing pathways 24 In eukaryotes, NMD uses the presence of an exon junction complex (EJC) downstream 25 PTCs as a second signal to distinguish a PTC from a true stop codon [13]. In addition to 26 NMD, the nonsense mRNAs may also be subject to some other pathways, including 27 translational repression, transcriptional silencing, and alternative splicing, all of which are 28 linked to NMD [14-16]. In eukaryotes, NMD regulating factors, including three interacting 29 proteins, UPF1, UPF2 (also known as NMD2), and UPF3, are encoded by highly conserved 30 genes originally identified in yeast [17].