Investigation of Intergenic Regions of Mycoplasma Hyopneumoniae and Development of Statistical Methods for Analyzing Small-Scale RT-Qpcr Assays Stuart W
Total Page:16
File Type:pdf, Size:1020Kb
Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2010 Investigation of intergenic regions of Mycoplasma hyopneumoniae and development of statistical methods for analyzing small-scale RT-qPCR assays Stuart W. Gardner Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Statistics and Probability Commons Recommended Citation Gardner, Stuart W., "Investigation of intergenic regions of Mycoplasma hyopneumoniae and development of statistical methods for analyzing small-scale RT-qPCR assays" (2010). Graduate Theses and Dissertations. 11238. https://lib.dr.iastate.edu/etd/11238 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Investigation of intergenic regions of Mycoplasma hyopneumoniae and development of statistical methods for analyzing small-scale RT-qPCR assays by Stuart W. Gardner A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Co-majors: Microbiology; Statistics Program of Study Committee: F. Chris Minion, Co-major Professor Max D. Morris, Co-major Professor Daniel S. Nettleton Dan J. Nordman Qijing Zhang Iowa State University Ames, Iowa 2010 ii TABLE OF CONTENTS LIST OF FIGURES iv LIST OF TABLES v LIST OF ABBREVIATIONS vi ABSTRACT viii CHAPTER 1. GENERAL INTRODUCTION Introduction 1 Dissertation Organization 1 Summary with Goals and Hypothesis 2 References 2 CHAPTER 2. LITERATURE REVIEW Mycoplasmas 3 Mycoplasma hyopneumoniae 4 Promoter Structure in Bacteria 5 Transcriptional Control in Mycoplasmas 12 Quantitative Real-Time Reverse Transcriptase PCR 21 Microarray 33 Top-Down and Bottom-Up Methods of Analysis 47 Delta Method 49 References 52 CHAPTER 3. DETECTION AND QUANTIFICATION OF INTERGENIC TRANSCRIPTION IN MYCOPLASMA HYOPNEUMONIAE Summary 66 Introduction 66 Methods 69 Results 75 Discussion 81 Acknowledgements 84 References 85 Supplemental Data 89 Supplemental File: Example Calculation of a Relative Quantity Index b/m 94 CHAPTER 4. STATISTICAL METHODS TO ANALYZE SMALL-SCALE RT- qPCR EXPERIMENTS AND TO COMPARE RELATIVE QUANTITIES ACROSS MULTIPLE TARGET SEQUENCES Abstract 99 Background 100 iii Methods 102 Results 111 Discussion 124 Authors’ Contributions 127 Acknowledgements 127 References 127 Supplemental Data 130 Additional file 1: Demonstration of Equivalence of Fold Change Calculations 130 Additional file 2: Details on the Analysis of the Design I Assays 135 Additional file 3: Details on the Analysis of the Design II Assays 141 CHAPTER 5. DISCUSSION General Discussion 148 Recommendations for Future Research 150 References 152 APPENDIX A. MICROARRAY EXPERIEMENTS AND SCRIPT FOR ANALYSIS High vs Low Adherent Variants 155 High vs Low Passage in M. hyopneumoniae Strain 232 166 Hydrogen Peroxide Stress in M. hyopneumoniae Strain 232 174 In vitro vs in vivo Grown M. hyopneumoniae Strain 232 177 Norepinephrine Stress in M. hyopneumoniae Strain 232 181 Comparative Hybridization Analysis of M. hyopneumoniae Field Isolates 189 References 219 APPENDIX B. CUSTOM R FUNCTIONS bh.fdr.R 222 compress.R 222 cum.mean.R 223 log.0.R 223 msn.R 224 normalize.R 225 plot.gene.R 225 q.value.R 226 slide.comp.R 227 ACKNOWLEDGMENTS 228 iv LIST OF FIGURES Figure 3.1. qRT-PCR analysis of IG regions of M. hyopneumoniae 77 Figure 3.2. Transcriptional analysis of IG regions flanking dnaK following heat shock 80 Figure 3.S1. Histogram of IG region size 89 Figure 3.S2. Reference gene amplification plot for standard curve dilution series 95 Figure 3.S3. Graph raw data 96 Figure 3.S4. Standard curve for reference gene 97 Figure 4.1. 96-well plate layout for Design I assay 104 Figure 4.2. 96-well plate layout for Design II assay 105 Figure 4.3. Fold change from both assay designs 113 Figure 4.4. Design I relative quantities 114 Figure 4.5. Design II relative quantities 116 Figure 4.6. Fold change of Design I vs Design II 118 Figure 4.7. Within treatment relative quantity of Design I versus Design II 119 Figure 4.8. Standard error ratios for fold change estimates 120 Figure 4.9. Design I relative quantity standard errors 121 Figure 4.10. Design II relative quantity standard errors 122 Figure 4.11. Within SE calculation method comparison of designs 123 Figure A.1. Scatter plot of genetic variation of field isolates 192 Figure A.2. Permutation test of hot spots showing variation around the M. hyopneumoniae genome 194 v LIST OF TABLES Table 2.1. Possible outcomes for m null hypothesis tests 46 Table 3.1. Primers used for qRT-PCR analysis 72 Table 3.S1. The 343 IG regions of 50 bp or more 90 Table 3.S2. Data from amplification plot of reference gene 96 Table 3.S3. Averaged duplicate Cq values from reference gene 97 Table 3.S4. Reference gene Cq values for individual samples 98 Table 3.S5. Cq values for target sequence ig-070/071 PS1 98 Table 4.1. Primer sets for RT-qPCR assays 103 Table 4.2. Parameter estimates, efficiency calculations, and model fit statistics for Design I assays 113 Table 4.3. Parameter estimates, efficiency calculations, and model fit statistics for Design II assays 115 Table 4.4. Summary table of the RT-qPCR assay results and model fitting 117 Table A.1. Most significant differentially expressed genes in high vs low adherent variants 157 vi LIST OF ABBREVIATIONS ANOVA – analysis of variance b – intercept in the model Cq = mx + b BALF – bronchial alveolar lavage fluid bp – base pair BU_AS – bottom-up all separate statistics BU_C – bottom-up combined statistics C - control cDNA – complementary DNA CDS – coding sequence CI – confidence interval CP – crossing point Cq - Quantification cycle, threshold fluorescence value CRISPR – clustered regularly interspaced short palindromic repeats Ct – crossing threshold D – dye df – degrees of freedom DNA - Deoxyribonucleic acid dsDNA – double stranded DNA FC – fold change FDR – false discovery rate FWER – family wise error rate G – gene gDNA – genomic DNA GFP – green fluorescent protein HS – heat shock IG – intergenic region ISU – Iowa State University lowess – locally weighted linear regression m – slope in the model Cq = mx + b Mbp – million bp mRNA – messenger RNA ncRNA – small non-coding RNA nm - nanometers nt – nucleotide vii ORF – open reading frame p1 – plate 1 p2 – plate 2 PCR – polymerase chain reaction PS – primer set qPCR – relative quantitative PCR qRT-PCR – quantitative real-time-PCR R2 – coefficient of determination REML – restricted maximum likelihood Rep – replicate RG – Reference Gene R-M – restriction modification RNA - Ribonucleic acid RNAP – RNA polymerase rRNA – ribosomal RNA RT – reverse transcriptase RT-qPCR – relative quantitative reverse transcription PCR S – slide SD – Shine-Delgarno SE – standard error(s) SEM – standard error of the mean sRNA – small non-coding RNA T – treatment/treated TBD – to be determined TD – top-down statistics TF – Transcription Factor TFBS – Transcription Factor-binding Site tRNA – transfer RNA TU – Transcriptional Units UTR – Untranslated Region UV – ultraviolet viii ABSTRACT The intergenic region (IG) transcriptional activity of Mycoplasma hyopneumoniae strain 232 was studied via two-color microarrays and quantitative real-time polymerase chain reactions (RT-qPCR). Two types of microarrays were constructed, one consisting of PCR products and the other of synthesized oligonucleotides. The PCR-array consisted of 994 PCR products (probes) which covers 98% (683/698) of the total open reading frames (ORFs) of strain 232, five structural ribosomal RNA probes, and 159 IG probes for 112 of 215 IG regions greater than 124 bp. The oligonucleotide-array consisted of 528 oligonucleotide probes ranging in size between 50 and 60 bp, and was designed for IG regions for which PCR products were not constructed or the length of the region (50-124 bp). Transcriptional signals were identified in 93.6% (321/343) of the IG regions larger than 49 bp. From these IG regions with transcriptional activity, five large (>500 bp) IG regions and the region upstream of dnaK were chosen for further analysis by RT-qPCR. A novel method to compare the relative quantity estimates of several different targets was developed for the RT-qPCR assays, and various methods were investigated to obtain error estimates of the fold change and relative quantity by applying top-down or bottom-up statistical approaches for two different experimental designs. The results from these assays indicate that no single transcriptional start site can account for transcriptional activity within IG regions. Transcription can end abruptly at the end of an ORF, but this does not seem to occur at high frequency. Rather, transcription continues past the end of the ORF, with RNA polymerase gradually releasing the template. Transcription can also be initiated within IG regions in the absence of accepted promoter-like sequences. Also, when conducting small scale RT-qPCR studies, the error in estimation of amplification efficiency should not be ignored in determining statistically significant differences. An assay design which uses serial dilutions of each individual sample to determine the amplification efficiency of a target sequence is favored over an assay design which uses the Stock I methodology to evaluate target sequence amplification efficiencies. In summary, methods to analyze the transcriptional activity of M. hyopneumoniae have been developed and the results have shown that IG regions are transcriptionally active and under some regulatory control. 1 CHAPTER 1. GENERAL INTRODUCTION Introduction The subject of this study is RNA transcription in the microorganism Mycoplasma hyopneumoniae. This highly contagious and chronic disease causing bacteria is associated with respiratory disease and reduced productivity in swine (2). Pathogenesis is known to occur through the adherence of the organism to the cilia of the respiratory epithelia in the swine lung, however little else is known about the pathogenic mechanisms (1).