Abstract
COKER, JEFFREY SCOTT. The systemic response to fire damage in tomato plants: A case study in the development of methods for gene expression analysis using sequence data. (Under the direction of Dr. Eric Davies.)
Fire is a natural component of most terrestrial ecosystems and can act as a local wound stimulus to plants. The ultimate goal of this work was to characterize the array of
transcripts which systemically accumulate in plants after fire damage. Before this could be
accomplished, substantial development of methods for gene expression analysis using sequence data was necessary. This involved developing methods for identifying
contamination in DNA sequence data (Chapter 2), identifying over 78,000 false sequences in
GenBank and several thousand more in the indica rice genome (Chapter 2), developing a
novel method for identifying housekeeping controls using sequence data (Chapter 3),
performing relative expression analyses for 127 potential housekeeping control transcripts
(Chapter 3), and characterizing 23 transcripts which encode all 13 subunits of vacuolar H+-
ATPases in tomato plants (Chapter 4). A subtractive cDNA library served as a starting point to identify and characterize 9 novel tomato transcripts systemically up-regulated in leaves in the first hour after a distant leaf is flame wounded (Chapters 5). Real-time RT-PCR using leaf RNA isolated at different times after flaming showed that the most common pattern of transcript accumulation was an increase within 30 to 60 minutes, followed by a return to basal levels within 3 hours. Expression analyses also showed that most up-regulated transcripts were already present in unwounded tissues. A total of 46 different transcripts
were identified from the subtractive cDNA library (Chapters 6). Compared with the entire
tomato transcriptome, these 46 transcripts are very highly conserved in plants. The vast majority fell into 5 classes: enzymes of general metabolism; protein synthesis, modification,
and transport; transcription; membrane transport; and photosynthesis and respiration. At
least half of the transcripts have been previously associated with wounding or stress,
suggesting that the systemic response to fire damage has components similar to those of other
wound and stress responses. On the other hand, 30% of transcripts were associated with
photosynthesis and respiration, suggesting that part of the response to fire damage is notably different from other wound and stress responses. Conclusions and future directions are included in Chapter 7.
THE SYSTEMIC RESPONSE TO FIRE DAMAGE IN TOMATO PLANTS: A CASE STUDY IN THE DEVELOPMENT OF METHODS FOR GENE EXPRESSION ANALYSIS USING SEQUENCE DATA
by JEFFREY SCOTT COKER
A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fulfillment of the requirements for the Degree of Doctor of Philosophy
DEPARTMENT OF BOTANY
Raleigh 2004
APPROVED BY:
______
Dr. Judy Thomas Dr. Jack Wheatley Advisory committee member Advisory committee member
______
Dr. Dominique Robertson Dr. Chris Brown Advisory committee member Advisory committee member
______
Dr. Eric Davies Chair of Advisory Committee
Dedication
The dissertation of Jeffrey S. Coker, which completes the Degree of Doctor of Philosophy, is dedicated to the educators of Plymouth, North Carolina.
Leafie Bryant Carolyn Watlington Julia Towe Joyce Hardison Rita Rhodes Donna Whitfield Frances Callander A. Willingham Ann Bland Robert Moore Doris Downing Kevin Cutler Ruth Pharr Leroy Bland Beth Thompson Kathy Stanfield Shirley Thomas Kerry Koeppl Sally Woolard Donald Rote Glenda Smith Pam Benson Bea Waters Becky Brown Judy Wynn Robert Cody Ms. Wilkins Alma Phifer Senya Norman Frances Jones Roxanna Brown Marian Floyd Judy Bragg Ed Clark Mary Kay Bradshaw Geraldine Rodgers Janet Swain Charlene Evans Joyce Hardison Susan Owens Dianne Staten Hector Palacios Donald Hassell Susie Jakeman Louis Spencer Victor Davis Marty Alligood Patrick Parr Mr. and Mrs. Sermons Michelle Stewart Julius Walker Robert Cody
ii Biography
Jeffrey Scott Coker was born the son of Jerry and Debra Coker in the small town of
Plymouth, North Carolina. His interest in plants is probably due to a family of gardeners, pulp and paper engineers, and wood-workers, as well as a community where farms, forests, ball fields, and swamps are plentiful. Jeffrey attended Davidson College, where he studied biology and ancient Greek and Roman civilizations, and played baseball. After graduation, he worked for one year at the Helen Paesler School in Raleigh, NC, teaching high school biology, chemistry, and calculus, as well as middle school science/math. It was during this year that he found a passion for teaching science and decided to pursue it at the college level.
Jeffrey entered graduate school at N.C. State University in 1999 as an RA/TA in the
Botany Department, where he taught laboratories in Botany and Biotechnology, and co- taught a new Whole Plant Physiology course. He earned a M.Ed. in Science Education in the spring of 2001, and formally became a Ph.D. student in Botany (under Dr. Eric Davies) shortly thereafter. He has been recognized for his teaching at N.C. State by receiving the
CALS Outstanding Teaching Assistant Award, the Martha Sue Sebastian Memorial Award for Excellence in Teaching, a GSA Outstanding Teaching Award, an Alcoa Teaching
Fellowship, and a NACTA Graduate Student Teaching Award. Student researchers under his supervision have been recognized locally and nationally for their work.
While in Raleigh, Jeffrey met a wonderful girl named Beth, and they were married on
December 20, 2003, in Greenville, N.C. Beginning in August of 2004, Jeffrey will be an
Assistant Professor in the Biology Department at Elon University. He looks forward to a successful career in teaching and research, and to spending many happy years with Beth.
iii Acknowledgements
There are many people who have supported me over the last five years in various capacities. My committee members have been extremely supportive, and for that I am most grateful. Dr. Eric Davies has been an outstanding research advisor in every sense. His openness to new ideas, support of my work, willingness to integrate scientific and educational pursuits, careful review of manuscripts, daily friendliness, and general guidance have all been invaluable. Perhaps the most distinct impression Eric has left on me is the amount of effort he spends helping to advance the lives and careers of his students and colleagues. I cannot think of a more admirable quality. We have had many conversations about how many students do not fully appreciate a teacher or mentor until years later. Let me assure you that I am fully aware of what an outstanding advisor I have had. Dr. Judy Thomas has been an excellent mentor and friend, and was an instrumental part of my success in graduate school. She believed in me when others were skeptical, and set me on the right path more times than I can count. Dr. Chris Brown has been a role model for me in terms of professionalism, teaching, and the leadership of research and teaching collaborations. He introduced me to concepts of Space Biology which changed the way I look at my own discipline. I credit Dr. Niki Robertson with shaping my earliest thoughts about biotechnology, and value her thoughts very highly. Her enthusiastic and insightful approaches to science and life are contagious among her students. Dr. Jack Wheatley’s presence on my committee is especially meaningful because he represents good teaching and educational scholarship. I am thankful for his guidance, patience, and insightful reviews of my teaching.
iv A number of people worked alongside me in the laboratory, and provided daily assistance for which I am thankful. Dr. Raul Salinas was especially helpful and patient.
Most of my “co-workers” were high school and undergraduate student researchers who always made the lab a more enjoyable place. In particular, I am thankful to have worked with Derek Jones regarding vacuolar ATPases and enjoyed both his enthusiasm and friendship. Other student researchers included Katie Grant, Jessica Staley, Holly Cline, Ryan
Parks, Ashwynn Stanger, John Pollard, and Turqouise Ross.
Dr. Gerald Van Dyke has been an invaluable teaching mentor and friend. His excitement about teaching and commitment to students have inspired me to seek excellence in the classroom. My time at N.C. State would not have been the same without the friendship and conversation of Dr. Isaac Bruck. I am also thankful for the Botany administrative staff, especially Sue Vitello and Vicki Lemaster, who dealt with many issues on my behalf.
I am blessed with a loving family which has provided support in many forms. Mom,
Dad, Grandmother, Chris, Eric, Laura, Sheila, Mike, Josh, and Debbie have all played important roles in my life. On at least two occasions, family members (Eric and Mom) helped me to overcome significant research difficulties.
Finally, I could not dream of having a more supportive wife. Beth has been at my side through virtually every step of my dissertation research. She has assisted me in the field, in the laboratory, and in the classroom. She has read my papers, inspected tables and figures, listened to whole lectures just so I could practice and, perhaps most importantly, encouraged me to work long hours when deadlines approached or I became really excited about something (which happens frequently). She must be, as we joke, the “best chemical engineering botanist” in the country. Any success I have must also be hers.
v Part of the research and travel associated with this dissertation was funded by grants from the Plant Molecular Biology Consortium, Sigma Xi, and the American Society of Plant
Biologists. Acknowledgements of a more technical nature are provided at the end of each chapter.
vi Table of Contents
List of Tables ...... xi
List of Figures...... xiv
1. Introduction...... 1
2. Sequence quality control ...... 6
A. Identifying adaptor contamination when mining DNA sequence data...... 7 Abstract...... 7 Acknowledgments...... 11 References...... 12
B. Cleaning data mined from the indica rice genome...... 16 Abstract...... 16 SmaI-linearized pUC18 plasmid...... 16 Regions of other cloning vector(s)...... 18 Phytophthora...... 19 Conclusions...... 20 References...... 21
C. Correction of the 5’ end of the human com1/p8 gene...... 26 Letter...... 26 References...... 26
3. Selection of candidate housekeeping controls in tomato plants using EST data ...... 28
Abstract...... 29 Introduction...... 29 Materials and methods ...... 30 Data mining...... 30 Calculation of relative expression levels ...... 30 Calculation of fold ranges and transcript variation...... 30 Results and discussion ...... 31 Acknowledgements...... 33 References...... 33
4. Identification, conservation, and relative expression of V-ATPase cDNAs in tomato plants...... 34
Abstract...... 35 Introduction...... 35 Materials and methods ...... 37 vii Identification of V-ATPase ESTs ...... 37 Relative expression analyses...... 37 Gene nomenclature ...... 37 Results and discussion ...... 40 23 V-ATPase genes identified in tomato...... 40 Hexamer rings are highly conserved...... 40 Relative expression levels in different tissues ...... 41 V-ATPase relative expression increases during fruit ripening ...... 45 Conclusion ...... 46 Acknowledgements...... 46 References...... 47
5. Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage ...... 49
Abstract...... 50 Introduction...... 51 Results...... 55 Discussion...... 59 CSWR-1 Acyl carrier protein ...... 60 CSWR-2 Adenylyl-sulfate reductase...... 60 CSWR-3 Unknown protein...... 61 CSWR-4 Photosystem II oxygen-evolving complex protein 3...... 61 CSWR-5 Putative anion:sodium symporter...... 61 CSWR-6 Unknown wound/stress protein...... 62 CSWR-7 Chloroplast-specific ribosomal protein ...... 63 CSWR-8 Alpha/beta fold family protein ...... 63 CSWR-9 Histidine triad family protein ...... 63 Materials and Methods...... 65 Plant material, growth conditions, and tissue collection...... 65 Subtractive cDNA library construction, screening and sequencing ...... 65 DNA sequence analysis and data mining...... 66 Verification of consensus sequences ...... 67 Real-time RT-PCR assays...... 67 Relative expression analyses...... 68 Polypeptide sequence analysis...... 69 Acknowledgements...... 69 Literature cited...... 70
6. Fire damage causes the systemic up-regulation of a set of highly conserved transcripts in tomato plants ...... 84
Abstract...... 85 Introduction...... 86 Materials and methods ...... 89 Plant material, growth conditions, and tissue collection...... 89
viii Subtractive cDNA library construction, screening and sequencing ...... 89 DNA sequence analysis ...... 90 Comparisons with the Arabidopsis genome ...... 90 Results...... 92 Overview of the subtractive cDNA library...... 92 Library validation...... 94 Conservation between tomato and Arabidopsis...... 95 Discussion...... 97 Transcripts common to other wound and stress responses ...... 97 Transcripts not common to other wound and stress responses...... 101 References...... 103
7. Conclusions and future directions...... 108
Conclusions and future directions regarding the development of methods for gene expression analysis using sequence data: Blueprint for a universal sequencing-based method of gene expression analysis...... 109 Abstract...... 109 Disadvantages of binding-radiation methods...... 110 Advantages of sequencing methods...... 112 Obstacles and specifications for a universal sequencing-based method...... 117 References...... 120 Conclusions and future directions regarding the biology of systemic responses to fire damage ...... 121
Appendices...... 124
Appendix 1: V-ATPase amino acid alignments...... 125
Appendix 2: Annotated sequences for novel tomato transcripts/proteins ...... 141
Appendix 3: Perspectives on student research experiences in plant biology...... 152
Overview...... 152
A. Involvement of plant biologists in undergraduate and high school student research ...... 153 Abstract...... 153 Introduction...... 153 Methods...... 153 Member participation...... 153 Advantages and disadvantages of research training ...... 154 References...... 156
B. A national perspective on mentoring student researchers in plant biology...... 157 Abstract...... 157
ix Introduction...... 158 Materials and methods ...... 161 Results and discussion ...... 164 Acknowledgements...... 176 References...... 177
C. Evaluation of teaching and research experiences undertaken by botany majors at N.C. State University...... 185 Abstract...... 185 Introduction...... 186 Methods...... 188 Results and discussion ...... 188 Acknowledgements...... 195 Literature cited...... 195
x List of Tables
Chapter 2-A
Table 1. Sequences and search parameters to identify entries in GenBank contaminated by 7 commercial adaptor sequences ...... 14
Chapter 2-B
Table 1. Matches in the indica genome with the pUC18 SmaI site...... 22
Table 2. Examples of internal pUC18 artifacts (≥14 bp) in indica scaffolds ...... 24
Table 3. Examples of phytophthora-like sequences in the indica genome...... 25
Chapter 3
Table 1. Summary of tentative consensus sequences (TCs) from the TIGR TGI that were analyzed for their potential as housekeeping control genes...... 30
Table 2. Highest-ranking housekeeping control genes in various tomato plant tissues ...... 31
Chapter 4
Table 1. V-ATPase genes in Arabidopsis and tomato ...... 38
Chapter 5
Table 1. Sequence extension and polypeptide deduction for unidentifiable tomato cDNA fragments that are "candidates for the systemic wound response" (CSWR) ...... 76
Table 2. PCR primers specific to 9 novel tomato cDNAs that were used to verify putative open reading frame sequences and perform real-time RT-PCR experiments...... 77
Chapter 6
Table 1. Summary of a subtractive cDNA library containing transcripts systemically up- regulated in the hour after fire damage...... 93
Chapter 7
Table 1. Specifications for a universal sequencing-based method of gene expression analysis...... 118
xi Appendix 1
Table 1. Subunit c amino acid identities...... 127
Table 2. Subunit c” amino acid identities ...... 128
Table 3. Subunit d amino acid identities...... 129
Table 4. Subunit e amino acid identities...... 130
Table 5. Subunit A amino acid identities...... 132
Table 6. Subunit B amino acid identities...... 134
Table 7. Subunit C amino acid identities...... 135
Table 8. Subunit D amino acid identities...... 135
Table 9. Subunit E amino acid identities ...... 137
Table 10. Subunit F amino acid identities ...... 138
Table 11. Subunit G amino acid identities...... 139
Table 12. Subunit H amino acid identities...... 140
Appendix 3-A
Table 1. ASPB member involvement and satisfaction with supporting undergraduate and high school research...... 154
Table 2. Frequencies of ASPB member comments regarding the potential advantages of supporting undergraduate (UG) and high school (HS) research...... 154
Table 3. Frequencies of ASPB member comments regarding the potential disadvantages of supporting undergraduate (UG) and high school (HS) research...... 155
Appendix 3-B
Table 1. Population demographics of respondents to a survey of the American Society of Plant Biologists (ASPB) ...... 180
Table 2. Percentages of respondents who have mentored various numbers of undergraduates ...... 181
xii Table 3. Respondent perceptions of institutional incentives for mentoring student researchers ...... 182
xiii List of Figures
Chapter 1
Figure 1. Strategy to identify and analyze cDNAs up-regulated in tomato leaf tissue during a systemic wound response to fire damage...... 5
Chapter 2-A
Figure 1. The path from sequencing a cDNA to an improperly edited sequence...... 15
Chapter 2-B
Figure 1. Matches of 20 bp, 19 bp, 18 bp, etc. in the indica genome corresponding to the pUC18 SmaI site...... 23
Chapter 3
Figure 1. Percentage of tomato cDNA libraries (n = 27) which contain ESTs for given genes within various fold ranges of relative expression ...... 32
Chapter 4
Figure 1. Amino acid identity of tomato V-ATPase subunits compared to Arabidopsis ...... 42
Figure 2. Relative expression levels of V-ATPase ESTs in different cDNA libraries of the TIGR TGI...... 43
Figure 3. Relative expression levels of individual V-ATPase cDNAs...... 44
Figure 4. Cumulative relative expression levels of tomato V-ATPase subunits ...... 44
Figure 5. Similarity between V-ATPase relative expression in developing tomatoes and V- ATPase activity in developing grapes (grape data from Terrier et al., 2001)...... 46
Chapter 5
Figure 1. Strategy to identify and characterize cDNAs up-regulated in tomato leaf tissue during a systemic wound response to fire damage ...... 78
Figure 2. Confirmation of the existence of 9 putative consensus sequences for unknown tomato cDNAs ...... 79
Figure 3. Expressed sequence tag analysis of 9 cDNAs that are candidates for the systemic wound response (CSWR)...... 80
xiv
Figure 4. Organ-specific relative abundance of CSWR-1 through CSWR-9 in unwounded tomato plants...... 81
Figure 5. Systemic transcript accumulation of 9 tomato cDNAs (CSWR-1 through CSWR-9) in leaf 4 after flame wounding leaf 3...... 82
Figure 6. Structural and functional prediction of 9 tomato proteins, encoded by CSWR-1 through CSWR-9 ...... 83
Chapter 6
Figure 1. Conservation of transcript sequences between tomato and Arabidopsis...... 95
Figure 2. Phenylpropanoid biosynthesis from phenylalanine...... 98
Figure 3. The methyl cycle and ethylene synthesis ...... 99
Chapter 7
Figure 1. Comparisons that can be made between 2 transcript populations using binding- radiation (a) and sequencing (b) methods...... 114
Figure 2. Theoretical blueprint for a universal sequencing-based method of gene expression analysis...... 119
Appendix 1
Figure 1. Alignment of c subunits in tomato ...... 125
Figure 2. Alignment of c subunits in tomato and Arabidopsis ...... 126
Figure 3. Alignment of c” subunits in tomato...... 127
Figure 4. Alignment of c” subunits in tomato and Arabidopsis ...... 128
Figure 5. Alignment of d subunits in tomato and Arabidopsis...... 129
Figure 6. Alignment of e subunits in tomato ...... 130
Figure 7. Alignment of e subunits in tomato and Arabidopsis ...... 130
Figure 8. Alignment of A subunits in tomato and Arabidopsis ...... 131
Figure 9. Alignment of B subunits in tomato ...... 132
xv Figure 10. Alignment of B subunits in tomato and Arabidopsis ...... 133
Figure 11. Alignment of C subunits in tomato and Arabidopsis ...... 134
Figure 12. Alignment of D subunits in tomato and Arabidopsis...... 135
Figure 13. Alignment of E subunits in tomato...... 136
Figure 14. Alignment of E subunits in tomato and Arabidopsis...... 136
Figure 15. Alignment of F subunits in tomato and Arabidopsis...... 137
Figure 16. Alignment of G subunits in tomato ...... 138
Figure 17. Alignment of G subunits in tomato and Arabidopsis...... 138
Figure 18. Alignment of H subunits in tomato and Arabidopsis...... 139
Appendix 3-A
Figure 1. ASPB member comments regarding potential advantages of supporting undergraduate researchers...... 155
Figure 2. ASPB member comments regarding potential advantages of supporting high school researchers...... 155
Figure 3. Number of ASPB member comments regarding undergraduate and high school research ...... 155
Appendix 3-B
Figure 1. Percentages of plant biologists who mentored various numbers of undergraduates in different “length of their mentoring career” categories ...... 183
Figure 2. Total number of undergraduates mentored by plant biologists of different academic ranks at land-grant universities, other research universities, and primarily undergraduate institutions (PUIs) ...... 184
Figure 3. Percentages of plant biologists of different academic rank at land-grant universities, other research universities, and primarily undergraduate institutions (PUIs) who perceive institutional incentives for mentoring undergraduate researchers ...... 184
Appendix 3-C
Figure 1. Average levels of student involvement in typical teaching-related activities ...... 198
xvi Figure 2. Average levels of student involvement in typical research-related activities ...... 198
Figure 3. Student perceptions of their research and/or teaching experience ...... 199
xvii
Chapter 1
Introduction
1 The ultimate goal of this dissertation was to identify transcripts that are systemically
up-regulated in response to fire damage in tomato plants. In order to accomplish this task,
several advances for sequencing-based methods of gene expression analysis had to be
developed and refined before meaningful analysis of a subtractive cDNA library could be
achieved. In Chapter 2, methods for improving sequence quality control and identifying
false sequences are presented. A method for identifying adaptor contaminants was
developed and used to identify over 78,000 false sequences in GenBank. One of the many
contaminated sequences was from the human p8/com1 gene, which has implications for
research on breast cancer. Other types of sequence contamination include sequences from
vectors and foreign organisms (pathogens, etc.), which were found in several thousand
locations in the indica rice genome. In Chapter 3, a novel method for identifying and
evaluating housekeeping genes using sequence data is presented. Using this method with
tomato sequences, relative expression analyses for 127 potential housekeeping control
transcripts were performed. These analyses provided potential housekeeping transcripts
which were used for real-time RT-PCR experiments later in the dissertation (Chapter 5).
In order to characterize the array of transcripts which systemically accumulate in
plants after fire damage, a subtractive cDNA library was used for their isolation and
identification, and these are described in Chapters 4-6. Chapter 4 (with Appendix 1) presents
the identification and characterization of 23 transcripts which encode all 13 subunits of
vacuolar H+-ATPases in tomato plants. This study stemmed from the discovery that one of the transcripts from the library encoded a c subunit of vacuolar H+-ATPase. In Chapter 5
(with Appendix 2), the library served as a starting point to identify and characterize 9 novel tomato transcripts systemically up-regulated in leaves in the first hour after a distant leaf is
2 flame wounded. Real-time RT-PCR using leaf RNA isolated at different times after flaming
showed that the most common pattern of transcript accumulation was an increase within 30
to 60 minutes, followed by a return to basal levels within 3 hours. Expression analyses also
showed that most up-regulated transcripts were already present in unwounded tissues.
Structural and functional predictions were also performed for each of the 9 novel transcripts.
In Chapter 6, a total of 46 different transcripts are described which were identified from the subtractive cDNA library. Compared with the entire tomato transcriptome, these 46 wound- up-regulated transcripts are very highly conserved. The vast majority fell into 5 classes: enzymes of general metabolism; protein synthesis, modification, and transport; transcription; membrane transport; and photosynthesis and respiration. At least half of the transcripts have been previously associated with wounding or stress, suggesting that the systemic response to fire damage has components similar to those of other wound and stress responses. On the other hand, 30% of transcripts were associated with photosynthesis and respiration, suggesting that part of the response to fire damage is notably different from other wound and stress responses. In addition to furthering knowledge on systemic responses to fire damage,
Chapters 4-6 (and Appendices 1 and 2) demonstrate how sequence data can be used simultaneously for gene discovery and expression analyses.
In Chapter 7, conclusions and future directions are provided for gene expression analyses using sequence data and for the biology of systemic responses to fire damage.
Future directions include a universal sequencing-based method of gene expression analysis,
as well as experiments to address whether or not the 46 transcripts lead to proteins which
actually function during the systemic response to fire damage.
3 Appendix 3 presents several educational studies on how to involve undergraduates
and high school students in research projects such as the ones presented in this dissertation.
The overall flow of work for this dissertation is shown in Figure 1. Work began with a subtractive cDNA library containing tomato transcripts up-regulated during a systemic response to flame wounding. From the subtractive cDNA library, tomato cDNA fragments were isolated and sequenced. The sequences were then screened for various types of contamination (using methods developed in Chapter 2). Blast searches of GenBank databases allowed the sequences to be divided into 3 classes based on their similarity to known genes: known tomato genes, homologous to known genes (but not known in tomato), and unidentifiable. The cDNA fragments which were unidentifiable were then analyzed in much more detail. Using expressed sequence tags (ESTs) in public databases, the full-length open reading frames of the transcripts were pieced together with the aid of bioinformatics tools. These full-length sequences were then checked experimentally by building PCR primers, amplifying them from a cDNA sample, and sequencing. The ESTs from public databases were also used to perform expression analyses. Using the full-length open reading frame sequences, extensive bioinformatics work was performed to predict the structures and functions of the putative proteins. Finally, real-time RT-PCR was performed over a 6 hour time course after flame wounding to better understand the kinetics of transcript accumulation. Housekeeping controls which were used in real-time RT-PCR experiments were chosen using the methods presented in Chapter 3.
4
Subtractive cDNA library of tomato genes up-regulated during a systemic wound response
Clone isolation and sequencing
Sequence quality control VecScreen Bacterial database searches
Blast searches of GenBank
ESTs ESTs from known Unidentifiable homologous to tomato genes ESTs known genes
Relative expression Sequence analysis using the extension using TIGR TGI the TIGR TGI
Sequence verification (PCR & sequencing)
Real-time RT- Housekeeping Blast searches of PCR (6 hr. controls GenBank timecourse)
PROSITE Protein family Pfam analysis PRINTS ProDom SMART TIGRFAMS Structural analysis
Localization Transmembrane Alpha helices / Interacting Coiled-coils / signals regions Beta sheets proteins leucine zippers
TargetP PHDhtm PROFsec DIP COILS HMMTOP 2ZIP
Figure 1. Strategy to identify and analyze cDNAs up-regulated in tomato leaf tissue during a systemic wound response to fire damage. Chapter 2 addresses issues of sequence quality control (light gray), Chapter 3 deals with selection of housekeeping controls (dark gray), and Chapters 4-6 present analyses beginning with the subtractive cDNA library and extending the length of the flow diagram.
5
Chapter 2
Sequence Quality Control
Jeffrey S. Coker and Eric Davies
Eric Davies provided guidance and editorial assistance.
This chapter is divided into three separate papers. Data associated with the first paper were reported to the National Center for Biotechnology Information in 2001, leading to the correction of numerous RefSeqs (curated gene sequences). The first paper has been accepted for publication in Biotechniques, and the second will be submitted. The third paper was published in 2002 in the journal Cancer Research 62, 4164-4165, and led to the correction of the human p8 cDNA sequence in GenBank.
6
Identifying adaptor contamination when mining DNA sequence data
Jeffrey S. Coker and Eric Davies
Department of Botany, North Carolina State University, Campus Box 7612, Raleigh, North Carolina 27695. email: [email protected]
Abstract
Meaningful analysis of DNA sequences depends on the accuracy of the sequences themselves, and so false sequences in public databases are a major concern for bioinformatics research. We describe a simple screen which has identified adaptor contamination in over
78,000 eukaryotic sequences in GenBank. Most of these entries were found in the GenBank
EST databases, but 4,528 were found in the GenBank/EMBL/DDBJ/PDB “nr” database. Out of a subset of 210 contaminated “nr” database entries, adaptor sequence was present in 82
(39%) as part of a gene or cDNA and in 11 (5%) as part of an open reading frame. Adaptor contamination was found to extend beyond public databases since 108 of the 210 “nr” entries are linked to peer-reviewed publications. Bioinformatics work which uses data mined from public sequence databases should include a simple check for adaptor contamination.
Detection of adaptor sequence contamination is made far easier by knowing that over 99% of adaptor contaminants appear near the ends of sequences, are flanked by vector, or involve adaptor dimerization.
7
Analysis of DNA sequences can only be as correct as the sequences themselves, and so contamination in public databases is a major concern for bioinformatics research. Here we describe a simple screen which identified adaptor contamination in over 78,000 eukaryotic sequences in GenBank. Awareness that over 99% of adaptor contaminants appear near the ends of sequences, are flanked by vector, or involve adaptor dimerization allows the detection of 99% of these sequences (Fig. 1).
A contaminated sequence is defined as “one that does not faithfully represent the genetic information from the biological source organism/organelle because it contains one or more sequence segments of foreign origin” (http://www.ncbi.nlm.nih.gov/VecScreen/contam.html). Sources of contamination for nuclear DNA and cDNA include vector sequence (1-6), plasmid vector insertion sequences (7), impure tissue sources (8), faulty laboratory protocols (9-10), mitochondrial DNA (11), and ribosomal DNA/RNA (12). There is one published account of contamination due to adaptor sequences, where it was shown that commercial adaptor sequences matched the 5’ or 3’ end of 728 GenBank and EMBL sequences (13). Strategies to decrease contamination in database sequences have emphasized vector sequences (4-6, 8) and given little attention to adaptor contamination.
An adaptor is a short oligonucleotide that is ligated to the ends of cDNAs for incorporation into a vector cloning site (Fig. 1). Usually adaptors consist of several restriction sites, one blunt end (for ligation to cDNA) , and one cohesive end (for ligation to a vector). Adaptors are frequently used in the construction of cDNA libraries and in generating cDNA ends using RACE (rapid amplification of cDNA ends) PCR.
8 The presence of adaptor sequences in organismal sequences in public databases has
the potential to cause many different errors of interpretation (14,15) which include the following:
False hits for others using public databases. Added difficulties in identifying genes and joining contigs. Misconstruction of PCR primers, microarrays, probes, etc. Incorrect conclusions regarding evolution and differences between organisms. Incorrect conclusions about gene structure, mRNA splicing, and mRNA transport. Incorrect conclusions about protein sequence, structure, transport, and function.
To investigate adaptor contamination in public databases, BLASTn searches of
GenBank (release 140.0; Feb. 15, 2004) eukaryotic sequences were performed using the
search parameters shown in Table 1. The search parameters returned perfect matches (100%
identity) with the respective adaptor sequences (Table 1). It should be noted that 3 separate
searches of the EST databases were performed for Stratagene Zap and Clontech P1/PN1
adaptors (human, mouse, and non-human/mouse ESTs were searched separately using the E-
values in Table 1) because searching all ESTs simultaneously returned more hits than the
server could process. Manual review of individual GenBank entries, literature review, and
personal communications were used to investigate several hundred matches further.
GenBank entries with adaptor contamination were also screened for vector contamination
using VecScreen (www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html), the tool commonly
used to screen GenBank submissions.
The searches and subsequent analyses identified over 78,000 contaminated sequences
in GenBank (Table 1). Most contaminated sequences were found in the GenBank expressed
sequence tag (EST) database, but the “nr” database (which contains annotated genes, etc.)
also contained 4,528 false sequences. There were also a large number of shorter matches
9 with adaptors that were not included when using the search parameters in Table 1, making it evident that the actual number of contaminated sequences is much higher than shown in
Table 1. Simply increasing the E-value will return these shorter matches.
Within the contaminated GenBank sequences, over 99% of adaptors were within 50 bp of an end, connected to vector sequence match as shown by VecScreen, or involved in dimerization (Fig. 1). The majority of matches not near the 5’ or 3’ end involved dimerization of Stratagene’s ZAP adaptor as shown in Figure 1. We performed BLASTn searches using the full sequences of many GenBank entries that included putative dimer sequences in the gene or cDNA sequence. These searches typically resulted in some
GenBank entries matching the query on one side of the dimer, but had totally different entries matching the other side, suggesting that the query sequences actually contained two unrelated sequences that were joined via dimerization. Obviously, this has the potential to create significant errors, especially since the dimer is often in the middle of sequences where it is more likely to be interpreted as part of the open reading frame.
A subset of 210 matches (from the “nr” database) with Clontech’s Marathon primer adaptors were examined more closely. These adaptors are part of Clontech’s suppression subtractive hybridization procedure (U.S. patents 5,565,340 and 5,759,822) used originally to make cDNA libraries and probes (16,17). Currently, a single 44 bp adaptor (P1/PN1) is used in both Marathon and PCR-Select products. The first guanine residue in P1 has been changed to a cytosine in recent Clontech kits.
STAATACGACTCACTATAGGGC TCGAGCGGCCGCCCGGGCAGGT P1 PN1
10 In the first Clontech libraries utilizing this technology, a second adaptor (P2/PN2) was also used (16).
TGTAGCGTGAAGACGACAGAA AGGGCGTGGTGCGGAGGGCGGT P2 PN2
Of 210 matches with Clontech Marathon adaptors, at least 82 (39%) are contaminated in
regions designated as gene or cDNA sequence, including 11 open reading frames (5%).
Through literature review and personal communications, we confirmed that Clontech
protocols had been used. Published literature shows these false sequences appearing in
transposons, protein sequences, regions used to join contigs, and other biologically relevant
regions. In fact, we found published accounts of (unrecognized) contaminated sequence in
most major journals of genetics and molecular biology.
The recognition of adaptor contamination has the potential to resolve many problems
in the literature (14,15). It is expected that removing adaptor contamination will clarify
many gene sequences as individual labs reinterpret their own sequences, and will prevent
those mining data from amplifying such errors.
Acknowledgments
We thank the scientists who corresponded with us regarding their GenBank entries,
Sophia Clotho for advice, Ron Sederoff for critical review, and staff at NCBI for their
correspondence.
11 References
1. Lamperti, E.D., J.M. Kittelberger, T.F. Smith, and L. Villakomaroff. 1992. Corruption of genomic databases with anomalous sequence. Nucl. Acids Res. 20:2741-2747.
2. Lopez, R., T. Kristensen, and H. Prydz. 1992. Database contamination. Nature 355:211.
3. Reynolds, T.L. 1994. Vector DNA artifacts in the nucleotide-sequence database. Biotechniques 16:1124-1125.
4. Harger, C., M. Skupski, J. Bingham, A. Farmer, S. Hoisie, P. Hraber, D. Kiphart, L. Krakowski, et al. 1998. The Genome Sequence DataBase (GSDB): improving data quality and data access. Nucl. Acids Res. 26:21-26.
5. Miller, C., J. Gurd, and A. Brass. 1999. A RAPID algorithm for sequence database comparison: application to the identification of vector contamination in the EMBL databases. Bioinformatics 15:111-121.
6. Seluja, G.A., A. Farmer, M. McLeod, C. Harger, and P.A. Schad. 1999. Establishing a method of vector contamination identification in database sequences. Bioinformatics 15:106- 110.
7. Binns, M. 1993. Contamination of DNA database sequence entries with Escherichia coli insertion sequences. Nucl. Acids Res. 21:779-779.
8. White, O., T. Dunning, G. Sutton, M. Adams, J.C. Venter, and C. Fields. 1993. A quality- control algorithm for DNA-sequencing projects. Nucl. Acids Res. 21:3829-3838.
9. Gersuk, V.H. and T.M. Rose. 1993. Database contamination. Science 260:606.
10. Dean, M. and R. Allikmets. 1995. Contamination of cDNA libraries and expressed- sequence-tags databases. Am. J. Hum. Genet. 57:1254-1255.
11. Wenger, R.H. and M. Gassmann. 1995. Mitochondria contaminate databases. Trends Genet. 11:167-168.
12. Gonzalez, I.L. and J.E. Sylvester. 1997. Incognito rRNA and rDNA in databases and libraries. Genome Res. 7:65-70.
13. Yoshikawa, T., A.R. Sanders, and S.D. Detera Wadleigh. 1997. Contamination of sequence databases with adaptor sequences. Am. J. Hum. Genet. 60:463-466.
14. Coker, J.S. and E. Davies. 2002. Correspondence re: A.H. Ree et al., Expression of a Novel Factor in Human Breast Cancer Cells with Metastatic Potential (Cancer Res., 59: 4675-4680, 1999). Cancer Res. 62:4164-4165.
12 15. Forster, P. 2003. To err is human. Annals of Human Genetics 67: 2-4.
16. Diatchenko, L., Y-F. Chris Lau, A.P. Campbell, A. Chenchik, F. Moqadam, B. Huang, S. Lukyanov, K. Lukyanov, et al. 1996. Suppression subtractive hybridization: A method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proc. Natl. Acad. Sci. USA 93:6025-6030.
17. Jin, H., X. Cheng, L. Diatchenko, P.D. Siebert, and C.C. Huang. 1997. Differential screening of a subtracted cDNA library: a method to search for genes preferentially expressed in multiple tissues. Biotechniques 23:1084-6.
13
Table 1. Sequences and search parameters to identify entries in GenBank contaminated by 7 commercial adaptor sequences.
Adaptor Sequence to search Detected by Search Parameters Matches in Eukaryota VecScreen? Filter E-value Word size Identity nr database EST database Clontech P1/PN1 TCGAGCGGCCGCCCGGGCAGGT Yes none 1 7 100 255 11655 Clontech P2/PN2 AGGGCGTGGTGCGGAGGGCGGT No none 1 7 100 13 705 Clontech EcoRI AATTCGCGGCCGCGTCGAC Yes none 0.05 7 100 156 15071 Promega EcoRI AATTCCGTTGCTGTCG No none 5 7 100 120 1167 Stratagene/Amersham Pharmacia EcoRI/NotI AATTCGCGGCCGC No none 150 7 100 765 16196 Stratagene ZAP AATTCGGCACGAG No none 150 7 100 3166 28830 Stratagene ZAP (dimer) CTCGTGCCGAATTCGGCACGAG No none 0.005 7 100 (778) (24106) Life Technologies 3' RACE GGCCACGCGTCGACTAGTAC Yes none 10 7 100 53 66 4528 73690 =78218
14
cDNA
adaptor cDNA adaptor
Sequencing start site
Bacter ial plasmid
Unedited sequence 1 Unedited sequence 2 Unedited sequence 3
3 types of adaptor contamination
1) 5’ or 3’ end
2) Flanked by vector
3) Adaptor dimers
Stratagene AATTCGGCACGAG ZAP Adaptor GCCGTGCTC
Dimer CTCGTGC CG AATTCGGCACGAG sequence GAGCACGGCTTAA GCCGTGCTC
Figure 1. The path from sequencing a cDNA to an improperly edited sequence. More than 99% of sequences contaminated with adaptors fall into one of the 3 groups shown at the bottom.
15
Cleaning data mined from the indica rice genome draft
Jeffrey S. Coker and Eric Davies
Department of Botany, North Carolina State University, Campus Box 7612, Raleigh, North Carolina 27695. email: [email protected]
Filtering out false sequences is a challenge for every genome project. Because the
Oryza sativa L. ssp. indica genome draft (1) is a major resource for efforts to improve the world food supply, its accuracy is of paramount importance and thus needs to be scrutinized very closely. The analysis presented here is intended especially for those mining data from the indica genome, and indicates false sequences of three different types: short (< 21 bp) remnants of SmaI-linearized pUC18 plasmid, regions of other cloning vector(s), and genomic sequence from an unidentified species of Phytophthora.
Recommendations are given for how to identify each type of false sequence when using data mined from the indica genome draft. Removal of false sequences is necessary to avoid errors in calculating polymorphism rates, gene discovery, estimating lateral gene transfer, and many other forms of bioinformatics research.
SmaI-linearized pUC18 plasmid
It was reported that a SmaI-linearized pUC18 plasmid was used for cloning rice genomic fragments (1), and thus it follows that each rice sequence would have been flanked by pUC18 before the sequence was “cleaned”. We have found that short remnants of pUC18 are still scattered throughout the indica genome. As shown in Table
1, 98% of matches with the pUC18 SmaI site (≥14 bp) in both the unassembled data and fully masked reads end within 5 bp of a 5’ or 3’ end. All but four sequences in the
16 unassembled data and one fully masked read are within 15 bp of an end. This suggests
that the vast majority of matches with the pUC18 SmaI site derive from cloning vector
and are not genuine rice sequences. Peripheral contaminants in unassembled data are not
a problem as long as they are removed before assembly.
A much more significant problem occurs when these contaminants become
internalized as sequences are joined together. Table 2 shows examples of internalized
pUC18 artifacts which were found in the scaffolds listed in Table 1. The ratio of
internalized contaminants to total contaminants leads us to conclude that 5-7% of
peripheral contaminants were internalized during contig/scaffold construction. Each
scaffold in Table 1 matches japonica rice entries in GenBank directly before and after the short region in question but not within it, proving that each is a false sequence. For example, Scaffold 9177 (GenBank acc. no. AAAA01009177) contains a pUC18 fragment at 6913 bp, and matches japonica sequences on both sides of the fragment (Table 2).
Although the pUC18 fragment is only 20 bp long, the “hole” in the indica sequence
(compared to japonica) is 517 bp long. There are many examples of such holes which are clearly not biological in origin. From a comparison of Chromosome 4 between indica and japonica, it has been suggested that japonica sequence may be “larger” because of insertions of transposable elements, and the average frequency of single- nucleotide polymorphisms is 1 SNP per 268 bp (3). However, since many apparent insertions and SNPs are due to the presence of false sequences and holes in the indica draft, such conclusions about differences between indica and japonica may be premature.
Since contamination by 14-20 bp fragments is present, a much larger number of
scaffolds are expected to contain 1-13 bp bits of the pUC18 SmaI site. For instance,
17 random chance would furnish only 4.5 matches with the 13 nucleotide sequence preceding the SmaI site (CTAGAGGATCCCC), but indica scaffolds have 1274 matches, while japonica has only 10 (2). Comparing the number of possible pUC18 artifacts (7-20 bp) with the number of matches one would expect by chance (E-values) leads to a prediction of over 13,000 contaminants (Fig. 1), or .029% of the total contig length. The
7-20 bp pUC18 fragments alone (not including 1-6 bp fragments and the “holes” they often represent) could account for 14% of the SNPs (1 SNP per 269 bp) between indica and japonica (3).
For those mining data from the indica rice genome, we recommend the following steps: 1) Search all sequences for fragments of the pUC18 SmaI site
(GTCGACTCTAGAGGATCCCC) 2) Remove the pUC18 sequences when they occur at the end(s) 3) For internal pUC18 matches, take 200-500bp of sequence surrounding each possible pUC18 artifact and Blast it against japonica and/or other rice sequences in
GenBank. If the region is not genuine rice sequence, the sequences may match on either side of the SmaI site, but will not match indica in the SmaI site. Closer examination usually reveals a “hole” in the indica sequence ranging from 10bp to several thousand base pairs. Data miners should also be aware that every pUC18 contaminant that is at least 12 bp contains a potential false “STOP” site (TAG) from base 10 to 12.
Regions of other cloning vector(s)
It appears that vectors other than pUC18 were also used for indica library construction. In some cases, matches with a particular vector appear on both ends of a scaffold and correspond with a restriction site in that vector. For example, over 100 bp of
18 Life Technologies pZL1 from Lambda ZipLox (or a similar vector) is at the ends of at least 25 scaffolds (e.g. Scaffold 89563) (4). In other cases such as Scaffolds 39078 (1276 bp), 45670 (1105 bp), and 82154 (691 bp), entire indica scaffolds are 99-100% identical to several dozen common vectors but match no rice sequences in GenBank or Syd (2). In other more ambiguous cases (e.g. Scaffold 101296), scaffolds are near perfect matches with both vectors and rice ESTs in GenBank, but still match nothing in Syd. Judging by the large size of these matches, it is unlikely that all vectors used in library construction were accounted for in decontamination screens.
For those mining data from the indica genome, we recommend that sequences of particular interest are compared to the VecScreen database (4) and/or bacterial databases.
Phytophthora
Phytophthora are well-known stramenopiles that commonly parasitize a wide variety of plant species. There are several dozen indica scaffolds that match
Phytophthora sequences but do not closely match sequences either in japonica or any other higher plant (Table 3). For example, Scaffold 45690 (Contig 77125) has 99.7% identity with 1107 bp of P. infestans mitochondrial DNA coding for three ribosomal proteins, but has no significant match with any plant sequence. Searches of indica identified 226 scaffolds that match GenBank Phytophthora sequences with an E-value of
1x10-10 or lower (5). Many of these may be highly conserved rice sequences and not from Phytophthora. Even so, since it is evident that there are sequences from
Phytophthora present (Table 3) and no Phytophthora genome has been completely sequenced, these and perhaps many other scaffolds must be reassessed.
19 There are three possible explanations for Phytophthora-like sequences in the indica genome: pathogen-infected tissue, cross-contamination of libraries, and lateral gene transfer. It is quite possible that pathogen-infected rice tissue was used for DNA isolation since pathogens are notoriously prevalent in plant tissue. The more exciting explanation would be lateral gene transfer after the divergence of indica from japonica.
However, we are unaware of any example of simultaneous lateral gene transfer of nuclear genes encoding mRNA (e.g. ric1 and actA) and rRNA (e.g. 18S), and mitochondrial genes encoding mRNA (e.g. rp12, rps19, and rps3) and rRNA (e.g. 16S rRNA), all of which seem to be present in indica (Table 3).
For those mining data from the indica genome, we recommend that sequences of particular interest are compared to Phytophthora and japonica sequences (including
ESTs). Contaminants will be nearly identical to Phytophthora sequences (if they have been sequenced in Phytophthora). On the other hand, if the indica sequence is nearly identical to a japonica sequence, then it is not likely to be a contaminant.
Conclusions
The indica rice genome draft has already been used to evaluate monocot and eudicot divergence (6), sequence variation between varieties of rice (3, 7), single nucleotide polymorphisms in rice varieties (3, 8), characteristics of various gene families
(9, 10), and many other important topics. It serves as an important resource for improving world food supply and will be used extensively in the future, and so it is critical that those mining the indica genome be aware of its imperfections.
20 References
1. J. Yu et al., Science 296, 79 (2002); http://210.83.138.53/rice/.
2. S.A. Goff et al., Science 296, 92 (2002); http://portal.tmri.org/rice/.
3. Q. Feng et al., Nature 420, 316 (2002).
4. Kitts, P.A., Madden, T.L., Sicotte, H. & Ostell, J.A. Manuscript in preparation; http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html.
5. All GenBank Phytophthora sequences (including ESTs) were searched against the indica genome using MegaBlast. Scaffolds with significant matches were then used to search all GenBank sequences (BLASTn).
6. M. Vincentz et al., Plant Physiol. 134, 951 (2004).
7. C. Li et al., Theor. Appl. Genet. 108, 392 (2004).
8. S. Nasu et al., DNA Res. 9, 163 (2002).
9. S. Griffiths et al., Plant Physiol. 131, 1855 (2003).
10. L. Jia et al., Plant Physiol. 134, 575 (2004).
11. We thank R. Dean and T. Houfek for confirming our Phytophthora findings and for discussing all of our results, J. Xiang for her thoughts on the conservation of rRNA, E. Coker for computer expertise, and Sophia Clotho for her thoughts.
21
Table 1. Matches in the indica genome with the pUC18 SmaI site (GTCGACTCTAGAGGATCCCC). Matches shown are at least 14 bp long (Expect ≤ 5.7). pUC18 sequences are typically on an end (within 5 bp) of raw genomic sequences such as those in the unassembled data and fully masked reads, but became internalized as contigs and scaffolds were pieced together.
Sequence type Matches Matches at a 5’ or 3’ end Unassembled data 1990 98.1% (1953) Fully masked reads 4342 98.0% (4255) Contigs 944 85.9% (811) Scaffolds 990 70.3% (696)
22
35000
30000 Matches in indica genome Expect value 25000 s e c n e u 20000 seq
f o
er 15000 mb u N 10000
5000
0 20 19 18 17 16 15 14 13 12 11 10 9 8 7 Length of match with pUC18 SmaI site (bp)
Fig. 1. Matches of 20 bp (GTCGACTCTAGAGGATCCCC), 19 bp (TCGACTCTAGAGGATCCCC), 18 bp (CGACTCTAGAGGATCCCC), etc. in the indica genome corresponding to the pUC18 SmaI site. The expect values approximate the number of hits one would expect by chance, assuming a random genome sequence. This leads to a prediction of over 10,000 contaminants of 7 bp or longer.
23
Table 2. Examples of internal pUC18 artifacts (≥14 bp) in indica scaffolds. In each case shown, the corresponding japonica sequence matches the indica scaffold directly before and after the artifact. The “holes” in the indica sequences range from 14 to several thousand bp long. All artifacts shown are more than 100 bp from a scaffold end and from any unfilled gaps within scaffolds (designated by a stretch of N's in GenBank). Scaffolds are listed as their GenBank accession numbers (AAAA01 + scaffold number) to facilitate further review.
Artifact Corresponding Scaffold Length (bp) location japonica match AAAA01000517 40212 6774 AP005289.2 AAAA01000875 33584 15658 AC124836.2 AAAA01000879 34316 29865 AC090484.4 AAAA01001305 30163 10745 AC137634.3 AAAA01001453 29009 8647 AP003282.2 AAAA01002627 22827 863 AC146893.1 AAAA01004136 18264 9608 AC137073.2 AAAA01005244 16035 1108 AP004762.3 AAAA01006321 14292 8637 AC137999.2 AAAA01008123 11429 2056 AE017073.1 AAAA01009177 10884 6913 AP003204.3 AAAA01009685 10424 3987 AL663008.3 AAAA01011822 8659 452 AP003988.2 AAAA01011882 8621 330 AP004262.2 AAAA01011939 8590 1286 AP005002.2 AAAA01014582 6939 1294 AC136520.2 AAAA01015702 6320 5417 AL663018.4 AAAA01018944 4789 176 AC135928.2 AAAA01019811 4431 3969 AE017063.1 AAAA01019999 4366 4148 AP003301.3 AAAA01020286 4259 1857 AL606992.3 AAAA01022160 3609 819 AC137607.2 AAAA01029543 2088 540 AP003518.2 AAAA01054885 966 812 AE017102.1
24 Table 3. Examples of phytophthora-like sequences in the indica genome. "Closest" matches are defined as those with the lowest E-value (E<10) in GenBank databases. In all cases shown here, the Phytophthora match spanned the majority of the scaffold and had an effective E-value of 0. Short regions (18-80 bp) on the ends of 8 of these scaffolds are also contaminated by plasmid sequences.
Indica scaffold Closest match in all organisms Closest match in japonica genome Identity Acc. No. Description Identity Acc. No. Description AAAA01045690 1104/1107 (99%) U17009.2 P. infestans rib. prot. L2, S19, and S3 ------AAAA01065444 838/844 (99%) AJ238654.1 P. undulata 18S rRNA gene 536/617 (86%) AP004778.3 Genomic DNA, chromosome 2 AAAA01078719 705/709 (99%) X54265.1 P. megasperma 16S rRNA 613/715 (85%) AP004778.3 Genomic DNA, chromosome 2 AAAA01076286 639/647 (98%) BE776357.1 P. infestans unidentified cDNA ------AAAA01070180 630/633 (99%) BE776214.1 P. infestans unidentified cDNA ------AAAA01084216 630/636 (99%) BE777367.1 P. infestans unidentified cDNA ------AAAA01070144 581/584 (99%) BE775905.1 P. infestans unidentified cDNA 381/437 (87%) AK063121.1 cDNA clone:001-111-E07 AAAA01090700 579/587 (98%) AJ133023.1 P. infestans ric1 gene ------AAAA01091080 556/557 (99%) U50844.1 P. infestans host-specific elicitor inf1 gene ------AAAA01082659 556/559 (99%) BE776610.1 P. infestans unidentified cDNA ------AAAA01086249 557/567 (98%) BE776104.1 P. infestans unidentified cDNA ------AAAA01055069 555/584 (95%) BE776247 P. infestans unidentified cDNA 832/904 (92%) AK060330.1 cDNA clone:001-008-B01 AAAA01049644 489/498 (98%) BE777164 P. infestans unidentified cDNA ------AAAA01063300 444/445 (99%) M59715.1 P.infestans actin (actA) gene 387/457 (84%) AK059967.1 cDNA clone:006-211-F12 AAAA01102792 237/237 (100%) AF339424.1 P. infestans 5.8S rRNA (and spacer) ------
25
26 27
Chapter 3
Selection of Candidate Housekeeping Controls in Tomato Plants using EST Data
Jeffrey S. Coker and Eric Davies
Eric Davies provided guidance and editorial assistance.
This chapter was published in 2003 in the journal Biotechniques 35, 740-748. It is currently being considered for a patent under the title “Method for Identifying Constantly Expressed Genes Using Nucleic Acid Sequence Data” (NCSU Disclosure File Number 04-064).
28
29
30
31
32 33
Chapter 4
Identification, Conservation, and Relative Expression of V-ATPase cDNAs in Tomato Plants
Jeffrey S. Coker, Derek Jones, and Eric Davies
Derek Jones assisted in mining data for c subunit cDNAs. Eric Davies provided guidance and editorial assistance.
This chapter was published in 2003 in the journal Plant Molecular Biology Reporter 21, 145-158.
34
35
36
37
38
39
40
41
42
43
44
45
46
47 48
Chapter 5
Identification, Accumulation, and Functional Prediction of Novel Tomato Transcripts Systemically Up-regulated after Fire Damage
Jeffrey S. Coker, Alan Vian, and Eric Davies
Alan Vian constructed the subtractive cDNA library. Eric Davies provided guidance and editorial assistance.
This chapter has been submitted for publication.
49 Abstract
Despite the major impacts of fire on plants, responses to fire damage have not been closely studied on the level of gene expression. Here we present analyses of novel transcripts from tomato (Lycopersicon esculentum) which are systemically up-regulated in leaves after a distant leaf is wounded by flame. Nine cDNA fragments were isolated from a subtractive cDNA library of leaf tissue 1 hour after flaming. Using data mining and PCR, full-length open reading frames were predicted, amplified, and then sequenced.
Comparisons with the Arabidopsis genome suggested that 8 of the encoded proteins are slow-evolving. Real-time RT-PCR using leaf RNA after flaming confirmed the systemic accumulation of 4 and 7 transcripts within 30 and 60 minutes, respectively, before returning to basal levels within 3 hours. During this same time course, proteinase inhibitor I levels gradually increased over 30-fold in 6 hours. Expression analyses also showed that 8 of the transcripts are present in unwounded leaf, stem, and root tissues.
The predicted proteins include an acyl carrier, adenylyl sulfate reductase, PS II oxygen- evolving complex protein 3, anion:sodium symporter, chloroplast-specific ribosomal protein, a histidine triad family protein, and an unknown wound/stress-related protein.
Homologues of several of these proteins have been associated with other types of wound and stress responses. It appears that within an hour after being damaged by fire, plants systemically up-regulate a variety of genes involved with basic cell metabolism and upkeep, in addition to classic defense genes such as proteinase inhibitor
50 Introduction
Plants must cope with a wide variety of natural wounding stimuli such as fire, herbivory, wind, rain, hail, UV radiation, sand, and trampling. Because plants are sessile and cannot escape these stimuli, to ensure survival they often respond to tissue damage by changes in gene expression (Graham et al., 1986; Braam and Davis, 1990; Schaller and Ryan, 1996; León et al., 2001) in both damaged tissues (local responses) and in undamaged tissues (systemic responses). Many “systemic wound response proteins”
(Schaller and Ryan, 1996), which are expressed in undamaged tissues following the intercellular transmission of a wound signal, have been previously identified in tomato plants. These include proteinase inhibitors (Green and Ryan, 1972), systemin (Pearce et al., 1991), an aspartic protease (Schaller and Ryan, 1996), chloroplast mRNA-binding protein (Vian et al. 1999), a bZIP DNA-binding protein (Stanković et al., 2000), allene oxide synthase and fatty acid hydroperoxide lyase (Howe et al., 2000), and others.
Further characterization of the array of systemically up-regulated genes is necessary to better understand plant defense and stress response mechanisms.
Knowledge of systemically up-regulated genes is also necessary to characterize the intercellular signals that move from wounded to unwounded tissue. Systemic signals that have been proposed include proteinase inhibitor-inducing factor (Ryan, 1974), systemin (Pearce et al., 1991), abscisic acid (Peña-Cortés et al., 1991), oligosaccharides
(Ryan and Farmer, 1991), methyl jasmonate (Herde et al., 1996), action potential
(Stanković and Davies, 1996), and variation potential (Wildon et al., 1992; Vian et al.,
1996). It is clear that the systemic wound response is a complex network(s) induced by many different signals, and that the extent and timing of these signals may vary
51
significantly depending on the plant species and the precise nature of the wound. For
example, evidence from Arabidopsis microarray experiments suggests that there are
fundamental differences in gene expression in response to mechanical wounding and
insect feeding (Reymond et al., 2000). On the other hand, there is clear evidence for
cross-talk between defense responses such as those that are herbivore- and pathogen-
directed (Stennis et al., 1998). Much about how responses to fire damage compare with
other types of wound responses is unknown.
Fire impacts most terrestrial ecosystems, and plants have evolved mechanisms to
survive fire (Bond and van Wilgen, 1996; DeBano et al., 1998). For example, in the
southeastern United States, shrubs and herbaceous plants in savannas, forests, evergreen
shrub bogs, wire grass sand-hills, swamps, and other ecosystems often survive fires and
are able to resprout and reproduce in future years (Bond and van Wilgen, 1996; DeBano
et al., 1998; Wells, 2002). In fact, some of the most species-rich plant ecosystems (i.e.
the herbaceous groundcover of longleaf pine savannahs) require fire to persist (Platt et
al., 1988; Drewa et al., 2002). A common misconception is that all wildfires kill all
plants in the burned area. The National Parks Service has used a 5-tiered “burn severity class” system to describe vegetation damage following a wildfire which includes undamaged (tier 1), scorched (tier 2; leaf litter is singed and foliage is slightly yellowed),
and low severity (tier 3; leaf litter is partly/mostly consumed but foliage remains intact)
classes (USDI, 1992). Resprouting after fire damage can occur from partially burned
above-ground organs or from roots after complete destruction of above-ground organs.
Despite the major impacts of fire on plants, responses to fire damage have not
been closely studied on the level of gene expression. From an experimental standpoint,
52
flame causes severe, yet reproducible, damage without moving the plant. Leaf flaming
has already proven useful for identifying novel components of the systemic wound
response to fire such as Pin 1 (Wildon et al., 1992; Stanković and Davies, 1996),
chloroplast mRNA-binding protein (Vian et al., 1999) and a bZIP DNA-binding protein
(Stanković et al., 2000).
To study the impacts of fire damage (flame wounding), tomato plants have
several advantages. First, since extensive work with other wound stimuli has been done
using tomato plants, it is possible to compare flame-induced gene expression with this
previous work. Second, a substantial amount is known about wound signaling events in
tomato plants which will facilitate understanding of the timing of the response. Finally, like many species in the Solanaceae, tomato plants (both wild and cultivated) possess many characteristics which typically allow many herbaceous plants to survive fires.
These characteristics include being a perennial (Taylor, 1986), having carbohydrate reserves stored in underground organs (Peres et al., 2001; Verdaguer and Ojeda, 2002),
and the ability to regenerate shoots from hypocotyls, roots, or other tissues (Takashina et al., 1998; Bertram and Lercari, 2000; Peres et al., 2001). It has been found that smoke extract stimulates the growth of tomato roots in vitro (Taylor and van Staden, 1998), and that growth of species within the Solanaceae can be regulated by fire regimes (Preston and Baldwin, 1999). Also, a bZIP gene similar to the one we found to be up-regulated by flame-wounding (Stankovic et al., 2000) has also been associated with adventitious shoot regeneration (Low et al., 2001). Thus, tomato plants are the preferred model system for work on the systemic wound responses to fire damage.
For genes previously examined, the most common pattern of transcript
53 accumulation in leaf 4 of three-week old tomato plants following a flame wound on leaf 3 is an increase that peaks within an hour, followed by a rapid decrease (Davies et al.,
1997; Vian et al., 1999). These rapid changes are then followed by a more gradual period of increased, decreased, or unchanged transcript accumulation. This has been shown most vividly for Pin 1 (Stanković and Davies, 1997), CMBP (Vian et al., 1999), and a bZIP DNA-binding protein (Stanković et al., 2000). The complexity of responses to wounding for individual transcripts (rapid increases and decreases) and the variation between transcripts (different time points for increase/decrease) suggests that different genes are being up-regulated by different systemic signals, or combinations of signals.
This cannot be deciphered without characterizing a wider array of transcripts that accumulate systemically following flame wounding.
Here we present analyses of 9 previously unidentified tomato cDNAs which are systemically up-regulated after a distant leaf is wounded by flame. These cDNAs were isolated from a subtractive cDNA library (wound minus control) from tissue harvested one hour after flaming.
54
Results
Our strategy for identifying and characterizing clones from a subtractive cDNA
library of wound-induced transcripts is shown in Figure 1. Clones from the library were
labeled as “candidates for the systemic wound response” (CSWR). The 9 clones initially
isolated from the cDNA library ranged from 59 to 647 bp and had an average length of
292 bp (Table 1). Attempts to identify them using Blast searches of GenBank were
inconclusive and/or ambiguous. Therefore, we searched expressed sequence tags (ESTs)
in the TIGR Tomato Gene Index (TGI) to identify identical matches and extend the
cDNA sequences using consensus sequence information (Table 1). The resulting putative
cDNAs ranged from 596 to 1830 and had an average length of 1048 bp (Table 1). These
putative cDNAs were confirmed by performing PCR (Fig. 2) and sequencing the PCR
products using the primers in Table 2.
Blast searches using the extended sequences returned matches with protein
sequences in GenBank ranging from 43% to 83% identical (Table 1). The putative
translations of all 9 cDNAs suggested full-length proteins which were approximately the
same size as their respective GenBank matches. Therefore, all 9 cDNAs encode proteins similar to those sequenced in other plants, although the exact functions of most are still
unknown.
By comparing tomato Unigenes in the TIGR TGI with the Arabidopsis genome
(using tBlastx), Van der Hoeven et al. (2002) divided tomato ESTs into “not
homologous” (E value ≥ 0.1), “fast-evolving” (1.0E-15 < E value < 0.1), “intermediate
evolving” (1.0E-50 < E value < 1.0E-15), and “slow-evolving” (E value < 1.0E-50)
classes. Only about 22% of all Unigenes fell into the “slow-evolving” class. By
55
repeating their methodology, we found that 8 of our cDNAs could be considered as
“slow-evolving” and 1 (CSWR-1) as “intermediate-evolving”. This high degree of
conservation could be related to responses to fire damage being ancestral (Bond and van
Wilgen, 1996) and/or incorporating elements of basic cell metabolism/upkeep. Most
tomato genes involved directly in cell rescue, defense, cell death and aging are not fast
evolving as a group (Van der Hoeven et al., 2002). Homologues for 5 of the 9 tomato
cDNAs (CSWR-1, 2, 4, 6, and 8) were found on Arabidopsis chromosome 4, which is
interesting since a high proportion (approximately 12%) of all genes on chromosome 4
have been associated with defense and disease responses (Mayer et al., 1999).
ESTs are an excellent tool for the preliminary analysis of gene expression (Adams
et al., 1995; Coker et al., 2003; Coker and Davies, 2003), and over 155,000 tomato ESTs
representing a variety of tissues are represented in a single collection in the TIGR TGI
(Van der Hoeven et al., 2002). To further characterize our 9 cDNAs, organ-specific
expression analysis was performed using data mining (Fig. 3) and experimental
approaches (Fig. 4).
The EST analysis in Figure 3 and PCR experiments in Figure 4 both support
several trends. First, all 9 cDNAs are present in unwounded leaf tissue (usually at low levels). Although unanticipated, this is not necessarily surprising since the subtractive
library technique we used screened for up-regulated genes and not just those present in
one tissue and absent in another. Second, although our subtractive cDNA library was
constructed from leaf tissue, none of the cDNAs are leaf-specific. CSWR-1 was
represented by ESTs only from leaf/shoot tissue (Fig. 3), but PCR showed that it is also
present in roots (Fig. 4). All other cDNAs were present in multiple tissues in both
56 analyses. Third, CSWR-1, CSWR-3, CSWR-4, and CSWR-7 are more abundant in leaves than other tissues. Fourth, CSWR-6 and CSWR-8 are most abundant in root tissues. Fifth, CSWR-2 and CSWR-5 appear to be present at relatively constant levels in different organs. Finally, CSWR-9 has very low abundance in all tissues.
The mRNA accumulation of CSWR-1 through CSWR-9 in leaf 4 following flaming of leaf 3 is shown in Figure 5. Two real-time RT-PCR experiments are shown at each timepoint. Two additional biological replicates for the 0 and 60 minute timepoints were processed in a separate set of experiments and further support transcript up- regulation after flame wounding (data not shown). All technical considerations suggested that the real-time RT-PCR reactions successfully amplified a specific cDNA with high efficiency. Melting curves for all reactions showed only one peak, suggesting only one
PCR product. The efficiency of real-time PCR reactions can be calculated from the slope of a Ct vs. quantity graph using a 10-fold standard dilution (ideally -3.32 when quantity is on a log scale), and was above 99% across more than 4 orders of magnitude on all of our plates.
Proteinase inhibitor I (Pin 1) was used as a positive control and actin as a housekeeping control. Pin 1 increased 5-fold over 60 minutes and 33-fold over 6 hours
(Fig. 5). In all experiments, actin levels at 60 minutes (the peak of accumulation for
CSWR-1 through CSWR-9) were not significantly different from control levels (data not shown).
The average transcript levels of seven of the nine candidates for the systemic wound response (CSWR-1, 2, 4, 5, 6, 7, and 9) more than doubled after flame wounding
(Fig. 5). The average transcript levels of CSWR-1, 2, 4, and 7 more than doubled after
57 only 30 minutes (Fig. 5). Although transcript up-regulation was evident in both experiments, there was some variation in the timing of the response since transcripts tended to peak at 30 and 60 minutes in experiments 1 and 2, respectively (Fig. 5). After 3 hours, transcript levels in both experiments had decreased to near the original levels. It is possible that CSWR-6 and CSWR-7 maintain slightly increased levels after 6 hours, but we can not confirm this statistically in our experiments.
On the other hand, CSWR-3 and CSWR-8 levels were not increased significantly relative to the 0 timepoint and showed somewhat erratic patterns of expression early in the timecourse (Fig. 5). Both cDNAs actually decreased to below 50% of their original levels after 6 hours. Also, all cDNAs except Pin 1 seemed to slightly decrease after 5 minutes, although this was statistically significant only for CSWR-3 and CSWR-8. This decrease could be part of a general transcriptional response to flame wounding caused by increased degradation or an interruption of transcription.
The predicted proteins encoded by CSWR-1 through CSWR-9 are shown in
Figure 6. It is notable that all 4 transcripts which were more prevalent in leaves (Fig. 3 and 4) encode proteins with chloroplast transit peptides (Fig. 6). This suggests consistency between our experimental and bioinformatics approaches. The success rates for correctly predicting localization signals, transmembrane regions, and alpha helices/beta sheets with the chosen software are 85% (Emanuelsson et al., 2000), 89-94%
(Tusnády and Simon, 2001; Rost, 1996), and 72% (Rost, 1996), respectively. The
COILS program used to predict coiled-coil regions yields a set of probabilities that reflect the coiled-coil forming potential of a sequence. We accepted coiled-coils with at least
80% probability that were at least 28 bp long.
58
Discussion
Our results illustrate that flame wounding induces the systemic up-regulation of numerous transcripts within an hour. The majority of studies involving the up-regulation of genes during systemic wound responses have examined time courses from 1 to 24 hours. Nevertheless, a number of studies suggest that the systemic response begins in distant leaves within the first hour after wounding. For example, Orozco-Cardenas and
Ryan (1999) found that hydrogen peroxide generated in response to leaf crushing can be found in distant tomato leaf veins within an hour after wounding. The systemic mRNA increase of ethylene-responsive transcription factors (ERFs) peaks within the first 30 minutes after crushing parts of a tobacco leaf before returning to the original levels after an hour (Nishiuchi et al., 2002). Cutting a petiole results in systemic accumulation of
ERF3 and ERF4 in the first 10 minutes (Nishiuchi et al., 2002). Similarly, during systemic responses in tomato leaves, levels of phosphatidic acid increase fourfold within
5 minutes, while lysophosphatidylcholine and lysophosphatidylethanolamine increase twofold within 15 minutes (Lee et al., 1997). Microarray experiments suggest that mechanical wounding induces up-regulation of at least 20 genes after 15 minutes, some of which fall rapidly to their original level (Reymond et al., 2000). In summary, there is significant evidence that various components of systemic responses reach leaves distant from a wound within minutes, and our results confirm this observation for fire-inflicted wounding.
When fire burns an organic material, an oxidation-reduction reaction takes place where O-H bonds are broken and heat is released. When first heated, fuels produce water vapor and mostly noncombustible gases which include terpenes and aromatic aldehydes
59
(DeBano et al., 1998). Heat then causes pyrolysis, the chemical decomposition of fuel materials to yield organic vapors and charcoal, and eventually combustion. Inevitably, flame causes significant stress to a plant (oxidative, hydraulic, toxic, etc.) in addition to causing a local wound. Responses to wounding and other stresses could explain the up- regulation of many of the flame-induced transcripts which we describe in the following.
CSWR-1 Acyl carrier protein
The CSWR-1 protein is 54% homologous to the acyl carrier protein ACP4 in
Arabidopsis, which plays a major role in the biosynthesis of fatty acids (Branen et al.,
2003). Like ACP4, CSWR-1 is small (14 kD), expressed mostly in leaves (Fig. 3 and 4), and appears to be localized to the chloroplast (Fig. 6). ACP4 carries growing acyl chains through the various steps of fatty acid biosynthesis, which occurs mostly in plastids.
Fatty acids function as crucial components of membrane lipids and as precursors to some signaling and defense compounds such as jasmonate. ACP4 mutants have a bleached appearance, reduced photosynthetic efficiency, and a reduced lipid composition (Branen et al., 2003).
CSWR-2 Adenylyl-sulfate reductase
The CSWR-2 protein (51 kD) is 75% identical to APR1 in Arabidopsis, which has oxidoreductase activity (acting on sulfur groups) and is involved in sulfate assimilation by which inorganic sulfate is processed and incorporated into sulfated compounds (Bick et al., 1998). This leads to the synthesis of cysteine and the antioxidant glutathione (Bick
60 et al., 2001). It has been found that APR1 is regulated by oxidative stress (ozone, oxidated glutathione, etc.), and provides a mechanism to control glutathione production necessary to combat oxidative stress (Bick et al., 2001). Like APR1 (Bick et al., 1998),
CSWR-2 contains a chloroplast localization signal, a reductase domain, and a thioredoxin-like domain near the carboxyl terminus (Fig. 6).
CSWR-3 Unknown protein
CSWR-3 encodes a highly conserved 25 kD protein with no known function in any plant. Although no functional domains were detected, the protein is proline-rich and there appears to be one transmembrane region (Fig. 6). There is also a putative chloroplast localization signal, corresponding with the observation that it is more prevalent in leaves (Fig. 3 and 4).
CSWR-4 Photosystem II oxygen-evolving complex protein 3 (PsbQ)
CSWR-4 is 67% identical to the Arabidopsis photosystem II oxygen-evolving complex protein 3, and is characterized by a chloroplast localization signal and a C- terminal domain with 4 major alpha helices (Balsera et al., 2003). The transcriptional up- regulation of a homologue to this gene has been associated with salt stress (Sugihara,
2000), but not wounding (to our knowledge). Consistent with its function in chloroplasts,
CSWR-4 is expressed primarily in green tissue (Fig. 3 and 4).
CSWR-5 Putative anion:sodium symporter
CSWR-5 encodes a 44 kD, leucine-rich (13% leucine) membrane protein with approximately 10 transmembrane domains and a conserved anion:sodium symporter
61 domain (Fig. 6). It also contains a putative leucine zipper motif at the C-terminus where leucine is repeated every 7 amino acids (Fig. 6), although this could be a coincidence resulting from high leucine content. Although close homologues exist in other plants, they have not been investigated. Homologues in yeast are necessary for coping with toxins such as arsenate (Bobrowicz et al., 1997), and homologues in animals act as bile acid:sodium symporters in the liver (Hagenbuch et al., 1991).
CSWR-6 Unknown wound/stress protein
Although CSWR-6 represents a highly conserved, 20 kD protein found in many higher plants, the molecular functions of all homologues are currently unknown.
Nevertheless, there is an unmistakable pattern of close homologues in other plants (E value < 1.0E-50) being sequenced from stress-related cDNA libraries, including dehydration stress in Brassica napus (GenBank acc. no. AAK01359.1), dehydration stress in Arabidopsis (AAM65891.1 and AAM62648.1), hypersensitive response following infection by tobacco mosaic virus in Capsicum annuum (AAF63515.1 and
AAO49266.1), response to elicitors in Nicotiana tabacum (BAB13708.1), and cold stress in Capsicum annuum (AAR83862.1). Interestingly, it is also related (E value = .002) to several genes in rats and humans which underlie polycystic kidney disease.
The key feature of the protein is the lipoxygenase homology (LH2) domain, also called the PLAT (polycystin-1, lipoxygenase, alpha-toxin) domain or the PLAT/LH2 domain (Fig. 6). This domain is found in a variety of membrane or lipid associated proteins. The predicted localization signal of CSWR-6 would target it to one of several membranous structures involved in transport within the cell (i.e. Golgi,
62
endoplasmic reticulum).
CSWR-7 Chloroplast-specific ribosomal protein
CSWR-7 is homologous (E value = E-100) to a family of proteins containing the
Sigma 54 modulation protein and the chloroplast-specific ribosomal protein S30 (Johnson
et al., 1990). This family contains a number of transcripts known to be repressed by light
(Tan et al., 1994). CSWR-7 has a chloroplast transit peptide (Fig. 6), consistent with its
prevalence in green tissues (Fig. 3 and 4). CSWR-7 also has low homology (E value =
.16) to phosphatidylinositol 4-kinase and a calmodulin-binding protein family which
contains an IQ calmodulin-binding motif.
CSWR-8 Alpha/beta fold family protein
CSWR-8 encodes a 22 kD protein related to the alpha/beta fold superfamily of
proteins (Fig. 6), which includes a wide range of catalytic enzymes. CSWR-8 is
predicted to have catalytic activity, and most likely acts as a hydrolase. No transit peptide was detected and no close homologue has been closely studied. CSWR-8 is most abundant in root tissue (Fig. 3 and 4).
CSWR-9 Histidine triad family protein
CSWR-9 encodes a 16 kD protein of the histidine triad (HIT) family (Fig. 6), which is known to be involved in cell cycle regulation. However, molecular functions of close homologues of CSWR-9 are unknown. CSWR-9 is a low abundance transcript in all tissues examined (Fig. 3), and contains a predicted transit peptide at the amino terminus (Fig. 6).
63
We have characterized transcripts which accumulate systemically in tomato leaves within 1 hour after flaming a distant leaf. In all likelihood, they are involved with a wide variety of metabolic functions and are functional under non-stress conditions.
Nevertheless, homologues in other organisms have been associated with defense (CSWR-
1), oxidative stress (CSWR-2), salt stress (CSWR-4), removal of toxins (CSWR-5), and numerous other stresses (CSWR-6, etc.).
Since the systemic response to fire damage has not been well characterized, the results presented here lead us to point out an important general observation. Previous experiments using flame-wounding have treated the stimulus as a “generalized” wound
(used largely out of experimental convenience) with the intention of simulating wounds from herbivores or pathogens. Nevertheless, the fact remains that fire is nearly ubiquitous in terrestrial ecosystems and plants have evolved mechanisms to deal with fire damage. This is extremely well documented in the ecological literature, and it is thought that fire-response mechanisms are ancestral characteristics (Bond and van Wilgen, 1996).
Therefore, the fact that several of the CSWR genes shown here, Pin 1, and CMBP (Vian et al., 1999) have been associated with responses to multiple wounds/stresses suggests that fire damage invokes a systemic response with components similar to other wound and stress responses in the natural environment. Currently, there is virtually no understanding of how responses to fire damage might be unique compared to other wound/stress responses on a molecular level. This will be an important topic for future work.
64
Materials and Methods
Plant material, growth conditions, and tissue collection
Tomato plants (Lycopersicon esculentum cv. Heinz) were obtained from Stokes
Seeds (Buffalo, New York) and grown under controlled conditions in the NCSU
Phytotron on a gravel/Peat-Lite substrate (developed at Cornell University, Ithaca, NY).
Growth chambers maintained an environment of 16 h light (300 µmol s-1 m-2) at 26ºC and 8 h dark at 21ºC. A butane lighter flame was held for 2 seconds 1 cm below the third leaf of 3-4 week old plants (about 12 cm in height with the fourth leaf not fully expanded) causing immediate, localized tissue damage. For construction of the subtractive cDNA library, the fourth leaf was harvested from wounded and control plants
1 hour after wounding and immediately frozen in liquid nitrogen. For RT-PCR experiments, the fourth leaf of individual plants was harvested at 0, 5, 10, 20, 30, 60, 180, and 360 minutes after wounding and immediately frozen in liquid nitrogen. For comparison of RNA levels in different organs (in unwounded plants), tissue from roots, stems (up to cotyledonary node), and leaves were harvested from 3 plants and pooled together in liquid nitrogen before grinding. All experiments were performed in duplicate.
Subtractive cDNA library construction, screening and sequencing
A subtractive library was constructed using a PCR-Select cDNA subtraction kit
(Clontech Laboratories, Inc. Palo Alto, CA, USA) such that wound-specific cDNAs were preferentially amplified. Subtraction was performed according to the manufacturer’s recommendations, with only slight modifications as described in Vian et al. (1999). The final PCR-amplified cDNAs were ligated into the T/A vector pT-7 Blue (Novagen,
65
Madison, WI) for 2 h at room temperature using T4 DNA ligase (Gibco-BRL).
Library clones were grown on LB/ ampicillin plates and single colonies picked and
grown in LB/ampicillin suspension culture. Most clone cDNAs were prepared for
sequencing by PCR amplification (35 cycles; 94° for 45s denaturing, 56° for 60s
annealing, 72° for 60s extension) using primers specific for the pT-7 Blue cloning site
(ACCATGATTACGCCAAGCTC and TAAAACGACGGCCAGTGAAT) and purified with a QIAquick PCR Purification kit (Qiagen, Valencia, CA). Other clones were prepared for sequencing using plasmid mini-preps (Qiagen, Valencia, CA). Sequencing was performed in forward (T7 primer) and reverse (pUC/M13 reverse primer) directions using a Beckman/Coulter CEQ2000XL 8-capillary DNA sequencer (dye-terminator chemistry) at the Genomics Core Research Facility of the University of Nebraska-
Lincoln.
DNA sequence analysis and data mining
All sequences were screened for vector, primer, and adaptor contamination
(Coker and Davies, 2002) using VecScreen
(http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html). Blast searches (Altschul et al., 1997) were performed in GenBank (version 2.2.7) nucleotide databases using tBlastx and Blastn, and in protein databases using Blastx. The initial short sequences of the nine cDNAs presented here could not be identified, and were therefore used to search expressed sequence tags (ESTs) in the TIGR Tomato Gene Index (version 9.0). Identical matches allowed the putative extension of our sequences using consensus sequence information.
66
Verification of consensus sequences
To verify consensus sequences, PCR primers (see Table 2) were designed to amplify the entire predicted open reading frames using Primer3 (http://www- genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi) and OligoAnalyzer 3.0
(http://207.32.43.70/biotools/oligocalc/oligocalc.asp). PCR was performed for 35 cycles in an MJ Research MiniCycler (94° for 45s denaturing, 54° for 60s annealing, 72° for 60s extension) using Platinum PCR Supermix (Invitrogen, Carlsbad, CA) and a pooled tomato cDNA sample as template. PCR products were sequenced as described above in forward and reverse directions using the respective primers in Table 2.
Real-time RT-PCR assays
Real-time reverse transcriptase polymerase chain reaction (real-time RT-PCR) allows the detection of low-abundance mRNA with great sensitivity and quantification with great accuracy (Bustin, 2000). Total RNA was extracted using an RNeasy Plant
Mini kit (Qiagen, Valencia, CA, USA) and further purified using a DNA-free kit
(Ambion, Austin, TX, USA). To make cDNA, RT-PCR was performed on 10 ul of RNA samples (at 50 ng/ul) using an Omniscript RT kit (Qiagen, Valencia, CA, USA) with the primer
TTCTAGAATTCAGCGGCCGCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTVN in the presence of RNAse inhibitor (ABI, Foster City, CA, USA). The cDNA samples were then diluted to 2.5 ng/ul.
PCR primers were constructed using OligoAnalyzer 3.0 to amplify 80-120 bp products that were 100-1000 bp from the 3’ ends of cDNAs in regions that we had
67 verified through DNA sequencing (Table 2). All primers had 59-60° melting temperatures, 3’ G/C caps, 40-60% G/C content in the last 5 bp on the 3’ ends, 40-60%
G/C content overall, and matched only one tentative consensus sequence in the TIGR
TGI (Table 2).
We recently published an analytical method for identifying housekeeping controls in various tissues of tomato plants (Coker and Davies, 2003). In the present study, an actin gene (GenBank acc. no. U60480.1) was used as a housekeeping control to confirm the consistency of tissue collection, mRNA extraction, and reverse transcription. Primers
(TGGTCGTACCACCGGTATTGTG and AATGGCATGTGGAAGGGCATAC) were designed to amplify a 91 bp product. The forward primer crossed an intron/exon junction to ensure that genomic DNA was not amplified. Also, no-RT controls were included as negative controls to ensure no contamination by genomic DNA. PCR was performed in an ABI Prism® 7900HT Sequence Detection System (95° for 10 min. followed by 40 cycles of 95° for 15s denaturing and 60° for 60s annealing/extension) using 25 µl 1x
SYBR Green PCR Mastermix (ABI, Foster City, CA, USA), 2 µl cDNA, and 3 µl primers (0.25 µM). Dissociation curve analysis was performed for each sample following PCR. Data were analyzed using ABI SDS software, and quantified relative to the standard curve of a serial dilution.
Relative expression analyses
There are 27 tomato cDNA libraries represented in the TIGR TGI database
(version 9.0) with large sample sizes (>500 ESTs), which were constructed from a variety of tissue types and developmental stages. We searched these 27 libraries for particular
68
EST sequences and calculated relative expression values based on the number of ESTs found in a given population. Analyses were performed as described in Coker et al.
(2003).
Polypeptide sequence analysis
Alignments and other basic sequence analyses were performed using Vector NTI
7.1 (Informax, Bethesda, MD). Searches to determine protein families, domains, and functional sites were performed using the InterPro database (www.ebi.ac.uk/interpro;
Mulder et al., 2003), which integrates PROSITE, Pfam, PRINTS, ProDom, SMART, and
TIGRFAMs. Structural analyses included the prediction of the following: localization signals using TargetP (www.cbs.dtu.dk/services/TargetP/; Emanuelsson et al., 2000); presence and orientation of transmembrane regions using PHDhtm
(http://cubic.bioc.columbia.edu/predictprotein/; Rost et al., 1996), HMMTOP
(http://www.enzim.hu/hmmtop/; Tusnády and Simon, 2001), and the hydropathy index of
Kyte and Doolittle (1982); alpha helices and beta strands using PROFsec
(http://cubic.bioc.columbia.edu/predictprotein/; Rost, 1996); possible interacting proteins using DIP (http://dip.doe-mbi.ucla.edu/; Xenarios et al., 2002); and coiled-coils and leucine zippers using COILS (http://cubic.bioc.columbia.edu/predictprotein/; Lupas,
1996) and 2ZIP (http://2zip.molgen.mpg.de/index.html; Bornberg-Bauer et al., 1998).
The overall strategy for cDNA analyses is outlined in Figure 1.
Acknowledgements
We thank Heike Winter-Sederoff and Raul Salinas for their assistance in accessing equipment, and Sophia Clotho for her advice.
69
Literature Cited
Adams MD, Kerlavage RD, Fleischmann RA, Fuldner CJ, Bult NH, Lee EF, Kirkness KG, Weinstock JD, Gocayne O, White et al. (1995) Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature 377: 3-17.
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815.
Balsera M, Arellano JB, Gutierrez JR, Heredia P, Revuelta JL, De Las Rivas J (2003) Structural analysis of the PsbQ protein of photosystem II by Fourier transform infrared and circular dichroic spectroscopy and by bioinformatic methods. Biochem 42: 1000- 1007.
Becker-Andre M, Schulze-Lefert P, Hahlbrock K (1991) Structural comparison, modes of expression, and putative cis-acting elements of the two 4-coumarate: CoA ligase genes in potato. J Biol Chem 266: 8551-8559.
Bertram L, Lercari B (2000) Phytochrome A and phytochrome B1 control the acquisition of competence for shoot regeneration in tomato hypocotyl. Plant Cell Reports 19: 604- 609.
Bick J-A, Aslund F, Chen Y, Leustek T (1998) Glutaredoxin function for the carboxyl- terminal domain of the plant-type 5’-adenylsulfate reductase. Proc Natl Acad Sci USA 95: 8404-8409.
Bick JA, Setterdahl AT, Knaff, DB, Chen Y, Pitcher LH, Zilinskas BA, Leustek T (2001) Regulation of the plant-type 5'-adenylyl sulfate reductase by oxidative stress. Biochem 40: 9040-9048.
Bobrowicz P, Wysocki R, Owsianik G, Goffeau A, Ulaszewski S (1997) Isolation of three contiguous genes, ACR1, ACR2 and ACR3, involved in resistance to arsenic compounds in the yeast Saccharomyces cerevisiae. Yeast 13: 819-28.
Bond WJ, van Wilgen BW (1996) Fire and plants. Chapman & Hall: London.
Bornberg-Bauer E, Rivals E, Vingron M (1998) Computational approaches to identify leucine zippers. Nucleic Acids Res 26: 2740-2746.
Braam J, Davis RW (1990) Rain-, wind-, and touch-induced expression of calmodulin related genes in Arabidopsis. Cell 60:357-364.
Branen JK, Shintani DK, Engeseth NJ (2003) Expression of antisense acyl carrier protein-4 reduces lipid content in Arabidopsis leaf tissue. Plant Physiol 132: 748-756.
70
Bustin SA (2000) Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays. J Molecular Endocrinology 25: 169-193.
Coker JS, Davies E (2002) Correspondence re: A.H. Ree et al., Expression of a novel factor in human breast cancer cells with metastatic potential (Cancer Res., 59: 4675- 4680, 1999). Cancer Res 62: 4164-4165.
Coker JS, Jones, D, and Davies, E (2003) Identification, conservation, and relative expression of V-ATPase cDNAs in tomato plants. Plant Molecular Biology Reporter 21: 145-158.
Coker JS, Davies E (2003) Selection of candidate housekeeping controls in tomato plants using EST data. Biotechniques 35: 740-748.
Debano LF, Neary DG, Ffolliott PF (1998) Fire effects on ecosystems. Wiley & Sons, Inc.: New York.
Drewa PB, Platt WJ, Moser EB (2002) Fire effects on resprouting of shrubs in headwaters of southeastern longleaf pine savannas. Ecology 83: 755-767.
Edwards K, Cramer CL, Bolwell GP, Dixon RA, Schuch W, Lamb CJ (1985) Rapid transient induction of phenylalanine ammonia-lyase mRNA in elicitor-treated bean cells. Proc Natl Acad Sci USA 82: 6731-6735.
Emanuelsson O, Nielsen H, Brunak S, von Heijne G (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J Mol Biol 300: 1005-1016.
Espartero J, Pintor-Toro JA, Pardo, JM (1994) Differential accumulation of S- adenosylmethionine synthetase transcripts in response to salt stress. Mol Biol 25: 217- 227.
Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H et al. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296: 92-100.
Graham JS, Hall G, Pearce G, Ryan CA (1986) Regulation of synthesis of proteinase inhibitors I and II mRNAs in leaves of wounded tomato plants. Planta 169: 399-405.
Green TR, Ryan CA (1972) Wound-induced proteinase inhibitor in plant leaves – possible defense mechanism against insects. Science 175: 776-777.
Hagenbuch B, Stieger B, Foguet M, Lubbert H, Meier PJ (1991) Functional expression cloning and characterization of the hepatocyte Na+/bile acid cotransport system. Proc Natl Acad Sci USA 88: 10629-10633.
71
Herde O, Atzorn R, Fisahn J, Wasternack C, Willmitzer L, Peña-Cortes H (1996) Localized wounding by heat initiates the accumulation of proteinase inhibitor II in abscisic acid-deficient plants by triggering jasmonic acid biosynthesis. Plant Physiol 112: 853-860.
Howe GA, Lee GI, Itoh A, Li L, DeRocher AE (2000) Cytochrome P450-dependent metabolism of oxylipins in tomato. Cloning and expression of allene oxide synthase and fatty acid hydroperoxide lyase. Plant Physiol 123: 711-24.
Johnson CH, Kruft V, Subramanian AR (1990) Identification of a plastid-specific ribosomal protein in the 30S subunit of chloroplast ribosomes and isolation of the cDNA clone encoding its cytoplasmic precursor. J Biol Chem 22: 12790-12795.
Kyte J, Doolittle RF (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105-132.
Lawton MA, Lamb CJ (1987) Transcriptional activation of plant defense genes by fungal elicitor, wounding and infection. Mol Cell Biol 7: 335-341.
Lee D, Douglas CJ (1996) Two divergent members of a tobacco 4-coumarate:coenzyme A ligase (4CL) gene family. cDNA structure, gene inheritance and expression, and properties of recombinant proteins. Plant Physiol: 112: 193-205.
Lee SM, Suh S, Kim S, Crain RC, Kwak JM, Nam HG, Lee YS (1997) Systemic elevation of phosphatidic acid and lysophospholipid levels in wounded plants. Plant J 12: 547-556.
León J, Enrique R, Sánchez-Serrano JJ (2001) Wound signalling in plants. J Exper Bot 52: 1-9.
Low RK, Prakash AP, Swarup S, Goh CJ, Kumar PP (2001) A differentially expressed bZIP gene is associated with adventitious shoot regeneration in leaf cultures of Paulownia kawakamii. Plant Cell Reports 20: 696-700.
Lu M, Holliday S, Zhang L, Dunn WA, Gluck SL (2001) Interaction between aldolase and vacuolar H+-ATPase. J Biol Chem 32: 30407-30413.
Lupas A (1996) Prediction and analysis of coiled-coil structures. Methods in Enzymology 266: 513-525.
Mayer K, Schuller C, Wambutt R, Murphy G, Volckaert G, Pohl T, et al. (1999) Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana. Nature 402: 769-777.
Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Barrell D, Bateman A, Binns D, Biswas M, Bradley P, Bork P, et al. (2003) The InterPro Database, 2003 brings increased
72
coverage and new features. Nucleic Acids Res 31: 315-318.
Nishiuchi T, Suzuki K, Kitajima S, Sato F, Shinshi H (2002) Wounding activates immediate early transcription of genes for ERFs in tobacco plants. Plant Mol Biol 49: 473-482.
Orozco-Cardenas M, Ryan CA (1999) Hydrogen peroxide is generated systemically in plant leaves by wounding and systemin via the octadecanoid pathway. Proc Natl Acad Sci USA 96: 6553-6557.
Pearce G, Strydom D, Johnson S, Ryan CA (1991) A polypeptide from tomato leaves induces wound-inducible proteinase inhibitor proteins. Science 253: 895-898.
Peña-Cortés H, Wilmitzer L, Sanchez-Serrano J (1991) Abscisic acid mediates wound induction but not developmental-specific expression of the proteinase inhibitor II gene family. Plant Cell 3: 963-972.
Peres LE-P, Morgante PG, Vecchi C, Kraus JE, van Sluys MA (2001) Shoot regeneration capacity from roots and transgenic hairy roots of tomato cultivars and wild related species. Plant Cell Tissue and Organ Culture 65: 37-44.
Platt WJ, Evans GW, Davis MM (1988) Effects of fire season on flowering of forbs and shrubs in longleaf pine forests. Oecologia 76: 353-363
Preston CA, Baldwin IT (1999) Positive and negative signals regulate germination in the post-fire annual, Nicotiana attenuata. Ecology 80: 481-494.
Reymond P, Weber H, Damond M, Farmer EE (2000) Differential gene expression in response to mechanical wounding and insect feeding in Arabidopsis. Plant Cell 12: 707- 719.
Rost B (1996) PHD: predicting one-dimensional protein structure by profile based neural networks. Methods in Enzymology 266: 525-539.
Ryan CA (1974) Assay and biochemical properties of the proteinase inhibitor inducing factor, a wound hormone. Plant Physiol 54: 328-332.
Ryan CA, Farmer EE (1991) Oligosaccharide signals in plants: a current assessment. Annu Rev Plant Physiol Plant Mol Biol 42: 651-674.
Sugihara K, Hanagata N, Dubinsky Z, Baba S, Karube I (2000) Molecular characterization of cDNA encoding oxygen evolving enhancer protein 1 increased by salt treatment in the mangrove Bruguiera gymnorrhiza. Plant Cell Physiol 41: 1279-1285.
Schaller A, Ryan CA (1996) Molecular cloning of a tomato leaf cDNA encoding an aspartic protease, a systemic wound response protein. Plant Mol Biol 31: 1073-1077.
73
Stanković B, Davies E (1996) Both action potentials and variation potentials induce proteinase inhibitor gene expression in tomato. FEBS Lett 390: 275-279.
Stanković B, Vian A, Henry-Vian C, Davies E (2000) Molecular cloning and characterization of a tomato cDNA encoding a systemically wound-inducible bZIP DNA- binding protein. Planta 212: 60-66.
Stennis MJ, Chandra S, Ryan CA, Low PS (1998) Systemin potentiates the oxidative burst in cultured tomato cells. Plant Physiol 117: 1031-1036
Takashina T, Suzuki T, Egashira H, Imanishi S (1998) New molecular markers linked with the high shoot regeneration capacity of the wild tomato species Lycopersicon chilense. Breeding Science 48: 109-113.
Tan X, Varughese M, Widger WR (1994) A light-repressed transcript found in Synechococcus PCC 7002 is similar to a chloroplast-specific subunit protein and to a transcription modulator protein associated with Sigma 54. J Biol Chem 269: 20905- 20912.
Taylor IB (1986) Biosystematics of the tomato. The tomato crop: A scientific basis for improvement. Eds Atherton JG and Rudich J. Chapman and Hall Ltd: New York.
Taylor JLS, van Staden J (1998) Plant-derived smoke solutions stimulate the growth of Lycopersicon esculentum roots in vitro. Plant Growth Regulation 26: 77-83.
Tusnády GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17: 849-850.
USDI-National Park Service (1992) Fire monitoring handbook. Natl Park Serv, Western Region. San Francisco, CA. 134 p. plus appendices.
Van der Hoeven R, Ronning C, Giovannoni J, Martin G, Tanksley S (2002) Deductions about number, organization, and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14: 1441-1456.
Verdaguer D, Ojeda F (2002) Root starch storage and allocation patterns in seeder and resprouter seedlings of two Cape Erica (Ericaceae) species. Amer J Botany 89: 1189- 1196.
Vian A, Henry-Vian C, Schantz R, Ledoigt G, Frachisse JM, Desbiez MO (1996) Is membrane potential involved in calmodulin gene expression after external stimulation in plants? FEBS Lett 380: 93-96.
Vian A, Henry-Vian C, Davies E (1999) Rapid and systemic accumulation of chloroplast
74 mRNA-binding protein transcripts after flame stimulus in tomato. Plant Physiol 121: 517- 524.
Wells BW (2002) The natural gardens of North Carolina. Rev. ed. UNC Press: Chapel Hill.
Wildon DC, Thain JF, Minchin PEH, Gubb IR, Reilly AJ, Skipper YD, Doherty HM, O’Donnell PJ, Bowles DJ (1992) Electrical signaling and systemic proteinase inhibitor induction in the wounded plant. Nature 360: 62-65.
Xenarios I, Salwinski L, Duan XJ, Higney P, Kim S, Eisenberg D (2002) DIP: The Database of Interacting Proteins. A research tool for studying cellular networks of protein interactions. Nucleic Acids Res 30: 303-305.
75
Table I. Sequence extension and polypeptide deduction for unidentifiable tomato cDNA fragments that are "candidates for the systemic wound response" (CSWR). Tentative contigs (labeled as TC) corresponding to each cDNA fragment were identified in the TIGR Tomato Gene Index and used for GenBank searches and polypeptide analyses. Putative protein and cDNA sequences were annotated and deposited into GenBank under the given accession numbers.
Library clone TIGR TGI match GenBank match GenBank submission Length Length Amino acid Amino Name (bp) Acc. # (bp) identity (Blastx) Protein acc. # acids kD pI Protein acc. # CSWR-1 229 TC128541 596 57% (78/136) AAL25091.1 133 14.2 4.9 AY568716 CSWR-2 189 TC116602 1830 83% (389/467) AAB05871.2 461 51.1 6.0 AY568717 CSWR-3 333 TC124827 776 74% (132/178) AAM14118.1 238 25.0 4.8 AY568718 CSWR-4 119 TC116141 966 67% (156/232) S00008 230 24.6 9.6 AY568719 CSWR-5 647 TC124664 1716 77% (319/414) NP_850089.1 407 43.7 8.9 AY568720 CSWR-6 206 TC123928 733 83% (156/186) AAR83862.1 184 20.3 6.1 AY568721 CSWR-7 303 TC117392 1454 63% (181/286) AAK43963.1 311 34.8 5.9 AY568722 CSWR-8 59 TC128899 662 65% (128/196) NP_195379.1 208 22.4 7.0 AY568723 CSWR-9 543 TC121082 702 43% (56/130) ZP_00019821.1 150 16.1 7.1 AY568724
76
Table II. PCR primers specific to 9 novel tomato cDNAs that were used to verify putative open reading frame sequences and perform real-time RT- PCR experiments.
Verification of putative ORF sequence Real time RT-PCR Product Product cDNA Forward primer Reverse primer size (bp) Forward primer Reverse primer size (bp)
CSWR-1 CCATTTCTTCTCTCTGCATTTCTC GCAAAAAGAATTCAATCCAAGACC 548 AAACGGAGGCACTGTGAAGTTG CAACACGAAGGCGGGATGTTC 89
CSWR-2 TGACAAGCAATTTCTTTGCTG TGTCAAAACAATGGTGTGATTG 1500 TGTAAATGGCGCTGCCCAAAC AGGTTCTCAACTCCAGGCTTACTC 99
CSWR-3 ACACAGCCAATCGAAGAAGC CAAGCAAAATGATTTGTCCTAAG 850 CGGGCTTCAACTGACGATTCTG ACCAACCACTTTGGTGTTGTCC 111
CSWR-4 CACCAAAACAAAAAGGTCCTG CAAGTTGAGGCTCTCGGATG 850 TCACTGTCAGAGCCCAACAGG AAGCTGCTTTAGCAACGGAACC 105
CSWR-5 AACGAACCCTGCCCTAAACT CAGCCAAGAAATGCAAACAA 1360 CGGCACTCGGATTTCTACTTGC ACCAAGTGCCATGCAGACAAC 92
CSWR-6 CATTTTCTAGAGAAAGAGCACAAGG CCATGACAGCAAAGACATGC 687 ATGCAACCAGCAGTTGTTCACC AGCTCACCGGACTCATACTTGTTC 111
CSWR-7 CGAGCAACAAAGCAACTGTG CTCTCAAGGAGTGAACATTATGC 1160 TCCGGAATGAGGAAACTGGTGAG TCAACCTCCAAGGGCTCTAACTTC 118
CSWR-8 GGTGAACTTGGTTGAAGCAC TGAAAATCCCCAAACCATTG 590 GATTTAGTTGAGGCGTTGGTGGTG AAACCATTGAGCGTGGTAGTGC 80
CSWR-9 CTCCACCGATGGTGAAAATC CATACCTCGATCTGAACGACAG 531 TTTGGGCACTCGCTTGTCATC TTGAACTCATGGCAGCCACAAC 85
77
Subtractive cDNA library of tomato genes up-regulated during a systemic wound response
Clone isolation and sequencing
Sequence quality control VecScreen Bacterial database searches
Blast searches of GenBank
ESTs ESTs from known Unidentifiable homologous to tomato genes ESTs known genes
Relative expression Sequence analysis using the extension using TIGR TGI the TIGR TGI
Sequence verification (PCR & sequencing)
Real-time RT- Housekeeping Blast searches of PCR (6 hr. controls GenBank timecourse)
PROSITE Protein family Pfam analysis PRINTS ProDom SMART TIGRFAMS Structural analysis
Localization Transmembrane Alpha helices / Interacting Coiled-coils / signals regions Beta sheets proteins leucine zippers
TargetP PHDhtm PROFsec DIP COILS HMMTOP 2ZIP
Figure 1. Strategy to identify and characterize cDNAs up-regulated in tomato leaf tissue during a systemic wound response to fire damage. Clones were sequenced from a subtractive cDNA library and separated into 3 classes based on how well they could be identified by Blast searches. Sequences for 9 unidentifiable cDNA fragments were extended and verified, and used for further Blast searches, protein family analysis, and structural analysis. Expression studies included expressed sequence tag analysis using the TIGR TGI and real-time RT-PCR of leaf tissue over a 6-hour timecourse after flame wounding.
78
1 2 3 4 5 6 7 8 9 10 11 12 13
2000 bp 1200 800 400 200
Figure 2. Confirmation of the existence of 9 putative consensus sequences for unknown tomato cDNAs. PCR was performed using a pooled cDNA sample and the primer pairs shown in Table 2. Products were run on a 2% agarose gel stained with ethidium bromide. Lanes 1 and 13 show 8µl of Low DNA Mass Ladder (Invitrogen), lane 11 is a positive PCR control, and lane 12 is a negative control (no PCR primers). Lanes 2-10 contain PCR products corresponding to the putative open reading frames for CSWR-1 through CSWR-9.
79
0.5 Leaves Shoots 0.4 Flow ers
s Culture/callus T
S 0.3 Fruits Roots
1000 E r e 0.2 # p 0.1
0
r -1 -2 3 -4 -5 6 -7 -8 9 9 o R R R- R R- R R- ct WR WR vg 1- a SW SW SW S SW SW S SW SW A C C C C C C C C C it. F In cDNA
Figure 3. Expressed sequence tag analysis of 9 cDNAs that are candidates for the systemic wound response (CSWR). Each bar represents the relative expression value for a particular gene from cDNA libraries in the TIGR TGI, grouped according to tissue. For example, the TIGR TGI contains 2 CSWR-1 ESTs from leaves, representing 0.1 CSWR-1 ESTs for every 1000 total ESTs. CSWR-4 expression levels for leaves and shoots are off the scale (shown by an arrow) at 2.2 and 1.9, respectively. Translation initiation factor 5A-3 (TIGR acc. no. TC124277) is shown as a ubiquitous cDNA with comparative expression level.
80
Root Stem Leaf CSWR-1
CSWR-2
CSWR-3
CSWR-4
CSWR-5
CSWR-6
CSWR-7
CSWR-8
CSWR-9
Figure 4. Organ-specific relative abundance of CSWR-1 through CSWR-9 in unwounded tomato plants. Each image shows PCR products for a given cDNA using root, stem, and leaf cDNA as template. Each band represents 8 µl of PCR product (35 cycles) on a 1.5% agarose gel stained with ethidium bromide.
81 Pin 1 CSWR-1 0.450 0.700 n n o 0.400 0.600 o ti i t a l a 0.350 l u u 0.500 m m
0.300 u ccu
a 0.400 acc t t 0.250 p p i i 0.200 0.300 scr scr an an 0.150 r 0.200 ve t i ve tr 0.100 t ti a l a
l 0.100 e
0.050 Re R 0.000 0.000 0 60 120 180 240 300 360 0 60 120 180 240 300 360 Time (min.) Time (min.)
CSWR-2 CSWR-3 0.060 0.080 n n o
o 0.070 i ti t 0.050 a a l l u u 0.060 m m 0.040 cu 0.050 accu ac t t p p 0.030 i 0.040 cri scr s 0.030 an
0.020 r tran
e 0.020 ve t v i ti t a a l
l 0.010
e 0.010 e R R
0.000 0.000 0 60 120 180 240 300 360 0 60 120 180 240 300 360 Time (min.) Time (min.)
CSWR-4 CSWR-5 0.450 0.120 n n o
o 0.400 ti ti 0.100 a a l l 0.350 u u m m 0.300 0.080 ccu accu
a t t 0.250 p p i i 0.060 r r 0.200 sc sc an an 0.150 0.040 tr tr e v ve 0.100 ti ti a a 0.020 l l e e 0.050 R R 0.000 0.000 0 60 120 180 240 300 360 0 60 120 180 240 300 360 Time (min.) Time (min.)
CSWR-6 CSWR-7
0.040 0.250 n n o o i
i 0.035 t t a l
la 0.200
0.030 u m mu u cu c
c 0.025 0.150 ac a
t pt p i i 0.020 r r c
s 0.100
n 0.015 ansc a r r t t
e 0.010 ve v i i t
t 0.050 a l la e
e 0.005 R R 0.000 0.000 0 60 120 180 240 300 360 0 60 120 180 240 300 360 Time (min.) Time (min.)
CSWR-8 CSWR-9 0.025 0.010 n n 0.009 o o i ti t a a l 0.020 l 0.008 u u m m 0.007 cu 0.015 0.006 ac t t accu p p i i 0.005 scr 0.010 scr 0.004 an an 0.003 ve tr ve tr i ti 0.005 t 0.002 a a l l e e 0.001 R R 0.000 0.000 0 60 120 180 240 300 360 0 60 120 180 240 300 360 Time (min.) Time (min.)
Figure 5. Systemic transcript accumulation of 9 tomato cDNAs (CSWR-1 through CSWR-9) in leaf 4 after flame wounding leaf 3. Two real-time RT-PCR experiments (solid and dotted lines) were performed on leaf mRNA from 0, 5, 10, 20, 30, 60, 180, and 360 minute timepoints and quantified relative to the standard curve of a serial dilution. Pin 1 (GenBank accession no. K03290) is a well-documented systemic wound gene shown for comparison. Error bars indicate ± standard error (n=2).
82
Phosphopantetheine attachment site (Ser-90) CSWR-1 133 aa Acyl carri er protei n phosphopantetheine dom. Chloro. l.s.
CSWR-2 461 aa Serine-rich reg. Phosphoadenosine phosphosulfate reductase dom. Thioredoxin dom. 2
Chloro. l.s.
CSWR-3 238 aa Chloro. l.s. Transmem. reg.
CSWR-4 230 aa Chloro. l.s. Transmem. reg. Photosys. II O- evolving complex pr ec. Photosys. II O- evolving complex pr ec. Coiled coil (28 bp)
L-307 L-314 L-321 L-328 CSWR-5 407 aa
Mito. or chloro. l.s. Transmembrane regions Sodium bile acid symporter dom.
CSWR-6 184 aa Transmem. reg. Lipoxygenase homology dom. Secr. l.s.
CSWR-7 311 aa Chloro. l.s. Sigma 54 modulation protein dom.
CSWR-8 208 aa Transmem. reg. Transmem. reg. Transmem. reg. Alpha/beta hydrolase dom.
CSWR-9 150 aa HIT family dom. Other l .s.
Figure 6. Structural and functional prediction of 9 tomato proteins, encoded by CSWR-1 through CSWR- 9. White cylinders represent alpha helices, gray ovals represent beta sheets, and block arrows at the amino termini represent localization signals. The locations of other structural elements are shown with black lines beneath each protein. “Other l.s.” refers to a signal peptide localizing somewhere other than chloroplasts, mitochondria, or the secretory pathway. CSWR-1 through CSWR-9 represents GenBank entries AY568716 through AY568724. Abbreviations: l.s., localization signal; Chloro., chloroplast; dom., domain; Mito., mitochondrion; prec., precursor; Secr., secretory pathway (i.e. golgi apparatus or endoplasmic reticulum); transmem. reg., transmembrane region.
83
Chapter 6
Fire Damage Causes the Systemic Up-regulation of a Set of Highly Conserved Transcripts in Tomato Plants
Jeffrey S. Coker, Alan Vian, and Eric Davies
Alan Vian constructed the subtractive cDNA library. Eric Davies provided guidance and editorial assistance.
This chapter will be submitted for publication.
84 Abstract
Fire is a natural component of most terrestrial ecosystems and can act as a local wound stimulus to plants. Nevertheless, there have been no previous attempts to catalogue the array of genes which are up-regulated after fire damage. We have constructed a subtractive cDNA library using PCR-based suppression subtractive hybridization and used it to identify 46 different transcripts which are systemically up- regulated in leaves in the first hour after a distant leaf is flame wounded. Compared with the entire tomato transcriptome, these 46 transcripts are very highly-conserved (slow- evolving) in plants. All but 4 of the identifiable transcripts fall into 5 classes: enzymes of general metabolism; protein synthesis, modification, and transport; transcription; membrane transport; and photosynthesis and respiration. At least half of the up-regulated transcripts have been previously associated with other types of wounds or stresses. These include phenylalanine ammonia-lyase, 4-coumarate:coenzyme A ligase, S-adenosyl-L- homocysteine hydrolase, S-adenosyl-L-methionine synthetase, catalase, leucine aminopeptidase, phantastica, and a metallothionein-like protein. Most of those which have not been associated with other wounding or stress stimuli are associated with photosynthesis and/or respiration. These include pyruvate kinase, rubisco small subunit, chlorophyll a/b binding proteins, and subunits of photosystems I and II.
85 Introduction
Because plants are sessile and cannot escape natural wounding stimuli, they often
respond to tissue damage by changes in gene expression (Graham et al., 1986; Braam and
Davis, 1990; Schaller and Ryan, 1996; León et al., 2001) in both damaged tissues (local
responses) and in undamaged tissues (systemic responses). Many systemic wound genes
have been previously identified in tomato plants including proteinase inhibitors (Green
and Ryan, 1972), systemin (Pearce et al., 1991), aspartic protease (Schaller and Ryan,
1996), allene oxide synthase and fatty acid hydroperoxide lyase (Howe et al., 2000), and
others. Further characterization of the array of systemically up-regulated genes is
necessary to better understand plant defense and stress response mechanisms.
Fire is a wounding and stress stimulus that impacts most terrestrial ecosystems,
and therefore plants have evolved mechanisms to survive it (Bond and van Wilgen, 1996;
DeBano et al., 1998). In fact, some of the most species-rich plant ecosystems (i.e. the
herbaceous groundcover of longleaf pine savannahs) require fire to persist (Platt et al.,
1988; Drewa et al., 2002). Despite the major impacts of fire on plants, responses to fire
damage have not been closely studied on the level of gene expression.
Like many species in the Solanaceae, tomato plants (both wild and cultivated)
possess many characteristics which allow many herbaceous plants to survive fires. These
characteristics include being a perennial (Taylor, 1986), having carbohydrate reserves
stored in underground organs (Peres et al., 2001; Verdaguer and Ojeda, 2002), and the
ability to regenerate shoots from hypocotyls, roots, or other tissues (Takashina et al.,
1998; Bertram and Lercari, 2000; Peres et al., 2001). It has been found that smoke extract stimulates the growth of tomato roots in vitro (Taylor and van Staden, 1998), and
86 that growth of species within the Solanaceae can be regulated by fire regimes (Preston
and Baldwin, 1999). Also, a bZIP gene similar to one we found to be up-regulated by
flame-wounding (Stankovic et al., 2000) has been associated with adventitious shoot
regeneration (Low et al., 2001). Thus, tomato plants are currently the preferred model
system for work on systemic responses to fire damage.
This work presents 46 tomato transcripts which were up-regulated during a
systemic response to fire damage. The transcripts were isolated from a subtractive cDNA
library constructed using PCR-based suppression subtractive hybridization, which is a
powerful method for identifying genes which are differentially expressed between two
tissues (Diatchenko et al., 1999). Two mRNA populations (tester and driver) are
converted to cDNA and hybridized. The hybrid sequences are then removed, leaving
unhybridized cDNAs which represent genes more highly expressed in one of the mRNA
populations (the tester). The differentially expressed cDNAs are then preferentially
amplified by PCR (using tester specific adaptors) to further minimize the chances of
generating false positives. The subtractive cDNA library analyzed here contains cDNAs present at higher levels after flame wounding (tester) than in an unwounded control
(driver). More specifically, it contains transcripts systemically up-regulated in leaf 4 of tomato plants in the first hour after leaf 3 was flamed. Transcripts isolated previously from this library include chloroplast mRNA-binding protein (Vian et al. 1999) and a bZIP
DNA-binding protein (Stanković et al., 2000), as well as an acyl carrier, adenylyl sulfate
reductase, PS II oxygen-evolving complex protein 3, anion:sodium symporter,
chloroplast-specific ribosomal protein, and a histidine triad family protein (Coker et al.,
2004). Here we summarize all unique transcripts isolated from the library, place them
87 into functional categories, assess their extent of conservation in eudicots, and discuss how the transcripts compare with those up-regulated by other wounds and stresses.
88 Materials and Methods
Plant material, growth conditions, and tissue collection
Tomato plants (Lycopersicon esculentum cv. Heinz) were obtained from Stokes
Seeds (Buffalo, New York) and grown under controlled conditions in the NCSU
Phytotron on a gravel/Peat-Lite substrate (developed at Cornell University, Ithaca, NY).
Growth chambers maintained an environment of 16 h light (300 µmol s-1 m-2) at 26ºC and 8 h dark at 21ºC. A butane lighter flame was held for 2 seconds 1 cm below the third leaf of 3-4 week old plants (about 12 cm in height with the fourth leaf not fully expanded) causing immediate, localized tissue damage. For construction of the subtractive cDNA library, the fourth leaf was harvested from wounded and control plants
1 hour after wounding and immediately frozen in liquid nitrogen.
Subtractive cDNA library construction, screening and sequencing
A subtractive library was constructed using a PCR-Select cDNA subtraction kit
(Clontech Laboratories, Inc. Palo Alto, CA, USA) such that wound-specific cDNAs were preferentially amplified. Subtraction was performed according to the manufacturer’s recommendations, with only slight modifications as described in Vian et al. (1999). The final PCR-amplified cDNAs were ligated into the T/A vector pT-7 Blue (Novagen,
Madison, WI) for 2 h at room temperature using T4 DNA ligase (Gibco-BRL).
Library clones were grown on LB/ampicillin plates and single colonies picked and grown in LB/ampicillin suspension culture. Most clone cDNAs were prepared for sequencing by PCR amplification (35 cycles; 94° for 45s denaturing, 56° for 60s annealing, 72° for
60s extension) using primers specific for the pT-7 Blue cloning site
(ACCATGATTACGCCAAGCTC and TAAAACGACGGCCAGTGAAT) and purified 89 with a QIAquick PCR Purification kit (Qiagen, Valencia, CA). Other clones were prepared for sequencing using plasmid mini-preps (Qiagen, Valencia, CA). Sequencing was performed in forward (T7 primer) and reverse (pUC/M13 reverse primer) directions using a Beckman/Coulter CEQ2000XL 8-capillary DNA sequencer (dye-terminator chemistry) at the Genomics Core Research Facility of the University of Nebraska-
Lincoln.
DNA sequence analysis
All sequences were screened for vector, primer, and adaptor contamination
(Coker and Davies, 2002) using VecScreen
(http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html). Blast searches were performed in GenBank (version 2.2.7) nucleotide databases using tBlastx and Blastn, and in protein databases using Blastx. Alignments and other basic sequence analyses were performed using Vector NTI 7.1 (Informax, Bethesda, MD).
Comparisons with the Arabidopsis genome
Over 155,000 tomato expressed sequence tags (ESTs) representing a variety of tissues are represented in a single collection in the TIGR Tomato Gene Index (TGI; version 9.0; http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=tomato). By comparing tomato ESTs in the TIGR TGI with the Arabidopsis genome (using tBlastx),
Van der Hoeven et al. (2002) divided tomato transcripts into “not homologous” (E value
≥ 0.1), “fast-evolving” (1.0E-15 < E value < 0.1), “intermediate-evolving” (1.0E-50 < E value < 1.0E-15), and “slow-evolving” (E value < 1.0E-50) classes. We identified the
TIGR TGI tentative consensus sequences (Unigenes) used by Van der Hoeven et al.
90 (2002) which corresponded with the cDNAs in our library. We then repeated the methodology of Van der Hoeven et al. (2002) by performing tBlastx searches against the
Arabidopsis genome (www.Arabidopsis.org).
91 Results
Overview of the subtractive cDNA library
Approximately 100 clones were sequenced from the subtractive cDNA library.
After redundant clones and clones representing different parts of the same transcripts
were discounted, there were 46 cDNAs remaining which represent unique transcripts.
This set of unique transcripts is shown in Table 1. The average length of unique cDNAs
from the library was 270 bp, which was expected since it was constructed using a 4-base
restriction enzyme (suggesting an average ~ 256 bp).
Identifications of transcripts in Table 1 were made using Blast searches of
GenBank, and considered putative since many transcripts have not been previously
described in tomato plants. Those which have not been described in any plant are listed
as unknowns. Thirty-six of the 46 transcripts fell into 5 functional classes: enzymes of
general metabolism; protein synthesis, modification, and transport; transcription; membrane transport; and photosynthesis and respiration (Table 1). Five transcripts were placed in an “other” class, and 5 unidentifiable cDNAs were labeled as unknowns (Table
1). The largest functional class in terms of number of unique transcripts was
“photosynthesis and respiration” (14 transcripts). A large number of transcripts were also associated with synthesizing, modifying, and/or transporting RNA and protein (10 transcripts). Perhaps most striking was the presence of 4 out of the 6 key enzymes in phenylpropanoid biosynthesis and the activated methyl cycle (PAL, 4CL, SAHH, and
SAMS), suggesting that the plants were increasing their capacity to make secondary metabolites.
92 Table 1. Summary of a subtractive cDNA library containing transcripts systemically up-regulated in the hour after fire damage.
Functional category Putative identification of cDNA from subtractive library Size (bp) Putative function Previous publication
Enzymes of general Phenylalanine ammonia-lyase (PAL5) 471 Phenylpropanoid metabolism; catalyzes conversion of phenylalanine to cinnamate metabolism 4-coumarate:coenzyme A ligase (4CL-1) 241 Phenylpropanoid metabolism; catalyzes conversion of 4-coumarate to 4-coumaroyl-CoA, etc. S-adenosyl-L-homocysteine hydrolase (SAHH) 113 Activated methyl cycle; catalyzes conversion of S-adenosyl-homocysteine to homocysteine S-adenosyl-L-methionine synthetase (SAMS) 117 Activated methyl cycle; catalyzes conversion of methionine to S-adenosyl-methionine Catalase 464 Degrades hydrogen peroxide Alpha/beta fold family protein 59 Catalytic enzyme, most likely with hydrolase activity Coker et al., 2004 Adenylyl-sulfate reductase 189 Sulfate assimilation Coker et al., 2004 Aspartokinase/homoserine dehydrogenase 121 Amino acid biosynthesis (lysine, threonine, isoleucine, and methionine) Protein synthesis, 30S ribosomal protein S5 (RPS5) 279 Translation; mRNA binding modification, and transport Elongation factor 1-alpha (LeEF-1) 147 Translation; brings aminoacyl-tRNA to the ribosome Leucine aminopeptidase (LAP) 330 Catalyzes the hydrolysis of amino acids from the N terminus of peptides/proteins UDP-glucose:protein transglucosylase (UPTG2) 168 Glycosyltransferase involved in cell wall biosynthesis Glycosyltransferase 314 Transfers oligosaccharides to proteins in the ER to make glycoproteins Chloroplast-specific ribosomal protein 303 Translation in the chloroplast Coker et al., 2004 ARF family GTP-binding protein (ARF1) 357 ADP-ribosylation factor; regulation of vesicle-mediated protein transport Transcription Basic leucine zipper (BZIP) 808 Leucine zipper domain transcription factor (DNA binding protein) Stankovic et al., 2000 Chloroplast mRNA binding protein (CMBP) 521 Allows correct processing of chloroplast mRNAs; forms stem-loop structure within 3'-UTR Vian et al., 1999 Phantastica (PHAN) 180 Myb family transcription factor required for meristem establishment Membrane transport Aquaporin (MIP2) 91 Forms water-selective membrane channels c subunit of V-ATPase (LeVHA-c2) 132 Couple the hydrolysis of ATP to the transport of protons across membranes; alters pH levels Coker et al., 2003 AUX1-like permease (LAX2) 191 Auxin transport Putative anion:sodium symporter 647 Transporting anions with sodium through membranes Coker et al., 2004 Photosynthesis and Pyruvate kinase (cytosolic isozyme) 105 Converts PEP to pyruvate during glycolosis; reaction is the primary regulator of glycolosis respiration Hydroxypyruvate reductase (HPR) 398 Conversion of hydroxypyruvate to glycerate Rubisco small subunit (RBCS) 44 Carboxylation of ribulose-1, 5-bisphosphate (RuBP) Davies et al., 1997 Rubisco activase 252 Activates rubisco by removing RuBP (in the presence of ATP) Plastidic aldolase (AldP) 311 Catalyzes a reaction in the Calvin Cycle Photosystem I subunit precursor 375 Photosystem I polypeptide Photosystem I reaction center subunit 226 Photosystem I polypeptide Chlorophyll a/b-binding protein (similar to CAB-1A) 206 Photosystem I polypeptide Chlorophyll a/b-binding protein (similar to CAB-1B) 208 Photosystem I polypeptide Chlorophyll a/b-binding protein (similar to CAB-1C) 216 Photosystem I polypeptide Chlorophyll a/b-binding protein (CAB-11) 467 Photosystem I polypeptide Chlorophyll b-binding protein (CAB-10B) 329 Photosystem II polypeptide Photosystem II oxygen-evolving complex protein 3 119 Photosystem II polypeptide Coker et al., 2004 Photosystem II 10 kD polypeptide 98 Photosystem II polypeptide Other Leucine-rich repeat (LRP) protein 83 Receptors involved in cell surface recognition of ligands produced by pathogens Histidine triad (HIT) family protein 543 Cell-cycle regulation Coker et al., 2004 Glucosyltransferase (similar to zeatin O-glucosyltransferase) 483 Glycosylation of zeatin (a cytokinin important for protection against cytokinin oxidases) Metallothionein-like protein (LEMT4) 425 Binds heavy metals for uptake and detoxification; may protect cellular consituents from oxidative damage Acyl carrier 229 Shuttles intermediates of type II fatty acid synthase system Coker et al., 2004 Unknown Unknown; similar to 5-hydroxytryptamine receptor in snails 286 ------Unknown; similar to queuine tRNA-ribosyltransferase 150 ------Unknown 333 ------Coker et al., 2004 Unknown 206 ------Coker et al., 2004 Unknown 83 ------
93 Library validation
Several lines of evidence suggest that the transcripts presented here are systemically up-regulated after fire damage. First, the transcripts were isolated from the subtractive cDNA library. Second, Northern blots and real-time RT-PCR experiments for various library clones (using RNA derived from tissue independent of the library) consistently show higher mRNA levels in leaf tissue collected in the hour after wounding than in control tissue (Davies et al., 1997; Vian et al., 1999; Stanković et al., 2000; Coker et al., 2004). Detailed timecourse experiments of mRNA accumulation kinetics have been performed for each transcript with a “previous publication” in Table 1. Therefore, the accumulation of at least 1 transcript from each functional class has been explored in detail. For genes previously examined, the most common pattern of transcript accumulation in leaf 4 of three-week old tomato plants following a flame wound on leaf 3 is an increase that peaks within an hour, followed by a rapid decrease. These rapid changes are then followed by a more gradual period of increased, decreased, or unchanged transcript accumulation (Davies et al., 1997; Vian et al., 1999; Stanković et al., 2000; Coker et al., 2004). Finally, many of the transcripts in Table 1 (and homologues of these transcripts) have been implicated in other types of wound and stress responses in previous studies (see Discussion). Thus, there is substantial evidence that the transcripts presented here are, in fact, systemically up-regulated after a leaf is damaged by fire.
94 Conservation between tomato and Arabidopsis
By comparing tomato transcripts in the TIGR TGI with the Arabidopsis genome
(using tBlastx), Van der Hoeven et al. (2002) divided tomato transcripts into “not
homologous”, “fast-evolving”, “intermediate-evolving”, and “slow-evolving” classes.
The percentage of all tomato transcripts in the TIGR TGI falling into each category is
shown in Figure 1. By repeating their methodology (Blast searching the Arabidopsis genome using the TIGR TGI unigenes which corresponded to our transcripts), we found that 3 (7%), 2 (4%), 10 (22%), and 31 (67%) of the 46 unique transcripts in our library
could be considered as not homologous, fast-evolving, intermediate-evolving, and slow-
evolving, respectively (Figure 1). Therefore, two-thirds of the transcripts in the
subtractive cDNA library are highly conserved (slow-evolving). It follows that fire
damage causes the systemic up-regulation of a set of highly conserved transcripts.
a) b) Not homologous 7% Fast-evolving Slow -evolving 4% Slow -evolving 22% 67% Not homologous 17%
Intermediate- evolving 22% Fast-evolving 24% Intermediate- evolving 37%
Figure 1. Conservation of transcript sequences between tomato and Arabidopsis. a) Entire tomato transcriptome (data from Van der Hoeven et al., 2002). b) Transcripts systemically up-regulated in the hour after fire damage.
95 This high degree of conservation could be related to responses to fire damage (or wounding in general) being ancestral (Bond and van Wilgen, 1996). Most tomato genes involved directly in cell rescue, defense, cell death and aging are not fast evolving as a group (Van der Hoeven et al., 2002).
A potential argument against this conclusion could be that the clones in a cDNA library most likely to be sequenced are those which are highly abundant, and those with high abundance may be highly conserved. However, unlike most cDNA libraries, subtractive libraries constructed using suppression subtractive hybridization have a greatly enhanced presence of low abundance transcripts (Diatchenko et al., 1999).
Furthermore, we have performed EST analysis for the library transcripts and found no evidence that a large proportion of them are highly abundant (Coker et al., 2003; Coker et al., 2004). Finally, most transcripts in each of the 5 functional classes (Table 1) are slow- evolving, and so the trend of high conservation clearly extends beyond just those which might be highly abundant.
96 Discussion
Transcripts common to other wound and stress responses
Since little work has been done on the response to fire damage at the level of gene
expression, a fundamental question is how it compares with responses to other wounds
and stresses. Homologues of at least half of the transcripts reported in Table 1 have been
previously associated with other wounds or stresses.
Eight transcripts from the subtractive cDNA library encoded enzymes of general
metabolism (Table 1), all of these were slow-evolving, and at least 6 have been
associated with other wounds/stresses. The up-regulation of phenylalanine ammonia-
lyase (encoded by PAL5) and 4-coumarate:coenzyme A ligase (encoded by 4CL) has
very important implications during responses to mechanical wounding, herbivory,
dehydration, and pathogen infection (Edwards et al., 1985; Lawton and Lamb, 1987;
Arimura et al., 2000; Reymond et al., 2000). Phenylalanine is a starting material for the
biosynthesis of coumarins, benzoic acid derivatives, lignin, anthocyanins, isoflavones, condensed tannins, simple phenylpropanoids, and other secondary phenolics (Figure 2).
PAL catalyzes the conversion of L-phenylalanine to trans-cinnamic acid, which is the first committed step of phenylpropanoid biosynthesis (Figure 2). The up-regulation of
PAL5 and 4CL is known to be coordinately enhanced by environmental stresses
(Somssich and Hahlbrock, 1998), and often leads to the production of phenolic, defense- related compounds.
97 COOH
NH2 Phenylalanine
PAL
NH3 COOH
Benzoic acid trans-Cinnamic acid derivatives
C4H
Simple COOH phenylpropanoids para-Coumaric acid Coumarins HO
CoA-SH 4CL
Lignin COSCoA precursors para-Coumaroyl CoA HO
Condensed Flavanoids Anthocyanins tannins
Figure 2. Phenylpropanoid biosynthesis from phenylalanine. The key enzymes phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), and 4- coumarate:coenzymeA ligase (4CL) are necessary for downstream production of a wide variety of phenolic compounds (indicated in shaded boxes).
S-adenosyl-L-methionine synthetase (SAMS) and S-adenosyl-L-homocysteine hydrolase (SAHH) are enzymes of the activated methyl cycle (Figure 3) which can methylate practically every class of plant metabolite (Moffatt and Weretilnyk, 2001).
The methyl cycle also leads to the biosynthesis of ethylene (Figure 3). SAMS and SAHH are known to be up-regulated in response to mechanical wounding, pathogen infection, herbivory, and salt stress (Kawalleck et al., 1992; Espartero et al., 1994; Arimura et al.,
98 2000; Reymond et al., 2000). For example, Kawalleck et al. (1992) found that fungal elicitor strongly up-regulated both mRNAs in parsley.
(Methionine salvage cycle)
Methionine HMT SAMS
S-adenosyl- ACC syn ACC ox Homocysteine Ethylene methionine ACC
SAHH Furanocoumarins S-adenosyl- homocysteine Methylated products
Figure 3. The methyl cycle and ethylene synthesis. The key enzymes of the methyl cycle, homocysteine S-methyltransferase (HMT), S-adenosyl-L-methionine synthetase (SAMS), and S-adenosyl-L-homocysteine hydrolase (SAHH), promote methylation of a variety of compounds, as well as the production of ethylene through the activity of ACC synthase and ACC oxidase.
Catalase serves as a scavenging enzyme to protect against oxidative damage, and is known to be up-regulated during dehydration and freezing protection (Knight and
Knight, 2001). Similarly, adenylyl-sulfate reductase is up-regulated during oxidative stress, and functions to process sulfates (Bick et al., 2001; Coker et al., 2004).
Seven transcripts from the subtractive cDNA library have been associated with protein synthesis, modification, and transport (Table 1), and 6 of these were slow- evolving. Most notable in this category is leucine aminopeptidase (LAP), which is up-
99 regulated after mechanical wounding, pathogen infection, dehydration, and salt stress (Gu et al., 1996; Chao et al., 1999). Although not generally considered “wound proteins”, transcript levels of ribosomal proteins, EF-1 alpha, and other protein modifiers/transporters are sensitive to certain wounds (Arimura et al., 2000).
Three transcripts from the subtractive cDNA library encoded transcription factors
(Table 1). Two of these were slow-evolving, and all 3 have been previously associated with other wounds or stresses. Basic leucine zippers have been associated with pathogen infection (Jakoby et al., 2002), chloroplast mRNA-binding protein with mechanical wounding (Vian et al., 1999), and phantastica with the feeding sites of root-knot nematodes (Koltai et al., 2001).
Four transcripts from the subtractive cDNA library are associated with membrane transport (Table 1). Three of these were slow-evolving, and homologues of at least 3 have been previously associated with other wounds or stresses. Aquaporins, c subunits of vacuolar ATPase, and the anion:sodium symporter are up-regulated during herbivory
(Arimura et al., 2000), salt stress (Chen et al., 2002), and the presence of toxins
(Bobrowicz et al., 1997), respectively.
Five transcripts from the library do not fit the other functional categories (Table
1). Most notable among these was a metallothioneine-like protein associated with mechanical wounding, herbivory, dehydration, and metal detoxification (Giritch et al.,
1998; Arimura et al., 2000; Reymond et al., 2000). Also, leucine-rich repeat proteins are commonly associated with pathogen infection (Tornero et al., 1996).
In summary, there are many transcripts up-regulated during a systemic response to fire damage similar to those up-regulated in response to other wounds and stresses.
100 Since most transcripts in all 5 functional categories (67% overall) were slow-evolving, it
also appears that these transcripts are highly conserved in plants. Taken together, both
conclusions support the notion that plants respond to multiple wounds/stress stimuli by
common, highly conserved mechanisms.
Transcripts not common to other wound and stress responses
Most previous studies show that photosynthetic genes such as rubisco small
subunit are unaffected or down-regulated by wounding, stress, or pathogen infection
(Hermsmeier et al., 2001). This down-regulation is associated with a shift of carbon from
primary metabolism to defense. In the current investigation of the systemic response to
fire damage, however, 14 transcripts from the subtractive cDNA library encoded proteins
involved in photosynthesis and respiration (Table 1). The systemic accumulation of
rubisco small subunit and photosystem II oxygen-evolving complex protein 3 after fire
damage has been described previously (Davies et al., 1997; Coker et al., 2004). It may also be notable that an enzyme of general metabolism in this study which has not been previously associated with wound/stress responses, aspartokinase/homoserine dehydrogenase, is regulated by photosynthetic-related signals (Zhu-Shimoni and Galili,
1998).
It seems evident that fire damage could provoke a very different response than other wounds with regard to energy metabolism. For example, it is possible that oxidative damage and leaf tissue damage caused by fire decreases photosynthetic capacity, which must then be restored. In the natural environment, fire damage is very different from pathogen attack or herbivory in that photosynthesis and growth can be of
101 immediate importance (perhaps to replace leaves). It is also not unprecedented for photosynthetic genes to play a role in a stress response. For example, homologues of 3 genes in this study (plastidic aldolase, photosystem II OEC protein 3, and rubisco activase) have been associated with responses to salt stress (Yamada et al., 2000;
Sugihara et al., 2000; Gu et al., 2004). The systemic up-regulation of photosynthetic genes after fire damage raises a very important question: Are photosynthetic genes necessary components of the early response to fire damage, or is their up-regulation merely the result of a perturbation in an interconnected network? This cannot yet be answered, and will be an interesting topic for future study.
102 References
Arimura G, Tashiro K, Kuhara S, Nishioka T, Ozawa R, Takabayashi J (2000) Gene responses in bean leaves induced by herbivory and by herbivore-induced volatiles. Biochem Biophys Res Commun 277: 305-310.
Bertram L, Lercari B (2000) Phytochrome A and phytochrome B1 control the acquisition of competence for shoot regeneration in tomato hypocotyl. Plant Cell Reports 19: 604- 609.
Bick JA, Setterdahl AT, Knaff, DB, Chen Y, Pitcher LH, Zilinskas BA, Leustek T (2001) Regulation of the plant-type 5'-adenylyl sulfate reductase by oxidative stress. Biochem 40: 9040-9048.
Bobrowicz P, Wysocki R, Owsianik G, Goffeau A, Ulaszewski S (1997) Isolation of three contiguous genes, ACR1, ACR2 and ACR3, involved in resistance to arsenic compounds in the yeast Saccharomyces cerevisiae. Yeast 13: 819-28.
Bond WJ, van Wilgen BW (1996) Fire and plants. Chapman & Hall: London.
Braam J, Davis RW (1990) Rain-, wind-, and touch-induced expression of calmodulin related genes in Arabidopsis. Cell 60:357-364.
Chao WS, Gu Y-Q, Pautot V, Bray EA, Walling LL (1999) Leucine aminopeptidase RNAs, proteins, and activities increase in response to water deficit, salinity, and wound signals systemin, methyl jasmonate, and abscisic acid. Plant Physiol 120: 979-992.
Chen X, Kanokporn T, Zeng Q, Wilkins TA, Wood AJ (2002) Characterization of the V- type H(+)-ATPase in the resurrection plant Tortula ruralis: accumulation and polysomal recruitment of the proteolipid c subunit in response to salt-stress. J Exp Bot 53: 225-32.
Coker JS, Davies E (2002) Correspondence re: A.H. Ree et al., Expression of a novel factor in human breast cancer cells with metastatic potential (Cancer Res., 59: 4675- 4680, 1999). Cancer Res 62: 4164-4165.
Coker JS, Jones, D, and Davies, E (2003) Identification, conservation, and relative expression of V-ATPase cDNAs in tomato plants. Plant Molecular Biology Reporter 21: 145-158.
Coker JS, Vian A, Davies E (2004) Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage. Submitted.
Davies E, Vian A, Vian C, Stankovic B (1997) Rapid systemic up-regulation of genes after heat-wounding and electrical stimulation. Acta Physiologiae Plantarum 4: 571-576.
103
Debano LF, Neary DG, Ffolliott PF (1998) Fire effects on ecosystems. Wiley & Sons, Inc.: New York.
Diatchenko L, Lukyanov S, Lau YF, Siebert PD (1999) Suppression subtractive hybridization: a versatile method for identifying differentially expressed genes. Methods Enzymol 303: 349-80.
Drewa PB, Platt WJ, Moser EB (2002) Fire effects on resprouting of shrubs in headwaters of southeastern longleaf pine savannas. Ecology 83: 755-767.
Edwards K, Cramer CL, Bolwell GP, Dixon RA, Schuch W, Lamb CJ (1985) Rapid transient induction of phenylalanine ammonia-lyase mRNA in elicitor-treated bean cells. Proc Natl Acad Sci USA 82: 6731-6735.
Espartero J, Pintor-Toro JA, Pardo, JM (1994) Differential accumulation of S- adenosylmethionine synthetase transcripts in response to salt stress. Mol Biol 25: 217- 227.
Giritch A, Ganal M, Stephan UW, Baumlein H (1998) Structure, expression and chromosomal location of the metallothionein-like gene family of tomato. Plant Molecular Biol 37: 701-714.
Graham JS, Hall G, Pearce G, Ryan CA (1986) Regulation of synthesis of proteinase inhibitors I and II mRNAs in leaves of wounded tomato plants. Planta 169: 399-405.
Green TR, Ryan CA (1972) Wound-induced proteinase inhibitor in plant leaves – possible defense mechanism against insects. Science 175: 776-777.
Gu YQ, Chao WS, Walling LL (1996) Localization and post-translational processing of the wound-induced leucine aminopeptidase proteins of tomato. J Biol Chem 271: 25880- 25887.
Gu R, Fonseca S, Puskas LG, Hackler L Jr, Zvara A, Dudits D, Pais MS (2004) Transcript identification and profiling during salt stress and recovery of Populus euphratica. Tree Physiol 24:265-76.
Hermsmeier D, Schittko U, Baldwin IT (2001) Molecular interactions between the specialist herbivore Manduca sexta (Lepidoptera, Sphingidae) and its natural host Nicotiana attenuata. I. Large-scale changes in the accumulation of growth- and defense- related plant mRNAs. Plant Physiol 125: 683-700.
Howe GA, Lee GI, Itoh A, Li L, DeRocher AE (2000) Cytochrome P450-dependent metabolism of oxylipins in tomato. Cloning and expression of allene oxide synthase and fatty acid hydroperoxide lyase. Plant Physiol 123: 711-24.
104
Jakoby M, Weisshaar B, Droge-Laser W, Vicente-Carbajosa J, Tiedemann J, Kroj T, Parcy F (2002) bZIP transcription factors in Arabidopsis. Trends in Plant Science 7:106- 111.
Kawalleck P, Plesch G, Hahlbrock K, Somssich IE (1992) Induction by fungal elicitor of S-adenosyl-L-homocysteine hydrolase mRNAs in cultured cells and leaves of Petroselium crispum. Proc Natl Acad Sci USA 89: 4713-4717.
Knight H, Knight MR (2001) Abiotic stress signaling pathways: specificity and cross- talk. Trends in Plant Science 6: 262-267.
Koltai H, Dhandaydham M, Opperman C, Thomas J, Bird D (2001) Overlapping plant signal transduction pathways induced by a parasitic nematode and a rhizobial endosymbiont. Mol Plant Microbe Interact 14:1168-77.
Lawton MA, Lamb CJ (1987) Transcriptional activation of plant defense genes by fungal elicitor, wounding and infection. Mol Cell Biol 7: 335-341.
León J, Enrique R, Sánchez-Serrano JJ (2001) Wound signalling in plants. J Exper Bot 52: 1-9.
Low RK, Prakash AP, Swarup S, Goh CJ, Kumar PP (2001) A differentially expressed bZIP gene is associated with adventitious shoot regeneration in leaf cultures of Paulownia kawakamii. Plant Cell Reports 20: 696-700.
Moffatt BA, Weretilnyk EA (2001) Sustaining S-adenosyl-L-methionine-dependent methyltranserase activity in plant cells. Physiologia Plantarum 113: 435-442.
Pearce G, Strydom D, Johnson S, Ryan CA (1991) A polypeptide from tomato leaves induces wound-inducible proteinase inhibitor proteins. Science 253: 895-898.
Peres LE-P, Morgante PG, Vecchi C, Kraus JE, van Sluys MA (2001) Shoot regeneration capacity from roots and transgenic hairy roots of tomato cultivars and wild related species. Plant Cell Tissue and Organ Culture 65: 37-44.
Platt WJ, Evans GW, Davis MM (1988) Effects of fire season on flowering of forbs and shrubs in longleaf pine forests. Oecologia 76: 353-363
Preston CA, Baldwin IT (1999) Positive and negative signals regulate germination in the post-fire annual, Nicotiana attenuata. Ecology 80: 481-494.
Reymond P, Weber H, Damond M, Farmer EE (2000) Differential gene expression in response to mechanical wounding and insect feeding in Arabidopsis. Plant Cell 12: 707- 719.
105 Schaller A, Ryan CA (1996) Molecular cloning of a tomato leaf cDNA encoding an aspartic protease, a systemic wound response protein. Plant Mol Biol 31: 1073-1077.
Somssich IE, Hahlbrock K (1998) Pathogen defence in plants – a paradigm of biological complexity. Trends in Plant Science 3: 86-90.
Stanković B, Vian A, Henry-Vian C, Davies E (2000) Molecular cloning and characterization of a tomato cDNA encoding a systemically wound-inducible bZIP DNA- binding protein. Planta 212: 60-66.
Sugihara K, Hanagata N, Dubinsky Z, Baba S, Karube I (2000) Molecular characterization of cDNA encoding oxygen evolving enhancer protein 1 increased by salt treatment in the mangrove Bruguiera gymnorrhiza. Plant Cell Physiol 41: 1279-1285.
Takashina T, Suzuki T, Egashira H, Imanishi S (1998) New molecular markers linked with the high shoot regeneration capacity of the wild tomato species Lycopersicon chilense. Breeding Science 48: 109-113.
Taylor IB (1986) Biosystematics of the tomato. The tomato crop: A scientific basis for improvement. Eds Atherton JG and Rudich J. Chapman and Hall Ltd: New York.
Taylor JLS, van Staden J (1998) Plant-derived smoke solutions stimulate the growth of Lycopersicon esculentum roots in vitro. Plant Growth Regulation 26: 77-83.
Tornero P, Mayda E, Gomez M, Canas L, Conejero V, Vera P (1996) Characterization of LRP, a leucine-rich repeat (LRR) protein from tomato plants that is processed during pathogenesis. Plant J 10: 315-330.
Van der Hoeven R, Ronning C, Giovannoni J, Martin G, Tanksley S (2002) Deductions about number, organization, and evolution of genes in the tomato genome based on analysis of a large expressed sequence tag collection and selective genomic sequencing. Plant Cell 14: 1441-1456.
Verdaguer D, Ojeda F (2002) Root starch storage and allocation patterns in seeder and resprouter seedlings of two Cape Erica (Ericaceae) species. Amer J Botany 89: 1189- 1196.
Vian A, Henry-Vian C, Davies E (1999) Rapid and systemic accumulation of chloroplast mRNA-binding protein transcripts after flame stimulus in tomato. Plant Physiol 121: 517- 524.
Yamada S, Komori T, Hashimoto A, Kuwata S, Imaseki H, Kubo T (2000) Differential expression of plastidic aldolase genes in Nicotiana plants under salt stress. Plant Sci 154: 61-69.
106 Zhu-Shimoni JX, Galili G (1998) Expression of an Arabidopsis aspartate kinase/homoserine dehydrogenase gene is metabolically regulated by photosynthesis- related signals but not by nitrogenous compounds. Plant Physiol 116: 1023-1028.
107
Chapter 7
Conclusions and Future Directions
108 Conclusions and future directions regarding the development of methods for gene expression analysis using sequence data: Blueprint for a universal sequencing-based method of gene expression analysis
Abstract
Modern methods of measuring gene expression rely upon complementary or antibody binding, followed by some measurement of radiation emitted or absorbed by a large population of molecules. These indirect measurements of gene expression have many limitations which are inherent to all binding-radiation methods. A direct measurement of gene expression would be to sequence individual cDNA molecules to establish actual numbers of molecules present. This section presents the advantages of sequencing-based measurements of gene expression and offers a set of specifications upon which future analyses could be based. The advantages of a direct sequencing method include integration of gene discovery with expression analyses, universal comparability between transcript frequencies, and universal comparability between experiments. Currently, the major obstacles for a universal sequencing-based method for gene expression analysis involve 1) technology for sequencing individual cDNA molecules 2) sequence quality-control and 3) methods for data analysis. To elucidate what advances are needed to overcome these obstacles, specifications and a blueprint for a system which would allow gene expression analysis in any species are provided.
109 Disadvantages of binding-radiation methods
Microarrays, northern and western blots, ELISA, RT-PCR, and the other popular
methods for measuring gene expression have two common traits. First, they all rely upon complementary binding (for measuring RNA or cDNA) or binding to an antibody (for measuring proteins) for specificity. Second, they depend upon some measurement of electromagnetic radiation (some wavelength of light), which is (in theory) proportional to the amount of cDNA, RNA, or protein present. Both characteristics allow one to conclude that all of these methods are indirect – they do not directly measure the number of cDNA, RNA, or protein molecules. In other words, binding-radiation methods measure an amount of radiation associated with the binding capacity of large populations of molecules, but do not find exact numbers of individual molecules.
Two sets of disadvantages are associated with binding-radiation methods. The first set is related to binding. Indirect methods require hybridization probes and antibodies for every gene in a species. Generating probes and antibodies is time/labor intensive and prone to error, making it practical only for economically important species.
Furthermore, because all probes and antibodies are different, their binding efficiencies are different. Therefore, it is not usually possible to make quantitative comparisons between the expression levels of different genes, or even comparisons using the same gene when different probes/antibodies are used. By contrast, direct methods require one to isolate RNA (or protein), convert RNA to cDNA, perhaps clone the cDNA (depending upon the sequencing technology), and then perform sequencing. Therefore, after RNA
(or protein) has been isolated, there could be one protocol suitable for measuring the expression of every gene for every species, and one software tool to analyze the data.
110 The second set of disadvantages associated with binding-radiation methods is related to detection of the light/radiation. Indirect methods usually have an upper limit of quantification due to saturation (too much light to measure accurately) and/or a lower limit due to a threshold of detection (too little light to measure accurately, if at all). All such problems are eliminated by direct methods. In other words, direct methods will reliably quantify the expression of both low and high-abundance transcripts (and proteins).
111 Advantages of sequencing methods
The direct method of measuring gene expression would be to sequence the individual transcript (or protein) molecules of a given cell or tissue and find the number of individual transcripts. Thus, by sequencing each transcript (or protein), one attains a direct, absolute measurement of gene expression. Sequencing individual full-length transcripts (or proteins) from a given cell or tissue represents the ultimate measurement of gene expression. Because of the increased specificity and precision of sequencing data, direct methods will be more reliable for measuring gene expression in the following scenarios:
a) Comparing homologous genes in the same species. b) Comparing homologous genes in different species. c) Comparing splice variants of the same gene. d) Identifying SNPs and/or unexpected sequence variations between individuals. d) Measuring small variations in gene expression. e) Any other experiment where exact quantification (number of a particular transcript in a population of transcripts) is needed.
The indirect binding-radiation methods are already widely used, whereas the direct sequencing methods are used less because of the time and high costs needed to sequence thousands of cDNAs (or other type of molecule). However, sequencing costs are decreasing and technology is being developed so that individual DNA molecules can be sequenced (Braslavsky et al., 2003). In the coming decades, advances in nanotechnology are likely to make direct sequencing methods a reality and indirect binding-radiation methods obsolete for reasons given below.
Integration of gene discovery with expression analyses
112 Gene discovery and measurement of gene expression are currently two distinct
steps. In general, sequencing is the preferred method of gene discovery, whereas
binding-radiation methods are the preferred methods of measuring gene expression. To
design an experiment to measure gene expression using binding-radiation methods, one
must already know something about the gene to be measured. Using sequencing methods
(i.e. sequencing a cDNA library), however, one needs no prior knowledge of the genes in
question – both gene discovery and measurement of gene expression occur using one
method.
Universal comparability between transcript frequences
As stated above, differences in binding efficiencies prevent direct, quantitative comparisons between expression levels of different genes. Figure 1 illustrates the number of direct comparisons which could be made between 2 transcript populations using binding-radiation and sequencing methods. For binding-radiation methods, binding efficiencies and light emission may be different for different transcripts, allowing direct comparisons to be made only for identical transcripts (i.e. comparing transcript A in population 1 with transcript A in population 2). On the other hand, sequencing methods generate transcript frequencies which allow direct comparisons between any 2 transcripts.
In the example using only 3 transcripts from 2 transcript populations in Figure 1, a
binding-radiation method would allow only 3 direct comparisons, while a sequencing
method would allow 15 comparisons.
113
a) Binding-radiation method b) Sequencing method
Population 1 Population 2 Population 1 Population 2
A A A A
B B B B
C C C C
Figure 1. Comparisons that can be made between 2 transcript populations using binding-radiation (a) and sequencing (b) methods. Each line represents a reliable comparison between transcript levels. Binding-radiation methods allow reliable comparisons only between identical transcripts, whereas sequencing methods allow comparisons between all transcripts.
The mathematical expression for the number of comparisons which could be made using a sequencing method is described by
n2 − n C = [Eq. 1] 2
where C is the number of possible comparisons and n is the total number of transcript frequencies in all populations. For example, a modern microarray experiment using 2 gene chips with 30,000 probes each allows 30,000 comparisons to be made. A sequencing method which generated data for the same 30,000 transcripts from the same 2 transcript populations would allow 1.8 × 109 comparisons (where n=30,000 x 2). Thus, the potential usefulness of the data increases 60,000–fold. This number is actually an underestimate, since direct comparisons between groups of transcripts would also be possible (i.e. all actin genes, all photosynthetic genes, etc.).
114 The scale of complexity of this problem in plants may be visualized by an analogy
using plant ecology in the United States. There are approximately 30,000 plant species
and 50 states in the U.S. Similarly, there are approximately 30,000 genes and 50 cell
types in higher plants. Therefore, in terms of complexity, comparing the numbers of all
transcripts between 2 cell types is analogous to comparing the numbers of all species
between 2 U.S. states (equating to 1.8 × 109 possible comparisons). Comparing transcript
populations for all 50 cell types (or species in 50 states) would result in 1.125 × 1012 possible comparisons. Thus, monumental advances in bioinformatics and computational biology will be necessary to deal with the vast amount of data generated by sequence- based expression studies.
Universal comparability between experiments
A common problem with binding-radiation methods is the difficulty in comparing
one experiment with another. As stated above, 2 different hybridization probes for the
exact same gene often do not allow a direct, quantifiable comparison. The result is that
the number of comparisons which can be made is restricted to the number of experiments
in a particular laboratory (or sometimes using a particular method). For example, using
binding-radiation methods, it is usually difficult (or impossible) for a laboratory which
has quantified expression levels for 3 transcripts to make direct comparisons with another
laboratory which quantified the same 3 transcripts in a different tissue. On the other
hand, sequencing methods allow such comparisons and would therefore facilitate
universal comparability within the literature and gene expression databases using
115 standard units such as “number of a particular transcript / number of total transcripts” or
“number of a particular transcript / cell”.
116 Obstacles and specifications for a universal sequencing-based method
Despite the future potential of sequencing-based methods for gene expression analysis, binding-radiation methods are currently more practical and robust for most experimental questions. Because sequencing is currently slow and expensive, it is impractical for many labs to sequence enough to complete an entire gene expression study. For tomato plants, which are among the most economically important species, there are 27 cDNA libraries publicly available (in the TIGR Tomato Gene Index) with more than 500 ESTs. Therefore, although there are enough data to perform gene expression analyses using sequence data, the number of experimental questions which can be addressed is limited.
Currently, there are 3 major obstacles impeding the use of sequencing for studies of gene expression (Table 1): 1) Technology for sequencing individual cDNA molecules
2) Sequence quality control 3) Methods for data analysis. With regard to the first obstacle, it has been shown that it is possible to sequence an individual DNA molecule
(Braslavsky et al., 2003). However, the current method is successful for less than 10 nucleotides at a time. For single-molecule sequencing to be useful, full-length cDNA molecules must be sequenced rapidly, demanding significant advances in sequencing technology (Table 1).
The second obstacle involves problems which are currently plaguing genome projects, transcriptome projects, and public sequence databases – sequence quality control (Table 1). Identifying false sequences (vectors, primers, adaptors, DNA from other organisms, etc.) in a sequencing project has proven to be a formidable task. The three main reasons for this are 1) Contaminating sequences are often very short (<20 bp)
117 2) the full DNA sequences of only a few organisms are known, and so identifying transcripts as “native” or “contaminating” is often challenging and 3) A wide variety of molecular rearrangements (due to transposons, adaptor dimerization, etc.) can take place during the creation of a DNA library. When using sequencing for gene expression analysis, bad sequence quality control could lead to miscalculating the total number of transcripts in a population, and worse, associating the wrong cDNA with an organism.
Currently, there is no tool that will reliably identify all types of contaminating DNA sequences. To eliminate such problems, an “EST Quality Algorithm” (EQUAL) must be devised and linked to public sequence databases such as GenBank and the TIGR Gene
Indices.
Table 1. Specifications for a universal sequencing-based method of gene expression analysis. Specifications are based upon the 3 major obstacles shown in the left-hand column. A method which fulfilled the proposed specifications would allow for rapid gene discovery and gene expression analyses in any species. Significant advances are necessary for each specification to become a practical reality.
Obstacle Specifications for a universal sequencing- References based method I. Technology for 1. Able to sequence individual cDNA molecules Braslavsky et al., 2003 sequencing individual cDNA molecules 2. Able to sequence individual full-length cDNA (none) molecules 3. Able to sequence a large number of individual (none) full-length cDNA molecules rapidly II. Sequence quality 4. High sequence quality (Phred scores) for the Ewing and Green, 1998 control entire length of cDNA molecules 5. Able to distinguish cDNA of one organism See Chapter 2 from that of pathogens, etc. 6. Able to distinguish genuine cDNA from See Chapter 2 cloning artifacts (vectors, primers, adapters, etc.). III. Methods for data 7. Clustering algorithms Pertea et al., 2003 analysis 8. Statistics (and visualization tools) to compare Stekel et al., 2000 transcript frequencies Romualdi et al., 2001 9. Methods for evaluating internal controls See Chapter 3 and/or housekeeping controls (Coker and Davies, 2003) 10. Public database of expression data which is Quackenbush et al., 2001 integrated with traditional sequence databases.
118
The third obstacle, methods for data analysis, is currently being addressed in the context of cDNA library analysis (Table 1). To allow uniformity in the treatment of data among researchers, algorithms for clustering identical/similar transcripts, computing transcript frequencies, comparing transcript frequencies, ensuring sound internal/housekeeping controls, and storing and searching sequences must be brought together into one integrated “EST Analysis Algorithm (EANAL). To be most useful,
EANAL must be linked to public sequence databases such as GenBank and the TIGR
Gene Indices.
Based on the major obstacles stated above, a basic set of specifications emerge
(Table 1) which would lead to a universal sequencing-based method of gene expression analysis as shown in generalized form in Figure 2.
I. DNA sequencing microchip
Purifier (to filter non-DNA) Well for application of cDNA sample
Separator (of individual cDNAs)
Stabilizer (of individual cDNA molecules)
Microprocessor Sequencer
Sequence data
II. EST quality algorithm (EQUAL)
III. EST analysis algorithm (ENEAL)
Figure 2. Theoretical blueprint for a universal sequencing-based method of gene expression analysis.
119 References
Braslavsky I., Herbert B., Kartalov E., Quake S.R. 2003. Sequence information can be obtained from single DNA molecules. Proc. Natl. Acad. Sci. 100: 3960-3964.
Coker J.S., Davies E. 2003. Selection of candidate housekeeping controls in tomato plants using EST data. Biotechniques 35: 740-748.
Ewing B., Green P. 1998. Base-calling of automated sequencer traces using Phred. II. Error probabilities. Genome Res. 8: 186-194.
Pertea G., Huang X., Liang F., Antonescu V., Sultana R., Karamycheva S., Lee Y., White J., Cheung F., Parvizi B., Tsai J., Quackenbush J. 2003. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 19: 651-2.
Romualdi C., Bortoluzzi S., Danieli G.A. 2001. Detecting differentially expressed genes in multiple tag sampling experiments: comparative evaluation of statistical tests. Hum. Mol. Genet. 10: 2133-2141.
Stekel D.J., Git Y., Falciani F. 2000. The comparison of gene expression from multiple cDNA libraries. Genome Res. 10: 2055-2061.
Quackenbush J., Cho J., Lee D., Liang F., Holt I., Karamycheva S., Parvizi B., Pertea G., Sultana R., White J. 2001. The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 29: 159-64.
120 Conclusions and future directions regarding the biology of systemic responses to fire damage
The overall goal of this dissertation was to characterize the array of transcripts which systemically accumulate after fire damage. Several conclusions regarding the biology of systemic responses to fire damage have resulted.
After a tomato leaf is damaged by fire, many different transcripts accumulate in other parts of the plant. Most of these transcripts are highly conserved in plants, suggesting that the observed systemic response to fire damage is not unique to tomato plants and could even be a universal phenomenon in higher plants. Most of the transcripts fall into 5 functional classes: 1) enzymes of general metabolism; 2) protein synthesis, modification, and transport; 3) transcription; 4) membrane transport; 5) photosynthesis and respiration. Most of the transcripts were already present in unwounded tissues, but at lower levels than after wounding. After wounding, the accumulation of most transcripts peaked within 30 to 60 minutes, followed by a return to basal levels within 3 hours.
The systemic response to fire damage has components similar to those of other wound and stress responses. These include 4 of the 6 key enzymes of phenylpropanoid biosynthesis and the activated methyl cycle, proteinase inhibitors, leucine aminopeptidase, and many others. These common components suggest that there is some universality in plant responses to different types of wounding and/or stress. On the other hand, the systemic response to fire damage has components different from those of other wound responses. Most notable among these were transcripts associated with photosynthesis and respiration. It is unclear if the accumulation of photosynthesis
121 transcripts is just a perturbation in an interconnected genetic network, or if it indicates a
fundamental difference in the response to fire damage versus other wound responses.
Based on these conclusions, there are several future directions for studies on the
systemic response to fire damage (flame wounding) in plants. Most importantly, the 46
transcripts which have been shown to accumulate after fire damage should be tested for
functionality. Do their corresponding polysomal mRNA levels increase? Do their
corresponding protein levels increase? If polysomal mRNA and/or protein levels do increase, then how do their kinetics compare with the accumulation of total mRNA?
What are these proteins doing? Are the transcripts associated with photosynthesis, in particular, being used to make functional proteins?
The second future direction is to extend the studies of gene expression in this dissertation to include root tissues. In natural ecosystems, fire often burns most or all of the aboveground mass of a plant. Therefore, it makes sense that a fundamental part of the systemic response to fire damage might include gene expression in the roots. Do the 46 transcripts which accumulate in leaves also accumulate in roots?
The third future direction is to understand how the entire plant transcriptome changes after flame wounding. The subtractive cDNA library allowed the discovery of up-regulated genes, but cannot tell us how many genes are up-regulated. Nor can the subtractive library tell us how many genes are down-regulated. Microarray experiments using tomato or Arabidopsis thaliana would provide a global perspective of transcript
changes in plants. Microarray experiments could also be used to answer parts of the first
and second future directions above (accumulation of polysomal mRNA and transcript
accumulation in roots).
122 Because fire in natural ecosystems is varied in its intensity, speed, and chemical composition, laboratory experiments should eventually test how biological responses change as the stimulus (fire) changes. How does systemic transcript accumulation change when a leaf is charred? Do different types of fires evoke different types of responses? Can smoke by itself cause changes in gene expression?
The final future direction must be to take the knowledge gained from tomato plants in a laboratory setting and apply it to studying plants in the natural environment.
Even though the laboratory is powerful in its ability to separate one variable affecting a plant from conflicting variables, we can never assume we know the full story until we study plants in their own environment. There seems little doubt that responses to fire damage in nature are far richer in complexity than we could ever see using a single model organism in a laboratory. If only to appreciate the full scope and beauty of this complexity, physiological studies of fire damage in nature should be pursued.
123
Appendices
Appendix 1 was published electronically in 2003 as Supplementary Materials for Chapter 4.
Appendix 2 contains GenBank entries associated with Chapter 5 which will be publicly released on September 1, 2004.
Appendix 3 consists of educational research completed over the course of this dissertation. Appendix 3-A was published in 2002 in the Journal of Natural Resources and Life Science Education 31, 44-47, Appendix 3-B will be submitted for publication, and Appendix 3-C has been submitted for publication. In addition, curricula associated with this dissertation have been published electronically in 2003 in the Genetics section of the Biology Lab Clearinghouse (http://blc.biolab.udel.edu/Coker-Davies/).
124
Appendix 1: V-ATPase amino acid alignments in Lycopersicon and Arabidopsis
Supplementary materials for Coker, Jones, and Davies (2003)
1 50 LeVHA-c1 -MSNFAGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRPE LeVHA-c2 MASTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRPE LeVHA-c3 -MSNFAGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRPE LeVHA-c4 MASTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRPE Consensus MMSTFAGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRPE 51 100 LeVHA-c1 LVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSGL LeVHA-c2 LVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSGL LeVHA-c3 LVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSGL LeVHA-c4 LVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSGL Consensus LVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSGL 101 150 LeVHA-c1 ACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGLI LeVHA-c2 ACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGLI LeVHA-c3 ACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGLI LeVHA-c4 ACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGLI Consensus ACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGLI 151 165 LeVHA-c1 VGIILSSRAGQSRAE LeVHA-c2 VGIILSSRAGQSRAE LeVHA-c3 VGIILSSRAGQSRAD LeVHA-c4 VGIILSSRAGQSRAE Consensus VGIILSSRAGQSRAE
Figure 1. Alignment of c subunits in tomato.
125
1 50 LeVHA-c1 (1) --MSNFAGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP LeVHA-c3 (1) --MSNFAGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP LeVHA-c4 (1) -MASTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP LeVHA-c2 (1) -MASTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP AtVHA-c1 (1) --MSTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP AtVHA-c2 (1) -MASTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP AtVHA-c3 (1) --MSTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP AtVHA-c4 (1) MASSGFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP AtVHA-c5 (1) --MSTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP Consensus (1) MSTFSGDETAPFFGFLGAAAALVFSCMGAAYGTAKSGVGVASMGVMRP 51 100 LeVHA-c1 (49) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSG LeVHA-c3 (49) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSG LeVHA-c4 (50) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSG LeVHA-c2 (50) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKTKSYYLFDGYAHLSSG AtVHA-c1 (49) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG AtVHA-c2 (50) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG AtVHA-c3 (49) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG AtVHA-c4 (51) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG AtVHA-c5 (49) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG Consensus (51) ELVMKSIVPVVMAGVLGIYGLIIAVIISTGINPKAKSYYLFDGYAHLSSG 101 150 LeVHA-c1 (99) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL LeVHA-c3 (99) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL LeVHA-c4 (100) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL LeVHA-c2 (100) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL AtVHA-c1 (99) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL AtVHA-c2 (100) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL AtVHA-c3 (99) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL AtVHA-c4 (101) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL AtVHA-c5 (99) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL Consensus (101) LACGLAGLSAGMAIGIVGDAGVRANAQQPKLFVGMILILIFAEALALYGL 151 166 LeVHA-c1 (149) IVGIILSSRAGQSRAE LeVHA-c3 (149) IVGIILSSRAGQSRAD LeVHA-c4 (150) IVGIILSSRAGQSRAE LeVHA-c2 (150) IVGIILSSRAGQSRAE AtVHA-c1 (149) IVGIILSSRAGQSRAE AtVHA-c2 (150) IVGIILSSRAGQSRAE AtVHA-c3 (149) IVGIILSSRAGQSRAE AtVHA-c4 (151) IVGIILSSRAGQSRAE AtVHA-c5 (149) IVGIILSSRAGQSRAE Consensus (151) IVGIILSSRAGQSRAE
Figure 2. Alignment of c subunits in tomato and Arabidopsis.
126
Table 1. Subunit c amino acid identities.
LeVHA-c1 LeVHA-c3 LeVHA-c4 LeVHA-c2 AtVHA-c1 AtVHA-c2 AtVHA-c3 AtVHA-c4 AtVHA-c5 LeVHA-c1 100 99 98 98 98 97 98 96 98 LeVHA-c3 100 97 97 98 96 98 96 98 LeVHA-c4 100 100 98 99 98 97 98 LeVHA-c2 100 98 99 98 97 98 AtVHA-c1 100 99 100 98 100 AtVHA-c2 100 99 98 99 AtVHA-c3 100 98 100 AtVHA-c4 100 98 AtVHA-c5 100
1 50 LeVHA-c''1 (1) MSAASTMAVMGASSSWSRALIQISPYTFSAVGIAIAIGVSVLGAAWGIYI LeVHA-c''2 (1) ------MAGPSSSWSRALVQISPYTFAAVGIAIAIGVSVLGAAWGIYI Consensus (1) M G SSSWSRALIQISPYTFAAVGIAIAIGVSVLGAAWGIYI 51 100 LeVHA-c''1 (51) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS LeVHA-c''2 (43) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS Consensus (51) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS 101 150 LeVHA-c''1 (101) KIYAAESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSTLF LeVHA-c''2 (93) QIYAPESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSSLF Consensus (101) IYA ESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSSLF 151 185 LeVHA-c''1 (151) VKILVIEIFGSALGLFGVIVGIIMSAQATWPSKTA LeVHA-c''2 (143) VKILVIEIFGSALGLFGVIVGIIMSAQASWPSKGA Consensus (151) VKILVIEIFGSALGLFGVIVGIIMSAQASWPSK A
Figure 3. Alignment of c” subunits in tomato.
127
1 50 LeVHA-c''1 (1) MSAASTMAVMGASSSWSRALIQISPYTFSAVGIAIAIGVSVLGAAWGIYI LeVHA-c''2 (1) ------MAGPSSSWSRALVQISPYTFAAVGIAIAIGVSVLGAAWGIYI AtVHA-c''1 (1) ---MSGVVALGHASSWGAALVRISPYTFSAIGIAISIGVSVLGAAWGIYI AtVHA-c''2 (1) -----MSGVAIHASSWGAALVRISPYTFSAIGIAISIGVSVLGAAWGIYI Consensus (1) S MAVAGHASSWSRALVRISPYTFSAIGIAIAIGVSVLGAAWGIYI 51 100 LeVHA-c''1 (51) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS LeVHA-c''2 (43) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS AtVHA-c''1 (48) TGSSLIGAAIEAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPSS AtVHA-c''2 (46) TGSSLIGAAIEAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPSS Consensus (51) TGSSLIGAAIKAPRITSKNLISVIFCEAVAIYGVIVAIILQTKLESVPAS 101 150 LeVHA-c''1 (101) KIYAAESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSTLF LeVHA-c''2 (93) QIYAPESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSSLF AtVHA-c''1 (98) KMYDAESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSTLF AtVHA-c''2 (96) KMYDAESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSTLF Consensus (101) KIYDAESLRAGYAIFASGIIVGFANLVCGLCVGIIGSSCALSDAQNSTLF 151 185 LeVHA-c''1 (151) VKILVIEIFGSALGLFGVIVGIIMSAQATWPSKTA LeVHA-c''2 (143) VKILVIEIFGSALGLFGVIVGIIMSAQASWPSKGA AtVHA-c''1 (148) VKILVIEIFGSALGLFGVIVGIIMSAQATWPTK-- AtVHA-c''2 (146) VKILVIEIFGSALGLFGVIVGIIMSAQATWPTK-- Consensus (151) VKILVIEIFGSALGLFGVIVGIIMSAQATWPSK A
Figure 4. Alignment of c” subunits in tomato and Arabidopsis.
Table 2. Subunit c” amino acid identities.
LeVHA-c''1 LeVHA-c''2 AtVHA-c''1 AtVHA-c''2 LeVHA-c''1 100 90 87 86 LeVHA-c''2 100 86 87 AtVHA-c''1 100 96 AtVHA-c''2 100
128
1 50 AtVHA-d1 (1) MYGFEALTFNIHGGYLEAIVRGHRAGLLTTADYNNLCQCENLDDIKMHLS AtVHA-d2 (1) MYGFEALTFNIHGGYLEAIVRGHRAGLLTTADYNNLCQCENLDDIKMHLS LeVHA-d1 (1) MYGFEALTFNIHSGYLEAIVRGHRSGLLTAADYNNLCQCETLDDIKMHLS Consensus (1) MYGFEALTFNIHGGYLEAIVRGHRAGLLTTADYNNLCQCENLDDIKMHLS 51 100 AtVHA-d1 (51) ATKYGSYLQNEPSPLHTTTIVEKCTLKLVDDYKHMLCQATEPMSTFLEYI AtVHA-d2 (51) ATKYGPYLQNEPSPLHTTTIVEKCTLKLVDDYKHMLCQATEPMSTFLEYI LeVHA-d1 (51) ATEYGPYLQNEPSPLHTTTIVEKCTVKLVDEFNHMLCQATEPLSTFLEYI Consensus (51) ATKYGPYLQNEPSPLHTTTIVEKCTLKLVDDYKHMLCQATEPMSTFLEYI 101 150 AtVHA-d1 (101) RYGHMIDNVVLIVTGTLHERDVQELIEKCHPLGMFDSIATLAVAQNMREL AtVHA-d2 (101) RYGHMIDNVVLIVTGTLHERDVQELIEKCHPLGMFDSIATLAVAQNMREL LeVHA-d1 (101) RYGHMIDNVVLIVTGTLHERDVQELLEKCHPLGMFDSIASLAVAQNMREL Consensus (101) RYGHMIDNVVLIVTGTLHERDVQELIEKCHPLGMFDSIATLAVAQNMREL 151 200 AtVHA-d1 (151) YRLVLVDTPLAPYFSECLTSEDLDDMNIEIMRNTLYKAYLEDFYKFCQKL AtVHA-d2 (151) YRLVLVDTPLAPYFSECLTSEDLDDMNIEIMRNTLYKAYLEDFYNFCQKL LeVHA-d1 (151) YRLVLVDTPLAPYFSECITSEDLDDMNIEIMRNTLYKAYLEDFYRFCQKL Consensus (151) YRLVLVDTPLAPYFSECLTSEDLDDMNIEIMRNTLYKAYLEDFYKFCQKL 201 250 AtVHA-d1 (201) GGATAEIMSDLLAFEADRRAVNITINSIGTELTREDRKKLYSNFGLLYPY AtVHA-d2 (201) GGATAEIMSDLLAFEADRRAVNITINSIGTELTREDRKKLYSNFGLLYPY LeVHA-d1 (201) GGATAEIMSDLLSFEADRRAVNITINSIGTELTRDDRRKLYSNFGLLYPY Consensus (201) GGATAEIMSDLLAFEADRRAVNITINSIGTELTREDRKKLYSNFGLLYPY 251 300 AtVHA-d1 (251) GHEELAICEDIDQVRGVMEKYPPYQAIFSKMSYGESQMLDKAFYEEEVRR AtVHA-d2 (251) GHEELAICEDIDQVRGVMEKYPPYQAIFSKMSYGESQMLDKAFYEEEVRR LeVHA-d1 (251) GHEELAICEDIDQVRGVMEKYPPYQSIFSKLSYGESQMLDKAFYEEEVKR Consensus (251) GHEELAICEDIDQVRGVMEKYPPYQAIFSKMSYGESQMLDKAFYEEEVRR 301 350 AtVHA-d1 (301) LCLAFEQQFHYAVFFAYMRLREQEIRNLMWISECVAQNQKSRIHDSVVYM AtVHA-d2 (301) LCLAFEQQFHYAVFFAYMRLREQEIRNLMWISECVAQNQKSRIHDSVVYM LeVHA-d1 (301) LCLSFEQQFHYGVFFSYIRLREQEIRNLMWISECVSQNQKTRVHDSVVFI Consensus (301) LCLAFEQQFHYAVFFAYMRLREQEIRNLMWISECVAQNQKSRIHDSVVYM 351 AtVHA-d1 (351) F AtVHA-d2 (351) F LeVHA-d1 (351) F Consensus (351) F
Figure 5. Alignment of d subunits in tomato and Arabidopsis.
Table 3. Subunit d amino acid identities.
AtVHA-d1 AtVHA-d2 LeVHA-d1 AtVHA-d1 100 99 91 AtVHA-d2 100 92 LeVHA-d1 100
129
1 50 LeVHA-e1 (1) MGFLVTTLIFVAIGVIASLCARICCNRGPSTNLLHLTLIITATVCCWMMW LeVHA-e3 (1) MGFAVTSLIFVVVGVIASFGAGICCNRGPSTNLLHLTLIITATVCCWMMW Consensus (1) MGF VTSLIFV IGVIAS A ICCNRGPSTNLLHLTLIITATVCCWMMW 51 71 LeVHA-e1 (51) AIVYLAQLKP-LIVPVLSEGE LeVHA-e3 (51) AIVYLAQLKPPLIVPILSEGE Consensus (51) AIVYLAQLKP LIVPILSEGE
Figure 6. Alignment of e subunits in tomato.
1 50 LeVHA-e1 (1) MGFLVTTLIFVAIGVIASLCARICCNRGPSTNLLHLTLIITATVCCWMMW LeVHA-e3 (1) MGFAVTSLIFVVVGVIASFGAGICCNRGPSTNLLHLTLIITATVCCWMMW AtVHA-e1 (1) MGFLITTLIFVVVGIIASLCVRICCNRGPSTNLLHLTLVITATVCCWMMW AtVHA-e2 (1) MAFVVTSLIFAVVGIIASICTRICFNKGPSTNLLHLTLVITATVCCWMMW Consensus (1) MGFLVTSLIFVVVGIIASLCARICCNRGPSTNLLHLTLIITATVCCWMMW 51 71 LeVHA-e1 (51) AIVYLAQLKP-LIVPVLSEGE LeVHA-e3 (51) AIVYLAQLKPPLIVPILSEGE AtVHA-e1 (51) AIVYIAQMNP-LIVPILSETE AtVHA-e2 (51) AIVYIAQMNP-LIVPILSEVE Consensus (51) AIVYIAQLNP LIVPILSEGE
Figure 7. Alignment of e subunits in tomato and Arabidopsis.
Table 4. Subunit e amino acid identities.
LeVHA-e1 LeVHA-e3 AtVHA-e1 AtVHA-e2 LeVHA-e1 100 87 84 76 LeVHA-e3 100 80 77 AtVHA-e1 100 86 AtVHA-e2 100
130
1 50 LeVHA-A (1) MPSIVGGPMTTFEDSEKESEYGYVRKVSGPVVVADGMGGAAMYELVRVGH AtVHA-A (1) MPAFYGGKLTTFEDDEKESEYGYVRKVSGPVVVADGMAGAAMYELVRVGH Consensus (1) MPA GG LTTFED EKESEYGYVRKVSGPVVVADGMAGAAMYELVRVGH 51 100 LeVHA-A (51) DNLIGEIIRLEGDSATIQVYEETAGLMVNDPVLRTHKPLSVELGPGILGN AtVHA-A (51) DNLIGEIIRLEGDSATIQVYEETAGLTVNDPVLRTHKPLSVELGPGILGN Consensus (51) DNLIGEIIRLEGDSATIQVYEETAGL VNDPVLRTHKPLSVELGPGILGN 101 150 LeVHA-A (101) IFDGIQRPLKTIAKRSGDVYIPRGVSVPALDKDILWEFQPKKIGEGDLLT AtVHA-A (101) IFDGIQRPLKTIARISGDVYIPRGVSVPALDKDCLWEFQPNKFVEGDTIT Consensus (101) IFDGIQRPLKTIAK SGDVYIPRGVSVPALDKD LWEFQP K EGD IT 151 200 LeVHA-A (151) GGDLYATVFENSLMEHRVALPPDAMGKITYIAPAGQYSLNDTVLELEFQG AtVHA-A (151) GGDLYATVFENTLMNHLVALPPDAMGKITYIAPAGQYSLKDTVIELEFQG Consensus (151) GGDLYATVFENSLM H VALPPDAMGKITYIAPAGQYSL DTVIELEFQG 201 250 LeVHA-A (201) VKKQVTMLQTWPVRSPRPVASKLAADTPLLTGQRVLDALFPSVLGGTCAI AtVHA-A (201) IKKSYTMLQSWPVRTPRPVASKLAADTPLLTGQRVLDALFPSVLGGTCAI Consensus (201) IKK TMLQSWPVRSPRPVASKLAADTPLLTGQRVLDALFPSVLGGTCAI 251 300 LeVHA-A (251) PGAFGCGKTVISQALSKYSNSDTVVYVGCGERGNEMAEVLMDFPQLTMTL AtVHA-A (251) PGAFGCGKTVISQALSKYSNSDAVVYVGCGERGNEMAEVLMDFPQLTMTL Consensus (251) PGAFGCGKTVISQALSKYSNSD VVYVGCGERGNEMAEVLMDFPQLTMTL 301 350 LeVHA-A (301) PDGREESVMKRTTLVANTSNMPVAAREASIYTGITIAEYFIDMGYNVSMM AtVHA-A (301) PDGREESVMKRTTLVANTSNMPVAAREASIYTGITIAEYFRDMGYNVSMM Consensus (301) PDGREESVMKRTTLVANTSNMPVAAREASIYTGITIAEYF DMGYNVSMM 351 400 LeVHA-A (351) ADSTSRWAEALREISGRLAEMPADSGYPAYLAARLASFYERAGKVKCLGG AtVHA-A (351) ADSTSRWAEALREISGRLAEMPADSGYPAYLAARLASFYERAGKVKCLGG Consensus (351) ADSTSRWAEALREISGRLAEMPADSGYPAYLAARLASFYERAGKVKCLGG 401 450 LeVHA-A (401) PERTGSVTIVGAVSPPGGDFSDPVTSATLGIVQVFWGLDKKLAQRKHFPS AtVHA-A (401) PERNGSVTIVGAVSPPGGDFSDPVTSATLSIVQVFWGLDKKLAQRKHFPS Consensus (401) PER GSVTIVGAVSPPGGDFSDPVTSATL IVQVFWGLDKKLAQRKHFPS 451 500 LeVHA-A (451) VNWLISYSKYSGALESFYEKFDPDFINIRTKAREVLQREDDLNEIVQLVG AtVHA-A (451) VNWLISYSKYSTALESFYEKFDPDFINIRTKAREVLQREDDLNEIVQLVG Consensus (451) VNWLISYSKYS ALESFYEKFDPDFINIRTKAREVLQREDDLNEIVQLVG 501 550 LeVHA-A (501) KDALAETDKITLETAKLLREDYLAQNAFTPYDKFCPFYKSVWMLRNIIHF AtVHA-A (501) KDALAEGDKITLETAKLLREDYLAQNAFTPYDKFCPFYKSVWMMRNIIHF Consensus (501) KDALAE DKITLETAKLLREDYLAQNAFTPYDKFCPFYKSVWMLRNIIHF 551 600 LeVHA-A (551) YNLANQAVERGAGMDGQKITYTLIKHRLGDLFYRLVSQKFEDPAEGEDVL AtVHA-A (551) YNLANQAVERAAGMDGQKITYTLIKHRLGDLFYRLVSQKFEDPAEGEDTL Consensus (551) YNLANQAVERAAGMDGQKITYTLIKHRLGDLFYRLVSQKFEDPAEGED L 601 623 LeVHA-A (601) VGKFQKLHDDLVAGFRNLEDETR AtVHA-A (601) VEKFKKLYDDLNAGFRALEDETR Consensus (601) V KF KLHDDL AGFR LEDETR
Figure 8. Alignment of A subunits in tomato and Arabidopsis.
131
Table 5. Subunit A amino acid identities.
LeVHA-A AtVHA-A LeVHA-A 100 94 AtVHA-A 100
1 50 LeVHA-B1 (1) MGSAPNSIE-MEEGTLEVGMEYRTVSGVAGPLVILDKVKGPKYQEIVNIR LeVHA-B2 (1) MGKAKKNIENMEEGTLEVGMEYRTVSGVAGPLVILEKVKGPKYQEIVNIR Consensus (1) MG A IE MEEGTLEVGMEYRTVSGVAGPLVILDKVKGPKYQEIVNIR 51 100 LeVHA-B1 (50) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL LeVHA-B2 (51) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL Consensus (51) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL 101 150 LeVHA-B1 (100) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS LeVHA-B2 (101) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS Consensus (101) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS 151 200 LeVHA-B1 (150) TVDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSDNLLEG LeVHA-B2 (151) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSENLLED Consensus (151) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSDNLLE 201 250 LeVHA-B1 (200) GEEDNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIER LeVHA-B2 (201) SEADNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIER Consensus (201) E DNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIER 251 300 LeVHA-B1 (250) IITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGRR LeVHA-B2 (251) IITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGRR Consensus (251) IITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGRR 301 350 LeVHA-B1 (300) GYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTGY LeVHA-B2 (301) GYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTGY Consensus (301) GYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTGY 351 400 LeVHA-B1 (350) ITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHSDVSNQ LeVHA-B2 (351) ITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHADVSNQ Consensus (351) ITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHADVSNQ 401 450 LeVHA-B1 (400) LYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDTR LeVHA-B2 (401) LYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDTR Consensus (401) LYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDTR 451 489 LeVHA-B1 (450) NIFQSLDLAWTLLRIFPRELLHRIPAKTLDQYYSRDASN LeVHA-B2 (451) NIFQSLDLAWTLLRIFPRELLHRIPAKTLDQYYSRDAPN Consensus (451) NIFQSLDLAWTLLRIFPRELLHRIPAKTLDQYYSRDA N
Figure 9. Alignment of B subunits in tomato.
132
1 50 LeVHA-B1 (1) MGSAPNSIE-MEEGTLEVGMEYRTVSGVAGPLVILDKVKGPKYQEIVNIR LeVHA-B2 (1) MGKAKKNIENMEEGTLEVGMEYRTVSGVAGPLVILEKVKGPKYQEIVNIR AtVHA-B2 (1) MGAAENNLE--MEGTLEIGMEYRTVSGVAGPLVILEKVKGPKYQEIVNIR AtVHA-B3 (1) --MVETSID-MEEGTLEIGMEYRTVSGVAGPLVILDKVKGPKYQEIVNIR Consensus (1) MGAAENSIE MEEGTLEIGMEYRTVSGVAGPLVILDKVKGPKYQEIVNIR 51 100 LeVHA-B1 (50) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL LeVHA-B2 (51) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL AtVHA-B2 (49) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL AtVHA-B3 (48) LGDGSTRRGQVLEVDGEKAVVQVFEGTSGIDNKFTTVQFTGEVLKTPVSL Consensus (51) LGDGTTRRGQVLEVDGEKAVVQVFEGTSGIDNKYTTVQFTGEVLKTPVSL 101 150 LeVHA-B1 (100) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS LeVHA-B2 (101) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS AtVHA-B2 (99) DMLGRIFNGSGKPIDNGPPILPEAYLDISGSSINPSERTYPEEMIQTGIS AtVHA-B3 (98) DMLGRIFNGSGKPIDNGPPILPEAYLDISGSSINPSERTYPEEMIQTGIS Consensus (101) DMLGRIFNGSGKPIDNGPPILPEAYRDISGSSINPSERTYPEEMIQTGIS 151 200 LeVHA-B1 (150) TVDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSDNLLEG LeVHA-B2 (151) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSENLLED AtVHA-B2 (149) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSDNLLEH AtVHA-B3 (148) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKTENLIQE Consensus (151) TIDVMNSIARGQKIPLFSAAGLPHNEIAAQICRQAGLVKRLEKSDNLLED 201 250 LeVHA-B1 (200) GE-EDNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIE LeVHA-B2 (201) SE-ADNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIE AtVHA-B2 (199) QE-DDNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIE AtVHA-B3 (198) DHGEDNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIE Consensus (201) E EDNFAIVFAAMGVNMETAQFFKRDFEENGSMERVTLFLNLANDPTIE 251 300 LeVHA-B1 (249) RIITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGR LeVHA-B2 (250) RIITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGR AtVHA-B2 (248) RIITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGR AtVHA-B3 (248) RIITPRIALTTAEYLAYECGKHVLVILTDMSSYADALRFCCSRRSS--WK Consensus (251) RIITPRIALTTAEYLAYECGKHVLVILTDMSSYADALREVSAAREEVPGR 301 350 LeVHA-B1 (299) RGYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTG LeVHA-B2 (300) RGYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTG AtVHA-B2 (298) RGYPGYMYTDLATIYERAGRIEGRKGSITQIPILTMPNDDITHPTPDLTG AtVHA-B3 (296) TWISGVYYTDLATIYERAGRIEGRKGSITQIPILTMPNDDITHPTPDLTG Consensus (301) RGYPGYMYTDLATIYERAGRIEGRTGSITQIPILTMPNDDITHPTPDLTG 351 400 LeVHA-B1 (349) YITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHSDVSN LeVHA-B2 (350) YITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHADVSN AtVHA-B2 (348) YITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHSDVSN AtVHA-B3 (346) YITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRKDHSDVSN Consensus (351) YITEGQIYIDRQLHNRQIYPPINVLPSLSRLMKSAIGEGMTRRDHSDVSN 401 450 LeVHA-B1 (399) QLYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDT LeVHA-B2 (400) QLYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDT AtVHA-B2 (398) QLYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVAQGAYDT AtVHA-B3 (396) QLYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVMQGAYDT Consensus (401) QLYANYAIGKDVQAMKAVVGEEALSSEDLLYLEFLDKFERKFVSQGAYDT 451 490
133 LeVHA-B1 (449) RNIFQSLDLAWTLLRIFPRELLHRIPAKTLDQYYSRDASN LeVHA-B2 (450) RNIFQSLDLAWTLLRIFPRELLHRIPAKTLDQYYSRDAPN AtVHA-B2 (448) RNIFQSLDLAWTLLRIFPRELLHRIPAKTLDQFYSRDTTN AtVHA-B3 (446) RNIFQSLDLAWTLLRIFPRELLHRIPAKTLDQFYSRDSTS Consensus (451) RNIFQSLDLAWTLLRIFPRELLHRIPAKTLDQFYSRDATN
Figure 10. Alignment of B subunits in tomato and Arabidopsis.
Table 6. Subunit B amino acid identities.
LeVHA-B1 LeVHA-B2 AtVHA-B1 AtVHA-B2 AtVHA-B3 LeVHA-B1 100 97 19 97 93 LeVHA-B2 100 19 96 92 AtVHA-B1 100 20 18 AtVHA-B2 100 93 AtVHA-B3 100
1 50 LeVHA-C (1) MASRYWVVSLPVQQNSSTTSLWSRLQESISRHSFDTPLYRFNIPNLRVGT AtVHA-C (1) MTSRYWVVSLPVKD--SASSLWNRLQEQISKHSFDTPVYRFNIPNLRVGT Consensus (1) M SRYWVVSLPV S SSLW RLQE ISKHSFDTPLYRFNIPNLRVGT 51 100 LeVHA-C (51) LDSLLALSDDLIKSNSFIEGVCSKTRRQIEELERVSGVLSSSLTVDGVPV AtVHA-C (49) LDSLLALGDDLLKSNSFVEGVSQKIRRQIEELERISGVESNALTVDGVPV Consensus (51) LDSLLAL DDLIKSNSFIEGV K RRQIEELERISGV S ALTVDGVPV 101 150 LeVHA-C (101) DSYLTRFAWDEAKYPTMSPLKEIVDGIHSQVAKIEDDLKVRVSEYNNVRS AtVHA-C (99) DSYLTRFVWDEAKYPTMSPLKEVVDNIQSQVAKIEDDLKVRVAEYNNIRG Consensus (101) DSYLTRF WDEAKYPTMSPLKEIVD I SQVAKIEDDLKVRVAEYNNIR 151 200 LeVHA-C (151) QLNAINRKQTGSLAVRDLSNLVKPADVVTSEHLTTLLAVVSKYSQKDWLS AtVHA-C (149) QLNAINRKQSGSLAVRDLSNLVKPEDIVESEHLVTLLAVVPKYSQKDWLA Consensus (151) QLNAINRKQSGSLAVRDLSNLVKP DIV SEHL TLLAVV KYSQKDWLA 201 250 LeVHA-C (201) SYETLTTYVVPRSSKMLYEDNEYALYTVTLFNRDADNFKNKARERGFQIR AtVHA-C (199) CYETLTDYVVPRSSKKLFEDNEYALYTVTLFTRVADNFRIAAREKGFQVR Consensus (201) YETLT YVVPRSSK LFEDNEYALYTVTLF R ADNFK AREKGFQIR 251 300 LeVHA-C (251) DFEHNPETQESRKQELEKLMQDQETFRSSLLQWCYTSYGEVFSSWMHFCA AtVHA-C (249) DFEQSVEAQETRKQELAKLVQDQESLRSSLLQWCYTSYGEVFSSWMHFCA Consensus (251) DFE E QESRKQEL KLMQDQES RSSLLQWCYTSYGEVFSSWMHFCA 301 350 LeVHA-C (301) VRIFAESILRYGLPPSFLSVVLAPSIKSEKKVRSILESLCDSSNSNFWKA AtVHA-C (299) VRTFAESIMRYGLPPAFLACVLSPAVKSEKKVRSILERLCDSTNSLYWKS Consensus (301) VR FAESILRYGLPPAFLA VLAPAIKSEKKVRSILE LCDSSNS FWKA 351 377 LeVHA-C (351) D-DEGGMAGFGGDTEAHPYVSFTINLV AtVHA-C (349) EEDAGAMAGLAGDSETHPYVSFTINLA Consensus (351) D D GAMAG AGDSE HPYVSFTINL
Figure 11. Alignment of C subunits in tomato and Arabidopsis.
134
Table 7. Subunit C amino acid identities.
LeVHA-C AtVHA-C LeVHA-C 100 80 AtVHA-C 100
1 50 LeVHA-D (1) MSGQTNRLVVVPTVTMLGVIKARLVGATRGHALLKKKSDALTVQFRQILK AtVHA-D (1) MAGQNARLNVVPTVTMLGVMKARLVGATRGHALLKKKSDALTVQFRALLK Consensus (1) MAGQ RL VVPTVTMLGVIKARLVGATRGHALLKKKSDALTVQFR ILK 51 100 LeVHA-D (51) KIVSTKESMGDVMKNSSFALTEAKYAAGENIKHVVLENVQTATLKVRSRQ AtVHA-D (51) KIVTAKESMGDMMKTSSFALTEVKYVAGDNVKHVVLENVKEATLKVRSRT Consensus (51) KIVS KESMGDMMK SSFALTE KY AGDNIKHVVLENV ATLKVRSR 101 150 LeVHA-D (101) ENIAGVKLPKFEHFSEGETKNDLTGLARGGQQVQACRAAYVKSIELLVEL AtVHA-D (101) ENIAGVKLPKFDHFSEGETKNDLTGLARGGQQVRACRVAYVKAIEVLVEL Consensus (101) ENIAGVKLPKFDHFSEGETKNDLTGLARGGQQV ACR AYVKAIELLVEL 151 200 LeVHA-D (151) ASLQTSFLTLDEAIKTTNRRVNALENVVKPRLENTVLYIKGELDELERED AtVHA-D (151) ASLQTSFLTLDEAIKTTNRRVNALENVVKPKLENTISYIKGELDELERED Consensus (151) ASLQTSFLTLDEAIKTTNRRVNALENVVKPKLENTI YIKGELDELERED 201 250 LeVHA-D (201) FFRLKKIQGYKKREVEKQMAAARLYAAEKSAEEFSLKRGISLGSAHNLLS AtVHA-D (201) FFRLKKIQGYKRREVERQAANAKEFAEEMVLEDISMQRGISINAARNFLV Consensus (201) FFRLKKIQGYKKREVEKQ A AK FA E ED SL RGISI AA N L 251 261 LeVHA-D (251) HASQKDDDIIF AtVHA-D (251) GGAEKDSDIIF Consensus (251) AA KD DIIF
Figure 12. Alignment of D subunits in tomato and Arabidopsis.
Table 8. Subunit D amino acid identities.
LeVHA-D AtVHA-D LeVHA-D 100 80 AtVHA-D 100
135
1 50 LeVHA-E1 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK LeVHA-E2 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK Consensus (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK 51 100 LeVHA-E1 (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVNTMKEAAAKEL LeVHA-E2 (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVCSMKEAASKEL Consensus (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLV SMKEAAAKEL 101 150 LeVHA-E1 (101) LNVSHHEHGIIDSILHHHHGGYKKLLHDLIVQSLLRLKEPCVLLRCRKHD LeVHA-E2 (101) LNVSHHHN------HHIYKKLLQALIVQSLLRLKEPSVLLRCREDD Consensus (101) LNVSHH H YKKLL LIVQSLLRLKEP VLLRCR D 151 200 LeVHA-E1 (151) VHLVEHVLEGVKEEYAEKASVHQPEIIVDEIHLPPAPSHHNMHGPSCSGG LeVHA-E2 (141) VPLVEDVLDAAKEEYAEKSQVHAPEVIVDQIYLPPAPSHHNAHGPSCSGG Consensus (151) V LVE VLDA KEEYAEKA VH PEIIVD IHLPPAPSHHN HGPSCSGG 201 241 LeVHA-E1 (201) VVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVAA LeVHA-E2 (191) VVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVAV Consensus (201) VVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVA
Figure 13. Alignment of E subunits in tomato.
1 50 LeVHA-E1 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK LeVHA-E2 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK AtVHA-E1 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK AtVHA-E2 (1) MNDADVSKQIQQMVRFIRQEAEEKANEISISAEEEFNIERLQLLESAKRK AtVHA-E3 (1) MNDADASIQIQQMVRFIRQEAEEKANEISISSEEEFNIEKLQLVEAEKKK Consensus (1) MNDADVSKQIQQMVRFIRQEAEEKANEISVSAEEEFNIEKLQLVEAEKKK 51 100 LeVHA-E1 (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVNTMKEAAAKEL LeVHA-E2 (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVCSMKEAASKEL AtVHA-E1 (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVNTMKEAAAKEL AtVHA-E2 (51) LRQDYDRKLKQVDIRKRIDYSTQLNASRIKYLQAQDDVVTAMKDSAAKDL AtVHA-E3 (51) IRQEYEKKEKQVDVRKKIDYSMQLNASRIKVLQAQDDIVNAMKEEAAKQL Consensus (51) IRQEYERKEKQVDVRKKIEYSMQLNASRIKVLQAQDDLVNTMKEAAAKEL 101 150 LeVHA-E1 (101) LNVSHHEHGIIDSILHHHHGGYKKLLHDLIVQSLLRLKEPCVLLRCRKHD LeVHA-E2 (101) LNVSHHHN------HHIYKKLLQALIVQSLLRLKEPSVLLRCREDD AtVHA-E1 (101) LNVSHHEHGIIDSILHHHHGGYKKLLHDLIVQSLLRLKEPCVLLRCRKHD AtVHA-E2 (101) LRVSNDKN------NYKKLLKSLIIESLLRLKEPSVLLRCREMD AtVHA-E3 (101) LKVSQHGF------FNHHHHQYKHLLKDLIVQCLLRLKEPAVLLRCREED Consensus (101) LNVSHH HHH YKKLL DLIVQSLLRLKEPSVLLRCRE D 151 200 LeVHA-E1 (151) VHLVEHVLEGVKEEYAEKASVHQPEIIVDE---IHLPPAPSHHNMHGPSC LeVHA-E2 (141) VPLVEDVLDAAKEEYAEKSQVHAPEVIVDQ---IYLPPAPSHHNAHGPSC AtVHA-E1 (151) VHLVEHVLEGVKEEYAEKASVHQPEIIVDE---IHLPPAPSHHNMHGPSC AtVHA-E2 (139) KKVVESVIEDAKRQYAEKAKVGSPKITIDEKVFLPPPPNPKLPDSHDPHC AtVHA-E3 (145) LDIVESMLDDASEEYCKKAKVHAPEIIVDKD--IFLPPAPSDDDPHALSC Consensus (151) V LVE VLEGAKEEYAEKA VHAPEIIVDE IHLPPAPSHHN HGPSC 136 201 250 LeVHA-E1 (198) SGGVVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVAA------LeVHA-E2 (188) SGGVVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVAV------AtVHA-E1 (198) SGGVVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVAA------AtVHA-E2 (189) SGGVVLASQDGKIVCENTLDARLDVAFRQKLPQIRTRLVGAPETSRA--- AtVHA-E3 (193) AGGVVLASRDGKIVCENTLDARLEVAFRNKLPEFCSKGSFLEMCVDPKVA Consensus (201) SGGVVLASRDGKIVCENTLDARLEVVFRKKLPEIRKCLFGQVA 251 300 LeVHA-E1 (242) ------LeVHA-E2 (232) ------AtVHA-E1 (242) ------AtVHA-E2 (236) ------AtVHA-E3 (243) LRQGWCSLMSDSNFITKEKLRDAKSMNPTGRRCPDPNGVEKKSMCYSSCK Consensus (251) 301 323 LeVHA-E1 (242) ------LeVHA-E2 (232) ------AtVHA-E1 (242) ------AtVHA-E2 (236) ------AtVHA-E3 (293) TQGFMGGSCQGHKGNYMCECYEG Consensus (301)
Figure 14. Alignment of E subunits in tomato and Arabidopsis.
Table 9. Subunit E amino acid identities.
LeVHA-E1 LeVHA-E2 AtVHA-E1 AtVHA-E2 AtVHA-E3 LeVHA-E1 100 89 100 72 55 LeVHA-E2 100 89 75 56 AtVHA-E1 100 72 55 AtVHA-E2 100 48 AtVHA-E3 100
1 50 LeVHA-F (1) MANRAPVRTNNSALIAMIADEDTITGFLLAGVGNVDLRRKTNYLIVDSKT AtVHA-F (1) MAGRATIPARNSALIAMIADEDTVVGFLMAGVGNVDIRRKTNYLIVDSKT Consensus (1) MA RA I NSALIAMIADEDTI GFLLAGVGNVDIRRKTNYLIVDSKT 51 100 LeVHA-F (51) TVKQIEDAFKEFTTREDIAIVLISQYVANMIRFLVDSYNKPIPAILEIPS AtVHA-F (51) TVRQIEDAFKEFSARDDIAIILLSQYIANMIRFLVDSYNKPVPAILEIPS Consensus (51) TVKQIEDAFKEFS RDDIAIILISQYIANMIRFLVDSYNKPIPAILEIPS 101 130 LeVHA-F (101) KDHPYDPAHDSVLSRVKYLFSTESVAGDRR AtVHA-F (101) KDHPYDPAHDSVLSRVKYLFSAESVSQR-- Consensus (101) KDHPYDPAHDSVLSRVKYLFS ESVA
Figure 15. Alignment of F subunits in tomato and Arabidopsis.
137
Table 10. Subunit F amino acid identities.
LeVHA-F AtVHA-F LeVHA-F 100 82 AtVHA-F 100
1 50 LeVHA-G1 (1) MESSRGGQNGIQLLLAAEQEAQRIVNVARTAKQARLKQAKEEAEKEIAEF LeVHA-G2 (1) MESNRGNQNGIQQLLGAEQEAQHIVNAARSAKQARLKQAKDEAEKEIAEF Consensus (1) MES RG QNGIQ LLAAEQEAQ IVN ARSAKQARLKQAKDEAEKEIAEF 51 100 LeVHA-G1 (51) RAYMEAEFQRKLEQTSGDSGANVKRLEIETNEKIEHLKTEASRVSADVVQ LeVHA-G2 (51) RAFMEAEFQRKLEQTSGDSGANVKRLDQETFAKIQHLKAESESISNDVVQ Consensus (51) RAFMEAEFQRKLEQTSGDSGANVKRLD ET KI HLK EA IS DVVQ 101 111 LeVHA-G1 (101) MLLRHVTTVKN LeVHA-G2 (101) MLLRQVTTVKN Consensus (101) MLLR VTTVKN
Figure 16. Alignment of G subunits in tomato.
1 50 LeVHA-G1 (1) MESSRGGQNGIQLLLAAEQEAQRIVNVARTAKQARLKQAKEEAEKEIAEF LeVHA-G2 (1) MESNRGNQNGIQQLLGAEQEAQHIVNAARSAKQARLKQAKDEAEKEIAEF AtVHA-G1 (1) MESNRG-QGSIQQLLAAEVEAQHIVNAARTAKMARLKQAKEEAEKEIAEY AtVHA-G2 (1) MES-----AGIQQLLAAEREAQQIVNAARTAKMTRLKQAKEEAETEVAEH AtVHA-G3 (1) MDSLRG-QGGIQMLLTAEQEAGRIVSAARTAKLARMKQAKDEAEKEMEEY Consensus (1) MES RG QGGIQQLLAAEQEAQ IVNAARTAKMARLKQAKEEAEKEIAEF 51 100 LeVHA-G1 (51) RAYMEAEFQRKLEQTSGDSGANVKRLEIETNEKIEHLKTEASRVSADVVQ LeVHA-G2 (51) RAFMEAEFQRKLEQTSGDSGANVKRLDQETFAKIQHLKAESESISNDVVQ AtVHA-G1 (50) KAQTEQDFQRKLEETSGDSGANVKRLEQETDTKIEQLKNEASRISKDVVE AtVHA-G2 (46) KTSTEQGFQRKLEATSGDSGANVKRLEQETDAKIEQLKNEATRISKDVVD AtVHA-G3 (50) RSRLEEEYQTQVSGT--DQEADAKRLDDETDVRITNLKESSSKVSKDIVK Consensus (51) RA ME EFQRKLE TSGDSGANVKRLEQETD KIEQLK EASRISKDVV 101 111 LeVHA-G1 (101) MLLRHVTTVKN LeVHA-G2 (101) MLLRQVTTVKN AtVHA-G1 (100) MLLKHVTTVKN AtVHA-G2 (96) MLLKNVTTVNN AtVHA-G3 (98) MLIKYVTTTAA Consensus (101) MLLKHVTTVKN
Figure 17. Alignment of G subunits in tomato and Arabidopsis.
138
Table 11. Subunit G amino acid identities.
LeVHA-G1 LeVHA-G2 AtVHA-G1 AtVHA-G2 AtVHA-G3 LeVHA-G1 100 81 77 69 54 LeVHA-G2 100 75 68 54 AtVHA-G1 100 81 55 AtVHA-G2 100 49 AtVHA-G3 100
1 50 LeVHA-H (1) MTTESVELTTEEVLRRDIPWETYMTTKLITGTGLQLLRRYDKKAESYKAQ AtVHA-H (1) --MDQAELSIEQVLKRDIPWETYMNTKLVSAKGLQLLRRYDKKPESARAQ Consensus (1) D ELS E VLKRDIPWETYM TKLISA GLQLLRRYDKK ES KAQ 51 100 LeVHA-H (51) LLDDDGPGYVRVFVTILRDIFKEETVEYVLALIDEMLTANPKRARLFHDK AtVHA-H (49) LLDEDGPAYVHLFVSILRDIFKEETVEYVLALIYEMLSANPTRARLFHDE Consensus (51) LLDDDGPAYV LFVSILRDIFKEETVEYVLALI EMLSANP RARLFHD 101 150 LeVHA-H (101) SLADEDTYEPFLRLLWKGNWFIQEKSCKILSLTVSARSKVQNGADANGDA AtVHA-H (99) SLANEDTYEPFLRLLWKGNWFIQEKSCKILAWIISARPKAGNAVIGNG-- Consensus (101) SLA EDTYEPFLRLLWKGNWFIQEKSCKILA ISAR K NA ANG 151 200 LeVHA-H (151) SSSKKKITTIDDVLAGVVEWLCAQLRKPTHPTRSIASTINCLSTLLKEPV AtVHA-H (147) ------IDDVLKGLVEWLCAQLKQPSHPTRGVPIAISCLSSLLKEPV Consensus (151) IDDVL GLVEWLCAQLK PSHPTR I I CLSSLLKEPV 201 250 LeVHA-H (201) VRSSFVRADGVKLLVPLISPASTQQSIQLLYETCLCVWLLSYYEPAIEYL AtVHA-H (188) VRSSFVQADGVKLLVPLISPASTQQSIQLLYETCLCIWLLSYYEPAIEYL Consensus (201) VRSSFV ADGVKLLVPLISPASTQQSIQLLYETCLCIWLLSYYEPAIEYL 251 300 LeVHA-H (251) ATSRALTRLIEVVKGSTKEKVVRVVILTLRNLLSKGTFSAHMVDLGVLQI AtVHA-H (238) ATSRTMQRLTEVVKHSTKEKVVRVVILTFRNLLPKGTFGAQMVDLGLPHI Consensus (251) ATSR L RL EVVK STKEKVVRVVILT RNLL KGTF A MVDLGL I 301 350 LeVHA-H (301) VQSLKAQAWSDEDLLDALNQLEQGLKENIKKLSSFDKYKQEVLLGHLDWS AtVHA-H (288) IHSLKTQAWSDEDLLDALNQLEEGLKDKIKKLSSFDKYKQEVLLGHLDWN Consensus (301) I SLK QAWSDEDLLDALNQLE GLKD IKKLSSFDKYKQEVLLGHLDW 351 400 LeVHA-H (351) PMHKDPIFWRENINNFEENDFQILRVLITILDTSSDARTLAVACYDLSQF AtVHA-H (338) PMHKETNFWRENVTCFEENDFQILRVLLTILDTSSDPRSLAVACFDISQF Consensus (351) PMHKD FWRENI FEENDFQILRVLITILDTSSD RSLAVACFDISQF 401 450 LeVHA-H (401) IQCHSAGRIIVNDLKAKERVMRLLNHDNAEVTKNALLCIQRLFLGAKYAS AtVHA-H (388) IQYHAAGRVIVADLKAKERVMKLINHENAEVTKNAILCIQRLLLGAKYAS Consensus (401) IQ HAAGRIIV DLKAKERVMKLINHDNAEVTKNAILCIQRL LGAKYAS 451 LeVHA-H (451) FLQA AtVHA-H (438) FLQA Consensus (451) FLQA
Figure 18. Alignment of H subunits in tomato and Arabidopsis.
139
Table 12. Subunit H amino acid identities.
LeVHA-H AtVHA-H LeVHA-H 100 77 AtVHA-H 100
140 Appendix 2: Annotated sequences for novel tomato transcripts/proteins
LOCUS 404 bp mRNA linear PLN 08-MAR-2004 DEFINITION Acyl carrier protein. ACCESSION AY568716 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 404) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts sytemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 404) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Raleigh, NC Campus Box 7612, USA FEATURES Location/Qualifiers source 1..404 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" CDS 1..402 /codon_start=1 /product="Acyl carrier protein" /translation="MASLSATCLRFGCSVNTSQINGGTVKLVSVGWGRSSAGFPSLRT SRLRVAAAKAETIDKVISIVRKQLALPADTKVSPESTFTKDLGADSLDTVEIVMALEE EFGIAVEEENSENIVTVQDAADLIEKLVEKK" transit_peptide 1..144 /note="Predicted chloroplast transit peptide" misc_feature 166..381 /note="Acyl carrier protein phosphopantetheine domain" misc_feature 268..270 /note="Phosphopantetheine attachment site (Serine-90)" BASE COUNT 120 a 75 c 104 g 105 t ORIGIN 1 atggctagtc tttcagctac ttgtctcaga tttggctgtt ctgtcaacac atctcagata 61 aacggaggca ctgtgaagtt ggtttcagtg ggttggggaa ggagtagtgc tggtttccct 121 tctctaagaa catcccgcct tcgtgttgca gctgcaaagg cagagacaat tgataaggta 181 ataagcatag tgagaaaaca actagcttta ccagcagaca ctaaggtcag ccctgaaagt 241 actttcacta aggacctcgg agccgactct ctggacactg tagaaattgt gatggcccta 301 gaagaagagt ttgggattgc agtagaagaa gagaactctg agaatattgt aacagttcaa 361 gatgctgctg acttgattga aaaacttgtt gagaagaagt agac //
141
LOCUS 1386 bp mRNA linear PLN 08-MAR-2004 DEFINITION Adenylyl-sulfate reductase. ACCESSION AY568717 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 1386) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 1386) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..1386 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..213 /note="Predicted chloroplast transit peptide" CDS 1..1386 /codon_start=1 /product="Adenylyl-sulfate reductase" /translation="MALTFTSSSAIHGSLSSSSSSYEQPKVSQLGTFQPLDRPQLLSS TVLNSRRRSAVKPLYAEPKRNDSIVPSAATIVAPEVGESVEAEDFEKLAKELQNASPL EVMDKALEKFGDDIAIAFSGAEDVALIEYAHLTGRPYRVFSLDTGRLNPETYQLFDTV EKHYGIRIEYMFPDSVEVQALVRTKGLFSFYEDGHQECCRVRKVRPLRRALKGLRAWI TGQRKDQSPGTRSEIPIVQVDPSFEGLDGGAGSLVKWNPVANVDGKDIWNFLRAMNVP VNSLHSQGYVSIGCEPCTRPVLPGQHEREGRWWWEDAKAKECGLHKGNIKDETVNGAA QTNGTATVADIFDTKDIVTLSKPGVENLVKLEDRREPWLVVLYAPWCQFCQAMEGSYV ELAEKLAGSGVKVGKFRADGDQKAFAQEELQLGSFPTILFFPKHSSKAIKYPSEKRDV DSLLAFVNALR" misc_feature 19..63 /note="Serine-rich region" misc_feature 346..963 /note="Phosphoadenosine phosphosulfate reductase domain" misc_feature 1048..1377 /note="Thioredoxin domain 2" BASE COUNT 361 a 280 c 368 g 377 t ORIGIN 1 atggctttga ctttcacttc ttcatctgca attcatggct ctttgtcttc ttcatcttct 61 tcttatgaac aacccaaagt atcccaattg ggtacctttc agccattgga taggcctcaa 121 ctattgtcgt caactgtttt gaattctcgg aggcgttcgg cagtgaagcc attgtatgct 181 gaacctaaga ggaatgattc aatagttccg tcagcagcta ccatcgtggc tcctgaggta 241 ggagagagtg ttgaggcaga ggactttgag aaattggcta aggagcttca aaatgcttcc 301 cctcttgagg ttatggacaa agcacttgag aaatttggag atgacattgc tattgctttc 361 agtggtgctg aagatgttgc tttgatagag tacgcacatt taactggacg accatacaga 421 gtattcagcc ttgatactgg gaggttgaac ccggagacat accaattatt tgacacagtg 481 gagaagcact atggcattcg cattgaatac atgttccctg attcagttga agttcaggcg 541 ttggttagga ccaaagggct tttctctttc tatgaggatg gccaccaaga gtgttgccgt 601 gtaaggaagg ttaggccttt gaggagagct ctaaagggct tacgcgcctg gatcacaggc 661 cagcgtaaag atcagtcccc tggaactcga tcagaaatcc ccattgttca ggtggaccct 721 tcttttgagg ggttggatgg cggtgctggt agcttggtga agtggaaccc tgtggctaat 781 gtggacggaa aagatatttg gaacttcctg cgtgccatga atgtgcctgt gaactcattg 841 cattcacaag gatatgtatc cattggatgc gaaccttgca caaggccagt tctaccaggg 901 caacacgaga gagagggaag atggtggtgg gaagatgcca aggccaagga gtgtggcttg
142 961 cacaagggca acatcaagga tgaaactgta aatggcgctg cccaaacaaa tggtactgct 1021 accgttgctg atatttttga taccaaggac attgttacct tgagtaagcc tggagttgag 1081 aacctagtaa aattggaaga ccgaagagag ccttggctcg ttgttcttta tgcaccttgg 1141 tgccaatttt gccaggcaat ggaaggatcc tatgttgaat tggctgagaa gttggctggt 1201 tctggtgtga aagtagggaa attcagggca gatggtgacc agaaagcatt tgcacaagaa 1261 gaattgcagc ttggcagctt ccctacaata ctcttcttcc caaagcactc ttcaaaggcc 1321 attaagtacc cttcagagaa gagggacgta gactccttgc tggcttttgt gaatgctctc 1381 agatga //
143 LOCUS 717 bp mRNA linear PLN 08-MAR-2004 DEFINITION Unknown protein. ACCESSION AY568718 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 717) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 717) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Raleigh, 27695 27695, USA FEATURES Location/Qualifiers source 1..717 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..132 /note="Predicted chloroplast transit peptide" CDS 1..717 /codon_start=1 /product="Unknown protein" /translation="MACAALSANSCTIASSSTGRLSFSTYQKDSKLRQRHSLVRFRVR ASTDDSDCNAEECAPDKEVGKVSMEWVAMDNTKVVGTFPPRKPRGWTGYVEKDTAGQT NIYSVEPAVYVAESAISSGTAGTSSDGAENTKAISAGIALISVAAASSILLQVGKNSP PPIQTVEYRGPSLSYYINKLKPAEIVQASITEAPTAPETEEVAITPEVESSAPEAPAP QVEVQSEAPQDTSSSSSNIS" misc_feature 403..459 /note="Predicted transmembrane region" BASE COUNT 210 a 174 c 165 g 168 t ORIGIN 1 atggcttgtg ctgctttatc agcaaacagc tgcaccatag cttcatcgtc tactggacga 61 ttgagctttt ccacatacca aaaggactca aaattgaggc aaagacacag tctcgtccga 121 ttcagagttc gggcttcaac tgacgattct gattgcaatg ctgaagaatg tgccccagac 181 aaggaggttg ggaaggtgag catggaatgg gtagccatgg acaacaccaa agtggttggt 241 acatttccac ctcgtaagcc gcgtggctgg acagggtatg ttgagaagga tactgctggg 301 cagacaaata tatactctgt tgagcctgca gtttatgtag cagaaagtgc tataagctct 361 ggtactgcag gcacctcatc tgatggagca gagaacacca aagctatttc agctgggata 421 gccttaatct ctgttgcagc tgcttcatcg attctccttc aagttgggaa gaactcacct 481 cctccgatac aaacagtgga gtacagggga ccatccctta gctactatat caacaagctt 541 aagccagcgg aaatagtcca agcttcaata accgaagcac caactgcacc agaaaccgaa 601 gaagtagcaa ttacaccaga agttgaaagc tctgctccag aagctcctgc tccacaagtt 661 gaagtccaat ctgaagcccc tcaggacact tcaagttcaa gttctaacat ctcttag //
144 LOCUS 693 bp mRNA linear PLN 08-MAR-2004 DEFINITION Photosystem II oxygen-evolving complex protein 3 (PsbQ). ACCESSION AY568719 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 693) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 693) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..693 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" CDS 1..693 /codon_start=1 /product="Photosystem II oxygen-evolving complex protein 3 (PsbQ)" /translation="MAHAMASMGGLIGSSQTVLDGSLQLSGSARLSTVSTNRIALSRP GLTVRAQQGSVDIETSRRAMIGLVAAGLAGSVAKAAFAEARSIKVGPPPPPSGGLPGT LNSDEARDFSLPLKNRFYLQPLTPAEAAQRVKDSAKEIVSVKDFIDKKAWPYVQNDLR LRAEYLRYDLKTVISAMPKEQKGKLQDLSGKLFKTISDLDHAAKTKNSAEAQKYYAET VTTLNDVLANLG" transit_peptide 1..147 /note="Predicted chloroplast transit peptide" misc_feature 193..246 /note="Predicted transmembrane region" misc_feature 553..636 /note="Predicted coiled coil" BASE COUNT 178 a 165 c 171 g 179 t ORIGIN 1 atggctcatg ctatggcttc tatgggtggc ctaattggtt cttcacaaac tgtcttggat 61 ggtagcctcc agcttagtgg ctcagcccgc ttgagtactg ttagcaccaa cagaattgcc 121 ttgtctagac caggactcac tgtcagagcc caacaggggt ctgttgacat cgaaactagc 181 cgtagagcca tgattggtct tgttgctgct ggcctagctg gttccgttgc taaagcagct 241 tttgctgaag ccaggtcaat taaggttggc cccccacctc ctccctcggg tggattgcct 301 ggaactttga actcagatga ggcaagggac ttcagtttgc cattgaagaa taggttttac 361 cttcaaccgt tgactccagc tgaggcagcc cagagagtta aggattcagc caaggagatt 421 gttagtgtca aggatttcat cgacaagaag gcctggcctt acgtccagaa tgaccttcgt 481 ctcagagcag aataccttcg ctatgacctt aagactgtta tctctgctat gccaaaagaa 541 cagaagggaa aactccagga tctgtctgga aagctcttta agaccattag tgatctggac 601 catgcagcaa agaccaagaa cagtgctgaa gcacagaagt actatgctga aactgtaact 661 accttaaatg atgttttggc caacctgggc tag //
145 LOCUS 1224 bp mRNA linear PLN 08-MAR-2004 DEFINITION Putative anion:sodium symporter. ACCESSION AY568720 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 1224) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 1224) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..1224 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..93 /note="Predicted mitochondria_chloroplast transit peptide" CDS 1..1224 /note="Involved in sulfate assimilation by which inorganic sulfate is processed and incorporated into sulfated compounds." /codon_start=1 /product="Putative anion:sodium symporter" /translation="MASLSRFIGKQCKLQCSDTLQRPSYGFCVRRSPTHLSMGMRNKD EIGRYNLFINQNQSKTSLVQSPCNRKIVCCEAASNVSGESSSTGMTQYEKIIETLTTL FPLWVILGTIIGIYKPSAVTWLETDLFTLGLGFLMLSMGLTLTFDDFRRCLRNPWTVG VGFLAQYFIKPLLGFTIAMALKLSAPLATGLILVSCCPGGQASNVATYISKGNVALSV LMTTCSTVGAIVMTPLLTKLLAGQLVPVDAAGLAISTFQVVLVPTVIGVLSNEFFPKF TSKIVTITPLIGVILTTLLCASPIGQVADVLKTQGAQLLLPVAALHAAAFFLGYQISK FSFGESTSRTISIECGMQSSALGFLLAQKHFTNPLVAVPSAVSVVCMALGGSALAVYW RNQPIPVDDKDDFKE" misc_feature 295..348 /note="Predicted transmembrane region" misc_feature 384..919 /note="Putative leucine zipper motif" misc_feature 385..438 /note="Predicted transmembrane region" misc_feature 475..528 /note="Predicted transmembrane region" misc_feature 547..600 /note="Predicted transmembrane region" misc_feature 637..705 /note="Predicted transmembrane region" misc_feature 742..801 /note="Predicted transmembrane region" misc_feature 838..903 /note="Predicted transmembrane region" misc_feature 940..993 /note="Predicted transmembrane region" misc_feature 1030..1083 /note="Predicted transmembrane region" misc_feature 1120..1173 /note="Predicted transmembrane region" BASE COUNT 314 a 252 c 268 g 390 t
146 ORIGIN 1 atggcttctc tgtccagatt tattgggaaa caatgtaaat tgcagtgttc agacacactt 61 cagagaccaa gttatgggtt ttgtgttaga aggagtccga cccatttgag tatgggtatg 121 agaaataaag atgagattgg aagatataat ttgttcatca atcaaaatca aagtaagact 181 tccctagttc aatccccgtg caatcgcaaa atagtatgtt gcgaggcagc atcaaatgtg 241 tctggggaaa gctcttccac tggaatgacc caatatgaga aaataattga gactttgacc 301 accctttttc ctctatgggt tatattgggt acaatcattg gcatatataa accttctgcg 361 gtcacttggt tggaaacaga tctcttcact ctgggtttgg gatttctaat gctttcaatg 421 ggtttgacac taacatttga cgacttccga agatgtttaa ggaacccatg gactgtaggt 481 gttggatttc tcgctcagta cttcattaaa ccactcttag gcttcaccat agcaatggct 541 ctaaagttgt ccgccccact tgctactggt ctgatcttgg tgtcatgctg tcctggaggc 601 caagcttcta atgtggcaac atatatttca aaggggaatg tagccctctc tgttctaatg 661 acaacgtgtt caacagttgg agctattgtg atgacacccc tgctgactaa gcttttagct 721 ggtcagcttg tcccagttga tgctgccggt cttgctatca gcacctttca agttgtgcta 781 gtgccaacag ttattggagt tctatcaaat gagttttttc ctaagtttac gtcaaaaatc 841 gtcaccatca cacctttaat tggagttatt ctgactactc ttctttgtgc tagtccgatt 901 ggtcaagtcg cagatgtgct gaaaactcag ggagcacagt tacttctccc tgtggcggcc 961 ttgcatgctg cagcattttt tctgggttac cagatttcaa aattttcatt tggtgaatca 1021 acatccagaa ctatttcgat agaatgtgga atgcagagtt cggcactcgg atttctactt 1081 gcacaaaagc atttcacaaa ccctcttgtt gctgtacctt ctgctgttag tgttgtctgc 1141 atggcacttg gtggaagtgc tctagctgtg tactggagga atcaaccaat tcctgttgat 1201 gacaaggatg attttaagga gtaa //
147 LOCUS 555 bp mRNA linear PLN 08-MAR-2004 DEFINITION Unknown wound/stress protein. ACCESSION AY568721 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 555) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 555) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Campus Box 7612, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..555 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..84 /note="Predicted secretory pathway transit peptide (i.e. golgi or ER)" CDS 1..555 /codon_start=1 /product="Unknown wound/stress protein" /translation="MGVAAQVNQMWFNLMIVLFFVSISSISAEDCVYTAYIRTGSIIK AGTDSNISLTLYDANGYGLRIKNIEAWGGLMGPGYNYFERGNLDIFSGKGPCVNGPIC KMNLTSDGTGPHHGWYCNYVEVTVTGAKKQCNQQLFTVNQWLGTDVSPYKLTAIRNNC KNKYESGELKPLYDSESFSIVDVI" misc_feature 91..465 /note="Lipoxygenase homology domain" BASE COUNT 158 a 110 c 124 g 163 t ORIGIN 1 atgggagtag cagctcaagt taaccaaatg tggttcaatc tcatgatcgt cctcttcttc 61 gtctctattt cttctatttc tgctgaagat tgtgtttaca cagcttacat tcgcactggt 121 tcaatcataa aagctggtac cgattcaaac atttcgttga ctctctacga tgccaatggc 181 tatggacttc gaataaaaaa catagaggcc tggggtggac ttatgggtcc aggttacaac 241 tactttgaaa gaggaaactt ggatatcttc agtgggaaag gtccttgtgt gaatggaccg 301 atctgtaaaa tgaatttgac ttcagatggt actggaccac accatggatg gtactgtaac 361 tacgtggaag tcaccgttac cggagctaaa aaacaatgca accagcagtt gttcaccgtg 421 aatcagtggc tgggcactga tgtttcgccg tataagctaa cggccatcag gaataactgt 481 aagaacaagt atgagtccgg tgagctaaag cccctttatg attctgaatc attttctata 541 gttgatgtaa tttaa //
148 LOCUS 936 bp mRNA linear PLN 08-MAR-2004 DEFINITION Chloroplast-specific ribosomal protein. ACCESSION AY568722 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 936) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 936) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Campus Box 7612, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..936 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..225 /note="Predicted chloroplast signal peptide" CDS 1..936 /codon_start=1 /product="Chloroplast-specific ribosomal protein" /translation="MATLSLSPSVGTTFHSLHSYPNGSSSYSSSCPATASPALSLTLS STNSRFLNSAFKMNEINVPVRNRVTKSFGVRMSWDGPLSSVKLILQGKNLELTPAVKD YVEEKLGKAVQKHSHLAREVDVRLSVRGGELGKGPKIRRCEVTLFTKKHGVIRAEEDG ESIYGSIDMVSSIIQRKLRKIKEKDSDRGRHMKGFDRLKVRDPEALLVQEDLETLSQE EEVEDDKSDGFVTEVVRKKSFDMPPLSVNEAIEQLENVDHDFYGFRNEETGEINIVYR RKEGGYGLIIPKEDGKTEKLEPLEVEPEKEPSIAE" misc_feature 256..564 /note="Sigma 54 modulation protein domain" BASE COUNT 281 a 170 c 232 g 253 t ORIGIN 1 atggcgactc tttccctttc cccttccgtg ggaacaactt ttcactctct ccatagctac 61 ccaaatggtt cctcatccta ttcttcttct tgtcccgcta ctgcttctcc agctttgtca 121 ctgacattgt catctaccaa ttcacgattt ttaaattcag ctttcaagat gaatgaaatt 181 aatgttcctg tcaggaatag ggtgacaaaa tcctttgggg tccggatgtc ttgggatggt 241 ccactttctt ctgttaaact cattcttcaa gggaaaaatc ttgagttaac acctgctgta 301 aaggactatg tggaagagaa gttgggtaag gcagttcaaa agcacagcca tctagccagg 361 gaagtggatg ttaggctgtc tgttcgaggt ggagagcttg gaaaaggccc aaaaattcga 421 agatgtgaag ttactctatt tacgaaaaag catggagtga ttcgtgcaga ggaagacggt 481 gagtcaattt atggaagtat agatatggta tcatcaatta tacagagaaa gttgcggaaa 541 attaaggaga aggattcaga ccgtggtcgc cacatgaagg gcttcgatag gctgaaagtc 601 agggacccag aggcgctgtt agttcaagag gatcttgaaa cactttccca agaggaagaa 661 gttgaagatg acaagagtga tggctttgtt actgaggttg ttcgtaagaa gtcctttgac 721 atgccacctt taagtgtcaa tgaagcaatt gaacagctgg aaaatgtcga ccatgacttc 781 tatggtttcc ggaatgagga aactggtgag attaacatcg tttacagacg aaaagaaggg 841 ggttatggac ttattattcc aaaggaagat ggtaaaacag agaagttaga gcccttggag 901 gttgaaccag agaaagaacc gtcgatagca gaataa //
149 LOCUS 627 bp mRNA linear PLN 08-MAR-2004 DEFINITION Alpha/beta fold family protein. ACCESSION AY568723 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 627) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 627) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Campus Box 7612, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..627 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" CDS 1..627 /codon_start=1 /product="Alpha/beta fold family protein" /translation="MVNLVEAQKPLLHGLMKLAGIRPHSIEIEPGTIMNFWVPSETII QKTKKNKKITTTTPLSNNQYAISPDSTTEPDPNKPVVVLIHGFAGEGIVTWQFQIGAL TKKYSVYVPDLLFFGGSVTDSSDRSPGFQAECLGKGLRKLGVEKCVVVGFSYGGMVAF KMAEMFPDLVEALVVSGSILAMTDSISTTTLNGLGIFIFFGAAAAYLC" misc_feature 235..504 /note="Alpha/beta hydrolase domain" misc_feature 238..306 /note="Predicted transmembrane region" misc_feature 439..498 /note="Predicted transmembrane region" misc_feature 553..621 /note="Predicted transmembrane region" BASE COUNT 181 a 129 c 145 g 172 t ORIGIN 1 atggtgaact tggttgaagc acaaaaacca ttgttacatg gcctaatgaa attagctgga 61 atcagacctc atagtataga gatagaacca ggcacaatta tgaatttttg ggttccttct 121 gaaaccataa ttcaaaaaac gaagaaaaac aaaaaaatca caaccactac tcctctctcc 181 aacaaccaat atgctatttc ccctgattcc accaccgaac ccgacccgaa caaacccgtg 241 gtcgtactaa tccacggctt tgccggcgaa ggaatagtga cgtggcaatt tcaaatcggt 301 gcattaacta aaaaatactc tgtttatgta ccggacctac ttttcttcgg cggatcagtt 361 acggatagct ccgatagatc gccgggtttt caagcagagt gtttgggtaa agggctgagg 421 aaattaggcg tggaaaaatg cgtagtggtt ggatttagtt atggaggaat ggtggcgttt 481 aagatggcgg aaatgtttcc agatttagtt gaggcgttgg tggtgtctgg atcgatatta 541 gcgatgactg attccattag cactaccacg ctcaatggtt tggggatttt catcttcttc 601 ggagctgctg ctgcctacct ctgttaa //
150 LOCUS 453 bp RNA linear PLN 08-MAR-2004 DEFINITION Histidine triad family protein. ACCESSION AY568724 KEYWORDS SOURCE tomato. ORGANISM Lycopersicon esculentum Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliophyta; eudicotyledons; core eudicots; asterids; lamiids; Solanales; Solanaceae; Solanum; Lycopersicon. REFERENCE 1 (bases 1 to 453) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Identification, accumulation, and functional prediction of novel tomato transcripts systemically up-regulated after fire damage JOURNAL Unpublished REFERENCE 2 (bases 1 to 453) AUTHORS Coker,J.S., Vian,A. and Davies,E. TITLE Direct Submission JOURNAL Submitted (08-MAR-2004) Botany, North Carolina State University, Gardner Hall, Campus Box 7612, Raleigh, NC 27695, USA FEATURES Location/Qualifiers source 1..453 /organism="Lycopersicon esculentum" /db_xref="taxon:4081" transit_peptide 1..90 /note="Predicted transit peptide (unknown location)" CDS 1..453 /note="Involved in cell cycle regulation." /codon_start=1 /product="Histidine triad family protein" /translation="MIVRRKTPALKVYEDDVCLCILDANPLCFGHSLVIPKSHFTSLQ ETPSSVVAAMSSKLPLISSAVMKATGCDSFNLLVNNGAAAGQVIYHTHIHIIPRKASD CLWTSETLSRCPLKSDEAQKLADGIRENLSISSNIEDSKGQGSSLVVN" misc_feature 4..315 /note="Histidine triad family domain" BASE COUNT 127 a 93 c 100 g 133 t ORIGIN 1 atgattgttc gacgtaaaac acctgcttta aaggtctatg aggatgatgt atgcctctgc 61 attttggatg caaacccatt gtgttttggg cactcgcttg tcatcccaaa gtctcatttt 121 acttctttgc aagaaactcc atcatcagtt gtggctgcca tgagttcaaa attgcccttg 181 attagcagtg cagtcatgaa agccactggt tgtgattcgt tcaacttgtt agttaacaac 241 ggggcagcag ctggccaggt tatatatcat acccacattc atataattcc tcgtaaagca 301 agcgattgcc tctggacttc tgagacctta agtagatgtc cgctgaagtc agacgaggct 361 cagaaacttg cagatggtat tagagaaaac ttatcaattt cgagcaacat tgaagatagt 421 aaggggcaag gatcaagtct cgttgtaaac tag
151
Appendix 3: Perspectives on student research experiences in plant biology
Overview
Student research is a vital part of our national science education infrastructure and the essence of inquiry-based learning. Research training efforts in plant biology are especially important now because plants are central to unprecedented 21st century challenges such as world food supply, environmental protection, and genetic modification. A series of surveys have been administered to the American Society of Plant Biologists to understand the following aspects of student research experiences: the extent of participation among plant biologists (and subgroups therein) in training student researchers, the advantages and disadvantages of research training, the effectiveness of various training techniques, and the mentor perception of institutional incentives. Overall, 89% and 49% of potential mentors have supported undergraduates and high school student researchers, respectively, and the average plant biologist trains 1.3 undergraduates and 0.3 high school students per year. Time efficiency seems to be the most important issue for mentor participation and success. For example, the vast majority of disadvantage comments involve the time spent by senior researchers training students, the time restraints of students, and the effects of training students on lab productivity. Similarly, many respondents who report great successes with young researchers mention strategies for saving time, maximizing productivity, and utilizing resources wisely. Even though the vast majority of plant biologists find that mentoring student researchers is rewarding, only 49% and 17% perceive institutional incentives for working with undergraduates and high school students, respectively. To assess educational outcomes from the student perspective, another set of surveys have been administered to undergraduate students in the Botany Department at N.C. State University. Positive educational outcomes which rated especially high included a greater appreciation for teaching/research, greater initiative towards pursuing a career, an increase in skills, and greater consideration for attending graduate school. Students found that the experiences were effective at building 5 “leadership skills” which included team-work, problem-solving, getting along with others, analytical skills, and time-management, and somewhat effective at developing 4 others which included writing, speaking, work ethic, and integrity.
152
153
154
155 156
A National Perspective on Mentoring Student Researchers in Plant Biology
Jeffrey S. Coker and Eric Davies
Abstract
Student research is a vital part of our national scientific infrastructure. We have surveyed
the American Society of Plant Biologists to measure participation levels and to
understand mentor perspectives on student research experiences in the plant sciences.
Overall, the average numbers of undergraduates and high school students mentored per
year by plant biologists are 1.3 and 0.3, respectively. Whereas most mentors oversee
undergraduates regularly, 9% of the mentoring population hosts more high school
students than the other 91%. Only 49% and 17% of plant biologists perceive institutional
incentives for mentoring undergraduate and high school student researchers, respectively.
The numbers of students mentored per year and the percentages of mentors perceiving
institutional incentives vary according to institutional type and academic rank, while
faculty at primarily undergraduate institutions host more undergraduates per year (1.9)
and perceive more institutional support (65%) for undergraduate research than those from all other institutional types. The highest-ranked institutional incentives were funding, academic credit for both student and mentor, and consideration in promotion/tenure decisions. Over 90% of mentor comments involving successful training techniques fell into one of three categories: designing a project that is simple and has clear goals, providing hands-on supervision, and ensuring good communication and explanations.
157 Introduction
Over the next century, plant biologists will be faced with at least 3 challenges of great importance to humanity: ensuring a sustainable food supply for over 10 billion people; developing safe, responsible practices with regard to genetically-modified organisms; and protecting the planet’s flora from a multitude of threats to the environment. A substantial group of highly trained scientists will be needed to deal with the complexity and breadth of these challenges. Accordingly, we have begun to assess research training in plant biology on a national level to understand what reforms might improve it in terms of both training practices and institutional support. We define research as any investigation that attempts to make an original intellectual or creative contribution to a discipline. Despite our focus on the plant sciences, we anticipate that our findings will be applicable to all areas laboratory-based research.
Research training at the high school and undergraduate level has direct implications on the national economy. Although scientists and engineers in research and development (R & D) constitute less than 1% of the total U.S. workforce, they drive innovations and new technologies that improve health, agriculture, environment, consumer products, and national defense. As the National Science Foundation (NSF) published, “The funding and conduct of R&D has always been viewed as essential to the
Nation” (NSF, 2000). The United States spent $265 billion on R & D in the year 2000 alone, $30 billion of which was spent by academic institutions (NSF, 2002). In this context, funding agencies such as NASA and the NSF are expanding student research opportunities (Service, 2002) and considering “integration of research and education” as
1 of 4 criteria to review research grants (NSF, 2003). Furthermore, a number of national
158 organizations have recommended the expansion and improvement of efforts to include students in college/university research (Sigma Xi, 1989; NSF, 1996; Howard Hughes
Medical Institute, 2002; Council on Undergraduate Research, 2003). The first recommendation of the Boyer Commission on Educating Undergraduates in the Research
University (1998) was to “make research-based learning the standard”.
Given the high level of investment and emphasis, studies assessing the prevalence and/or quality of research training experiences are surprisingly scarce. The vast majority of “assessment” involves the self-assessment of particular programs for major funding agencies. Although self-assessments are important for improving individual programs and training techniques, it is difficult to make comparisons between them and they may not have much value outside the funding agency. Furthermore, these assessments give few clues about the research training habits of mentors without major funding (a large proportion of mentors). Thus, the quantity and quality of student research experiences on a national level remain uncertain. Such uncertainty should be a concern since the number of scientific/technical articles published by American biologists dropped 22% between
1992 and 1999 (National Science Foundation, 2002).
Most of the published literature on research training, though rich in examples of specific programs, has similar limitations. Over the last decade, publications on research training have fallen into three major categories: non-specific large-scale accounts
(Seago, Jr. 1992; Austin 1997; Schowen 1998; Druger 1998; Craig 1999; Levesque and
Wise 2001), descriptions of particular programs/courses (Ortez 1994; Heppner 1996;
Nikolova Eddins et al. 1997; Chaplin et al. 1998; Krasny 1999; McLean 1999; Henderson and Buising 2000; Boersma et al. 2001; Hutchison et al., 2002), and descriptions of
159 particular training methods (Beer 1995; Durso 1997; Lewis et al., 2002; Griffin et al.,
2003). Recently, there have also been several surveys of larger populations of student researchers in chemistry and biology (Mabrouk and Peters, 2000), psychology (Landrum and Nelsen, 2002), and medicine (Solomon et al., 2003), as well as a survey of mentors in plant biology (Coker and Davies, 2002) and an institutional survey of liberal arts colleges
(Research Corporation, 2001). In the study presented here, we synthesize data from mentors in order to understand the plant biology research training landscape in broad terms.
We have administered a series of surveys via the Education Committee of the
American Society of Plant Biologists (ASPB; www.aspb.org). The ASPB is a professional organization that promotes the interests of plant scientists and publishes two well-respected journals, Plant Physiology and The Plant Cell. Its members work in a
diverse array of government, industry, and academic environments across six continents
and are largely representative of laboratory-based plant biology researchers, although
most are college and university faculty in the United States.
Our first survey found that 89% and 49% of ASPB members have supported
undergraduate and high school student researchers, respectively (Coker and Davies,
2002). Respondents discussed the advantages and disadvantages of supporting student
researchers (Coker and Davies, 2002). In the current report, we present findings from a
second survey involving the numbers of undergraduate and high school students
mentored and mentor perceptions of institutional support. Respondents also indicated
which training techniques, research project ideas, and institutional incentives they have
found to be effective and ineffective.
160 Materials and Methods
Data collection
Two mass emails were sent via the ASPB Education Committee to everyone on the society’s membership list (fall 2001). These emails briefly explained the survey and referred members to a website where the survey was posted. We received responses from about 7% of the society (338 members out of around 5,000). It seemed possible that our sample population was a biased subgroup of the whole society and may contain a disproportionate number who are especially interested in student research. To explore this possibility, we administered the quantitative portion of the survey to 50 ASPB members at random at the national meeting in Denver (2002).
The survey instrument contained several questions that have unknown, yet quantifiable, answers regarding student research mentoring.
• How many years have you been in a position where you could host high school and/or undergraduate researchers? A) 0-5 B) 5-10 C) 11-15 D) 16-25 E) Over 25 • About how many high school researchers have you supported in your career (if any)? A) 0-5 B) 5-10 C) 11-15 D) 16-25 E) Over 25 • About how many undergraduate researchers have you supported in your career (if any)? A) 0-5 B) 5-10 C) 11-15 D) 16-25 E) Over 25 • Does your institution give any incentives for you to mentor undergraduate/high school research? Undergraduate Yes / No High school Yes / No
It also contained five open-response items asking for the following:
• Training techniques or research projects that work well with beginning researchers • Training techniques or research projects that do not work well with beginning researchers • Institutional incentives for mentoring student researchers which are most effective • Institutional incentives for mentoring student researchers which are least effective • Incentives not available but which would be appealing to mentor, student, or administration
Finally, several questions allowed us to gather the following demographic information for correlative analysis (Table 1):
161
• Type of institution • Position at that institution • Gender • Ethnicity • Country of residence
Data were grouped and summarized for correlative analysis relative to numbers of
undergraduates mentored (Table 2) and perceptions about institutional support (Table 3).
Quantitative analyses
The survey data are discrete (not continuous) in that each respondent is classified
based on the various categories in Table 2 and 3, leading us to test hypotheses using chi
squared statistics. We constructed r x c tables to calculate a chi-squared statistic to test
each of our null hypotheses. Our null hypotheses were always that there is no
relationship between the row and column categories in Tables 2 and 3. Following
standard convention, chi squared values corresponding to a probability of .05 or less were considered significant.
For example, we tested the null hypothesis “The number of undergraduates mentored is independent of the type of institution” by building an r x c table with r (rows) being types of institutions and c (columns) being categories of number of undergraduates mentored. We then calculated a chi squared value for each box in the table based on its deviation from an expected value (based on the total number in that column divided by the overall total). After finding that the sum of all chi squared values in the table corresponded to a p-value less than .05, we concluded that we “Can reject the null hypothesis”, or rather, “The number of undergraduates trained could be dependent on the type of institution.” We then proceeded to test more specific hypotheses, such as “The
162 number of undergraduates mentored is independent of the mentor being an academic and
non-academic.” Our final conclusions were based upon r x c tables that had estimated
values in each box of at least 5, since lower values can skew chi squared statistics. Thus, we do not make separate conclusions about mentors from government, industry, and the private sector because of low sample sizes. Instead, we pooled these groups into a “non- academics” category. Also, we cannot make conclusions involving mentor nationality or ethnicity.
Qualitative analyses
Each open-response item was analyzed by grouping similar comments together and counting the total number of comments in each group. If a respondent mentioned multiple techniques/incentives in the same comment, the comment was included in the count for all applicable groups. Comments were ranked according to their total counts.
163 Results and Discussion
Demographics
The majority of respondents were professors at colleges and universities in the
United States (Table 1). About 65% of the whole sample worked at either land-grant or other research universities, whereas 20% worked at primarily undergraduate institutions
(PUIs). Those representing PUIs were mostly from liberal arts colleges (85%). Most of those from academia were either assistant, associate, or full professors, although a few post-docs and graduate students (nearly all from universities) also completed the survey.
Respondents not in academia (13%) were a mix of researchers from government, industry, and the private sector.
The length of time respondents had been in a position to mentor is well- distributed. A little more than half (56%) were in the first 10 years of their mentoring career, whereas the rest were evenly distributed between 11-15, 16-25, and over 25 year categories. In general, those in academia were evenly spread throughout these categories, whereas non-academics were more frequently in the 0-5 year category.
The vast majority (74%) worked in the United States, whereas the remaining 26% were from 26 different countries on 6 continents. Most were Caucasian (68%) or Asian
(11%) in descent. Despite the international diversity of the sample, diversity was relatively low with respect to Hispanics and Latinos (4.1%), as well as African-
Americans (0.3%).
164 Sampling concerns
To ensure that respondents to the online survey were not a biased subgroup of the
ASPB membership, we administered the quantitative portion to 50 ASPB members at random at an ASPB national meeting. Because these randomly administered surveys gave statistics that were very similar to those from the online survey (data not shown), we conclude that the online survey did, indeed, provide a representative sample of the whole population.
Numbers of undergraduates trained
The percentages of respondents who have mentored various numbers of undergraduates are shown in Table 2. Overall, 30%, 22%, 14%, 13%, and 22% of respondents mentored 0-5, 5-10, 11-15, 16-25, and >25 undergraduates, respectively
(Table 2). In Table 2, percentages are also shown for various respondent groups including type of institution, academic position, years in a position to mentor, number of high school students mentored, and gender. For purposes of comparing groups, the total number of students mentored is less important than the rate of mentoring (the number of students per mentor year).
In order to estimate the number of undergraduates trained per mentor, we assume that the average response in any given category is equal to the middle of that category’s scale (i.e. someone who marked the “5-10” category has been a potential mentor for 7.5 years). To be conservative, we used 1.5 students as the average number in the “0-5” category for number of undergraduates mentored, and estimated that those who have mentored “Over 25” students average 42 students over the whole population (derived
165 from a linear extrapolation). This number comes from an assumption (and subsequent linear extrapolations) that those who reach the “Over 25” student category in a given length of time will continue to mentor at the same rate for the rest of their career. We also assume that those who have been a potential mentor for “Over 25” years average 30 years. Based on these estimations, we calculate the average number of undergraduates per mentor to be 15 (Table 2). Since the average length of mentoring career in the population was 12 years, we conclude that the average number of undergraduates per mentor year is about 1.3 (Table 2).
The relationships between length of mentoring career and the number of undergraduates trained are shown in Figure 1. In general, most mentors seem to be active throughout their careers, as shown by the steady increase in the numbers of undergraduates trained over time (Fig. 1). Another overall trend is that, within any
“length of mentoring career” category, there is significant variation in the numbers of students trained (Fig. 1; also reflected by error bars in Fig. 2 and 3). This is likely due to the broad spectrum of job responsibilities (and motivations) across the research community.
As one would expect, the total number of undergraduates mentored increases with higher academic positions. On the average, graduate and post-doctoral students mentored
2.2 and 3.7 total students (Fig. 2), respectively, which equates to 0.6-0.7 students per mentor year (Table 2). Assistant, associate, and full professors mentored 10, 19, and 27 total students, respectively (Fig. 2). Whereas there is no significant difference between the mentoring rates of assistant and associate professors (1.9-2.0 undergraduates per year), the rate of full professors is significantly lower (1.3 undergraduates per year). This
166 seems to indicate less involvement in research mentoring among full professors. There
are several possible explanations for this trend, including a shift toward administrative or
teaching duties over time, a drop in productivity among faculty after tenure, being more
selective in recruiting students, and a realization over time that mentoring is not rewarded
professionally. Another explanation could be, as one respondent said, “Since they
(student researchers) are extremely time-intensive and spend at least their share of supply money from my grant, I have had fewer as my career progresses.” On the other hand, it is also possible that student research has been emphasized more in recent years, making the overall mentoring rate higher among the younger faculty even though full professors may have mentored as many students recently.
The type of institution had a significant effect on the numbers and rates of undergraduates trained. Current faculty at universities averaged 1.3 undergraduates per year, while faculty at PUIs averaged 1.9 (Table 2). This difference results from 3 factors: full professors at PUIs mentoring slightly more than those at universities (Fig. 2), full professors making up a higher percentage of the total population than assistant or associate professors (Table 1), and PUI professors having been in a position to mentor 5-
6 years less than those at universities (Table 2) which leads to a higher mentoring rate.
Researchers from government agencies and industry averaged 0.6-0.7 undergraduates per year. Thus, the data indicate that the average PUI professor mentors around 45% more undergraduates on average than faculty at universities and 300% more than researchers in government/industry. Although we did not necessarily expect these results, they are understandable given the greater focus of PUIs on undergraduate education and the time that university professors must spend directing graduate and post-doctoral students (who,
167 in turn, also mentor undergraduates). However, these findings are supported by a recent study of 136 liberal arts colleges which found that the number of students engaged in some type of research rose 70% in the past decade (Research Corp., 2001).
Number of high school students trained
Using the same logic as above, we estimate the number of high school student researchers per mentor to be 3.6, and the average number per mentor year to be about 0.3.
Unlike undergraduate research mentoring, high school mentoring is driven by a small group of mentors. Over half of high school researchers who are mentored by professional plant biologists are guided by 9% of the mentor population. Furthermore, the medians of all “length of mentoring career” categories correspond to 0-5 high school students. For example, 96% of mentors in the 0-5 years category, as well as 62% in the
25+ years category, have trained 5 or fewer high school students.
What are the most prolific mentors doing?
Around 10% of mentors trained over 25 undergraduates in their first 10 years
(Fig. 1), suggesting that the most prolific mentors train at least 2-3 undergraduates per year and could train nearly 100 students in their career. A few mentors make significant efforts to train both undergraduate and high school students. For example, 10% of 25+ year mentors have trained over 25 high school students, and 4 of these 6 individuals also mentored more than 25 undergraduates. One respondent commented, “I train 3 to 12 undergraduate students each semester. Seven have been coauthors on published papers
168 since 1997.” On the other hand, a typical mentor trains 1-2 undergraduates per year and
0 high school students.
We wish to be clear here that numbers of students say nothing about the quality of student research experiences or the duration of those experiences. Whereas getting more students involved in research is advantageous, increasing the number of students trained can be counterproductive if the quality of experiences decline in the process. This being said, we feel certain that many mentors are doing an excellent job with a few students, while others are doing an excellent job with many students.
Effective training techniques and research projects
Over 90% of respondent comments regarding effective training techniques fell into one of the following three categories, each of which had subcategories:
1. Design a project that is simple and has clear goals (38%). a. Well-structured b. Uses a technique(s) common to the lab c. Uses a single technique or set of techniques. d. Achievable in short amount of time. 2. Provide hands-on supervision (27%). a. Partner inexperienced students with experienced students. b. Have students work in teams. 3. Ensure good communication and explanations (27%). a. Explain theory, background, and context b. Provide clear, written directions c. Be available to listen and answer questions.
For the most part, comments were remarkably consistent, and many respondents actually mentioned all three major points. There was only one detectable difference between the comments from various categories of respondents, which involved providing hands-on supervision. For researchers at universities, hands-on supervision equated to working
169 with a graduate student or post-doc (almost every comment said this). For those at PUIs, on the other hand, it either meant direct supervision by a professor, working in teams, or partnering inexperienced undergraduates with experienced undergraduates. It has been suggested that the traditional roles of undergraduates, graduate students, and post-docs are blurring because lower-level students are participating more in research and higher- level students are demanding to be better mentored (González, 2001). Responses to our survey suggest that the roles between upper-level students and faculty may also be blurring, at least in the context of research mentoring, since upper-level students are often expected to mentor lower-level students.
Respondent comments about particular research projects that work well were largely a reflection of their own lab work. A wide array of techniques and projects were mentioned including PCR, DNA sequencing, morphology, histology, mutant screens, enzyme assays, protein purification, cloning, dye uptake into xylem, computer-based projects, etc. At the same time, no particular technique or project was mentioned significantly more than others. The message seems to be that students can successfully perform just about any technique for a research project, so long as the technique is standard in the lab and can be properly overseen. Some respondents also expressed a preference for techniques which are inexpensive and relevant to answering many different research questions.
170 Ineffective training techniques and research projects
The ineffective training techniques identified by respondents were the opposites
of those they had identified as effective. Again, over 90% of comments fell into three
classes:
1. Projects that are not simple and lack clear goals (53%). 2. No hands-on supervision (27%). 3. Poor communication and explanations (11%).
Similarly, the subcategories for ineffective techniques were the opposites of those
mentioned above for effective training techniques. In describing ineffective techniques
and projects, respondents mentioned project design 15% more, and communication 16%
less, than they had when mentioning effective techniques. Other notable ineffective
techniques mentioned by several respondents included using students as technicians (to do nothing but routine tasks) and attempting to mentor students on projects not directly related to other work in the lab.
The common theme for all ineffective training techniques, whether during project design, in the lab, or in providing explanations and interpreting data, was that passive mentoring does not work. In other words, pointing undergraduate or high school students in a general direction and leaving them to figure things out on their own tends to fail.
Effective mentoring, on the other hand, involves an active process of project design, goal-
setting, hands-on training, and guidance.
There was no consensus on particular laboratory projects that are ineffective for
training student researchers. In fact, nearly all that were mentioned as being ineffective
had been mentioned by someone else as a potentially effective project. The one
exception involved the use of radioactivity-based techniques during student projects,
171 which several respondents discouraged and another mentioned as “illegal” for those 18 and younger in Australia.
Perceptions of institutional incentives
Table 3 shows the percentages of respondents who think their institution gives incentives for mentoring student research. Overall, about one-half (49%) perceived institutional incentives for mentoring undergraduates, and only one-sixth (17%) perceived incentives for mentoring high school students. The vast majority of those respondents with incentives for high school mentoring also have incentives for undergraduate mentoring (88%), whereas only 2% of all mentors have incentives exclusive to high school researchers compared to 32% for undergraduates. Only 14% of all mentors perceive institutional incentives for mentoring both undergraduates and high school students, whereas 46% perceive no incentives for either.
Almost 50% of those at land grant and other research universities perceive that there are incentives for mentoring undergraduates (Table 3). If only faculty members at these universities are considered (i.e. excluding graduate and post-doctoral students), the percentage rises slightly to 53% (Table 3). In contrast, significantly more faculty at PUIs perceive incentives for undergraduate research (65%). The discrepancy between university faculty and PUI faculty involving undergraduate-related incentives derives mainly from the responses of full professors and partly from assistant and associate professors (Fig. 3). Seventy percent of full professors and 57% of associate professors at
PUIs responded that they have incentives compared with 55% and 47% at land-grant universities and other research universities, respectively (Figure 3). Although almost
172 equal numbers of assistant professors at “other research universities” (67%) and at PUIs
(69%) perceived they had undergraduate-related incentives, significantly fewer associate
and full professors at “other research universities” (38%) perceived institutional incentives than those at PUIs (55% - Fig. 3). At land-grant universities, the same percentage (55%) of faculty perceived institutional incentives regardless of academic rank (Fig. 3).
Between 16-18% of those at universities felt that there are incentives for mentoring high school students (whether or not graduate and post-doctoral students are included in the percentage). On the other hand, fewer of those at PUIs (8%) felt that they have incentives to mentor high school researchers (Table 3). This trend was supported by several comments by PUI faculty who said that their specific job is to work with undergraduates and not high school students.
Among government employees, the percentage of those perceiving incentives for undergraduate and high school research mentoring were both 54% (Table 3). In industry and the private sector, very few respondents (less than 15%) perceived incentives for mentoring student researchers (Table 3).
As one might expect, within colleges and universities there seem to be significantly more incentives to support undergraduates than high school students. On the other hand, respondents from government perceive little difference between incentives available for mentoring undergraduate and high school research. We were surprised to find that the group most positive about incentives for high school mentoring was from government, but must note that the sample size for this group was relatively low (n=22).
173
Most effective institutional incentives
The ranks of the 3 most effective institutional incentives from the mentor perspective were very clear:
1. Funding 2. Academic credit (for student AND mentor) 3. Consideration in promotion/tenure decisions
Most funding-related comments mentioned general laboratory needs such as supplies, training-costs, stipends, travel, etc. Many noted that funds specifically for students are important, especially in the summer when students tend to need money. Methods of student payment include stipends, scholarships, fellowships, and free tuition. A small number of mentors also mentioned supplementing mentor salary as an incentive.
The second-most effective incentive, academic credit, has two separate elements.
First, students must receive course credit for their work. Next, mentors (who are usually college faculty) should receive teaching credit proportional to the number of students they mentor, and this credit should have practical value in determining overall workloads.
This relates to results from our previous survey, which showed that the biggest disadvantage of mentoring student researchers is that it takes a lot of time (Coker and
Davies, 2002). Most mentors will not commit to many students without gaining time somewhere else in their schedules.
The third-most effective incentive is consideration in career advancement decisions. We suspect that part of the reason why assistant professors tend to perceive more institutional incentives than their more experienced colleagues (Table 3) is that they have not yet been through the tenure-review process and thus do not realize how little
174 they may be rewarded for mentoring student researchers. “Institutionalizing” student research must include appropriate acknowledgement for those who put forth considerable mentoring effort.
Other incentives mentioned include the following: co-authorship of papers/posters/presentations by students, being in a culture where research mentoring is expected, having students add to research productivity, administrative support for practical arrangements (housing, parking, etc.), involvement with an honors program, and the expectation that it is (or will become) necessary for research funding.
Least effective institutional incentives
Mentors did not have a unified voice on particular incentives that do not work.
Nearly every comment dealt with a poor method of implementing or enforcing an incentive that would have been effective otherwise. The common theme of all comments was that mission statements, administrative encouragement, and departmental expectations are not effective in the absence of more tangible institutional incentives.
The most desirable institutional incentive not currently available
When asked to list incentives that are unavailable but which would be appealing to mentor, student, or administration, all three of the “most effective” incentives listed above were mentioned frequently. However, the most mentioned incentive was for mentors to receive teaching credit for mentoring student researchers. The following comments represent this opinion:
175 Academic credit would be most useful and provide for the time necessary for such a worthwhile undertaking.
We receive no "credit" for taking undergraduates in lab. Our work loads are calculated based on course loads, and although undergraduates who do year-long thesis projects with us do sign up with us as taking a course, we receive no credit for teaching such a course.
Recommendations for future studies
It is important to emphasize that the number of students mentored by an individual mentor does not indicate the quality and overall value of those experiences. In fact, a highly effective mentor could have a greater student retention rate which results in fewer total students than a less effective mentor. For this reason, we caution that individual mentors should not be evaluated based solely on numbers of students trained.
Student outcomes, nature and length of student research projects, and other quality measures should be taken into consideration along with numbers of students trained.
Thus, we suggest that future surveys attempt to measure both the quantity and quality of research experiences.
Finally, longitudinal data on student research would allow more informed decisions to be made by teachers, researchers, and administrators. Only with knowledge of long-term trends can research experiences be optimized and evaluated on a national scale. Therefore, we think that replicating this study in 10-20 years would have substantial value.
Acknowledgements
We thank members of the ASPB for their cooperation in filling out surveys and Sophia
Clotho for her advice.
176
References
Austin, C.A. (1997). A survey of final-year undergraduate laboratory projects in biochemistry and related degrees in Great Britain. Biochem. Educ. 25, 12-14.
Beer, R.H. (1995). Guidelines for the supervision of undergraduate research. J. Chem. Educ. 72, 721-722.
Boersma, S., M., Hluchy, G., Godshalk, J., Crane, D., DeGraff, and Blauth, J. (2001). Student-designed interdisciplinary science projects. J. Coll. Sci. Teach. 30, 397-402.
Boyer Commission on Educating Undergraduates in a Research University. (1998). Reinventing undergraduate education: A blueprint for America’s research universities.
Chaplin, S.B., Manske, J.M., and Cruise, J.L. (1998). Introducing freshmen to investigative research – A course for biology majors at Minnesota’s University of St. Thomas. J. Coll. Sci. Teach. 27, 347-350.
Coker, J.S., and Davies, E. (2002). Involvement of plant biologists in undergraduate and high school student research. J. Nat. Resour. Life Sci. Educ. 31, 44-47.
Council on Undergraduate Research. (2003). The Council on Undergraduate Research.
Craig, N.C. (1999). The joys and trials of doing research with undergraduates. J. Chem. Educ. 76, 595-597.
Druger, M. (1998). Teaching versus research – An ongoing issue at the college level. J. Nat. Resour. Life Sci. Educ. 27, 134-135.
Durso, F.T. (1997). Corporate-sponsored undergraduate research as a capstone experience. Teaching of Psychology 24, 54-56.
González, C. (2001). Undergraduate research, graduate mentoring, and the university mission. Science 293, 1624-1626.
Griffin, V., McMiller, T., Jones, E., and Johnson, C.M. (2003). Identifying novel helix- loop-helix genes in Caenorhabditis elegans through a classroom demonstration of functional genomics. Cell Biol. Educ. 2, 51-62.
Henderson, L., and Buising, C. (2000). A research-based molecular biology laboratory. J. Coll. Sci. Teach. 30, 322-327.
Heppner, F. (1996). Learning science by doing science. Am. Biol. Teach. 58, 372-374.
177
Howard Hughes Medical Institute. (2002). Undergraduate science education at research universities.
Hutchison, A.R., and Atwood, D.A. (2002). Research with first- and second-year undergraduates: a new model for undergraduate inquiry at research universities. J. Chem. Educ. 79, 125-126.
Krasny, M.E. (1999). Reflections on nine years of conducting high school research programs. J. Nat. Resour. Life Sci. Educ. 28, 17-23.
Landrum, E.R., and Nelsen, L.R. (2002). The undergraduate research assistantship: an analysis of the benefits. Teaching of Psychology 29, 15-19.
Levesque, M.J., and Wise, M. (2001). The Elon experience: Supporting undergraduate research across all disciplines. CUR Quarterly, Mar, 113-116.
Lewis, J.R., Kotur, M.S., Butt, O., Kulcarni, S., Riley, A.A., Ferrell, N., Sullivan, K.D., and Ferrari, M. (2002). Biotechnology apprenticeship for secondary-level students: Teaching advanced cell culture techniques for research. Cell Biol. Educ. 1, 26-42.
Mabrouk, P.A., and Peters, K. (2000). Student perspectives on undergraduate research experiences in chemistry and biology. CUR Quarterly, Sept, 25-33.
McLean, R.J.C. (1999). Original research projects – A major component of an undergraduate microbiology course. J. Coll. Sci. Teach. 29, 38-40.
National Science Foundation. (1996). Shaping the future: New expectations for undergraduate education in science, mathematics, engineering, and technology.
National Science Foundation. (2000). Science and engineering indicators-2000 (NSB 00-1).
National Science Foundation. (2002). Science and engineering indicators-2002 (NSB 02-1).
National Science Foundation. (2003). Grant proposal guide (NSF 03-041).
Nikolova Eddins, S.G., Williams, D.F., Bushek, D., Porter, D., and Kineke, G. (1997). Searching for a prominent role of research in undergraduate education: Project Interface. J. Excellence in College Teaching 8, 69-81.
178 Ortez, R.A. (1994). Investigative research in nonmajor freshman biology classes. J. Coll. Sci. Teach. 23, 296-300.
Research Corporation. (2001). Academic Excellence: The Sourcebook.
Schowen, K.B. (1998). Research as a critical component of the undergraduate educational experience. K.B. Schowen (Ed.), Washington, D.C.: National Academy Press. pp 73-81.
Seago, J.L., Jr. (1992). The role of research in undergraduate instruction. Am. Biol. Teach. 54, 401-405.
Service, R.F. (2002). New lure for young talent: extreme research. Science 297, 1633- 1634.
Sigma Xi. (1989). An exploration of the nature and quality of undergraduate education in science, mathematics and engineering. A report of the National Advisory Group of Sigma Xi, The Scientific Research Society.
Solomon, S.S., Tom, S.C., Pichert J., Wasserman, D., and Powers, A.C. (2003). Impact of medical student research in the development of physician-scientists. J. Investig. Med. 51, 149-156.
179 Table 1. Population demographics of respondents to a survey of the American Society of Plant Biologists (ASPB).
% of Category pop.
Land-grant university 41.0 Other research university 24.2 Primarity undergraduate institution 20.1 Government 6.5 Industry 3.8 Institute, museum, private organization 2.4 Unknown 2.1
Full professor 31.4 Assistant professor 17.5 Associate professor 16.6 Post-doc 9.8 Graduate student 8.3 Research director 5.0 Research scientist 3.8 Lab manager 3.6 Retired professor 2.1 Other 1.2 Unknown 0.9
0-5 yrs in a position to mentor 30.0 5-10 yrs 25.5 11-15 yrs 14.8 16-25 yrs 14.2 Over 25 yrs 15.4
United States 74.3 Canada 4.1 Japan 2.7 Germany 2.4 Australia 1.5 Mexico 1.2 Taiwan 1.2 United Kingdom 0.9 Portugal 0.9 18 other countries 7.1 Unknown 3.8
Caucasian 67.5 Asian 10.7 Hispanic/Latino 4.1 African-American 0.3 Other 0.6 Unknown 16.9
Females 31.6 Males 64.6 Unknown 3.8
180 Table 2. Percentages of respondents who have mentored various numbers of undergraduates. Estimates of average number of undergraduates mentored and average number of undergraduates per mentor year are based on these percentages. Shaded regions show the most important trends that are supported by reasonable sample sizes (n). UG=Undergraduate researchers; HS=High school researchers.
Estim. avg. Estim. Length of avg. Estim. avg. # # UG per total # UG mentoring UG per n 0-5 UG 5-10 UG 11-15 UG 16-25 UG > 25 UG mentor mentored career (yrs) mentor year Overall 339 29.6 % 21.8 % 14.0 % 13.1 % 21.5 % 15 5247 12.0 1.3
Institution Land-grant univ. 139 33.3 18.4 12.8 12.1 23.4 16 2186 12.9 1.2 (only current faculty) 94 16.3 20.2 16.3 16.3 30.8 20 1879 15.4 1.3 Other research univ. 82 22.5 11.3 18.8 23.8 23.8 18 1500 13.1 1.4 (only current faculty) 58 11.3 22.6 11.3 24.2 30.6 21 1216 16.1 1.3 PUIs 68 13.0 24.6 21.7 15.9 24.6 18 1242 9.7 1.9 Government 22 40.9 36.4 13.6 4.5 4.5 8 172 11.6 0.7 Industry 13 69.2 7.8 15.4 0.0 7.8 7 89 12.3 0.6 Inst., mus., priv. org. 8 62.5 25.0 0.0 0.0 12.5 8 65 7.9 1.0
Academic position Graduate student 28 85.2 14.8 0.0 0.0 0.0 2 67 3.8 0.6 Post-doc 33 71.9 21.9 6.3 0.0 0.0 4 116 5.3 0.7 Assistant professor 59 27.1 37.3 13.6 18.6 3.4 10 593 5.3 1.9 Associate professor 56 7.1 23.2 21.4 25.0 23.2 19 1079 9.8 2.0 Full professor 106 4.8 12.4 16.2 17.1 49.5 27 2887 21.3 1.3
Years could mentor 0-5 101 66.0 25.0 8.0 1.0 0.0 4 411 2.5 1.6 5-10 86 20.9 26.7 17.4 24.4 10.5 14 1185 7.5 1.8 11-15 50 16.0 20.0 18.0 18.0 28.0 19 968 12.5 1.5 16-25 48 8.3 16.7 14.6 16.7 43.8 25 1196 20.0 1.2 Over 25 52 7.7 13.5 15.4 9.6 53.8 28 1435 30.0 0.9
# of HS mentored 0-5 261 32.8 23.6 14.3 13.9 15.4 13 3471 10.4 1.3 5-10 36 16.7 13.9 25 8.3 36.1 21 765 17.6 1.2 11-15 15 20 20 0 13.3 46.7 24 361 19.0 1.3 16-25 8 12.5 12.5 12.5 0 62.5 29 232 18.4 1.6 Over 25 8 0 12.5 0 12.5 75 35 280 27.1 1.3
Gender of mentor Male 219 27.4 23.7 13 13 22.8 16 3502 13.1 1.2 Female 107 31.8 18.7 16.8 14 18.7 15 1566 10.1 1.4
181 Table 3. Respondent perceptions of institutional incentives for mentoring student researchers. Shaded regions show the most important trends that are supported by reasonable sample sizes (n). UG=Undergraduate researchers; HS=High school researchers.
% Yes % Yes n for UG for HS Overall 339 49.2 16.5
Institution Land-grant univ. 139 47.5 17.2 (only current faculty) 94 55.3 18.1 Other research univ. 82 48.1 13.0 (only current faculty) 58 51.7 10.7 PUIs 68 65.2 8.1 Government 22 54.5 54.5 Industry 13 7.7 7.7 Inst., mus., priv. org. 8 12.5 0.0
Academic position Graduate student 28 38.5 23.1 Post-doc 33 29.0 9.7 Assistant professor 59 67.2 17.3 Associate professor 56 50.0 7.1 Full professor 106 55.2 16.7
Years could mentor 0-5 101 40.8 12.2 5-10 86 47.0 12.0 11-15 50 68.0 28.0 16-25 48 45.8 14.6 Over 25 52 53.8 17.3
# of UG mentored 0-5 99 30.4 15.9 5-10 72 68.5 24.7 11-15 47 46.8 12.8 16-25 44 54.5 6.8 Over 25 72 52.8 12.5
# of HS mentored 0-5 261 49.2 11.7 5-10 36 47.2 25.0 11-15 15 66.7 40.0 16-25 8 75.0 50.0 Over 25 8 25.0 12.5
Gender of mentor Male 219 46.5 18.1 Female 107 53.8 13.1
182
70
60
s 50 nt
nde 40 o p s e
r 30 f o % 20 Over 25 10 16-25 11-15 0 6-10 Undergraduates 0-5 6-10 0-5 mentored 11-15 16-25 Over Length of mentoring 25 career (yrs)
Figure 1. Percentages of plant biologists who mentored various numbers of undergraduates in different “length of their mentoring career” categories. For example, of the plant biologists who were in a position to mentor for 0-5 years, over 60% mentored 0-5 undergraduates.
183
d 40 e
or Land-grant univ.
t 35 n e 30 Other research univ. m
s PUIs
e 25 t a
u 20 d a r
g 15 r
de 10 un l 5 a t
To 0 Grad Post-docs Assistant Associate Full students professors professors professors
Figure 2. Total number of undergraduates mentored by plant biologists of different academic ranks at land-grant universities, other research universities, and primarily undergraduate institutions (PUIs).
90 s
e Land-grant univ.
iv 80 t Other research univ. n e
c 70 PUIs n
l i 60 a n io
t 50 u it t
s 40 in
g 30 n i v i 20 e c r
e 10
% p 0 Grad students Post-docs Assistant Associate Full professors professors professors
Figure 3. Percentages of plant biologists of different academic rank at land-grant universities, other research universities, and primarily undergraduate institutions (PUIs) who perceive institutional incentives for mentoring undergraduate researchers.
184
Evaluation of Teaching and Research Experiences Undertaken by Botany Majors at N.C. State University
Jeffrey S. Coker and C. Gerald Van Dyke Department of Botany, N.C. State University, Raleigh, North Carolina 27695
Abstract
Many science departments require undergraduate students to complete either a teaching or research experience. We have developed a survey instrument to measure outcomes of student teaching and research experiences from the student perspective. Our results in the Botany
Department at N.C. State University show that those doing research are involved mainly in data collection and analysis, whereas those who are teaching are mainly involved with hands- on laboratory instruction. Nearly all students rated their experiences as very good overall and would recommend them to other students. Several positive educational outcomes were rated especially high, including a greater appreciation for teaching/research, greater initiative towards pursuing a career, an increase in skills, and greater consideration for attending graduate school. Students found that the experiences were effective at building 5 “leadership skills” which included team-work, problem-solving, getting along with others, analytical skills, and time-management, and somewhat effective at developing 4 others which included writing, speaking, work ethic, and integrity. Students rated academic-related outcomes relatively low overall, suggesting that motivation to make better grades or to take different courses changed little as a result of research or teaching experiences.
185 Introduction
Experiential learning in the forms of teaching and research can be extremely rewarding for undergraduate students. These experiences allow students to put classroom knowledge into practice and explore potential career paths. Teaching and research settings frequently present rich opportunities to build leadership skills such as team-work, problem-solving, getting along with others, analytical skills, time-management, writing, speaking, work ethic, and integrity. Perhaps most importantly, both teaching and research pose significant, open-ended challenges to students that provide opportunities for high achievement and excellence.
An increased emphasis has been placed on experiential learning in recent years, resulting in a greater need for assessment. Funding agencies such as NASA and the NSF are expanding student research opportunities (Service, 2002) and considering “integration of research and education” as 1 of 4 criteria to review scientific research grants (NSF, 2003). Furthermore, a number of national organizations have recommended the expansion and improvement of efforts to include undergraduates in college/university research (NSF, 1996; Boyer
Commission, 1998; Howard Hughes Medical Institute, 2002). Similarly, the concept of student-assisted teaching has been strongly advocated (Miller et al., 2001), and it is known that most laboratory instruction at U.S. universities is done by teaching assistants (Sundberg and Marshall, 1993).
Recently, there have been surveys of student researchers in chemistry and biology (Mabrouk and Peters, 2000), psychology (Landrum and Nelsen, 2002), and medicine (Solomon et al.,
2003), as well as a national survey of mentors in plant biology (Coker and Davies, 2002) and
186 an institutional survey of liberal arts colleges (Research Corporation, 2001). Previous authors
have also described student research projects in particular courses (Chaplin et al., 1998;
McLean, 1999; Henderson and Buising, 2000). We are unaware of any recent survey of
undergraduate teaching assistants in the sciences which sought to determine educational
outcomes. Nevertheless, the role of graduate teaching assistants in the sciences has been
examined (Druger, 1997; Sundberg et al., 2000), and surveys of teaching assistants have been
performed in communications (Socha, 1998) and sociology (Fingerson and Culley, 2001).
Many science departments nationwide require that students complete an out-of-classroom experience in order to graduate. Undergraduates majoring in Botany at N.C. State University are required to complete either a teaching or research experience as part of the required departmental curriculum. Such experiences include (but are not limited to) laboratory teaching assignments in botany or biology courses, faculty-supervised research, and off- campus internships. We have developed a survey instrument to measure outcomes of teaching and research experiences in the Botany Department at N.C. State. The results were used to determine what students did during research/teaching experiences, the educational
outcomes, the overall success of the requirement, and will be used to improve experiences
and better advise students on which experiences to pursue in the future.
187 Methods
The survey instrument developed for assessing teaching and research experiences consisted of 60 multiple-choice items and 16 open-response items. For those who may be interested in administering similar surveys at their institutions, we have posted this survey at www.cals.ncsu.edu/botany/faculty/gvandyke/undergraduatesurvey.html.
Botany majors at N.C. State University were asked individually to complete the survey after they had finished a research or teaching experience. Most students took about 15 minutes to complete the survey. A total of 25 surveys were completed from the fall of 2002 to the spring of 2004 which included student experiences over a 3-year period (2001-2004). This constitutes most of the students who graduated from the Botany Department over this period.
Results and discussion
Overview of the students
The 25 students who completed surveys were Botany majors with an average GPA of 3.5
(ranging from 2.1 to 4.0). Students major in Botany at N.C. State University for many different reasons. The Botany curriculum is structured to allow students to customize their program to fit career objectives. Student interests include space biology, ethnobotany, pharmaceutical aspects of medicinal plants, plant identification (wetlands, rare and endangered plants, forest plants, grasses, etc.), plant ecology, plant systematics, plant pathology, plant physiology, molecular botany and many others. Some majors may even pursue careers in scientific writing.
188 Overview of research/teaching experiences
Of the 25 students in this survey, 23 had 1 teaching/research experience, 2 had multiple experiences. Nineteen students performed research, 6 taught, and 2 had an experiential internship. Teaching experiences typically involved teaching assistant duties in Introductory
Botany laboratories at N.C. State University. Research was performed in a broad array of settings such as the following: research labs on campus, Syngenta, BASF, Baylor College of
Medicine, Reynolda Gardens at Wake Forest University, the U.S. National Arboretum, national forests, and the U.S. Department of Agriculture.
Typical teaching experiences occupied 7-10 hours per week for 1-2 semesters, and ranged from 3-6 hours per week for 1 semester to 10 hours per week for 3 semesters. Typical research experiences during a school year occupied 10-20 hours per week for 2 semesters, whereas typical summer research experiences were 40 hours per week for the entire summer
(9-12 weeks). The extent of research experiences ranged from 8-10 hours per week for 1 semester to 10 hours per week for 6 semesters (including summer work).
Levels of involvement in specific activities
Figures 1 and 2 show the levels of involvement of students in teaching and research-specific activities, respectively. The most prevalent teaching activities were related to hands-on laboratory instruction, including set-up (3.2), brief presentations (3.8), and other routine tasks
(Fig. 1). Students were somewhat involved in other educational activities such as writing objectives (1.8), developing course material (1.8), writing exams (1.8), and grading exams
189 (2.8). Few to none were involved in traditional professorial duties such as giving full-length lectures (1.2) or performing teaching research (1.0).
Students who participated in research reported being most involved in the attainment of data, including performing experiments, collecting data, and then analyzing data (Fig. 2). Moving from top to bottom along the y-axis of Figure 2 represents a typical progression of activities in a professional research setting. Students reported being somewhat involved in early research stages such as generating hypotheses (2.3) and designing experiments (2.5), and also in late stages such as interpreting results (2.5) and making conclusions (2.2). The lowest- ranking categories were more advanced activities that demand a greater time commitment, especially involvement in the grant process (1.3) and presentation of research (1.1-1.8).
Nevertheless, there was at least some student involvement in all stages of research (Fig. 2).
Effects on leadership skills
General questions asked students about the effectiveness of their teaching/research experiences in helping them to “increase skills,” to “develop leadership skills,” and to “show them the need for developing leadership skills.” Students rated these at 4.2, 4.0, and 4.2, respectively, demonstrating that teaching/research experiences were effective to very effective, in general, at building leadership skills (Fig. 3). In further support of this, student comments regarding skills/rewards gained through a teaching/research experience included many references to leadership skills. Among these were public speaking, time-management, self-organization, working with others, “asking for help,” “experiencing the dynamics of
190 working with other members of the lab on a project,” and “thinking of different ways to
accomplish a goal.”
The survey also contained questions which asked students to rate the effectiveness of their
teaching/research experiences in developing 9 particular leadership skills. Students rated
them as follows: teamwork - 4.0, getting along with others – 4.0, problem-solving - 3.9, time-
management - 3.9, analytical skills - 3.8, speaking - 3.4, writing – 2.9, integrity - 2.9, work
ethic - 2.9. Therefore, students felt that teaching/research experiences were somewhat
effective to effective in developing all 9 leadership skills. The fact that none of the ratings for
particular skills were quite as high as ratings for skills, in general, is probably related to
students having many different types of experiences which enhanced different sets of skills.
In other words, all experiences developed leadership skills, but each developed a different
combination of them.
With regard to the lowest-ranking leadership categories, most students felt that they already
had integrity and work ethic and so any effects of research/teaching on developing them were
minimal. The next two lowest categories, speaking and writing, were pulled down by the
ratings of students with research experiences. Survey results are consistent in that activities
that research students said they were less involved in (grant writing and presenting research) match the skills that they said were less developed by their experience (speaking and writing). This suggests to us that research experiences could be improved by putting more emphasis on speaking and writing, which equates to fostering environments where students will present their work.
191 Effects on academics and broader education
It seems that immediate effects of teaching and research experiences on undergraduate
academics were minimal (Fig. 3). Most students found that experiences were either not
effective or only somewhat effective at causing them to take different courses (2.5),
motivating them to take more difficult courses (2.8), or motivating them to increase their
GPA (2.9).
Nevertheless, student comments suggest that their experiences had a large impact on their
educations, in a broader sense. For example, one student wrote, “My research experience on
campus has really made my education MUCH more well-rounded. I understand the things we are taught in class because I have done them. And what I learn in class supplements my understanding of techniques.” Most students also reported that their experiences were effective (4.0) at helping them learn more about botany. Taken together, these data suggest that teaching/research experiences were highly educational even though they had little effect on undergraduate perceptions of academics.
Although teaching and research did not often cause students to change their undergraduate courses or improve grades, their experiences were effective (4.0) at causing them to consider further studies such as graduate school (Fig. 3). This is ironic since academic achievement is necessary to get into graduate school. The trend of impacting future academic plans while having little impact on current academics may be related to most students having their teaching/research experiences as upperclassmen, and also to their GPAs already being high
(average 3.5). It is unclear how teaching/research experiences might affect the academic
192 performance of underclassmen and/or a more random sample of the student population, where academics may have more room for improvement.
Effects on career goals
Students rated teaching/research experiences as somewhat effective (3.0) at “changing” their career goals (Fig. 3), usually because they had already established goals. Student comments frequently referred to experiences “reinforcing”, “refining”, and “encouraging” with regard to their future careers, suggesting that their goals were being positively affected although not changed.
Also, it seems that student attitudes towards pursuing a career were significantly affected
(Fig. 3). Students found that experiences were effective at helping them to develop more initiative towards pursuing a career (4.3) and at helping them to be more flexible in their outlook on career possibilities (4.1). Interestingly, the more general effects on initiative were rated more highly than effects of motivating students specifically toward a career in teaching
(3.8) or research (3.9).
Teaching/research experiences are also potentially valuable for showing students what they will not be happy with as a career. This was an outcome for two students, one who would prefer to avoid research and another who is less likely to get a job in industry. Nevertheless, students on average found that their experiences were not effective at “helping them to determine that they did not” want to teach (2.0) or do research (2.1). In fact, these were the lowest-ranking categories on the effectiveness scale (Fig. 3). Although discovering what one
193 does not like is a valid educational outcome, we view these scores as a further indication that
teaching and research experiences are having a positive influence on students.
Summary
For college/university departments with teaching, research, and/or internship requirements,
assessment can be very useful for improving experiences and better advising students. In the
Botany Department at N.C. State University, we found that those doing research are involved mainly in data collection and analysis, whereas those who are teaching are mainly involved with hands-on laboratory instruction. Nearly all students rated their experiences as very good overall and would recommend them to other students. Several positive educational outcomes were rated especially high, including a greater appreciation for teaching/research, greater initiative towards pursuing a career, an increase in skills, and greater consideration of graduate school. Students also found that the experiences were effective at building a range of “leadership skills”, but rated academic-related outcomes relatively low. Our results have been used to determine what students did during research/teaching experiences, the educational outcomes, and the overall success of the requirement. This study has given us knowledge of how to improve particular experiences and better advise students on which experiences to pursue in the future. Because every department (and every student) is different, we anticipate that much of the value of this study lies in the actual survey instrument and strategy for analysis. Therefore, we invite others to adapt this assessment strategy in their own departments.
194 Acknowledgements
We thank Drs. Gary Moore and Jim Flowers (Dept. of Agricultural and Extension Education at N.C. State) for their valuable feedback on an early draft of the survey, Dr. Arnold Oltmans
(Dept. of Agricultural and Resource Economics at N.C. State) for commenting on the manuscript, Sophia Clotho for her advice, and Botany undergraduates for completing surveys.
195 Literature cited
Boyer Commission on Educating Undergraduates in a Research University. 1998. Reinventing undergraduate education: A blueprint for America’s research universities.
Chaplin, S.B., J.M. Manske, and J.L. Cruise. 1998. Introducing freshmen to investigative research – A course for biology majors at Minnesota’s University of St. Thomas. Jour. Coll. Sci. Teach. 27: 347-350.
Coker, J.S. and E. Davies. 2002. Involvement of plant biologists in undergraduate and high school student research. Jour. Nat. Resour. Life Sci. Educ. 31: 44-47.
Druger, M. 1997. Preparing the next generation of college science teachers. J. Coll. Sci. Teach. 26: 424- 427.
Fingerson, L. and A.B. Culley. 2001. Collaborators in teaching and learning: Undergraduate teaching assistants in the classroom. Teaching Sociology 29: 299-315.
Henderson, L. and C. Buising. 2000. A research-based molecular biology laboratory. Jour. Coll. Sci. Teach. 30: 322-327.
Howard Hughes Medical Institute. 2002. Undergraduate science education at research universities.
Landrum, E.R. and L.R. Nelsen. 2002. The undergraduate research assistantship: an analysis of the benefits. Teaching of Psychology 29: 15-19.
Mabrouk, P.A. and K. Peters. 2000. Student perspectives on undergraduate research experiences in chemistry and biology. CUR Quarterly, Sept.: 25-33.
McLean, R.J.C. 1999. Original research projects – A major component of an undergraduate microbiology course. Jour. Coll. Sci. Teach. 29: 38-40.
Miller, J.E., J.E. Groccia, and M.S. Miller (Eds.). 2001. Student-assisted teaching: A guide to faculty- student teamwork. Anker Publ. Co.: Bolton, MA.
National Science Foundation. 1996. Shaping the future: New expectations for undergraduate education in science, mathematics, engineering, and technology.
National Science Foundation. 2003. Grant proposal guide (NSF 03-041).
Research Corporation. 2001. Academic Excellence: The Sourcebook.
Service, R.F. 2002. New lure for young talent: extreme research. Science 297: 1633-1634.
Socha, T.J. 1998. Developing an undergraduate teaching assistant program in communication: Values, curriculum, and preliminary assessment. Jour. Assoc. for Communication Admin. 27: 77-83.
Solomon, S.S., S.C. Tom, J. Pichert, D. Wasserman, and A.C. Powers. 2003. Impact of medical student research in the development of physician-scientists. Jour. Investig. Med. 51: 149-156.
196 Sundberg, M.D. and J.E. Armstrong. 1993. The status of laboratory instruction for introductory biology in the U.S. universities. Amer. Biol. Teacher 55: 144-146.
Sundberg, M.D., J.E. Armstrong, M.L. Dini, and E.W. Wischusen. 2000. Some practical tips for instituting investigative biology laboratories. J. College Sci. Teach. 29: 353-359.
197
Had training in teaching techniques Wrote objectives
Helped develop a lecture or lab Set up a lab y t
i Gave brief presentations in lab v i Gave full-length lecture(s)
g act Wrote exams n Graded exams Presented course material on the internet eachi
T Routine tasks
Collected teaching research data Analyzed teaching research data
1234 Level of involvement
Figure 1. Average levels of student involvement in typical teaching-related activities, based on the following scale: 1 – not involved, 2 – somewhat involved, 3 – involved, 4 – very involved. Error bars represent standard error.
Made observations that led to a hypothesis Formulated hypothesis based on observations Designed experiments Wrote grant proposal Performed experiments y
t Collected data i v i Analyzed data
act Interpreted results of experiment
ch Made conclusions about results Presented research orally
esear Presented research as a poster R Submitted a manuscript for publication Presented research on the internet
1234 Level of involvement
Figure 2. Average levels of student involvement in typical research-related activities, based on the following scale: 1 – not involved, 2 – somewhat involved, 3 – involved, 4 – very involved. Error bars represent standard error.
198
Given me a new appreciation for teaching Given me a new appreciation for research Shown me the need for teamwork skills Helped me develop more initiative towards pursuing a career Increased my skills Shown me that I have a good work ethic Shown me the need for developing leadership skills Has caused to consider graduate school Helped me to learn more about botany Helped me develop teamwork skills Helped me to determine that I would like a research career Helped me develop leadership skills Helped me to be more flexible in my outlook on getting along with others … Helped me to be more disciplined with my time Motivated me towards a career in research Has enhanced my problem-solving skills ence has i Helped me to be more flexible in my outlook on career possibilities
per Has enhanced my analytical skills ex
Helped me to see that I am disciplined with my time
her Motivated me towards a career in teaching t
o Helped me see that I am disciplined in being on time g/
n Enhanced my speaking skills Helped me to determine that I would like a teaching career achi e
t Has caused me to consider continuing in the same company
ch/ Changed my career goals
ear Motivated me to increase my GPA s
e Enhanced my writing skills r
y Helped me to see that I need to develop integrity M Shown me that I need to develop a better work ethic Motivated me to take more difficult courses Caused me to make course changes Helped me to see that I do NOT want to do research Helped me to determine I do NOT want to teach
12345 Effectiveness scale
Figure 3. Student perceptions of their research and/or teaching experience, based on the following effectiveness scale: 1 – not applicable, 2 – not effective, 3 – somewhat effective, 4 – effective, 5 – very effective. Error bars represent standard error.
199