PROTEIN-PROTEIN INTERACTION ASSAY IN PHYTOPHTHORA SOJAE USING YEAST TWO-HYBRID SYSTEM
Abasi Aikebaierjiang
A Dissertation
Submitted to the Graduate College of Bowling Green State University in partial fulfillment of the requirements for the degree of
DOCTOR OF PHILOSOPHY
May 2020
Committee:
Vipaporn Phuntumart, Advisor
Pavel Anzenbacher Graduate Faculty Representative
Raymond Larsen
Paul Morris
Scott Rogers
© 2020
Abasi Aikebaierjiang
All Rights Reserved iii ABSTRACT
Vipaporn Phuntumart, Advisor
An oomycete pathogen, Phytophthora sojae is one of the most serious threats to soybean production worldwide. Transcription factors are crucial for the survival of all living organisms including oomycetes. This feature provides a useful clue for selection of transcription factors as targets for the control methods.
A potential transcription factor in P. sojae, Ps1365 (PHYSODRAFT_342624) was discovered via yeast one-hybrid system (Rutter, 2012). In this research, I aimed to find the proteins which interact with Ps1365, with the hypothesis that these interactive proteins function together with Ps1365 to activate expression of other genes. Ps1365 is a small protein of 156 amino acids (17,424 Da). Genomic analysis indicated that Ps1365 is one of 22 paralogous proteins in P. sojae but only four orthologs were noted in P. infestans. Ps1365 was used as a bait in yeast two-hybrid analysis (Y2H) to screen cDNAs library of P. sojae mycelia. The assays showed that Ps1365 lacks autoactivation and toxicity in the yeast strain Saccharomyces cerevisiae
Y2HGold. Following the Y2H, high-throughput sequencing was performed and revealed two prey sequences that are potential partners of Ps1365. These sequences were identified as
PHYSODRAFT_291312 and PHYSODRAFT_356433.
Analysis of deduced amino acid sequences of PHYSODRAFT_291312 and
PHYSODRAFT_356433 showed that they are small globular proteins of 8,288 Da and 11,737.68
Da, respectively and contains α-helices, the simplest form of a transcription factor. SignalP analysis showed that both PHYSODRAFT_291312 and PHYSODRAFT_356433 lack signal peptide and do not contain nuclear localization signals. These predictions partly support our iv hypothesis that they can freely pass through nuclear membrane because of their small sizes to interact with Ps1365. Co-expression analysis using existing data from the Fungal and Oomycete
Genomics Resource (FungiDB) as well as from the National Center for Biotechnology
Information (NCBI) showed that Ps1365, PHYSODRAFT_291312 and
PHYSODRAFT_356433 expressed together during mycelial growth and during infection, further confirmed that Ps1365 and the two candidate prey proteins may function together to drive expression of the genes downstream.
All together, these results support the hypothesis that the two candidate P. sojae proteins,
PHYSODRAFT_291312 and PHYSODRAFT_356433 may interact with Ps1365 to regulate the expressions of downstream genes. v
This Dissertation is dedicated to my lovely parents, Abbas Abla and Maramnisa Mosayuf. vi ACKNOWLEDGMENTS
First of all, I would like to express my deep appreciation to my dissertation advisor, Dr.
Vipaporn Phuntumart, for the great role she has played, expert guidance she has provided, as well as the both academic and technical support she has given during the process of initiating, conducting and finalizing my research. When I joined her research group, Dr. Phuntumart made the process of adjustment to her lab much easier by patiently demonstrating how to use the equipment and tools in the lab to me. During the process of conducting my research, Dr.
Phuntumart provided me with constructive criticisms, guidances on how to develop new scientific ideas, ways to troubleshoot lab protocols, and many research and writing tips. Without this very generous and selfless provision of her time, it would have been almost impossible for me to complete my research and dissertation. I can say that Dr. Phuntumart is one of the best and most conscientious advisors I have ever seen in my life.
At this point, I also express my special gratefulness to the other members of my committee. When I was doing my research, my committee members provided me with the new research ideas, constructive criticisms, which guaranteed the continuation of my research work in a more fruitful direction. Their generous support also resulted in the successful completion of my preliminary examination, proposal as well as my dissertation defense and final oral examination.
Just at the beginning of this research, Dr. Paul Morris provided me with the first sample, the mycelium of Phytophthora sojae strain P6497. When my research encountered difficulties and bottlenecks, Dr. Morris provided me with expert direction and support. He also helped evaluate my analysis results and showed the crucial points to consider. During the process of taking courses and doing research, Dr. Scott Rogers made me completely realized the importance of thinking biological phenomena in the context of evolution, and this played great role when I was working on my research. When I was in the process of constructing the phylogenetic tree of vii Ps1365, Dr. Rogers helped me evaluate my analysis and showed me essential points to consider
and compliment. When I was writing my dissertation, Dr. Rogers took his precious time helped
me review my dissertation despite his own busy schedule and heavy workload and gave me
important revising suggestions, crucial criticisms and showed me the correct way to express and
organize ideas in the academic writing. Dr. Raymond Larsen made me become realize the
importance of research methods by repeatedly emphasizing that we should not forget to learn
about research methods previous people utilized in light of their “flashy” results, and this learning
strategy was a great help for me. In addition, Dr. Larsen gave me many valuable comments and
criticisms to enable me to orient my research strategies. Dr. Pavel Anzenbacher has guaranteed the
successful completion my preliminary exam, proposal meeting, dissertation defense and final oral
exam according to the rules of the Graduate College by closely collaborating with me as well as
all the other members of the committee.
Here, I also would like to thanks all the former and current lab members of the
Phuntumart Lab. Eric Budge, Rebecca Cull and Angsana Keeratijarut showed me the way to
correctly use a lab equipment whenever I asked them. Dilshan Beligala, Shannon Miller, James
Artman, Alexander Howard, Gayathri Beligala, Satyaki Ghosh, Maheshi Kukulekanagme and
Kevin Rowlands provided me with their generous help whenever I needed of assistance. I would
say that without the help from them, it would not have been easy for me the successfully complete
the research.
Moreover, I express my gratitude to Bowling Green State University and the United
States Department of Agriculture - National Institute of Food and Agriculture (USDA-NIFA)
Agriculture and Food Research Initiative Oomycete-Soybean CAP (award 2011-68004-30104) for providing my research with funds. I also express my gratefulness to staff from both
Department of Biological Sciences and Bowling Green State University because they provided me viii with a supportive learning environment. I also wish to thank the Graduate College for providing me with the excellent academic service.
Finally, I would like to express my thankfulness to both of my parents, Abbas Abla and
Maramnisa Mosayuf for the material and emotional support they provided during the entire process of my study at BGSU. They gave me the courage and confidence whenever my study encountered problems and hardships. Without their support, it would have been difficult to achieve the success of today’s results. ix
TABLE OF CONTENTS
Page
CHAPTER 1. INTRODUCTION ...... 1
Introduction ...... 1
I Phytophthora sojae ...... 1
General Information ...... 1
Taxonomy ...... 2
Life Cycle ...... 5
II Gene Regulation ...... 6
III Gene Regulation in Oomycetes...... 8
Changes of Transcription Levels in Oomycetes ...... 8
Gene Silencing in Oomycetes ...... 9
Regulatory Elements in Oomycetes ...... 11
Transcription Factors in Oomycetes ...... 13
References ...... 17
CHAPTER 2. YEAST TWO-HYBRID ANALYSIS ...... 23
Introduction ...... 23
I Yeast Two-hybrid Assay (Y2H) ...... 23
II Phylogenetic Analysis ...... 25
Hypotheses and Aims ...... 26
Material and Methods ...... 27
I Biological Material ...... 27
II Bioinformatic Analysis of Ps1365 ...... 31
III Y2H Bait Construction ...... 35 x
IV Bait Autoactivation Assay ...... 45
V Bait Toxicity Assay ...... 50
VI Prey Library Construction ...... 50
VII Yeast Two-hybrid Screening ...... 54
VIII Prey Colony Insert-Checking ...... 55
IX Plasmid Rescue ...... 56
Results ...... 57
Aim 1) Analysis of the Bait Sequence, Ps1365 ...... 57
Bait Sequence Analysis ...... 57
Hydrophobicity Analysis ...... 61
Signal Peptide and Conserved Domain Prediction ...... 61
Secondary Structure Prediction ...... 63
Subcellular Localization Prediction ...... 64
In Silico Gene Expression Analysis ...... 66
Phylogenetic Analysis ...... 66
Aim 2) Cloning the Bait Sequence, Ps1365 into Yeast, Saccharomyces
cerevisiae Y2HGold ...... 69
Bait Vector Construction ...... 69
Analysis of Bait Construct...... 76
Bait Transformation and Autoactivation Assay ...... 78
Bait Toxicity Assay ...... 81
Aim 3) Identify Interactive Proteins of a Novel P. sojae Transcription Factor,
Ps1365 Using Yeast Two-hybrid Assay ...... 81
Prey Library Construction ...... 81
Y2H Screening ...... 86 xi
Plasmid Rescue ...... 92
Discussion ...... 95
References ...... 97
CHAPTER 3. ANALYSIS OF THE POTENTIAL INTERACTOR PROTEINS ...... 103
Introduction ...... 103
I Bioinformatic Analysis of Interactor Proteins of Ps1365 ...... 103
Hypotheses and Aims ...... 104
Materials and Methods ...... 105
I Yeast Two-hybrid Assay ...... 105
II Bioinformatic Analysis ...... 106
Results ...... 110
Aim 1) Analysis of the Potential Interactive Sequences of Ps1365 Obtained
from Yeast Two-hybrid Assay ...... 110
Analysis of the Potential Prey Inserts ...... 110
Analysis of the Potential Prey Inserts by BLAST ...... 112
Validation of the Gene Models of the Prey Candidates ...... 112
Protein Features of PHYSODRAFT_291312 ...... 116
PHYSODRAFT_291312 against Protein Databases ...... 117
PHYSODRAFT_291312: Globular or Membrane ...... 117
Domain and Protein Family Prediction of PHYSODRAFT_291312 ..... 121
Secondary Structure of PHYSODRAFT_291312 ...... 121
Tertiary Structure and Functional Analysis of
PHYSODRAFT_291312 ...... 122
Signal Peptide and Subcellular Localization Prediction of
PHYSODRAFT_291312 ...... 126 xii
Protein Features of PHYSODRAFT_356433 ...... 129
PHYSODRAFT_356433: Globular or Membrane ...... 129
Analysis of PHYSODRAFT_356433 against Protein Databases...... 133
Tertiary Structure and Functional Prediction of
PHYSODRAFT_356433 ...... 135
Signal Peptide and Subcellular Localization Analyses of
PHYSODRAFT_356433 ...... 139
Co-expression Analysis of Bait and Prey Protein Genes ...... 140
Discussion ...... 143
References ...... 148
APPENDIX A. CONSENSUS SEQUENCES OF THE FOUR PREY CONTIGS
OBTAINED FROM SEQUENCHER ...... 160
xiii
LIST OF FIGURES
Figure Page
1.1 Eukaryotic tree of life ...... 4
2.1 The two-hybrid principle ...... 24
2.2 Parameters used for constructing the phylogenetic tree of Ps1365, its 21
homologous proteins in P. sojae as well as some of their homologous counterparts
in other oomycetes ...... 33
2.3 pGBKT7 Vector picture ...... 37
2.4 Vectors used for the autoactivation assay ...... 49
2.5 Partial amino acid sequence of Ps1365 ...... 59
2.6 Full-length nucleotide (A) and deduced amino acid sequences (B) of Ps1365 ...... 60
2.7 The Kyte-Doolittle Hydropathy Plot analysis of Ps1365 ...... 61
2.8 SignalP analysis showed that Ps1365 did not contain any cleavage site indicating
a recognizable signal peptide ...... 62
2.9 Predicted secondary structure of Ps1365 by JPred 4, a secondary structure
predictor server ...... 64
2.10 Prediction of Ps1365 subcellular localization by CELLO ...... 65
2.11 In silico gene expression analysis of Ps1365 in three different developmental
stages of P. sojae, mycelium, cyst and infection...... 66
2.12 Phylogenetic divergences of the 22 proteins belong to a novel protein family in
P. sojae, including Ps1365 ...... 68
2.13 Agarose gel electrophoresis analysis of Ps1365 gradient PCR products ...... 70
xiv
2.14 Agarose gel electrophoresis analysis of pGBKT7 (BD) vector after restriction
digestion with EcoRI and/or BamHI at 37°C for one hour ...... 72
2.15 Agarose gel electrophoresis analysis of colony PCR reactions of transformed
TOP10 E. coli cells containing the vector pGBKT7-Ps1365 ...... 74
2.16 Agarose gel electrophoresis of colony PCR of E. coli transformant colony #4
containing pGBKT7-Ps1365 ...... 75
2.17 Alignment of the Ps1365 gene sequence in plasmid pGBKT7-Ps1365 with the
Ps1365 database gene sequence ...... 78
2.18 Autoactivation assay of Ps1365-encoding gene in S. cerevisiae Y2HGold ...... 80
2.19 Toxicity assay of Ps1365-encoding gene in S. cerevisiae Y2HGold...... 81
2.20 Agarose gel electrophoresis of total RNA extracted from P. sojae mycelium ...... 83
2.21 Agarose gel electrophoresis of P. sojae mycelium first-strand cDNAs synthesized
using SMART III Oligo, CDSIII and CDSIII/6 primers ...... 84
2.22 Agarose gel electrophoresis of LD-PCR amplicons of P. sojae mycelium cDNAs
synthesized using 5’ PCR and 3’ PCR primers ...... 85
2.23 Agarose gel electrophoresis of bait colony PCR using Ps1365 gene-specific
forward and reverse primers ...... 87
2.24 Agarose gel electrophoresis of blue diploid colonies insert-check PCR using
Matchmaker Insert Check PCR Mix 2 ...... 89
2.25 Agarose gel electrophoresis of blue diploid colonies batch insert-check PCR
reactions using Matchmaker Insert Check PCR Mix 2 ...... 91
2.26 Agarose gel electrophoresis of plasmids extracted from E. coli after transformed
with prey plasmids ...... 92 xv
2.27 Agarose gel electrophoresis of E. coli transformants colony PCR and direct
PCR of plasmids from some E. coli colonies using Matchmaker Insert Check
PCR Mix 2 ...... 94
3.1 Default parameters of Sequencher 5.4.5 used to align the prey sequences ...... 106
3.2 Alignment of the PHYSODRAFT_291312 (is shown here as
PHYSODRAFT_291312-t26_1) gene model with RNA-Seq data ...... 113
3.3 DNA sequence of the predicted new gene model of PHYSODRAFT_291312
after aligning the original gene model from FungiDB with RNA-Seq data from
the same database ...... 114
3.4 Alignment of PHYSODRAFT_356433 gene model (is shown here as
PHYSODRAFT_356433-t26_1) with RNA-Seq data ...... 116
3.5 Features of PHYSODRAFT_291312 protein achieved from FungiDB ...... 117
3.6 Prediction of potential transmembrane domains in PHYSODRAFT_291312 ...... 120
3.7 Secondary structure prediction of PHYSODRAFT_291312 ...... 122
3.8 The structural model of PHYSODRAFT_291312 predicted by Phyre2 web
server ...... 123
3.9 Tertiary structure of PHYSODRAFT_291312 predicted by RaptorX ...... 124
3.10 SignalP Signal Peptide Prediction of PHYSODRAFT_291312 ...... 125
3.11 Analysis of PHYSODRAFT_291312 by CoSiDe Combined Signal Peptide
Predictor predicted the best cleavage site at the 23rd amino acid residue ...... 127
3.12 Subcellular localization prediction of PHYSODRAFT_291312 using Phobius .... 128
3.13 Features of PHYSODRAFT_356433 protein achieved from FungiDB ...... 129
3.14 Prediction of potential transmembrane domains in PHYSODRAFT_356433 ...... 130 xvi
3.15 Secondary structure prediction of PHYSODRAFT_356433 ...... 134
3.16 Structure model of PHYSODRAFT_356433 by Phyre2 web server represented
in ribbon diagram ...... 136
3.17 Prediction of PHYSODRAFT_356433 tertiary structure by RaptorX ...... 137
3.18 SignalP Signal Peptide Prediction showed that PHYSODRAFT_356433 contains
no signal peptide because no cleavage site was observed from all the three scores 138
3.19 Prediction of subcellular localization of PHYSODRAFT_356433 by Phobius ..... 139
3.20 Transcription levels of three P. sojae protein genes: Ps1365
(PHYSODRAFT_342624), PHYSODRAFT_291312 and
PHYSODRAFT_356433 during the three developmental stages of P. sojae ...... 142
xvii
LIST OF TABLES
Table Page
2.1 Primers in this research ...... 38
3.1 NCBI BLAST analysis results of the contigs of P. sojae prey inserts ...... 111
1
CHAPTER 1. INTRODUCTION
Introduction
I Phytophthora sojae
General Information
Phytophthora sojae is one of the most important plant pathogens, causing substantial
losses in worldwide soybean production every year. Some wildflower species, for example,
lupins, have also been reported to be host for P. sojae infection (Tyler, 2007). Phytophthora
sojae belongs to Class Oomycota, literally meaning “egg fungus”. There are numerous
species belonging to Genus Phytophthora and almost all of them are plant pathogens. For
example, P. infestans causes blight disease in potatoes (Erwin, Bartnicki-Garcia, & Tsao,
1983); and P. citrophthora and P. nicotianae cause fruit and root rot, as well as gummosis on
citrus (Ahmed et al., 2012). Some Phytophthora species have very high host specificity, such
as P. sojae infects soybean as its primary host (Tyler, 2007), while other members of
Phytophthora can infect a wide range of plants. For example, P. nicotianae infects more than
72 genera of plants (Grote et al., 2002).
The Genus Phytophthora may encompass 200-600 species (Brasier, 2007), and among them more than 50 species have been reported to severely affect plants, including
economically important crops (Erwin et al., 1983); for example, P. infestans that resulted in
the famous Irish Great Famine of 1845-49 (Erwin et al., 1983; Yoshida et al., 2013); P. ramorum, responsible for sudden oak death of oak trees (Tyler et al., 2006). Meanwhile, P. sojae has been causing substantial decreases in soybean production worldwide for decades. It 2 has been reported to cause $1-2 billion dollars of economic loss in the world every year, and this includes $200 million of loss annually in the United States alone (Tyler, 2007).
Taxonomy
Oomycetes are in the protist group called Heterokonta (Stramenopiles), while fungi are within the Opisthokonta, which includes Animalia and Fungi. The morphology of oomycetes is similar to that of fungi. In addition, oomycetes and fungi are similar to each other in the following physiological and pathological aspects:
1) They are osmotrophs (obtain nutrients by direct absorption);
2) They are mainly filamentous (grow by extending hyphae);
3) They reproduce by forming spores;
4) Although not universal, both true fungi and oomycetes include parasitic members
(Money, 1998).
On the other hand, the differences between them can be categorized into the following aspects:
1) Cell wall chemistry: the cell walls of oomycetes are composed mainly of glucans
(β-1,3 and β-1,6) and cellulose (reviewed in Erwin et al., 1983). On the contrary, true fungi cell wall primarily consists of chitin and/or chitosan (Judelson & Blanco, 2005; Lamour &
Kamoun, 2009; Tyler, 2007).
2) Hypha structure: in oomycetes, hyphae have no septa, but are coenocytic; in true fungi, many are unicellular, while multicellular hyphal cells are separated by septa, with each cell containing one or more nuclei (Judelson & Blanco, 2005; Lamour & Kamoun, 2009;
Tyler, 2007). 3
3) Motile asexual spore: motile asexual spores are very common in oomycetes and
usually are biflagellated zoospores; in true fungi, motile asexual spores are uncommon
(Judelson & Blanco, 2005; Lamour & Kamoun, 2009; Tyler, 2007).
4) Predominant asexual spore formation: in oomycetes, predominant asexual spores are produced from a sporangium, which is usually undesiccated and composed of a multi-nuclear single cell; in true fungi, predominant asexual spores usually form as desiccated conidia, which are either unicellular or multicelluar and each cell contains only a single haploid nucleus (Judelson & Blanco, 2005; Lamour & Kamoun, 2009; Tyler, 2007).
5) Sexual spores: oomycetes only use oospores as sexual spores, while true fungi have
various types of sexual spores. Sexual spores of oomycetes develop on hyphal termini; in
fungi, sexual spores form within enclosing structures and in large quantities (Judelson &
Blanco, 2005; Lamour & Kamoun, 2009; Tyler, 2007).
6) Ploidy of hyphae: oomycete hyphae are mostly diploid, but the hyphae cells of true
fungi are either haploid, dikaryotic, or polynucleate (Judelson & Blanco, 2005; Lamour &
Kamoun, 2009; Tyler, 2007).
7) Biochemistry: oomycete spores use mycolaminarin and lipids as energy reserves
while true fungi spores use glycogen and trehalose; toxic secondary metabolites are very
common in fungi, but have not been reported in oomycetes; fungi use peptides as mating
hormones, but oomycetes may use lipids; oomycetes are usually without pigments, by
contrast, pigments are very common in fungi (Judelson & Blanco, 2005; Lamour & Kamoun,
2009; Tyler, 2007). 4
The difference between fungi and oomycetes also lies on their taxonomic positions.
Together with diatoms and brown algae, oomycetes belong to Stramenopiles (Tyler, 2007).
By contrast, fungi belong to Opisthokonta (an unranked taxonomic unit) (Lee, Ristaino, &
Heitman, 2012), close to animals (Judelson & Blanco, 2005; Lamour & Kamoun, 2009; Tyler,
2007). These differences can tell us that oomycetes are not fungi at all despite the similarities
on their morphology and life style (Figure 1.1).
Figure 1.1. Eukaryotic tree of life. Oomycetes (red arrow) are phylogenetically distant from fungi (light blue arrow). This figure was adapted from Baldauf (2003) and Lee, Ristaino, and Heitman (2012).
Until now, one of the existing hypotheses about the origin of oomycetes views that
oomycetes, as well as other stramenopiles, such as brown algae are the products of symbiosis 5 event between eukaryotes. According to this hypothesis, heterokonts originated when a heterotrophic eukaryotic ancestor engulfed a red algae (Rogers, 2012).
Life Cycle
Just like other oomycetes and many fungi, P. sojae uses hyphae to absorb nutrients
(i.e., osmotrophy) from its host (usually a soybean plant) or environment. Oomycetes appear to have gained osmotrophy via horizontal transfer of genes from fungi. Phytophthora species reproduce both sexually and asexually. For asexual reproduction, P. sojae hyphae produce sporangia. The sporangium has two destinies: either it germinates directly to form hyphae or it forms many zoospores inside. Zoospores are asexual spores with two flagella, and they can move freely within water or wet soil. Flooding or irrigation will induce P. sojae to produce zoospores. The life span of zoospores is very short. Once they find the root of a soybean plant, they attach themselves on the root surface and form a cyst. At this point, zoospores lose their flagella and start developing hyphae which penetrate into host root system to absorb nutrients.
In harsh environments, P. sojae produces chlamydospores instead of zoospores.
Chlamydospores have very thick cell walls and can withstand harsh environments (Tyler,
2007).
Phytophthora sojae utilizes oospores for sexual reproduction. It and other oomycetes have two sexual organs: antheridia (male reproductive organs) and oogonia (female reproductive organs). Those are the places for male and female meiosis, respectively, where haploid gametes are produced. During fertilization, the oogonium and antheridium will fuse with each other, then the haploid nucleus from antheridium will enter the oogonium and fertilization occurs, followed by the production of many oospores. Oospores lack flagella but 6
have thick cell walls. Compared with zoospores, the life span of oospores is longer. Under
favorable environments, oospores can germinate to form functional hyphae (Tyler, 2007).
II Gene Regulation
Gene regulation is one of the most important processes in living organisms. In
eukaryotes, regulation of gene expression can occur at different levels, including
chromatin-level gene regulation, transcriptional regulation, post-transcriptional regulation,
translational regulation and post-translational regulation. Among them, transcriptional
regulation is the key process of overall gene expression, as transcription is the bridge between
genes and proteins. Transcription is regulated by the interactions between trans-factors
(transcription factors) and cis-elements (enhancer/silencer sequences on DNA). Transcription
is simpler in prokaryotes: in some cases only RNA polymerase is sufficient for initiating
transcription, while in other cases a few activator proteins are needed. However, eukaryotic
transcriptional initiation process is more complex and requires numerous transcription factors.
Eukaryotic transcription factors can be divided into two main categories:
1) General Transcription Factors (GTFs) - These transcription factors are essential to
help initiate the transcription by binding with a TATA box in the promoter region of
protein-coding genes and forming the basal transcription machinery with RNA polymerase II and mediator proteins. GTFs are common for the transcription of all class II genes in eukaryotes and they are highly conservative. There are six GTFs: TFIIA, TFIIB, TFIID,
TFIIE, TFIIF, and TFIIH.
2) Specific Transcription Factors - These transcription factors (TFs) are specific for a single or a group of special gene targets. They recognize and bind to specific DNA sequence 7 motifs and create a complex that allows RNA polymerase to begin RNA synthesis. Since TFs control the rate of mRNA synthesis, they are very crucial for the regulation of gene transcription in eukaryotes. According to their effects on transcription, they can be divided into activators and repressors. Activators activate the transcription of target genes by binding to DNA sequences called enhancers while repressors inhibit the transcription of target genes by binding to enhancers, silencers, or by directly interacting with activators to suppress their functions (Krebs, Goldstein, & Kilpatrick, 2011).
Although very diverse, transcription factors share a ubiquitous feature, and that is they all have DNA-binding domains (DBD). Some transcription factors have trans-activating domains (TAD). DBDs bind to promoter or enhancer sites on DNA, while TADs bind with necessary proteins to initiate transcription. Transcription factors can be divided into several main families based on the structure and shape of their DNA-binding domains:
(1) Leucine Zippers: every seventh amino acid residue of TFs in this group is a leucine.
Two such polypeptides bind together with the hydrophobic interactions between leucine residues which are located in two polypeptides and are shaped like a zipper, so leucine zippers always appear as dimers (Krebs et al., 2011).
(2) Helix-turn-helix (HTH): TFs in this group are composed of two helixes. One lies in a major groove of DNA double helix, and another one also associates within the major groove, both connected by a section of the protein that allows the two to shift relative to one another
(i.e., the turn) (Krebs et al., 2011).
(3) Zinc-finger: it is composed of a zinc-binding site and a loop, which has 23 amino acid residues (Krebs et al., 2011). 8
(4) Basic Helix-loop-helix (bHLH): this domain is composed of two regions, an
N-terminal basic region and a C-terminal helix-loop-helix (Peng, Shan, Kuang, Lu, & Chen,
2013). Every helix of this structure is amphipathic, i.e. one side is hydrophobic, while another
side is hydrophilic. And there is a connecting loop between the two α-helixes. The length of the loop is 12 to 28 amino acids. TFs with this domain usually form homodimers or heterodimers, and the basic regions of these domains are responsible for binding with DNA
(Krebs et al., 2011). The basic region recognizes and binds with an E-box (5'-CANNTG-3') of the target gene (Peng et al., 2013).
III Gene Regulation in Oomycetes
Although the genome of P. sojae was completed and published (Tyler et al, 2006), oomycete gene regulation is understudied (Seidl, Wang, Ackerveken, Govers, & Snel, 2012).
The knowledge of promoter motifs and transcription factors associated with development is essential to fully understand an organism (Xiang, Kim, Roy, & Judelson, 2009).
Changes of Transcription Levels in Oomycetes
Genes have different expression levels at different stages of a life cycle and under
different living conditions. cDNA macroarray analysis found that this is also true for
oomycetes (Kim & Judelson, 2003). Through the analysis, Kim & Judelson (2003) identified
a set of genes whose transcriptional levels change during the stages of the formation of
sporangia, hyphae and cyst. They also found that some of those genes were activated during
starvation or in a mutant non-sporulating strain. They predicted that the products of those
genes include possible regulators, enzymes, and transporters, and those regulators include
TFs, while the enzymes were mainly dehydrogenases. The same study showed that 9
dehydrogenase-encoding genes were mainly expressed during sporulation and stress
responses. An appealing point in their study is that they found a creatine kinase-like enzyme
in oomycetes, because previously creatine kinase was found only in animals and
trypanosomes (Kim & Judelson, 2003).
Wang et al. (2009) studied the genes involve in asexual sporulation by developing
mutant P. sojae strains (using UV irradiation) that cannot produce oospores. Then, they used
suppression subtractive hybridization to compare gene expressions of the mutant strain and
the normal strain during their asexual sporogenesis period. Their results indicated that 39
putative genes were expressed in high levels in the normal strain, and they predicted that
those gene products function in processes such as, metabolism, cell cycle control, signaling, cell defense, protein biosynthesis, and regulation of transcription. They also reported that the following six proteins were related to the formation of oospores: developmental protein
DG1037, glycoside hydrolase, hypothetical protein UB145, FAD-dependent pyridine nucleotide-disulphide oxidoreductase, phosphatidylinositol-4-phosphate 5-kinase, and a sugar transporter (Wang et al., 2009).
Gene Silencing in Oomycetes
Gene silencing methods are very conducive and essential for understanding the
mechanisms of transcription regulation. Although several gene-silencing methods for
oomycetes have been developed, different methods resulted in different efficiencies
(Ah-Fong, Bormann-Chung, & Judelson, 2008). Gene silencing takes place at both
transcriptional and post-transcriptional levels (Ah-Fong et al., 2008). Transcriptional
silencing can be achieved by transferring a copy of the target gene into the target 10 cell/organism, and this may result in changes, such as chromatin remodeling that affects
transcription. On the contrary, post-transcriptional silencing is achieved by introducing a
double-stranded RNA homologous of the target gene into the target cell so that the transcripts
of the target gene will be degraded by the RNA-induced silencing complex (RISC) (Ah-Fong
et al., 2008). However, recent studies showed that there is no distinct boundary between
transcriptional and post-transcriptional silencing, and double-stranded RNAs used for
post-transcriptional silencing can also trigger transcriptional silencing (Ah-Fong et al., 2008).
Gene silencing in some oomycete species has been reported. For example, van West,
Kamoun, van ’t Klooster, and Govers (1999) silenced the P. infestans inf1 gene by introducing inf1 sense, antisense and promoter-less sequences into P. infestans (van West et al., 1999). Ah-Fong et al. (2008) transferred sense, antisense, and hairpin sequences into P. infestans and analyzed the efficiency in silencing the inf1 elicitin gene of P. infestans
(Ah-Fong et al., 2008), and reported that a hairpin sequence was most effective in silencing
the target gene. They also detected small RNAs that were 21 nucleotides in length and
homologous to inf1 from the partially-silenced P. infestans strains and according to their
opinion this might be the manifestation of RNAi-like silencing of the target gene.
RNAi has three pathways: miRNA, siRNA and piRNA (Wilson & Doudna, 2013).
miRNAs and siRNAs silence the expression of the target gene by interfering with the target
mRNA through RISC complexes (in cytoplasm) (Wilson & Doudna, 2013); in addition,
siRNAs inhibit the transcription of the target gene through RITS (RNA-induced initiation of
transcriptional silencing) complexes (inside nucleus) (Deng et al., 2015; Verdel et al., 2004).
In addition, it was found that silencing signals can be transmitted between the two nuclei in a 11
heterokaryon (in oomycetes) (van West et al., 1999). Artificial gene silencing also facilitates
the understanding of the role of RNA helicase in oomycetes. Walker & van West (2007)
silenced an RNA helicase-encoding gene using RNA interference, and showed that
RNA-helicase is essential for the normal formation of zoospores.
Regulatory Elements in Oomycetes
Specific DNA sequences play important roles in transcriptional regulation, and these
sequences include promoters, terminators and their specific motifs. The length of intergenic
regions in oomycetes is usually less than 500 bp, and this may imply that oomycete
promoters are not as complex as plant and metazoan promoters (Xiang et al., 2009). Judelson
et al. (1992), using transient assays with β-glucuronidase as a reporter, observed the activities of promoter and terminator sequences in three oomycete species (P. infestans, P. megasperma
f. sp. glycinea, and Achlya ambisexualis). Their results indicated that there was a vast
difference between oomycete and higher fungi transcriptional machineries. Roy, Poidevin,
Jiang, and Judelson (2013) analyzed the core promoters in oomycetes and showed that promoter regions of some oomycetes have initiator-like sequences and these sequences were
16-19 nt long, and flanked by FPR (flanking promoter region) sequences. Using expectation maximization they found sequences resembling INR (initiator), FPR, and a novel regulatory sequence called as DPEP (Downstream Promoter Element Peronosporales), but no TATA box was found (Roy et al., 2013). They also conducted mutagenesis analysis and showed that
DPEP was a core motif. Their genome-wide search also indicated that only a small portion of
P. infestans genes have INRs and/or FPRs, and the promoters without INR/FPR motifs possess pyrimidine-rich sequences in the regions close to transcription start sites (Roy et al., 12
2013). Their findings indicated the following correlations between the distribution of
INR/FPR motifs and the target gene functions:
1) Genes with combined INR+FPR motifs expressed at higher levels and they were related to infection and development;
2) Genes with DPEP and FPR motifs were expressed constitutively (Roy et al.,
2013).
The correlation between these motifs and development/infection indicates that oomycete promoter motifs not only participate in the initiation of general transcription, but also in the regulation of the expression of life-cycle-related genes (Roy et al., 2013). INR,
FPR, and combined INR+FPR motifs are also present in other oomycete species, such as P. sojae, Pythium ultimum, Hyaloperonospora arabidopsidis, and Saprolegnia parasitica, while
DPEP exists in all studied oomycetes, except S. parasitica (Roy et al., 2013). They estimated that stramenopiles other than oomycetes have INR motifs, but not FPR and DPEP motifs. The difference of core elements between oomycetes and animals/fungi/plants can explain why animal/fungal/plant core promoters cannot work well in oomycetes. Besides, TATA-like motifs were found in some oomycete genes, but there was not conclusive evidence about their function (Roy et al., 2013).
Seidl et al. (2012) predicted 19 conserved DNA motifs from the genome sequences of
three Phytophthora spp. (P. i nfestans, P. sojae, and P. ramorum) and in planta gene
expression data. Some of these DNA motifs were found on the regions upstream of
transporter/RXLR (a protein motif composed of four amino acid residues: Arg-X-Leu-Arg)
(Li, 2010) effector/transcriptional regulator-encoding genes. Specific elements near to the 13
INR or INR/FPR sequences of sporulation-related genes, Cdc14 and Pks1, have been
reported and showed that they were essential for the normal expression of those genes. A very
specific motif called a cold-box, which is only 7 nt-long, has been shown to mediate the
temperature-related expression of the genes specific to zoospore formation (Seidl et al., 2012).
In their study, 19 conserved motifs were predicted by comparing the upstream regions of
co-expressed genes in P. infestans with the upstream regions of their orthologs in two other
Phytophthora species (P. sojae and P. ramorum). They also predicted that some of those
motifs are upstream of effector or transcription regulator genes (Seidl et al., 2012).
Xiang et al. (2009) analyzed Pks1, a protein kinase in P. infestans using a Pks1
promoter fused with the β-glucuronidase reporter gene, and showed that the expression of the
gene occurs during the intermediate stage of sporulation (Xiang et al., 2009). They also
reported that transcription start sites of the gene were located within Inr-like and T-rich
regions. They identified a CCGTTG sequence as a major regulator of sporulation-specific
transcription and located 110-nt upstream of the transcription start site. The sequence was
also found in other sporulation-induced promoters (Xiang et al., 2009).
Transcription Factors in Oomycetes
Transcription factors are one of the key components of transcription regulation.
Alteration of TFs gives rise to morphological and physiological diversity of organisms
(Gamboa-Meléndez, Huerta, & Judelson, 2013). It has been estimated that Phytophthora spp.
genomes each contain approximately 700 transcription factors, and this number is similar to
that of fungi, such as Fusarium graminearum and Magnaporthe oryzae (Judelson, 2012).
Transcription factors can be divided into several families according to their domain structure 14 and conformation, and previous studies have identified some important members of those families in oomycetes. Among them, basic leucine zipper (bZIP) family transcription factors regulate development and stress response in eukaryotes (Gamboa-Meléndez et al., 2013). bZIP transcription factors exist exclusively in eukaryotes and they control physiological processes of animals, plants, and fungi (Ye, Wang, Dong, Tyler, & Wang, 2013). To date, the bZIP transcription factors in oomycetes are still not well-understood (Ye et al., 2013). The only bZIP whose function is known in oomycetes is Pibzp1 in P. infestans, and its function is associated with zoospore motility and plant infection, and it interacts with a protein kinase
(Ye et al., 2013).
Gamboa-Meléndez et al. (2013) computationally identified 38 bZIP-family transcription factors from P. infestans using PFAM, INTERPRO, BLASTP, and SMART. Half of the bZIP TFs have amino acids other than asparagine at the site corresponding to residue
235 of GCN4 in S. cerevisiae, while bZIP TFs in non-oomycete eukaryotes always have asparagine at the same site. Through interspecific comparisons they identified these amino
acid substitutions as specific for oomycetes (Gamboa-Meléndez et al., 2013). They also
observed that transcription levels of approximately two-thirds of P. infestans bZIP TFs changed dramatically in the life cycle of P. infestans, and the transcription of the majority of
those genes increased in zoospores, sporangia, or cysts formed by zoospores. Their findings
also indicated that the function of eight P. infestans bZIP TFs was to mediate defense of P.
infestans against peroxide damage (Gamboa-Meléndez et al., 2013). Additionally, Blanco &
Judelson (2005) showed that a member of the bZIP family TF, Pibzp1 is essential for the
swimming activity of zoospores and formation of appressoria in P. infestans. A study by Ye et 15 al. (2013) predicted potential bZIPs in several oomycetes, including P. sojae, two diatoms,
and two fungi. They hypothesized that the expansions of novel bZIPs in oomycete genomes
are the products of gene duplication. They also found that bZIP transcription levels changed
mainly during zoosporangia, zoospore, cyst and infection stages. Besides, many putative
bZIPs have novel DNA-binding domains (Ye et al., 2013). In addition, Pibzp1-silenced
strains (Pibzp1 is a bZIP transcription factor from P. infestans) have the following
characteristics:
1) High efficiency in cyst germination;
2) Low efficiency in pathogenicity because they cannot produce appressoria (Blanco
& Judelson, 2005; Walker & van West, 2007).
LIM is a zinc-binding protein motif which mediates protein-protein interactions
(Ravinder & Goyal, 2017). Tani, Kim, and Judelson (2005) identified seven genes potentially
encoding NIFs (nuclear LIM interactor-interacting factors) from P. infestans, and among
them, four were associated with sporulation, while three were associated with hypha. The promoter fusion constructs of these genes with a β-glucuronidase reporter gene showed specific spatial and temporal activity. In their previous study, they identified two novel transcription regulators whose expression increased significantly during sporulation and zoosporogenesis, and both of them were similar to nuclear LIM interactor-interacting factors.
Another important group of TFs are Myb-family TFs, and this group of TFs have Myb domains as DNA-binding domains, and Myb domains include THT domains as its structural parts (Ambawat, Sharma, Yadav, & Yadav, 2013). In P. infestans, Xiang and Judelson (2014) found that transcription levels of some Myb genes fluctuated with day-and-night cycle, and 16 showed a general trend of increase following the number of sporangia formed. On the contrary, the transcription of the Myb2R4-encoding gene does not follow this pattern. When overexpressed, Myb2R4 doubled the amount of sporulation and greatly increased the
transcription of Myb2R1. Using chromatin immunoprecipitation, they found that Myb2R4
binds with the promoter of Myb2R1. When they tried to silence all the eight Myb genes using
DNA-directed RNA interference, only one of them, Myb2R3 was silenced and resulted in reduced sporulation. They also found that seven of those genes had negative effects on vegetative growth when overexpressed, while Myb3R6 negatively affected the dormancy of sporangia. Their research showed that the effectiveness of silencing triggered by hairpin constructs was determined by its copy number, and induced abnormal expression is interrupted by epigenetic silencing and excision of transgene in P. infestans (Xiang &
Judelson, 2014).
Myb proteins are extremely diverse DNA-binding proteins and Myb TFs usually have two or three tandem arrays of Myb domains (Xiang & Judelson, 2010). In oomycetes, Myb
TF R2R3 proteins possess helices similar to c-Myb, while Myb TF R1R2R3 proteins have either c-Myb-like or novel sequences. They also found that the transcription levels of eight
R2R3 and R1R2R3 proteins increased at the sporulation stage and the expression of three
R2R3 and R1R2R3 proteins increased when zoospores were being released. The oomycete species, Hyaloperonospora arabidopsidis, which has less R2R3 and R1R2R3 genes than
Phytophthora, simply cannot produce zoospores. Their work showed that the expression of most R2R3 and R1R2R3 genes are specific for germination or sporulation.
Zhang et al. (2012) studied the role of P. sojae Myb-family TFs in the functioning of 17 protein kinase, PsSAK1. They sequenced the transcriptome of PsSAK1-silenced P. sojae strain during cyst stage and at 1.5h after infection, and they found that the transcription levels of several Myb-family TFs altered, including a R2R3 Myb TF, PsMYB1. Their results
showed that the transcription factor PsMYB1 works as downstream of PsSAK1 and essential
for the development of zoospores.
References
Ah-Fong, A. M. V., Bormann-Chung, C. A., & Judelson, H. S. (2008). Optimization of
transgene-mediated silencing in Phytophthora infestans and its association with
small-interfering RNAs. Fungal Genetics and Biology, 45(8), 1197-1205.
doi:10.1016/j.fgb.2008.05.009
Ahmed, Y., D'onghia, A. M., Ippolito, A., Shimy, H. E., Cirvilleri, G., & Yaseen, T.
(2012). Phytophthora nicotianae is the predominant Phytophthora species in citrus
nurseries in Egypt. Phytopathologia Mediterranea, 51(3), 519-527.
Ambawat, S., Sharma, P., Yadav, N., & Yadav, R. (2013). MYB transcription factor genes as
regulators for plant responses: An overview. Physiology and Molecular Biology of
Plants, 19(3), 307-321. doi:10.1007/s12298-013-0179-1
Baldauf, S. L. (2003). The deep roots of eukaryotes. Science, 300(5626), 1703-1706.
doi:10.1126/science.1085544
Blanco, F. A., & Judelson, H. S. (2005). A bZIP transcription factor
from Phytophthora interacts with a protein kinase and is required for zoospore
motility and plant infection. Molecular Microbiology, 56(3), 638-648.
doi:10.1111/j.1365-2958.2005.04575.x 18
Brasier, C. (2007). Phytophthora biodiversity: How many Phytophthora species are
there? Proceedings of the Fourth Meeting of the International Union of Forest
Research Organizations (IUFRO) Working Party S07.02.09: Phytophthoras in Forest
and Natural Ecosystems, Gen. Tech., 101-115.
Deng, X., Zhou, H., Zhang, G., Wang, W., Mao, L., Zhou, X., . . . Lu, H. (2015). Sgf73, a
subunit of SAGA complex, is required for the assembly of RITS complex in fission
yeast. Scientific Reports, 5(1), 14707. doi:10.1038/srep14707
Erwin, D., Bartnicki-Garcia, S., & Tsao, P. (1983). Phytophthora: Its biology, taxonomy,
ecology, and pathology. St. Paul, Minn: American Phytopathological Society.
Gamboa-Meléndez, H., Huerta, A. I., & Judelson, H. S. (2013). bZIP transcription factors in
the oomycete Phytophthora infestans with novel DNA-binding domains are involved
in defense against oxidative stress. Eukaryotic Cell, 12(10), 1403-1412.
doi:10.1128/EC.00141-13
Grote, D., Olmos, A., Kofoet, A., Tuset, J. J., Bertolini, E., & Cambra, M. (2002). Specific
and sensitive detection of Phytophthora nicotianae by simple and
nested-PCR. European Journal of Plant Pathology, 108(3), 197-207.
doi:1015139410793
Judelson, H. S. (2012). Dynamics and innovations within oomycete genomes: Insights into
biology, pathology, and evolution. Eukaryotic Cell, 11(11), 1304-1312.
doi:10.1128/EC.00155-12
Judelson, H. S., & Blanco, F. A. (2005). The spores of Phytophthora : Weapons of the plant
destroyer. Nature Reviews Microbiology, 3(1), 47-58. doi:10.1038/nrmicro1064 19
Judelson, H. S., Tyler, B. M., & Michelmore, R. W. (1992). Regulatory sequences for
expressing genes in oomycete fungi. Molecular & General Genetics : MGG, 234(1),
138-146. doi:10.1007/bf00272355
Kim, K. S., & Judelson, H. S. (2003). Sporangium-specific gene expression in the oomycete
phytopathogen Phytophthora infestans. Eukaryotic Cell, 2(6), 1376-1385.
doi:10.1128/EC.2.6.1376-1385.2003
Krebs, J. E., Goldstein, E. S., & Kilpatrick, S. T. (2011). Lewin's essential genes (3rd ed.).
Burlington, MA: Jones & Bartlett Learning.
Lamour, K., & Kamoun, S. (2009). Oomycete genetics and genomics: diversity, interactions
and research tools. Hoboken, NJ: Wiley-Blackwell.
Lee, S. C., Ristaino, J. B., & Heitman, J. (2012). Parallels in intercellular communication in
oomycete and fungal pathogens of plants and humans. PLoS Pathogens, 8(12),
e1003028. doi:10.1371/journal.ppat.1003028
Li, S. (2010). Characterization of soybean GmPUB1 proteins that interact with the
Phytophthora sojae effector Avr1b protein (Master's thesis, Iowa State University).
Money, N. P. (1998). Why oomycetes have not stopped being fungi. Mycological
Research, 102(6), 767-768. doi:10.1017/S095375629700556X
Peng, H., Shan, W., Kuang, J., Lu, W., & Chen, J. (2013). Molecular characterization of
cold-responsive basic helix-loop-helix transcription factors MabHLHs that interact
with MaICE1 in banana fruit. Planta, 238(5), 937-953.
doi:10.1007/s00425-013-1944-7
Ravinder, R., & Goyal, N. (2017). Cloning, characterization and subcellular localization of 20
nuclear LIM interactor interacting factor gene from leishmania donovani. Gene, 611,
1-8. doi:10.1016/j.gene.2017.02.007
Rogers, S. O. (2012). Integrated molecular evolution (1st ed.). Boca Raton, FL: CRC Press.
Roy, S., Poidevin, L., Jiang, T., & Judelson, H. S. (2013). Novel core promoter elements in
the oomycete pathogen Phytophthora infestans and their influence on expression
detected by genome-wide analysis. BMC Genomics, 14(1), 106.
doi:10.1186/1471-2164-14-106
Seidl, M. F., Wang, R. P., Van den Ackerveken, G., Govers, F., & Snel, B. (2012).
Bioinformatic inference of specific and general transcription factor binding sites in the
plant pathogen Phytophthora infestans. PLoS One, 7(12), e51295.
doi:10.1371/journal.pone.0051295
Tani, S., Kim, K., & Judelson, H. (2005). A cluster of NIF transcriptional regulators with
divergent patterns of spore-specific expression in Phytophthora infestans. Fungal
Genetics and Biology, 42(1), 42-50. doi:10.1016/j.fgb.2004.09.005
Tyler, B. M. (2007). Phytophthora sojae: Root rot pathogen of soybean and model
oomycete. Molecular Plant Pathology, 8(1), 1-8.
doi:10.1111/j.1364-3703.2006.00373.x
Tyler, B. M., Tripathy, S., Zhang, X., Dehal, P., Jiang, R. H. Y., Aerts, A., . . . Boore, J. L.
(2006). Phytophthora genome sequences uncover evolutionary origins and
mechanisms of pathogenesis. Science, 313(5791), 1261-1266.
doi:10.1126/science.1128796 van West, P., Kamoun, S., van ’t Klooster, J. W., & Govers, F. (1999). Internuclear gene 21
silencing in Phytophthora infestans. Molecular Cell, 3(3), 339-348.
doi:10.1016/S1097-2765(00)80461-X
Verdel, A., Jia, S., Gerber, S., Sugiyama, T., Gygi, S., Grewal, S. I. S., & Moazed, D. (2004).
RNAi-mediated targeting of heterochromatin by the RITS
complex. Science, 303(5658), 672-676. doi:10.1126/science.1093686
Walker, C. A., & van West, P. (2007). Zoospore development in the oomycetes. Fungal
Biology Reviews, 21(1), 10-18. doi:10.1016/j.fbr.2007.02.001
Wang, Z., Wang, Z., Shen, J., Wang, G., Zhu, X., & Lu, H. (2009). Identification
of Phytophthora sojae genes involved in asexual sporogenesis. Journal of
Genetics, 88(2), 141-148. doi:10.1007/s12041-009-0021-2
Wilson, R. C., & Doudna, J. A. (2013). Molecular mechanisms of RNA interference. Annual
Review of Biophysics, 42(1), 217-239. doi:10.1146/annurev-biophys-083012-130404
Xiang, Q., & Judelson, H. S. (2010). Myb transcription factors in the
oomycete Phytophthora with novel diversified DNA-binding domains and
developmental stage-specific expression. Gene, 453(1), 1-8.
doi:10.1016/j.gene.2009.12.006
Xiang, Q., & Judelson, H. S. (2014). Myb transcription factors and light regulate sporulation
in the oomycete Phytophthora infestans. PLoS One, 9(4), e92086.
doi:10.1371/journal.pone.0092086
Xiang, Q., Kim, K. S., Roy, S., & Judelson, H. S. (2009). A motif within a complex promoter
from the oomycete Phytophthora infestans determines transcription during an
intermediate stage of sporulation. Fungal Genetics and Biology, 46(5), 400-409. 22
doi:10.1016/j.fgb.2009.02.006
Ye, W., Wang, Y., Dong, S., Tyler, B. M., & Wang, Y. (2013). Phylogenetic and
transcriptional analysis of an expanded bZIP transcription factor family
in Phytophthora sojae. BMC Genomics, 14(1), 839. doi:10.1186/1471-2164-14-839
Yoshida, K., Schuenemann, V. J., Cano, L. M., Pais, M., Mishra, B., Sharma, R., . . .
Burbano, H. A. (2013). The rise and fall of the Phytophthora infestans lineage that
triggered the irish potato famine. eLife, 2, e00731. doi:10.7554/eLife.00731
Zhang, M., Lu, J., Tao, K., Ye, W., Li, A., Liu, X., . . . Wang, Y. (2012). A Myb transcription
factor of Phytophthora sojae, regulated by MAP kinase PsSAK1, is required for
zoospore development. PLoS One, 7(6), e40246. doi:10.1371/journal.pone.0040246 23
CHAPTER 2. YEAST TWO-HYBRID ANALYSIS
Introduction
I Yeast Two-hybrid Assay (Y2H)
The majority of living activities are manifested by protein-protein interactions. Several methods have been developed for identifying potential interactive protein pairs or complexes.
These include affinity chromatography, coimmunoprecipitation (Phizicky & Fields, 1995), microscale thermophoresis (Wienken, Baaske, Rothbauer, Braun, & Duhr, 2010), bimolecular fluorescence complementation (Ohad & Yalovsky, 2010) and yeast two-hybrid (Y2H) analysis (Coates & Hall, 2003). Among them, Y2H is widely used because of its effectiveness, accuracy and that it does not require special and expensive instruments.
The principle of Y2H is based on the very nature and function of transcription factors.
In general, a transcription factor would have a DNA-binding domain (DBD) and a trans-activating domain (TAD). While the DBD is responsible for recognizing and binding to the regulatory sites of the target gene, the TAD binds with other proteins to facilitate transcription initiation. However, to actually initiate transcription, a DBD and a trans-activating domain (TAD) should come into close proximity to activate the basal transcription machinery. To investigate the interaction of two proteins such as A and B, two fusion proteins are constructed: protein A is fused with DBD, and is called a "bait"; and protein B is fused with TAD, and is called a "prey". If A and B are interactive proteins, they bind with each other, and this binding physically brings the DBD and the TAD into close proximity, so that the transcription of the downstream gene known as a reporter gene is activated (Figure 2.1). 24
Figure 2.1. The two-hybrid principle. Matchmaker® Gold Yeast Two-Hybrid
System User Manual, [PT4084-1], ©2010 Clontech Laboratories, Inc. ( n/k/a Takara Bio
USA, Inc.). In this system, one protein is fused with DNA-binding domain of GAL4 (orange, marked as “GAL4 DNA-BD”), and this protein under question is called “bait”. Another protein is fused with trans-activating domain of GAL4 (green, marked as “GAL4 AD”), and this protein is called “prey”. If the bait and prey proteins interact, GAL4 BD and GAL4 AD come into close proximity, the transcription of the reporter genes occurs. This figure was adapted from Matchmaker® Gold Yeast Two-Hybrid System User Manual, Protocol No.:
PT4084-1 (Clontech Laboratories, Inc., Mountain View, CA).
25
The viability of the Y2H system has been well-established in many studies. For example, Peng et al. (2013) utilized Y2H to successfully show the interactions of five
MabHLHs that form heterodimers as well as their interactions with MaICE1 in banana fruit.
Rajagopala et al. (2012) tested the feasibility of Y2H by analyzing the known structures of the E. coli DNA polymerase III complex, MntR complex, using a Y2H assay as well as by
reviewing the literature of previously conducted Y2H assays on the protein complexes,
Varicella Zoster Virus ribonucleotide reductase, bacteriophage λ, yeast proteasome, and
human spliceosome. Their analysis showed that Y2H is suitable for analyzing interaction
networks of subunits within a protein complex. In P. sojae, H. Brar and M. K. Bhattacharyya
used Y2H to identify several proteins from soybean that potentially interact with Avr1b, one
of the well-studied effector proteins from P. sojae (unpublished, reviewed in Li, 2010). Li
(2010) used Y2H to further confirm that some conserved residues in the Avr1b protein were
needed for the interactions between Avr1b and soybean GmPUB1s. This study demonstrated
that Y2H analysis is suitable for studying the interactions of P. sojae proteins. In addition,
Cheng et al. (2018) studied the molecular mechanism of GmPIB1, a soybean bHLH
transcription factor that expressed in resistant soybean against P. sojae infection. Using Y2H,
they showed that GmPIB1 can form homodimers. Taken together, it could be seen that Y2H is
suitable for studies of TFs including TFs that form homodimers. Naveed, Bibi and Ali (2019)
using Y2H identified the potential interactive targets of P. infestans RXLR effector, PiAvrblb2
in tomato.
II Phylogenetic Analysis
Phylogenetic analysis plays an important role in biological studies of the evolutionary 26 relationships of organisms, genes, and proteins. Through phylogenetic analysis, evolutionarily connections can be made within and between species using similarities between nucleotide and amino acid sequences. This is especially useful when studying conserved domains of proteins. Comparative genomics can help identify conserved domains/motifs in proteins under study through comparison and alignment of/with known protein sequences. For example, Yan et al. (2013) studied the phylogeny of ERF proteins in
sorghum. In their study, they determined the hidden Markov model (HMM) of conserved AP2
domain by analyzing Arabidopsis AP2/ERF proteins via PFAM database
(http://pfam.janelia.org/search/sequence), then they used this HMM to find AP2/ERF proteins
in sorghum genome. Additional study conducted by Martinez-Duncker et al. (2003) using a
neighbor-joining phylogentic analysis of four groups of enzymes (POFUT1, POFUT2, α2-
and α6-FUTs) concluded that the four groups are, in fact, four distinct enzyme families, i.e.
the members within each group are closer to each other than to the members of other groups.
Furthermore, they were able to infer the approximate time of the divergence events of the
enzymes relative to the evolution and emergence of the host species from the phylogenetic
tree.
Hypotheses and Aims
In this chapter, I hypothesized that a potential TF, Ps1365, interacts with one or more
proteins in P. sojae. This study has three aims:
1) Bioinformatic and phylogenetic analysis of the bait sequence, Ps1365.
2) Cloning of the bait sequence into yeast strain Saccharomyces cerevisiae
Y2HGold to construct bait yeast strain. 27
3) Identify interactive proteins of Ps1365 via Y2H.
Materials and Methods
I Biological Materials
Phytophthora sojae: P. sojae strain P6497 was grown using V8 agar plates. To prepare
1 L of V8 agar media, 200 mL V8 vegetable juice was added to 2.5 g CaCO3, and adjusted to
1 L by adding deionized water (dH2O). The pH was adjusted to 6-7 by using one or more
drops of 10 M NaOH. Then, 15 g of Bacto agar was added and stirred until homogenized, then autoclaved at 121°C for 30 min. The autoclaved media was poured into sterile petri
dishes. An autoclaved polycarbonate membrane (Sigma-Aldrich, St. Louis, MO) was
carefully placed onto each newly-made V8 agar plate using sterile forceps. To propagate P.
sojae mycelium, a plug with well-grown P. sojae mycelia was removed from an old V8 agar
plate using a cork borer (the cork borer penetrated through the V8 agar media until it hit the
bottom of the petri dish). Then the plug with mycelia on top was placed onto the center of the
polycarbonate membrane on the newly-made V8 agar plate using a sterile forceps or a sterile
needle, then the new V8 agar plate was incubated at room temperature (about 18-22°C) for
5-7 days. The mycelia were scraped off from the polycarbonate membrane surface using a razor blade and placed into sterile screw-capped 2.0 ml microfuge tubes and flash-frozen in liquid nitrogen, then immediately stored at -80°C until use.
Escherichia coli: One Shot® Mach1™-T1R and One Shot® TOP10 Chemically
Competent E. coli cells (Invitrogen, Carlsbad, CA) were used to propagate the yeast plasmids pGBKT7 and pGADT7-Rec. Cells were cultured on LB agar plates and in LB broth with 50
μg/ml kanamycin or 100 μg/ml ampicillin for selection. To prepare 500 mL of LB broth, 500 28
mL dH2O was added to 12.5 g LB ready-made powder (Lab Express International Inc.,
Fairfield, NJ), then stirred until the LB ready-made powder is completely dispersed. The pH
was adjusted to 7.0, if needed. Then, the media was autoclaved at 121°C for 30 min. LB agar
was prepared similarly, except 7.5 g of Bacto agar was added, followed by autoclaving. It
was then poured into sterile petri dishes when cooled down to about 55°C. To make LB broth
with kanamycin (50 μg/mL) or LB broth with ampicillin (100 μg/mL), one volume of 50
mg/mL kanamycin stock solution or 100 mg/mL ampicillin stock solution were added to
1000 volumes of LB broth when the LB broth was at room temperature, then the container
included LB broth was vigorously shaken to mix well. LB agar kanamycin (50 μg/mL) or LB
agar ampicillin (100 μg/mL) media was made by adding one volume of 50 mg/mL kanamycin
stock solution or 100 mg/mL ampicillin stock solution to 1000 volumes of LB agar when the
LB agar cooled down to about 55°C, then the container included LB agar was vigorously
shaken to mix well. The achieved media including antibiotics were immediately poured into
sterile petri dishes. The E. coli cells were cultured on LB agar plates at 37°C and incubated
overnight; E. coli cells were cultured in LB broth with 225 rpm shaking at 37°C overnight.
Saccharomyces cerevisiae: two yeast strains, S. cerevisiae Y2HGold and Y187 were
used. Both of them were grown on YPDA agar plates if they did not contain any vector.
Y2HGold with bait vector was grown on SD/-Trp single dropout (SDO) agar plates, while
Y187 with prey constructs was grown on SD/-Leu agar plates for selection. The diploid yeast
cells which were formed by the budding and mating of the bait and prey yeast strains were
grown on SD/–Leu/–Trp double dropout media (DDO). X-α-Gal and Aureobasidin A (AbA)
were used for observing the activity of reporter genes. Ready-made media powder pouches 29 from Clontech were used for the preparation of all the above YPDA and SD media (including all the nutrient-dropout media, e.g. SDO, DDO). Their broth media were prepared by dissolving a pouch of corresponding ready-made broth media powder from Clontech in 500 mL of dH2O, pH was adjusted to near 5.8 by adding 10 M NaOH, then autoclaved at 121°C
for 15 min. Their agar media were prepared either through two different procedures: (1) using
a pouch of corresponding ready-made broth media powder from Clontech. The procedure was
exactly same as that of the preparation of the broth media, except 10 g of Bacto agar was
added to each 500 mL of the broth media, followed by autoclaving; (2) directly using ready-made agar media powder from Clontech. The procedure was exactly same as that of the preparation of the broth media, except it was simply homogenized before autoclaving. The components of YPDA agar media were as follows:
peptone 20 g/L yeast extract 10 g/L agar 20 g/L adenine 120 mg/L The components of SD agar (includes SDO and DDO) media were as follows:
yeast nitrogen base 6.7 g/L
agar 20 g/L
amino acids see below “amino acid compositions of SD
media”
dextrose 2%
The amino acid compositions of SD media are as follows:
L-Adenine hemisulfate salt 20 mg/L 30
L-Arginine HCl 20 mg/L
L-Histidine HCl monohydrate 20 mg/L
L-Isoleucine 30 mg/L
L-Leucine 100 mg/L
L-Lysine HCl 30 mg/L
L-Methionine 20 mg/L
L-Phenylalanine 50 mg/L
L-Threonine 200 mg/L
L-Tryptophan 20 mg/L
L-Tyrosine 30 mg/L
L-Uracil 20 mg/L
L-Va line 150 mg/L
SDO and DDO lack the corresponding amino acids. For example, SD/-Trp media lacks L-Tryptophan on its amino acid composition.
X-α-Gal stock solution was prepared by dissolving 250 mg of X-α-Gal in 12.5 mL of dimethylformamide (DMF) solution, and stored at -20°C. AbA stock solution was prepared by dissolving 1 mg of AbA in 2 mL of ethanol and stored at 4°C. To make yeast media including X-α-Gal, 1 mL of X-α-Gal stock solution (20 mg/mL) was added to 500 mL yeast media (when the yeast media cooled to 55°C after autoclaving) and thoroughly mixed by shaking the container (containing yeast media). To make yeast media including AbA, 200 µL of AbA stock solution (500 µg/mL) was added to 500 mL of yeast media (when the yeast media cooled to 55°C after autoclaving) and thoroughly mixed by shaking the container 31
(containing yeast media).
II Bioinformatic Analysis of Ps1365
To obtain full-length amino acid and nucleotide sequences of Ps1365, its partial amino acid sequence was used as a query to search against P. sojae nucleotide and protein databases at NCBI (National Center for Biotechnology Information) (https://www.ncbi.nlm.nih.gov),
JGI (U.S. Department of Energy Joint Genome Institute) (https://jgi.doe.gov) and FungiDB
(Stajich et al., 2012) (https://fungidb.org/fungidb/) using BLAST tools. The NCBI BLAST
parameters were: database was selected as “Non-redundant protein sequences (nr)”, algorithm
was “blastp (protein-protein BLAST)”. The JGI BLAST parameters were: the alignment
program was set as blastp (blast protein vs. protein), and the database was selected as
"Phytophthora sojae v3.0 filtered model proteins", all the other parameters were set as
default. The FungiDB BLAST parameters were: “Target Data Type” was set as “Proteins”;
“BLAST Program” was set as “blastp”; “Target Organism” was set as “Phytophthora sojae
strain P6497”; all the other parameters were set as default.
The amino acid sequences of the 30 homologous proteins of Ps1365 from P. sojae
were provided by Brian Rutter. The eight homologous protein sequences of Ps1365 from
other oomycetes were retrieved from FungiDB. Proteins with abnormal insertions or
deletions indicating that they are pseudogenes were removed. The alignment of Ps1365 with
its homologous P. sojae proteins and with other related oomycete proteins as well as the
construction of their phylogenetic tree were performed using MEGA 7.0.26 (Kumar, Stecher,
& Tamura, 2016). Two different methods, neighbor-joining (NJ) and maximum likelihood
(ML) were used for constructing the nucleic acid and protein phylogenetic trees. The 32 parameters in Figure 2.2a were used for constructing nucleic acid NJ tree; the parameters in
Figure 2.2b were used for constructing nucleic acid ML tree; the parameters in Figure 2.2c were used for constructing protein NJ tree; the parameters in Figure 2.2d were used for
constructing protein ML tree. 33 a
b
34
c
d
Figure 2.2. Parameters used for constructing the phylogenetic tree of Ps1365, its 21 homologous proteins in P. sojae as well as some of their homologous counterparts in other oomycetes. The software MEGA 7.0.26 was used for the tree construction. a) nucleic acid NJ tree parameters: b) nucleic acid ML tree parameters; c) protein NJ tree parameters; d) protein
ML tree parameters.
35
Kyte-Doolittle Hydropathy Plot (Kyte & Doolittle, 1982) was used for analyzing the hydrophobicity of Ps1365, with window size set to 9. Signal peptide was predicted using
SignalP - Signal Peptide Prediction (Petersen, Brunak, von Heijne, & Nielsen, 2011),
organism group was selected as “Eukaryotes”, “D-cutoff values” was set as default, “Method” was set as “Input sequences may include TM regions”. Ps1365 were analyzed for potential
conserved domains using Pfam (Finn et al., 2016). The secondary structure of Ps1365 was
predicted using JPred 4 (Drozdetskiy, Cole, Procter, & Barton, 2015) using default
parameters. The subcellular localization of Ps1365 was predicted by CELLO (Yu, Lin, &
Hwang, 2004; Yu, Chen, Lu, & Hwang, 2006), “Eukaryotes” was selected for
“ORGANISMS”, and “Protein” was selected for “SEQUENCES”.
III Y2H Bait Construction
DNA extraction was performed using frozen P. sojae mycelia ground with a mortar
and pestle in liquid nitrogen. The achieved P. sojae mycelium dry powder was immediately
used for DNA extraction through Qiagen DNeasy Plant Mini Kit (Qiagen, Redwood City, CA)
according to the manufacturer’s protocol.
Cloning of Ps1365 bait gene into E. coli: the Y2H bait was constructed using
Matchmaker Gold Yeast Two-Hybrid System according to its manual (protocol No.:
PT4084-1, Clontech Laboratories, Inc., Mountain View, CA). PCR amplification of
Ps1365-encoding gene was performed using gene-specific forward and reverse primers
(Table 1) and genomic DNA of P. sojae as template through the following PCR program:
36
initialization 95°C 2 min
denaturation 95°C 30 sec
30 cycles annealing 67°C 30 sec
extension 72°C 1 min
final elongation 72°C 5 min
The PCR products were purified using QIAGEN Mini Elute PCR Purification Kit
(QIAGEN, Redwood City, CA) according to its manual. The process was, at first, 5 volumes of Buffer PB was added to the PCR reaction, at that time the color of the achieved mixture was yellow, indicating its pH was ≤ 7.5. PCR products were filtered using MinElute columns by centrifugation at 13,000 rpm at room temperature for 1 min. Then the DNAs on the
MinElute column membrane were washed using 750 μL of Buffer PE by centrifugation at
13,000 rpm at room temperature for 1 min, and centrifugation was conducted once again at
13,300 rpm at room temperature for 1 min. Finally, the DNAs were eluted using 30 μL dH2O by centrifugation at 13,000 rpm for 1 min. The purified PCR product was sequenced, using
Ps1365-encoding gene forward primer, to confirm the gene sequence and its open reading frame (DNA Analysis, LLC, Cincinnati, OH).
37
Figure 2.3. pGBKT7 Vector picture. pGBKT7 Vector Information, [PT3248-5], ©2008
Clontech Laboratories, Inc. (n/k/a Takara Bio USA, Inc.). This plasmid was used as bait. This figure was adapted from Matchmaker® Gold Yeast Two-Hybrid System User Manual,
Protocol No.: PT4084-1 (Clontech Laboratories, Inc., Mountain View, CA).
38
Table 2.1. Primers in this research
Primer Sequence (5’-3’) PCR
annealing
temperature
(°C)
Ps1365F CATGGAGGCCGAATTCATGAATGTGCGCGGTAGAACGCGT 67
Ps1365 GCAGGTCGACGGATCCTCACTCTTCTTCCTTGTCACC CGG 67
R
SMART AAGCAGTGGTATCAACGCAGAGTGGCCATTATGGCCGGG 70.8
III Oligo
CDS III ATTCTAGAGGCCGAGGCGGCCGACATG-d(T)30VN 42
CDS ATTCTAGAGGCCGAGGCGGCCGACATG-NNNNNN 42
III/6
5’ PCR TTCCACCCAAGCAGTGGTATCAACGCAGAGTGG 68
Primer
3’ PCR GTATCGATGCCCACCCTCTAGAGGCCGAGGCGGCCGACA 68
Primer
39
After sequencing, the full-length sequence of Ps1365-encoding gene was cloned into
the bait vector pGBKT7 (BD) (Figure 2.3) by the following process: because both the gene-specific forward and reverse primers (used for cloning the Ps1365 gene) have overhangs homologous, respectively, to the two flanking sequences of the multiple cloning site of the yeast vector pGBKT7 (BD), the amplified sequence was inserted into the multiple cloning site of the vector pGBKT7 (BD) by homologous recombination. To perform homologous recombination, the circular pGBKT7 vector was double digested using EcoRI and BamHI at
37°C for one hour, then incubated at 65°C for 20 min. Then the double-digested vector was purified through ethanol precipitation (Lamitina Lab, 2007).
The ethanol precipitation was conducted as follows:
1) 1/10 volumes of sodium acetate (pH 5.2) was added to the digested vector.
2) Mixed by gently pipetting.
3) 2.5 volumes (vector + sodium acetate) of cold ethanol (stored at -20°C) was added to the reaction.
4) Mixed by gently pipetting.
5) Stored at -20°C for three hours.
6) Centrifuged at 16,300 g (max) for 15 min.
7) Carefully decanted supernatants.
8) Added 1 mL 70% ethanol. Centrifuged at 16,300 g for 2 min, residual ethanol was carefully decanted.
9) Centrifuged once again at 16,300 g for 2 min, residual ethanol was carefully decanted. 40
10) DNA pellets were resuspended in 15 μL dH2O.
The purified double-digested pGBKT7 vector and Ps1365 gene insert were attached to
each other via homologous recombination using an In-Fusion HD Cloning Kit (Clontech
Laboratories, Inc., Mountain View, CA) according to its manual. The process was that, at first, the following reagents were combined by the volumes shown below:
5× In-Fusion HD Enzyme Premix 4 μL
linearized pGBKT7 vector 10.3 μL (50.573 ng)
Ps1365 5.4 μL (50.274 ng)
dH2O 0.3 μL
Incubated at 50°C for 15 min.
The above product of the In-Fusion reaction was then transformed into Invitrogen One
Shot® TOP10 Chemically Competent E. coli cells (Thermo Fisher Scientific Inc., Waltham,
MA) according to its protocol below:
1) Frozen TOP10 cells were taken out from -80°C, and immediately placed on ice.
2 μL of In-Fusion reactions were added to each tube of TOP10 cell, mixed gently.
2) Incubated on ice for 10 min.
3) Heat shocked the cells at 42°C water bath for 30 sec.
4) Immediately incubated on ice for 2 min.
5) Added 250 μL of room temperature SOC medium to each tube of TOP10 cells.
6) Tubes were mixed by shaking at 200 rpm, 37°C for one hour.
7) Plated each tube on one LB ager plate with 50 μg/mL kanamycin.
8) Incubated at 37 °C overnight. 41
The components of the SOC medium are as follows:
Tryptone 2%
Yeast Extract 0.5%
NaCl 10 mM
KCl 2.5 mM
MgCl2•6H2O 10 mM
glucose 20 mM
The pGBKT7 vectors containing Ps1365 gene insert were extracted from the E. coli
cells using Zyppy™ Plasmid Miniprep Kit (Zymo Research Corporation, Irvine, CA)
according to its protocol. First, 100 μL of 7× Lysis Buffer was added to each 600 μL of
bacterial liquid culture, and mixed. Then, 350 μL of 4°C Neutralization Buffer was added to each reaction and mixed. The reactions were centrifuged at 16,000 for 4 min. Supernatants were filtered through Zymo-Spin IIN columns to trap the plasmids on column membranes by centrifugation at 16,000 g for 30 sec. Each of the column membranes were washed with 200
μL Endo-Wash Buffer by centrifugation at 16,000 rpm for 30 sec. This step was repeated with
400 μL Endo-Wash Buffer. The plasmids on each column membrane were eluted with 30 μL dH2O by centrifugation at 16,000 g for 30 sec. The purified plasmid was sequenced by DNA
Analysis, LLC (Cincinnati, OH) to ensure that the gene is in the correct open reading frame
within the vector pGBKT7.
Sub-cloning of the Ps1365 gene in yeast strain S. cerevisiae Y2HGold: the pGBKT7
vector containing a full-length sequence of Ps1365 gene (abbreviated as pGBKT7-Ps1365)
was transformed into the yeast strain S. cerevisiae Y2HGold using Yeastmaker™ Yeast 42
Transformation System 2 (Clontech Laboratories, Inc., Mountain View, CA) according to its protocol. The process was, at first, competent yeast cells were prepared as follows:
1) Yeast cells were incubated on YPDA agar plates at 30°C until the colony diameters reach 2-3 mm.
2) Each single colony with the diameters between 2-3 mm was transferred into 3 mL YPDA broth and liquid-incubated with 250 rpm shaking at 30°C for 12 h.
3) A 5 μL aliquot of culture were transferred into 50 mL YPDA broth and were incubated with 250 rpm shaking at 30°C until OD600 reached 0.194.
4) The yeast culture was centrifuged at 700 g for 5 min to pellet yeast cells, supernatant was discarded. Cell pellets were resuspended with 100 mL fresh YPDA broth, and incubated with 250 rpm shaking at 30 °C until OD600 reached 0.447.
5) The achieved yeast culture was divided into two 50 mL volumes in two sterile
Falcon conical tubes, and centrifuged at 700 g for 5 min to pellet the yeast cells. The
supernatants were discarded.
6) Cells were resuspended with 30 mL dH2O.
7) The suspension was centrifuged at 700 g for 5 min to discard the supernatants.
Resuspended the cell pellets in 1.5 mL 1.1×TE/LiAc (prepared by mixing 1.1 mL of 10X TE
Buffer and 1.1 mL of 1 M LiAc (10X) solution and adding sterile dH2O to make the total
volume 10 mL).
8) The suspension was centrifuged at 16,300 g for 1 min.
9) Supernatants were discarded and cells were resuspended in 600 μL 1.1×TE/LiAc,
then placed on ice. 43
Then, yeast transformation was conducted as follows:
1) The following components were mixed on each empty tube:
Plasmid [pGBKT7-Ps1365] 2 μL (118.6 ng)
Yeastmaker Carrier DNA denatured 5 μL (50 μg)
2) To each tube above, added 50 μL competent cells, mixed gently by gently vortexing.
3) 500 μL of PEG/LiAc (prepared by combining 8 ml of 50% PEG 3350, 1 ml of
10X TE Buffer, and 1 ml of 1 M LiAc (10X)) was added to each tube of reaction, mixed gently by gently vortexing.
4) Incubated at 30°C for 30 min. Cells were mixed every 10 min by gently vortexing.
5) Added 20 μL DMSO (dimethylsulfoxide) and mixed by gently vortexing.
6) Incubated in 42°C water bath for 15 min. Cells were mixed at every 5 min intervals by gently vortexing.
7) Yeast cells were pelleted by centrifugation at 16,300 g for 1 min.
8) Supernatants were discarded. The cells were resuspended in 1 mL YPD Plus
Medium.
9) Yeast cells were pelleted by centrifugation at 16,300 g for 1 min.
10) Supernatants were discarded. The cells were resuspended in 1 mL 0.9% NaCl solution.
11) Cells were diluted to 1/10 and 1/100 concentrations by adding 0.9% NaCl solution and plated on SD/-Trp agar plates, then incubated at 30°C until colonies with 2-3 44 mm of diameters appear (five days).
Then, glycerol stocks of Y2HGold [pGBKT7-Ps1365] (Y2HGold cells including pGBKT7-Ps1365) were made and stored at -80°C.
To confirm that whether the genome of the bait, S. cerevisiae Y2HGold transformants, contained Ps1365-encoding gene sequence, yeast colony PCR was performed on two single bait colonies as follows: firstly, Y2HGold [pGBKT7-Ps1365] cells were grown on SD/-Trp agar plates at 30°C for 5 days, then single yeast colonies whose diameter was between 2-3 mm were picked and lysed with NaOH as follows:
1) Each single yeast colony was picked and suspended in 20 μl of 20 mM NaOH.
2) Incubated the yeast-NaOH suspension at 95°C for 45 minutes to break down yeast cell walls.
3) Centrifuged at maximum speed (20,817 g) for 10 minutes.
Using the supernatants obtained as templates, yeast colony PCR was performed using
Taq 2× Master Mix (New England Biolabs, Ipswich, MA) and Ps1365 gene-specific primers
(Table 1). Meanwhile, the supernatant from untransformed S. cerevisiae Y2HGold cells was used as a negative control, while vector pGBKT7-Ps1365 was used as a positive control. The following PCR program was used:
45
initiation 95°C 2 min
denaturation 95°C 30 sec
30 cycles annealing 67°C 30 sec
extension 72°C 1 min
final elongation 72°C 5 min
The components of the Taq 2× Master Mix used are as follows:
Tris-HCl 20 mM
KCl 100 mM
MgCl2 3 mM
dNTPs 0.4 mM
Glycerol 10%
IGEPAL® CA-630 0.16%
Tween® 20 0.1%
Taq DNA Polymerase 50 Units/mL
IV Bait Autoactivation Assay
The pGBKT7-Ps1365 bait strain was tested for autoactivation to confirm that Ps1365 would not be able to activate the expression of the reporter genes without the presence of prey proteins. The bait autoactivation assay was performed using the Matchmaker Gold Yeast
Two-Hybrid System according to the manufacturer’s protocol (protocol No.: PT4084-1,
Clontech Laboratories, Inc., Mountain View, CA).
In order to conduct the autoactivation test, five test groups were set up: 46
① empty cell group: Y2HGold cells without any vector
② empty vector group: Y2HGold cells transformed with pGBKT7 (BD) vector without any insert
③ test group: Y2HGold transformed with pGBKT7-Ps1365
④ negative control: Y2HGold [pGBKT7-Lam] X Y187 [pGADT7-T]
⑤ positive control: Y2HGold [pGBKT7-53] X Y187 [pGADT7-T]
Group ④ negative control is the diploid yeast cells resulted from the mating of
Y2HGold [pGBKT7-Lam] and Y187 [pGADT7-T] cells. These diploid yeast cells will not express the reporter genes because of the lack of interaction between the bait (Lam) and prey
(T antigen). Group ⑤ positive control is the diploid yeast cells resulted from the mating of
Y2HGold [pGBKT7-53] and Y187 [pGADT7-T] cells. These diploid yeast cells will express the reporter genes because of the interaction between the bait (p53) and prey (T antigen).
Groups ①, ② and ③ were plated on the following three agar media, respectively, with 1/10, 1/100, 1/1000 dilutions:
(1) SDO (SD/–Trp). Only the yeast cells harboring the pGBKT7 vector can grow on this media.
(2) SDO/X (SD/–Trp/X-α-Gal). Only the yeast cells harboring the pGBKT7 vector can grow on this media. Also, the MEL1 gene that they contained was activated forms a blue pigment on this media.
(3) SDO/X/A (SD/–Trp/X-α-Gal/AbA). Only the yeast cells harboring the pGBKT7 vector, and the AUR1-C genes they contained were activated could grow on this media. Also, if the MEL1 gene was activated, the yeast cells would form typical blue colonies on this 47 media.
The two control groups were plated on the following three different agar media, respectively, with 1/10, 1/100, 1/1000 dilutions:
(1) DDO (SD/–Leu/–Trp). Only the yeast cells harboring both pGBKT7 and pGADT7 vectors can grow on this media.
(2) DDO/X (SD/–Leu/–Trp/X-α-Gal). Only the yeast cells harboring both pGBKT7 and pGADT7 vectors grow on this media. Also, the activation of the MEL1 gene they contained causes the formation of a blue pigment.
(3) DDO/X/A (SD/–Leu/–Trp/X-α-Gal/AbA). Only the yeast cells harboring both pGBKT7 and pGADT7 vectors, as well as AUR1-C gene they contained was activated could
grow on this media. As mentioned above, the activation of their MEL1 gene causes the
formation of a blue pigment.
The plates were incubated at 30°C until colony diameters reached 2-3 mm. If blue
colonies appeared on both SDO/X and SDO/X/A agar plates of the test group, this meant that
Ps1365 caused autoactivation. If white or pale blue colonies appeared on the test group
SDO/X plates and no colonies on the test group SDO/X/A plates, this meant that Ps1365
failed to autoactivate the reporter genes without the presence of prey proteins.
The positive and negative control groups were constructed (as follows) according to
Matchmaker Gold Yeast Two-Hybrid System User Manual:
The diploid yeast cells containing pGBKT7-53 and pGADT7-T were used as a positive control, while the diploid yeast cells containing pGBKT7-Lam and pGADT7-T were used as a negative control in the autoactivation test. 48
First ly, S. cerevisiae Y2HGold cells were divided into two groups: one group was transformed with pGBKT7-53 (Figure 2.4a), while another group was transformed with the pGBKT7-Lam (Figure 2.4b) vector; and, S. cerevisiae Y187 cells were transformed with the pGADT7-T vector (Figure 2.4c) (all these transformations were performed using Yeastmaker
Yeast Transformation System 2 according to its protocol). The resultant Y2HGold
[pGBKT7-53] and Y2HGold [pGBKT7-Lam] cells were screened on SD/-Trp agar plates while Y187 [pGADT7-T] cells were on SD/-Leu agar plates.
Within yeast cells, pGBKT7-53 expressed the GAL4 BD-fused p53 protein; pGBKT7-Lam expressed the GAL4 BD-fused lamin, while pGADT7-T expressed the GAL4
AD-fused T antigen. pGBKT7-53 was used for a positive control because the p53 protein expressed by pGBKT7-53 interacts with the T antigen (which is expressed by pGADT7-T), while pGBKT7-Lam was used for a negative control because lamin expressed by pGBKT7-Lam cannot interact with the T antigen. Thus, the diploid yeast cells containing pGBKT7-53 and pGADT7-T are the positive control, while the diploid yeast cells containing pGBKT7-Lam and pGADT7-T are the negative control in the autoactivation test.
49
a b
c
Figure 2.4. Vectors used for the autoactivation assay. a) pGBKT7-53 Vector picture.
Matchmaker® Gold Yeast Two-Hybrid System User Manual, [PT4084-1], ©2010 Clontech
Laboratories, Inc. (n/k/a Takara Bio USA, Inc.). This vector contains a built-in murine p53 insert instead of the bait gene. b) pGBKT7-Lam DNA-BD Control Vector picture.
Matchmaker™ Library Construction & Screening Kits User Manual, [PT3955-1], ©2007
Clontech Laboratories, Inc. (n/k/a Takara Bio USA, Inc.). This vector contains a human lamin
C-encoding gene instead of bait gene. c) pGADT7-T Vector picture. Matchmaker® Gold
Yeast Two-Hybrid System User Manual, [PT4084-1], ©2010 Clontech Laboratories, Inc.
(n/k/a Takara Bio USA, Inc.). This vector contains an SV40 large T-antigen gene. p53 and
T-antigen can interact with each other, while lamin C and T-antigen cannot, so that the diploid
yeast cells (formed by Y2H mating) which including both pGBKT7-53 and pGADT7-T
functions as a positive control, while the diploid yeast cells including both pGBKT7-Lam and pGADT7-T function as the negative control.
50
For mating experiments, when the colony diameters of yeast cells reached between
2-3 mm on SD/-Trp and SD/-Leu plates, the mating was performed by co-culturing the
Y2HGold [pGBKT7-53] or Y2HGold [pGBKT7-Lam] and Y187 [pGADT7-T] cells in 500
μl of 2X YPDA with shaking at 200 rpm, at 30oC, overnight.
V Bait Toxicity Assay
To construct the control group for the toxicity test, the S. cerevisiae Y2HGold cells were directly transformed with pGBKT7 (BD) vectors without any insert and then were plated on SDO agar plates for screening. Both the test group and empty vector group were plated separately on SDO agar plates and incubated at 30 °C for five days. The growth of yeast cells and/or the size of the colonies were used as the indicator for toxicity. If Ps1365 is toxic for yeast, there would be no colonies on the test group SDO plates, or the colony diameters of yeast cells containing Ps1365 gene would be significantly smaller than that of empty vector group. If there is no obvious difference between the colony diameters of empty vector group and that of test group on SDO plates, it can be confirmed that Ps1365 is not toxic for yeast.
VI Prey Library Construction
Total RNAs were extracted from P. sojae mycelia using QIAGEN RNeasy Plant Mini
Kit (Qiagen, Redwood City, CA) according to its protocol. At first, 10 μL of
β-mercaptoethanol (β-ME) was added to each 1 mL of Buffer RLT (from QIAGEN Plant
Mini Kit) and mixed. Phytophthora sojae mycelium powder tubes were quickly taken out from -80°C and immediately put into liquid nitrogen. Then, the tubes were taken out from liquid nitrogen and 450 μL of Buffer RLT (contains β-ME) was immediately added to each of 51 the tubes and vortexed vigorously to homogenize the sample. The lysates were filtered through QIAshredder spin columns by centrifuging at 14,000 rpm for 2 min. Next, 0.5 volumes of RNase-free ethanol was added to each of the collected supernatants of each flow-through, and each was immediately filtered through RNeasy spin columns by centrifuging at 11,000 rpm for 1 min. The flow-throughs were discarded. The filter membrane of each RNeasy spin column was washed with 700 μL Buffer RW1 through centrifugation at
11,000 rpm for 15 sec. Each spin column membrane was washed again with 500 μL Buffer
RPE by centrifugation at 11,000 rpm for 15 sec, then washed once again with 500 μL Buffer
RPE by centrifugation at 11,000 rpm for 2 min. All traces of Buffer RPE were removed by centrifuging once again at 14,000 rpm for 1 min. The RNAs were collected by washing each column membrane with 50 μL RNase-free water through centrifugation at 11,000 rpm for 1 min. This step was repeated using the achieved eluate.
Both the first-strand cDNA synthesis and the long-distance PCR (LD-PCR) were performed according to the protocol of Clontech Make Your Own “Mate & Plate™” Library
System (Clontech Laboratories, Inc., Mountain View, CA) using primers listed in Table 1. For the first-strand cDNA synthesis, 3 μL of template RNA (amount is between 450 ng and 560 ng) and 1 μL of 10 μM CDSIII or CDSIII/6 were combined, then incubated at 72°C for 2 min, then cooled on ice for 3 min, then centrifuged at 4°C, 14,000 rpm for 10 sec. To each of the reactions above, added 5 μL of the following mixture:
5× First-strand Buffer 2 μL
10 mM DTT (dithiothreitol) 1 μL
10 mM dNTP Mix (10 mM of each) 1 μL 52
SMART MMLV Reverse Transcriptase (200 U/μL) 1 μL
The reactions were incubated at 42°C for 10 min (CDSIII/6-primed reactions were incubated at 25°C for 10 min before this step). To each tube 1 μL 10 μM SMARTIII-modified oligo primer was added, mixed and incubated at 42°C for one hour, then placed at 75°C for
10 min to terminate the first-strand synthesis. Then, the reactions were cooled to 20°C, and 1
μL RNase H (2 units) was added to each reaction, incubated at 37°C for 20 min. Finally, the
reactions were stored at -20°C.
To amplify first-strand cDNAs through LD-PCR, 2 μL of first-strand cDNA was taken
for each reaction. A mixture was prepared using the following reagents in the shown volume:
sterile dH2O 368 μL
10× Advantage 2 PCR Buffer 80 μL
10 mM dNTPs (10 mM of each) 16 μL
5’ PCR primer (10 μM) (Table 1) 16 μL
3’ PCR primer (10 μM) (Table 1) 16 μL
5M Betaine Solution 272 μL
50× Advantage 2 Polymerase Mix 16 μL
The components of the 10× Advantage 2 PCR Buffer used are as follows:
Tricine-KOH (pH 8.7 at 25°C) 400 mM
KOAc 150 mM
Mg(OAc)2 35 mM
BSA 37.5 µg/ml
Tween 0.05 % 53
Nonidet-P40 0.05 %
The 50× Advantage 2 Polymerase Mix used contained Taq polymerase, and its buffer composition is as follows:
Glycerol 50%
Tris-HCl (pH 8.0) 15 mM
KCl 75 mM
EDTA 0.05 mM
To each 2 μL of first-strand cDNA, added 98 μL of the mixture above, then run the
following PCR program:
95°C 30 sec
95°C 10 sec 22× 68°C 6 min (increase 5 sec in each subsequent cycle)
68°C 5 min
The LD-PCR products were purified by ethanol precipitation. For this, 1/10 volume of
3M sodium acetate (pH 5.2) was added to each LD-PCR reaction, then mixed by gently
pipetting. 2.5 volumes (the volume of LD-PCR reaction + sodium acetate) of cold ethanol
(stored at -20°C) was added to the mixture, then mixed by gently pipetting. The reactions
were incubated at -20°C overnight. Then, the reactions were taken out from the freezer and
centrifuged at 14,000 rpm for 20 min at room temperature. Supernatants were discarded and
the precipitates were resuspended by adding 1 mL 70% ethanol to each reaction, mixed by 54 gently pipetting, then centrifuged at 14,000 rpm for 2 min at room temperature, the supernatants were carefully decanted. The remained pellets were centrifugated once again at
14,000 rpm for 2 min, and supernatants were carefully decanted. Finally, 21 μL of dH2O (pH
8.11) was added to each reaction to rehydrate the DNAs.
The library stocks were then made according to Clontech Make Your Own “Mate &
Plate™” Library System User Manual (Clontech Laboratories, Inc., Mountain View, CA). For making library stocks, SmaI-linearized pGADT7-Rec vectors and cDNAs were co-transformed into S. cerevisiae Y187 cells using Yeastmaker Yeast Transformation System
2 (Clontech Laboratories, Inc., Mountain View, CA) according to its protocol. The resultant transformants were incubated on SD/-Leu agar plates (with 50 μg/ml kanamycin) at 30°C for
5 days. The ligation process was automatically completed within S. cerevisiae Y187 cells.
The colonies were collected to produce glycerol stocks and stored at -80 °C.
VII Yeast Two-hybrid Screening
The Y2H was performed using Matchmaker® Gold Yeast Two-hybrid System
(Clontech Laboratories, Inc., Mountain View, CA) according to its protocol as follows:
1) A fresh colony (diameter was between 2-3 mm) of the bait yeast construct was
o cultured at 30 C in 50 mL SD/-Trp broth with 260 rpm shaking until OD600 reaches 0.781.
2) The cells were pelleted by centrifugation at 1000 g for 5 min. Cell pellets were
resuspended with 4 mL of SD/-Trp broth to a cell density of 1.16 x 108 cells/mL.
3) Next, 4 mL of the bait strain was mixed with two tubes of P. sojae mycelium prey
library (1 mL each) in a 2 L flask. Then 45 mL of 2X YPDA broth (containing 50 µg/mL
kanamycin) was added to the mixture and incubated with 50 rpm shaking at 30oC for 20 h. 55
Microscopic observation was used to confirm the mating of yeast cells. If mating did not occur, incubation was continued.
4) Once the 3-lobes-shaped structures were found which indicating the successful mating, the culture was centrifuged at 1000 g for 10 min to precipitate the cells. The supernatant was discarded and cell pellets were resuspended with 10 mL 0.5X YPDA (with
50 µg/mL kanamycin), then were plated on SD/-Trp, SD/-Leu and SD/–Leu/–Trp plates respectively with 1/10, 1/100, 1/1000 and 1/10000 dilutions (100 µL to each plate).
5) The remaining culture was plated on 150 mm DDO/X/A agar plates (200 µL to each plate).
6) All plates were incubated at 30°C for five days.
7) Numbers of independent colonies on SDO and DDO plates were counted, and then the number of screened diploid colonies (formed by mating) and mating efficiency were calculated.
VIII Prey Colony Insert-checking
The blue diploid yeast colonies indicating possible positive reactions were randomly analyzed for the potential prey protein gene inserts by using Clontech Insert-check PCR Mix
2 (Clontech Laboratories, Inc., Mountain View, CA) according to its protocol as follows:
1) Each blue yeast colony was picked and suspended in 20 μl of 20 mM NaOH.
2) Incubated the yeast-NaOH suspension at 95°C for 45 minutes to break down
yeast cell walls.
3) The yeast cell debris was pelleted by centrifugation at 20,817 g for 10 min. Then,
8 μL nano-pure water, 2 μL of the supernatant, and 10 μL of Matchmaker Insert-check PCR 56
Mix 2 (PCR polymerase, dNTPs, primers, and buffer) were combined for each yeast colony.
Then, the following insert-check PCR program was run:
94°C 1 min
98°C 10 sec 30× 68°C 3 min
IX Plasmid Rescue
Plasmid extraction was performed on the single blue yeast colonies on DDO/X/A agar
plates using Zymoprep™ Yeast Plasmid Miniprep II Kit (Zymo Research Corp, Irvine, CA)
according to its protocol. For this, each yeast colony was dispensed into 200 μL of Solution 1,
then 5 μL of Zymolyase (5 Units/μL) was added to each reaction. The reactions were
incubated at 37°C for 60 min, then 200 μL of Solution 2 was added to each reaction, after that,
400 μL of Solution 3 was added (to each reaction). The reactions were centrifugated at 20817
g for 3 min. The supernatants were transferred to Zymo-Spin-1 columns, then centrifugated at
20,817 g for 30 sec, flow-throughs were discarded. The spin-column membranes were washed using 550 μL Wash Buffer by centrifugation at 20,817 g for 2 min. Finally, plasmids were eluted from the spin-column membranes using 10 μL of Zyppy Elution Buffer by centrifugation at 20,817 g for 1 min. Then, the plasmids from each individual colony were transformed into One Shot® Mach1™-T1R and One Shot® TOP10 Chemically Competent E. coli cells (Thermo Fisher Scientific Inc., Waltham, MA) according to the manufacturer’s manual. For the transformation, first the frozen Mach1™-T1R and One Shot® TOP10 cells 57 were taken out from -80°C and immediately placed on ice to thaw. Then, 5 μL of plasmid extractions from yeast was added to each tube of thawed competent cells, mixed by gently tapping. The reactions were incubated on ice for 30 min, then heat-shocked in a 42°C water bath for 30 sec. After that, reactions were incubated on ice for 2 min. Next, 250 μL of room-temperature SOC media was added to each tube of reactions, then incubated at 37°C with 225 rpm shaking for one hour. One tube of transformed E. coli cells was plated on each
LB agar plates with 100 μg/ml ampicillin and incubated at 37°C overnight to make primary E.
coli plates. The resultant individual colonies were transferred onto new LB agar plates with
100 μg/ml ampicillin to make secondary E. coli plates. Two secondary E. coli plates and one primary E. coli plate were sent to University of Chicago Comprehensive Cancer Center DNA
Sequencing & Genotyping Facility (UCCCC-DSF), Illinois for high-throughput plasmid extraction and Sanger sequencing.
Results
Aim 1) Analysis of the Bait Sequence, Ps1365
Bait Sequence Analysis
Rutter et al. (2012) identified a potential novel family of transcription factors in P.
sojae. They reported that the upstream regions of 20% of the P. sojae genes contain a
common motif “GCCGCC”. Their study used yeast one-hybrid analysis (Y1H) to identify
proteins that bind to the “GCCGCC” motif using P. sojae cDNA libraries from both zoospore
and mycelia. They identified 31 proteins that were able to bind to the GCCGCC motif and
found that each of these proteins contains a conserved region of 50 amino acids. Using
BLAST, they identified six P. infestans proteins, and a protein from P. ramorum, that are 58 orthologous proteins of Ps1365 .
Through re-analysis of Ps1365 in the genome of P. sojae, out of 31 genes previously
identified, 22 were predicted to be hypothetical proteins, containing complete open reading
frames. Nine proteins were identified as pseudogenes. The following six proteins were
considered to be pseudogene proteins because they have abnormal deletions:
PHYSODRAFT_342634-t26_1
PHYSODRAFT_288539-t26_1
PHYSODRAFT_283985-t26_1
PHYSODRAFT_320623-t26_1
PHYSODRAFT_534655-t26_1
PHYSODRAFT_342550-t26_1
The following six proteins were removed as pseudogenes because of their abnormal insertions:
PHYSODRAFT_288539-t26_1
PHYSODRAFT_283985-t26_1
PHYSODRAFT_288533-t26_1
PHYSODRAFT_289262-t26_1
PHYSODRAFT_342550-t26_1
PHYSODRAFT_257212-t26_1
Among the remaining 22 hypothetical proteins, Ps1365 was selected for further characterization because it encodes a small protein and does not have any intron, which makes it convenient for cloning and manipulating. Additionally, Ps1365 is expressed in 59 various important stages including mycelium, zoospores and during infection. The partial amino acid sequence of Ps1365 (Figure 2.5) was identified and provided by Rutter (2012).
NVRGRTRDGKFIYATTMKGNLEPKKKGNASEVKEESQTIGSQAKKRAKIESNPGVSS
VSSVSEEDPVSTICLRLLNRVTYPFLSEDVASILSNMVGSQSDPLGLVEGDLDYRDEK
KK
Figure 2.5. Partial amino acid sequence of Ps1365. This partial protein sequence was
identified by Rutter (2012). The region labeled red is a conserved region which is shared by a
family of 22 proteins in P. sojae.
The full-length sequence of Ps1365 was retrieved from NCBI via BLAST search
(identical sequences were observed from both JGI and FungiDB databases) and is shown in
Figure 2.6. Analysis of full-length sequence of Ps1365 protein showed that it is 471-nt long and
encodes a putative small protein of 156 amino acids (Figure 2.6). Ps1365 was associated with accession numbers XP_009539397.1 (NCBI), PHYSODRAFT_342624-t26_1 (FungiDB) and
PHYSODRAFT_342624 (JGI version3). The gene is 471-bp long (with no intron) and is located at 572879 – 573349 of the forward strand of the P. sojae genome (P. sojae genome sequence can be downloaded from the webpage:
https://fungidb.org/fungidb/showQuestion.do?questionFullName=GenomicSequenceQuestion
s.SequencesByTaxon). According to FungiDB, the product of PHYSODRAFT_342624-t26_1
is a hypothetical protein of 156 amino acids, with the molecular weight of 17,424 Da and
isoelectric point of 8.39. 60
A) Nucleotide Sequence of 1365 (XP_009539397.1, PHYSODRAFT_342624) ATGAATGTGCGCGGTAGAACGCGTGACGGCAAGTTCATCTATGCCACGACAATGAAAGACAATCTT AAGCCCAAGAAGAAAGGCAACGCCTCTGAGGTGAAGGAGGAGAGTCAGACGATCGGGAGTCAG GCGAAGAAGCGGGCGAAGATCGAGAGCAATCCTGGCGTCTCAAGTGTGTCGAGTGTCTCGGAGG AGGATCCTGTTTTCACGATCTGCCTCCGTTTGCTCAACAGGGTCACCTACCCGTTCCTGAGCGAGG ACGTGGCTTCCATTCTATCAAACATGGTGGAGTCTCAGTATGACCCGCTGGGGCTTGTGGAGGGCG ACCTGGACTATCGCGATGAAGATGAAAAACGAAACAAGAACGTTAGTTCGTATGCAAAGTTGTCGC TTACCGCGACGCGCGACAAGTTGTTGCAGAACGCCCGTAAAGCACTAGCGTACGCGCCGGGTGAC AAGGAAGAAGAGTGA B) Amino Acid Sequence of 1365 (XP_009539397.1, PHYSODRAFT_342624) MNVRGRTRDGKFIYATTMKDNLKPKKKGNASEVKEESQTIGSQAKKRAKIESNPGVSSVSSVSEEDP VFTICLRLLNRVTYPFLSEDVASILSNMVESQYDPLGLVEGDLDYRDEDEKRNKNVSSYAKLSLTATRDK LLQNARKALAYAPGDKEEE*
Figure 2.6. Full-length nucleotide (A) and deduced amino acid sequences (B) of
Ps1365. The partial amino acid sequence of Ps1365 was used to query the NCBI, JGI and
FungiDB databases. BLAST results showed that Ps1365 is associated with accession numbers XP_009539397.1 (NCBI), PHYSODRAFT_342624-t26_1 (FungiDB), and
PHYSODRAFT_342624 (JGI version 3). Start and stop codons are underlined. An asterisk (*) indicates stop codon in the amino acid sequence. 61
Hydrophobicity Analysis
Sub-localization of Ps1365 was first analyzed via the Kyte-Doolittle Hydropathy Plot analysis (Kyte & Doolittle, 1982). Using the window size of 9, results showed that majority parts of Ps1365 are hydrophilic except the regions between the amino acid residues of 50-95 showed several possible hydrophobic regions (Figure 2.7) indicating that Ps1365 may not be a membrane protein, but rather a globular protein.
Figure 2.7. The Kyte-Doolittle Hydropathy Plot analysis of Ps1365. The full-length
amino acid sequence of Ps1365 was analyzed with window size of 9. The regions between
the amino acid residues of 50-95 showed possible hydrophobic parts. The numbers on the
horizontal axis represent the position of amino acid residues; the numbers on the vertical axis
indicates the degree of hydrophobicity of each amino acid residue.
Signal Peptide and Conserved Domain Prediction
Secretory signal peptide of Ps1365 was analyzed by SignalP - Signal Peptide 62
Prediction (Petersen, Brunak, von Heijne, & Nielsen, 2011)
(http://sigpep.services.came.sbg.ac.at/signalblast.html). The C-score is the cleavage site score,
which is high in the cleavage site. The amino acid residues with lower S-scores may
potentially be the part of a mature protein, while the amino acid residues with higher S-scores
may potentially be the part of a signal peptide. The Y-score is highest on the sites where
C-score is significantly higher and S-score abruptly decreased, implying the potential existence of a cleavage site. Results indicated that Ps1365 does not contain any recognized signal peptide (Figure 2.8). Additionally, Pfam (pfam.xfam.org) tool did not identify any
conserved domain in Ps1365, suggesting that Ps1365 either does not have functional domains
or it has domains which were not identified in the database.
Figure 2.8. SignalP analysis showed that Ps1365 did not contain any cleavage site
indicating a recognizable signal peptide. The x-axis represents amino acid residues and their
positions, while the y-axis shows the value of the three scores: C, S and Y which together
indicate the potential existence of a cleavage site. 63
Secondary Structure Prediction
When analyzing the secondary structure of Ps1365 using JPred 4 (Drozdetskiy, Cole,
Procter, & Barton, 2015) (http://www.compbio.dundee.ac.uk/jpred/), a structure that had
either a helix-loop-helix (HLH) or helix-turn-helix (HTH) on the Ps1365 polypeptide was
indicated (Figure 2.9). Within the JPred prediction results, Jnet indicates the final result of secondary structure prediction, jhmm shows the JNet HMM profile prediction result, while jpssm represents the results of JNet PSIBLAST PSSM profile prediction. Existence of more than one α-helix in JNet conclusion in Figure 2.9 implies the potential existence of a HLH or
HTH domain in Ps1365. 64
Figure 2.9. Predicted secondary structure of Ps1365 by JPred 4, a secondary structure predictor server. “H” represents helical, “E” represents extended, “-” represents other types of structures. Jnet = Final secondary structure prediction for query, jhmm = JNet HMM profile prediction, jpssm = JNet PSIBLAST PSSM profile prediction.
Subcellular Localization Prediction
The potential subcellular localization of Ps1365 was predicted by analyzing the
full-length putative amino acid sequence of Ps1365 using CELLO (Yu, Lin, & Hwang, 2004;
Yu, Chen, Lu, & Hwang, 2006). CELLO is a subcellular localization prediction software that
combines analyses of five different classifiers: amino acid composition, N-peptide
composition, partitioned sequence composition, physico-chemical composition, and neighboring sequence composition. The prediction result indicated that Ps1365 may localize to nucleus with a high possibility at 4.306 (Figure 2.10). 65
Figure 2.10. Prediction of Ps1365 subcellular localization by CELLO. The probability of Ps1365 being a nuclear protein was significantly higher than any other subcellular localization probabilities at 4.306 implying that Ps1365 might be a nuclear protein. The
values alongside the subcellular locations/organelles are the degrees of reliability for the
protein to be localized to that location/organelle. Each of the reliability scores is the sum of
the reliabilities of five classifiers: amino acid composition, N-peptide composition, partitioned sequence composition, physico-chemical composition, and neighboring sequence composition. Location/organelle with the highest score is reported as the potential localization of the protein. 66
In Silico Gene Expression Analysis
The transcription levels of the Ps1365-encoding gene were directly observed from the transcriptome data deposited in FungiDB (Figure 2.11). Relatively higher levels of transcription of Ps1365 occurred during mycelium and cyst stages compared to the infection stage. The highest level of transcription occurred during mycelium growth (the transcription of Ps1365 gene increased following maturation).
Figure 2.11. In silico gene expression analysis of Ps1365 in three different developmental stages of P. sojae, mycelium, cyst and infection (Tyler 2014, FungiDB: fungidb.org).
Phylogenetic Analysis
Phylogenetic analyses were performed on Ps1365 and its 21 homologous proteins from P. sojae, as well as several related proteins from other oomycete species (Figure 2.12).
Among them, the 22 P. sojae proteins were provided by Brian Rutter, the protein sequences 67 from other oomycetes were achieved by blasting using FungiDB. The sequences were aligned with each other using ClustalW embedded in MEGA7. As a result, all the 22 P. sojae proteins formed a distinct protein subfamily with a bootstrap value of 93. The analysis also determined the first divergence of the 22 P. sojae proteins with the bootstrap value of 98, but failed to show any subsequent divergence events within the two larger subgroups of the 22 P. sojae proteins. The phylogenetic tree demonstrated a closer connection between the four P.
infestans proteins and the P. so j a e proteins with the bootstrap value of 93. Although the tree showed the P. sojae- P. infestans group (22 proteins from P. sojae, four proteins from P.
infestans) and other-oomycete group (two proteins from P. infestans, one protein from P.
ramorum, one protein from Pythium vexans) as two major branches, the analysis failed to
support the evolutionary relation between the two major branches with a significant bootstrap
value. 68
Figure 2.12. Phylogenetic divergences of the 22 proteins belong to a novel protein
family in P. sojae, including Ps1365. The evolutionary history was inferred by using
Maximum Likelihood, based on the JTT matrix-based model (Jones, Taylor, & Thornton,
1992). The tree with the best ln likelihood (-488.30) is shown. The percentage of trees in which the associated taxa clustered together in bootstrap analyses is shown next to the branches. The initial tree(s) for the heuristic search were obtained automatically by applying
Neighbor-Joining and BioNJ algorithms to a matrix of pairwise distances estimated using a
JTT model, and then selecting the topology with superior ln likelihood value. The tree is drawn to scale, with branch lengths measured in the number of substitutions per site (scale at lower left). The analysis involved 30 amino acid sequences. All positions containing gaps and missing data were eliminated. There were a total of 41 positions in the final dataset. 69
Evolutionary analyses were conducted in MEGA7 (Kumar, Stecher, & Tamura, 2016).
Bootstrap values ≥ 50 % (1000 replicates) are given at the branching points. Ps1365 was labeled as jgi|Physo3|342624|gm1.22199 g.
Aim 2) Cloning the Bait Sequence, Ps1365 into Yeast, Saccharomyces cerevisiae Y2HGold
Bait Vector Construction
All the three PCR reactions with the three different annealing temperatures (65°C,
66°C and 67°C) showed bands which were about 500 bp which was the expected size of
Ps1365-encoding gene plus two sticky ends (Figure 2.13). The PCR products were purified
using QIAGEN MinElute PCR Purification Kit, then sent to DNA Analysis, LLC, Cincinnati,
OH for sequencing. 70
Figure 2.13. Agarose gel electrophoresis analysis of Ps1365 gradient PCR products.
Ps1365-encoding gene was amplified from P. sojae genome using Ps1365-gene-specific forward and reverse primers at three different annealing temperatures: 65°C, 66°C, 67°C.
Annealing time was set as 30 sec. PCR products were subjected to electrophoresis using a 1% agarose gel in 1X TAE buffer. Lane “100 bps”: the marker NEB (New England BioLabs,
Ipswich, MA) 100 bp DNA ladder, the unit of the numbers on the figure is “bp”; the temperatures on the figure show the annealing temperature of the PCR reaction on the corresponding lane. 71
pGBKT7 (BD) vector was prepared for cloning through restriction digestion with
EcoRI and/or BamHI at 37°C for one hour. The resultant digestion products were electrophoresed on a 0.8% agarose gel. It can be seen from the agarose gel electrophoresis results of restriction digestion that both the single digestion and double digestion were successful because all the digested vectors showed definite narrow bands while undigested vector showed an indefinite wide and faint band (Figure 2.14), implying vectors have been successfully cut. The Ps1365 insert was prepared for cloning through direct PCR amplification and subsequent PCR purification. Both the pGBKT7 (BD) vector and the
Ps1365 insert were treated with In-Fusion homologous recombination reaction to insert
Ps1365 into the vector. Then, TOP10 cells were transformed with the In-Fusion-treated product. Among the achieved numerous colonies, colony PCR analysis was performed on eight randomly-selected E. coli single colonies with the annealing temperature of 47°C.
Universal T7 promoter and specifically-designed T7 terminator were used as primers. The
resultant PCR reactions were subjected to electrophoresis on a 1% agarose gel. Among the
eight randomly-selected single colonies, only colony #4 showed a band with the approximate size more than 500 bps, implying the possible existence of Ps1365 gene insert. 72
Figure 2.14. Agarose gel electrophoresis analysis of pGBKT7 (BD) vector after restriction digestion with EcoRI and/or BamHI at 37°C for one hour. “M” = 1 kb DNA ladder
(NEB, MA), unit of numbers on the ladder are in “kb”; “-” is the circular pGBKT7 (BD) vector without any digestion and was served as control; “E” represents circular pGBKT7 (BD) vector single-digested with EcoRI, while “B” represents circular pGBKT7 (BD) single-digested with BamHI; “D” means circular pGBKT7 (BD) vector double-digested with both EcoRI and BamHI. 73
Numerous colonies resulted after transforming E. coli cells with vector pGBKT7-Ps1365. Among these, eight single colonies were randomly selected and colony
PCR was performed using the universal T7 promoter and specifically-designed T7 terminator
(Table 1).
The PCR products were subjected to electrophoresis on a 1% agarose gel. If colony #4 included the Ps1365 gene insert, a band which is about or higher than 500 bp would be shown on the agarose gel. The expected band would be longer than Ps1365-encoding gene itself because the band included the Ps1365 gene insert as well as 5’ and 3’ vector sequence overhangs (the vector sequences downstream of T7 promoter site and upstream of T7 terminator site in addition to Ps1365-encoding gene). As a result, the desired bands which were about or higher than 500 bp long were shown on the agarose gel (Figure 2.15), which implies the possible existence of Ps1365 gene insert.
The plasmid was extracted from this colony and analyzed for the presence of
Ps1365-encoding gene by performing PCR on the plasmid. This PCR was done by two separate reactions: one was using T7 promoter and the specifically-designed T7 terminator above. Another was using the Ps1365 gene-specific forward and reverse primers. Both groups of PCR reactions showed that the recombinant vector was containing the Ps1365-encoding gene (Figure 2.16). 74
Figure 2.15. Agarose gel electrophoresis analysis of colony PCR reactions of
transformed TOP10 E. coli cells containing the vector pGBKT7-Ps1365. Among the eight
randomly-selected single colonies, only colony #4 showed a band with the approximate size more than 500 bps, implying the possible existence of Ps1365 gene insert. “M” = 50 bp DNA
Ladder (NEB, MA). The units of molecular masses of the ladder are in “bp”; “W” is water
(negative control); “V” is pGBKT7 vector without any insert; “1”-“8” are the colony PCR
reactions of randomly-selected eight E. coli single colonies after transformation with
In-Fusion treated product of pGBKT7 vector and the Ps1365 gene insert. Each lane
represents one single colony. 75
Figure 2.16. Agarose gel electrophoresis of colony PCR of E. coli transformant colony
#4 containing pGBKT7-Ps1365. The purified plasmid was PCR-amplified using either T7 primers or Ps1365 gene-specific primers. “M” = NEB 100 bp DNA ladder (NEB, MA), the unit of the numbers on the figure are in “bp”; “T7” is PCR products amplified using T7-PCR program; “G” is PCR products amplified using Ps1365-PCR program; “W” means water was used instead of the colony #4 plasmid template; “V” means empty pGBKT7 vector without any insert was used instead of the colony #4 plasmid template; “T” means the colony #4 plasmids were used as templates. 76
Analysis of Bait Construct
To make sure that the inserted Ps1365-encoding gene was perfectly in frame with the
host vector pGBKT7, the vector was extracted from the E. coli colony and was sequenced
using T7 promoter. Then, the sequencing result was aligned with Ps1365-encoding gene
sequence from databases using Clustal Omega (Sievers et al., 2011) and the alignment result
(Figure 2.17) showed that the sequence inserted into the vector pGBKT7 (BD) was identical
to Ps1365 gene and the insert is perfectly in-frame with the vector pGBKT7 (BD) start site.
This implies that Ps1365-encoding gene was successfully inserted into vector pGBKT7 (BD). 77 78
Figure 2.17. Alignment of the Ps1365 gene sequence in plasmid pGBKT7-Ps1365 with the Ps1365 database gene sequence. The two sequences were aligned using Clustal
Omega (Sievers et al., 2011) (http://www.ebi.ac.uk/Tools/msa/clustalo/). The Ps1365 gene
sequence within the plasmid pGBKT7-1365 formed an exact match with the database Ps1365
gene sequence, and within the correct reading frame of the vector pGBKT7. “Ps1365” is the
Ps1365 gene sequence within the plasmid pGBKT7-1365; “PHYSODRAFT_342624-t26_1”
is the database Ps1365 gene sequence. A dash (-) means there is no nucleotide on the same position in the input sequence. An asterisk (*) means an exact match. Red boxes at the beginning indicate the codons, and thus the reading frame of the pGBKT7 vector, while red boxes at the end show the stop codons of both the database Ps1365 gene sequence and
Ps1365 gene sequence within the plasmid pGBKT7-Ps1365.
Bait Transformation and Autoactivation Assay
The constructed pGBKT7-Ps1365 plasmid was then transformed into the yeast strain S.
cerevisiae Y2HGold using Clontech Yeastmaker Yeast Transformation System 2 according to
its manual. Yeast transformants were incubated on SD/-Trp selective agar plates at 30°C and
after five days of incubation, typical yeast single colonies with the diameter of 2-3 mm were
achieved as instructed in the manual.
To test for the autoactivation, the Mel1 and AUR1-C genes were used as reporter genes
in the Matchmaker Gold Yeast Two-Hybrid System. That is to say, positive control diploid
cells were incubated on DDO/X and DDO/X/A media, respectively; while Y2HGold cells
including pGBKT7-Ps1365 were incubated on SDO/X and SDO/X/A media. Incubation was 79 conducted at 30°C for 5 days. When an AUR1-C gene is expressed, yeast can grow on agar plates with Aureobasidin A, while it is lethal to yeast cells when AUR1-C gene is silent. In addition, when MEL1 gene is activated and expressed, yeast cells can convert X-α-Gal into blue pigment, indol (Chevalier, Roy, & Savoie, 1991), so that yeast colonies appear as blue. If the MEL1 gene is inactive, yeast colonies would appear white. As a result of the autoactivation test, Y2HGold [pGBKT7-1365] failed to form blue colonies on SDO/X
(Figure 2.18a) and SDO/X/A plates (Figure 2.18b), which implies that Ps1365 alone cannot activate the reporter genes MEL1 and AUR1-C; while the positive control group (which was constructed by mating (co-culturing) Y2HGold [pGBKT7-53] and Y187 [pGADT7-T] cells and had the true interaction of p53 and T antigen) produced blue colonies on both DDO/X
(Figure 2.18c) and DDO/X/A plates (Figure 2.18d), implying the activation of both MEL1 and AUR1-C genes. These results indicated that Ps1365-encoding gene cannot activate the reporter genes inside the yeast strain Y2HGold without the participation of prey proteins. 80
a c
b d
Figure 2.18. Autoactivation assay of Ps1365-encoding gene in S. cerevisiae Y2HGold.
The plates were incubated at 30°C for 5 days. a) Growth of Y2HGold cells including pGBKT7-Ps1365 on a SDO/X plate. b) Growth of Y2HGold cells including pGBKT7-Ps1365 on a SDO/X/A plate. c) Growth of positive control group yeast colonies on a DDO/X plate. d) Growth of positive control group yeast colonies on a DDO/X/A plate.
Positive control cells on both DDO/X and DDO/X/A media appeared blue as expected. 81
Bait Toxicity Assay
In the toxicity test, if Ps1365 is toxic for Y2HGold cells, Ps1365 would impede, even totally stop the growth of Y2HGold cells. As a result of the toxicity test, we did not see any significant difference between the colony diameters of empty vector group and test group after five days of incubation at 30°C on SDO agar media (Figure 2.19), indicating that
Ps1365 is not toxic for the yeast strain Y2HGold, and can be used for further analysis.
a b
Figure 2.19. Toxicity assay of Ps1365-encoding gene in S. cerevisiae Y2HGold. a)
Y2HGold cells transformed with pGBKT7-Ps1365. b) Y2HGold cells transformed with pGBKT7, as control. No significant difference was found between the colony diameters of the two groups.
Aim 3) Identify Interactive Proteins of a Nove l P. sojae Transcription Factor, Ps1365
Using Yeast Two-hybrid Analysis
Prey Library Construction
The construction of P. sojae mycelium prey library was started with the extraction of 82 total RNA from P. sojae mycelium as described earlier and the electrophoresis results indicated that RNA was successfully extracted from P. sojae mycelia (Figure 2.20). SMART
III Oligo, CDSIII and CDSIII/6 primers were used to synthesize the first-strand cDNAs using total RNA extracted from P. sojae mycelium as templates. Among them, SMART III Oligo has homology with the 3’ end of pGADT7-Rec vector, while CDSIII and CDSIII/6 primers have homology with the 5’ end of pGADT7-Rec vector. The difference is that CDSIII vector is the equivalent of an oligo-dT primer, while CDSIII/6 is that of a random primer. Two reactions were used for the synthesis of cDNAs: one using the combination of SMART III and CDSIII primers; another using the combination of SMART III and CDSIII/6 primers. As a result, both the CDSIII and CDSIII/6-primed cDNA synthesis reactions exhibited cDNA bands (Figure 2.21). The first-strand cDNAs were amplified via LD-PCR (Make Your Own
“Mate & Plate” Library System, Mountain View, CA) using 5’ and 3’ PCR primers and the resultant PCR products also produced the expected cDNA bands (Figure 2.22). 83
Figure 2.20. Agarose gel electrophoresis of total RNA extracted from P. sojae mycelium. Lane M = 1 kb ladder (NEB, Ipswich, MA), Lanes 1-4 are RNA samples extracted from four independent hyphal mat of P. sojae. Amount of RNA (ng/lane): Lane 1 - 172.96.
Lane 2 - 39.69. Lane 3 - 152.87. Lane 4 - 185.51. 84
Figure 2.21. Agarose gel electrophoresis of P. sojae mycelium first-strand cDNAs synthesized using SMART III Oligo, CDSIII and CDSIII/6 primers. Lane “1 kb” = 1 kb ladder (NEB, Ipswich, MA); Lanes “RNA” are total RNAs extracted from P. sojae mycelium;
Lane “CDSIII” is CDSIII-primed cDNA synthesis; “CDSIII/6” is CDSIII/6-primed cDNA synthesis; Lane “100 bps” = 100 bp DNA Ladder (NEB, Ipswich, MA). Unit of the numbers on the figure is “kb”. 85
Figure 2.22. Agarose gel electrophoresis of LD-PCR amplicons of P. sojae mycelium
cDNAs synthesized using 5’ PCR and 3’ PCR primers. Lane “1 kb” = 1 kb ladder (NEB,
Ipswich, MA); Lane “(-)” is the negative control in which sterile dH2O was used instead of
DNA template; Lanes “CDSIII” are the LD-PCR reactions whose template is CDSIII-primed
first-strand cDNAs; Lanes “CDSIII/6” are the LD-PCR reactions whose template is
CDSIII/6-primed first-strand cDNAs. The unit of the numbers on the figure is “kb”. 86
Y2H Screening
Before conducting the Y2H screening, the bait yeast strain was recovered from
glycerol stocks and PCR was used to validate the presence of Ps1365. Agarose gel
electrophoresis showed the expected band of Ps1365 at 500 bp (Figure 2.23), which
confirmed the presence of bait construct pGBKT7-Ps1365 within Y2HGold cells.
To prepare the bait strain before Y2H mating, the bait cells were incubated in SD/-Trp liquid media and were measured by hemocytometry to ensure that the bait cell density exceeded 108 cells/ml. In addition, the cell density of P. sojae mycelium prey library was
measured using a library titration method before the Y2H screening. The results were
performed in duplicate and showed that the cell density of both of the prey libraries were
higher than the minimum requirement of the manufacturer - Takara Bio, Inc. (at 2 x 107
cfu/mL). Tube #1 had a cell density of 4.04 x 107 cfu/mL and tube #2 of 4.34 x 107 cfu/mL. 87
Figure 2.23. Agarose gel electrophoresis of bait colony PCR using Ps1365 gene-specific forward and reverse primers. Lane “100 bps” = 100 bp DNA Ladder (NEB,
Ipswich, MA); Lane “water” is the negative control group in which water was used instead of
the DNA template; Lane “empty Y2HGold cells” is the negative control group in which
Y2HGold cells without any vector transformed were used instead of
Y2HGold[pGBKT7-Ps1365]; Lane “pGBKT7-Ps1365” is the positive control group in which
the plasmid pGBKT7-Ps1365 was used as PCR template instead of cell lysate supernatants;
Lanes “Bait Yeast Colony 1” and “Bait Yeast Colony 2” are the respective colony PCR
reactions of the two tested bait yeast colonies. 88
To perform the Y2H mating, liquid culture of the bait strain and prey library were co-incubated for 24 h in 2X YPDA Broth medium using Clontech Matchmaker® Gold Yeast
Two-Hybrid System according to its manual. The reaction was then plated on DDO/X/A plates and incubated at 30°C for five days. As a result, 120 blue colonies resulted on
DDO/X/A plates. The blue colonies were analyzed by colony PCR using Matchmaker
Insert-check PCR Mix 2, and the subsequent agarose gel electrophoresis test showed the PCR amplicons were of different lengths (Figure 2.24). Because of the large numbers of blue colonies, yeast batch colony PCR was performed by combining every 10 colonies into a single PCR reaction. The result showed that this Y2H was successful since all the colony groups resulted in bands longer than 500 bp (Figure 2.25), indicating the insertion of prey
DNA from P. sojae (the amplicon from empty vector is about 300 bp). 89 a b
c 90
Figure 2.24. Agarose gel electrophoresis of blue diploid colonies insert-check PCR using Matchmaker Insert Check PCR Mix 2. a) Insert-check PCR of three randomly-selected blue diploid colonies. b) Insert-check PCR results of 11 randomly-selected blue diploid colonies. c) Insert-check PCR results of other 11 randomly-selected blue diploid colonies.
Lane “100 bps” = 100 bp DNA Ladder (NEB, Ipswich, MA); Lane “1 kb” = 1 kb ladder
(NEB, Ipswich, MA); Lane “-” is the negative control in which nano-pure H2O was used
instead of yeast lysate supernatant; Lane “+” is the positive control in which circular
pGADT7-Rec vector was used instead of yeast lysate supernatant; Lanes “colonies” are the
insert-check PCR reactions of blue diploid colonies and each lane represents one single
colony. The unit of the numbers on the figure is “bp”. 91
Figure 2.25. Agarose gel electrophoresis of blue diploid colonies batch insert-check
PCR reactions using Matchmaker Insert Check PCR Mix 2. Lane “100 bps” = 100 bp DNA
Ladder (NEB, Ipswich, MA); Lane “1 kb” = 1 kb ladder (NEB, Ipswich, MA); Lane “-” is the
negative control in which nano-pure H2O was used instead of yeast lysate supernatant; Lane
“+” is the positive control in which circular pGADT7-Rec vector was used instead of yeast lysate supernatant; Lanes “colonies” are the insert-check PCR reactions of blue diploid colonies and each lane represents at most ten colonies. The unit of the numbers on the figure is “bp”. 92
Plasmid Rescue
The prey inserts which showed positive reactions were sequenced to identify the
potential interactive proteins. To obtain sufficient plasmids for sequencing, the plasmids were
extracted from blue colonies of yeast using Zymoprep™ Yeast Plasmid Miniprep II Kit
(Zymo Research, Irvine, CA). The extracted plasmids were transformed into Invitrogen
Mach1-T1R and TOP10 competent E. coli cells. Numerous E. coli colonies were resulted on
LB agar plates containing 100 μg/mL ampicillin. To confirm the presence of the plasmids in
E. coli, plasmid extraction was performed on four randomly selected E. coli colonies using
Zyppy Plasmid Miniprep Kit according to its protocol. Gel electrophoresis results (Figure
2.26) indicated that each of the E. coli colonies tested contained plasmids from the yeast.
Figure 2.26. Agarose gel electrophoresis of plasmids extracted from E. coli after transformed with prey plasmids. Lane M = 1 kb ladder (NEB, Ipswich, MA); Lanes “P1”-”P4” are the plasmid extractions from the four single E. coli colonies, respectively. The unit of the numbers on the figure is “kb”. 93
Following the successful initial experiments of those four colonies, additional group colony PCR (four single colonies per group) was performed. A total of five groups were tested using the Insert-Check PCR Mix following the PCR program on its manual.
Electrophoresis analysis showed that all the five colony groups and the four plasmid extractions from E. coli showed the bands longer than 500 bp (Figure 2.27) indicating the presence of the DNA inserts derived from P. sojae. 94
Figure 2.27. Agarose gel electrophoresis of E. coli transformants colony PCR and direct PCR of plasmids from some E. coli colonies using Matchmaker Insert Check PCR Mix
2. Lane “100 bps” = 100 bp DNA Ladder (NEB, Ipswich, MA). Lane “1 kb” = 1 kb ladder
(NEB, Ipswich, MA). Lane “water” is the negative control in which water was used instead of PCR templates. Lane “empty vector” is the positive control in which circular pGADT7-Rec vector was used as DNA template. Lanes “E. coli colonies” are the colony
PCR reactions of E. coli colonies, and each lane represents four E. coli colonies. Lanes
“plasmids extracted from E. coli” are the PCR reactions of four plasmid extractions from E. coli. The unit of the numbers on the figure is “kb”. 95
Discussion
Transcription factors play important roles in gene regulation. Identification of TFs that are associated with pathogenicity and zoospore functions could provide a new target for disease control to combat the diseases caused by Phytophthora species. Ps1365 was
identified as one of the members of a novel TF family in P. sojae based on yeast one-hybrid
assay (Rutter, 2012). In silico analysis of Ps1365 showed that it is a small protein with 156 aa
and a mass of 17,424 Da without signal peptide and transmembrane domains. Further
analysis revealed that Ps1365 may contain a HLH/HTH domain indicating its potential
function as TF since HTH domain is known to have DNA-binding property (reviewed in
Brennan & Matthews, 1989). This partly supports my hypothesis that Ps1365 is a TF. The
nuclear localization sequence (NLS) was not detected on Ps1365 using existing software
available to date, and this is contradictory to the hypothesis that Ps1365 is a TF which can
enter nucleus. However, considering that the molecular weight of Ps1365 is about 17.4 kDa,
and the upper limit of nuclear envelope to permit the passage of molecules is 40-60 kDa
(Zanta, Belguise-Valladier, & Behr, 1999), Ps1365 may be able to enter nucleus without an
NLS because of its relatively small size (Kalderon, Roberts, Richardson, & Smith, 1984). In
addition, the prediction tools used have a degree of inabilities in correctly predicting the
subcellular localization of proteins (Min, 2010). Therefore, the signal peptide prediction
result is not enough to completely refute the hypothesis that Ps1365 is a TF. The protein
domain/family-prediction software (e.g., Pfam) failed to return a match, which seemingly
implies that Ps1365 does not have any known characteristic domain. However, the expansion
rate of the domain databases lag behind that of protein databases. Current domain-prediction 96 methods have their shortcomings (Ochoa, 2013), and studies of TFs in Phytophthora are still in their infancy. Therefore, it is possible that Ps1365 may have a functional domain(s) which has not yet been added into current databases.
Phylogenetic analysis of Ps1365 using MEGA7 indicated that all 22 proteins from P. sojae including Ps1365 form a protein family (Figure 2.12 – refer to the phylogram).
Compared with the similar proteins in other oomycetes, four proteins from P. infestans appear
to be evolutionarily close to the 22 P. sojae proteins. This implies that the divergence of the P.
sojae and P. infestans proteins occurred during and after the separation of P. sojae and P.
infestans.
The 22 P. sojae proteins are phylogenetically closer to each other than to other
oomycete proteins, indicating that these proteins evolved from a common ancestor, and the
emergence of the 22 proteins occurred after the divergence of P. sojae and P. infestans. In other words, the 22 P. sojae proteins are the products of gene duplications from an ancestral gene sequence within P. sojae, not the result of speciation from a common ancestral species with P. infestans or other oomycetes.
In this study, Ps 1365 was used as a bait sequence in Y2H screening. Before conducting Y2H screening, Ps1365 was tested for autoactivation and toxicity. The autoactivation results indicated that Ps1365 protein alone cannot activate the expression of the reporter genes AUR1-C and MEL1 in yeast. Ps1365 was also tested for toxicity and results showed that it does not deter the growth of Y2HGold cells indicating that Ps1365 is not toxic to yeast cells. Taken together, Ps1365 can be used in Matchmaker Gold Y2H System as a
bait. 97
After the mating of the bait and prey strains, 120 blue colonies resulted, indicating the interaction of Ps1365 and the prey proteins from P. sojae. At the same time, white colonies also appeared, which may be because the GAL4-AD-prey fusion protein cannot bind with its
target upstream activating sequence (UAS) (Clontech Laboratories, Inc., 2013).
Considering the number of blue colonies produced in this study, Wang et al. (2002) reported 65 potential positive colonies when they screened the interactive proteins of hepatitis
B virus X protein in human liver cDNA library, while Shahheydari et al. (2014) reported 204
potential positive colonies when they were searching the interactive partners of tumor protein
D52 in a human breast carcinoma cDNA library. Thus, the number of positive colonies that
resulted from this experiment is between the numbers of positive colonies reported in the
previous two studies, and is therefore consistent with these studies.
After Y2H screening, potential prey inserts were detected using insert-check PCR
reaction. The PCR amplicons showed bands longer than 500 bp indicating that the Y2H was
successful and further confirmed that the yeast cells contained the recombinant plasmids with
inserted genes form P. sojae. Any amplicons with bands below 500 were excluded from the
analysis as it might be transformants containing the empty pGADT7-Rec vector.
References
Brennan, R. G., & Matthews, B. W. (1989). The helix-turn-helix DNA binding motif. The
Journal of Biological Chemistry, 264(4), 1903-1906.
Cheng, Q., Dong, L., Gao, T., Liu, T., Li, N., Wang, L., . . . Zhang, S. (2018). The bHLH
transcription factor GmPIB1 facilitates resistance to Phytophthora sojae in Glycine
max. Journal of Experimental Botany, 69(10), 2527-2541. doi:10.1093/jxb/ery103 98
Chevalier, P., Roy, D., & Savoie, L. (1991). X-α-gal-based medium for simultaneous
enumeration of bifidobacteria and lactic acid bacteria in milk
doi://doi.org/10.1016/0167-7012(91)90034-N
Coates, P., & Hall, P. (2003). The yeast two‐hybrid system for identifying protein–protein
interactions. The Journal of Pathology, 199(1), 4-7. doi:10.1002/path.1267
Drozdetskiy, A., Cole, C., Procter, J., & Barton, G. J. (2015). JPred4: A protein secondary
structure prediction server. Nucleic Acids Research, 43(W1), W389-W394.
doi:10.1093/nar/gkv332
Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., . . .
Bateman, A. (2016). The Pfam protein families database: Towards a more sustainable
future. Nucleic Acids Research, 44(D1), D279-D285. doi:10.1093/nar/gkv1344
Goujon, M., McWilliam, H., Li, W., Valentin, F., Squizzato, S., Paern, J., & Lopez, R. (2010).
A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids
Research, 38(Web Server issue), W695-W699. doi:10.1093/nar/gkq313
Jones, D. T., Taylor, W. R., & Thornton, J. M. (1992). The rapid generation of mutation data
matrices from protein sequences. Computer Applications in the Biosciences : CABIOS,
8(3), 275-282.
Kalderon, D., Roberts, B. L., Richardson, W. D., & Smith, A. E. (1984). A short amino acid
sequence able to specify nuclear location. Cell, 39(3), 499-509.
doi:10.1016/0092-8674(84)90457-4 99
Kumar, S., Stecher, G., & Tamura, K. (2016). MEGA7: Molecular evolutionary genetics
analysis version 7.0 for bigger datasets. Molecular Biology and Evolution, 33(7),
1870-1874. doi:10.1093/molbev/msw054
Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character
of a protein. Journal of Molecular Biology, 157(1), 105-132.
doi:10.1016/0022-2836(82)90515-0
Lamitina Lab. (2007). Ethanol precipitation of DNA. Unpublished manuscript. Retrieved
from
http://docs.wixstatic.com/ugd/803ab9_1cd1cb09279649b388391953899ae1f9.pdf
Li, S. (2010). Characterization of soybean GmPUB1 proteins that interact with the
Phytophthora sojae effector Avr1b protein (Master's thesis, Iowa State University)
Martinez-Duncker, I., Mollicone, R., Candelier, J., Breton, C., & Oriol, R. (2003). A new
superfamily of protein-O-fucosyltransferases, 2-fucosyltransferases,
and 6-fucosyltransferases: Phylogeny and identification of conserved peptide motifs.
Glycobiology, 13(12), 1-5. doi:10.1093/glycob/cwg113
McWilliam, H., Li, W., Uludag, M., Squizzato, S., Park, Y. M., Buso, N., . . . Lopez, R.
(2013). Analysis tool web services from the EMBL-EBI. Nucleic Acids Research,
41(Web Server issue), W597-W600. doi:10.1093/nar/gkt376
Min, X. J. (2010). Evaluation of computational methods for secreted protein prediction in
different eukaryotes. Journal of Proteomics & Bioinformatics, 3(4), 143-147.
doi:10.4172/jpb.1000133 100
Naveed, Z. A., Bibi, S., & Ali, G. S. (2019). The Phytophthora RXLR effector Avrblb2
modulates plant immunity by interfering with Ca2+ signaling pathway. Frontiers in
Plant Science, 10 doi:10.3389/fpls.2019.00374
Ochoa, A. (2013). Protein domain prediction using context statistics, the false discovery rate,
and comparative genomics, with application Toplasmodium falciparum (Doctoral
dissertation, Princeton University)
Ohad, N., & Yalovsky, S. (2010). Utilizing bimolecular fluorescence complementation (BiFC)
to assay protein-protein interaction in plants. Methods in Molecular Biology (Clifton,
N.J.), 655, 347-358.
Peng, H., Shan, W., Kuang, J., Lu, W., & Chen, J. (2013). Molecular characterization of
cold-responsive basic helix-loop-helix transcription factors MabHLHs that interact
with MaICE1 in banana fruit. Planta, 238(5), 937-953.
doi:10.1007/s00425-013-1944-7
Petersen, T. N., Brunak, S., von Heijne, G., & Nielsen, H. (2011). SignalP 4.0:
Discriminating signal peptides from transmembrane regions. Nature Methods, 8(10),
785-786. doi:10.1038/nmeth.1701
Phizicky, E. M., & Fields, S. (1995). Protein-protein interactions: Methods for detection and
analysis. Microbiological Reviews, 59(1), 94-123.
Rajagopala, S. V., Sikorski, P., Caufield, J. H., Tovchigrechko, A., & Uetz, P. (2012).
Studying protein complexes by the yeast two-hybrid system. Methods, 58(4), 392-399.
doi:10.1016/j.ymeth.2012.07.015 101
Rutter, B. D. (2012). Catch of the day: a yeast one-hybrid assay identifies a novel
DNA-binding domain in Phytophthora sojae (Master's thesis, Bowling Green State
University)
Shahheydari, H., Frost, S., Smith, B., Groblewski, G., Chen, Y., & Byrne, J. (2014).
Identification of PLP2 and RAB5C as novel TPD52 binding partners through yeast
two-hybrid screening. Molecular Biology Reports, 41(7), 4565-4572.
doi:10.1007/s11033-014-3327-y
Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., . . . Higgins, D. G.
(2011). Fast, scalable generation of high‐quality protein multiple sequence
alignments using Clustal Omega. Molecular Systems Biology, 7(1), 539-n/a.
doi:10.1038/msb.2011.75
Stajich, J. E., Harris, T., Brunk, B. P., Brestelli, J., Fischer, S., Harb, O. S., . . . Roos, D. S.
(2012). FungiDB: An integrated functional genomics database for fungi. Nucleic Acids
Research, 40(D1), D675-D681. doi:10.1093/nar/gkr918
Wang, X. Z., Jiang, X. R., Chen, X. C., Chen, Z. X., Li, D., Lin, J. Y., & Tao, Q. M. (2002).
Seek protein which can interact with hepatitis B virus X protein from human liver
cDNA library by yeast two-hybrid system. World Journal of Gastroenterology, 8(1),
95-98. doi:10.3748/wjg.v8.i1.95
Wienken, C. J., Baaske, P., Rothbauer, U., Braun, D., & Duhr, S. (2010). Protein-binding
assays in biological liquids using microscale thermophoresis. Nature Communications,
1(7), 100. doi:10.1038/ncomms1093 102
Yan, H. W., Hong, L., Zhou, Y. Q., Jiang, H. Y., Zhu, S. W., Fan, J., & Cheng, B. J. (2013).
A genome-wide analysis of the ERF gene family in sorghum. Genetics and Molecular
Research : GMR, 12(2), 2038-2055. doi:10.4238/2013.May.13.1
Yu, C., Chen, Y., Lu, C., & Hwang, J. (2006). Prediction of protein subcellular localization.
Proteins: Structure, Function, and Bioinformatics, 64(3), 643-651.
doi:10.1002/prot.21018
Yu, C., Lin, C., & Hwang, J. (2004). Predicting subcellular localization of proteins for
gram‐negative bacteria by support vector machines based on n‐peptide
compositions. Protein Science, 13(5), 1402-1406. doi:10.1110/ps.03479604
Zanta, M. A., Belguise-Valladier, P., & Behr, J. (1999). Gene delivery: A single nuclear
localization signal peptide is sufficient to carry DNA to the cell nucleus. Proceedings
of the National Academy of Sciences of the United States of America, 96(1), 91-96.
doi:10.1073/pnas.96.1.91 103
CHAPTER 3. ANALYSIS OF THE POTENTIAL INTERACTOR PROTEINS
Introduction
I Bioinformatic Analysis of Interactor Proteins of Ps1365
Bioinformatic analysis allows for efficient analysis of sequences obtained from lab experiments to predict the possible structures and functions of genes and proteins. Genome and transcriptome databases can be used to assist and complement lab experiments. For example, Kaur, Kocher & Gupta (2012) cloned the potential alkaline protease gene from
Bacillus circulans MTCC 7906, and analyzed the sequence using the Basic Local Alignment
Search Tool (BLAST) located at the National Center for Biotechnology Information (NCBI) website, followed by the phylogenetic analysis of the alkaline protease gene. Through the analyses, they concluded that the cloned sequence is indeed a novel alkaline protease from B. circulans MTCC 7906. Genome and transcriptome databases have also been used in the studies of oomycetes. Tian et al. (2004) performed the similarity and motif searches using public databases, including the GenBank nonredundant database, to analyze the expressed sequence tags (ESTs) from tomato leaves infected with P. infestans. They discovered a potential member of the Kazal serine protease inhibitor family and named its gene epi1, which was later showed to have the protease-inhibiting function. The genome of P. sojae strain
P6497 was sequenced by the United States Department of Energy Joint Genome Institute using a whole genome shotgun strategy at 9x coverage (Tyler et al., 2006). The project has the accession AAQY00000000 within the GenBank database at the National Center for
Biotechnology Information (NCBI). The genome size is approximately 95 Mb (Tyler et al.,
2006) and contains 26,489 coding genes and 25 pseudogenes (Howe et al., 2020; Kersey et 104 al., 2018; Protists.ensembl.org, 2018). Currently, there are P. sojae transcriptomic data from ten different developmental stages (mycelia, zoosporangia, zoospores, cysts, germinating cysts, and five infection stages) available under the accession number SRP006969 (Ye et al., 2011).
In this chapter, existing genomic and transcriptomic databases were used to analyze the genes encoding potential interactor proteins of Ps1365, a potential transcription factor identified in a previous study (Rutter, 2012). After sequencing, several software packages were employed to analyze the sequences obtained. The identities, structures and functions of the interactive proteins were determined according to their gene annotations in GenBank,
FungiBD and JGI. Additional software includes Sequencher 5.4.5 (Gene Codes Corporation,
Ann Arbor, MI, USA), which automatically compares the forward and reverse-complementary orientations to assemble the best possible contigs, so DNA assembly can be done regardless of orientation. Sequencher also trims ends to remove poor-quality and vector sequences that may mislead data analysis. Additionally, a series of protein analysis tools were used for predicting the identity, structure and functions of the prey protein candidates.
Hypotheses and Aims
Ps1365 was identified as a novel transcription factor via yeast one-hybrid assay
(Rutter, 2012). This chapter aims to analyze the potential interactor proteins of Ps1365 that were obtained from the yeast two-hybrid assay (Y2H). In Chapter 2, the bait vector pGBKT7
(Clontech, CA) was used to clone the Ps1365-encoding gene and screen the cDNA library of
P. sojae mycelium. The results showed that the bait plasmid containing Ps1365 gene insert showed neither self-activating effect nor toxicity on the yeast strain Y2HGold. After the Y2H assay, the sequencing was performed and many prey plasmids were sequenced. The 105 hypotheses for this chapter are:
1) The potential prey candidates are transcription factors, at least nuclear proteins.
2) The transcriptions of the bait and two potential prey candidate genes manifest the same temporal and spatial patterns.
In this chapter, there were two aims:
1) Analyze the interactive proteins of the potential novel P. sojae transcription factor, Ps1365, which was indicated from Y2H.
2) Confirm the co-expression of the bait and the potential prey proteins using existing transcriptome data available in FungiDB.
Materials and Methods
I Yeast Two-hybrid Assay
The detailed procedures of the lab experiment are provided in Chapter2. Briefly, the
“bait” plasmid was constructed by cloning the Ps1365 gene from P. sojae into the pGBKT7 vector. Following the sequencing and alignment analysis, the "bait" plasmid pGBKT7-Ps1365 was transformed into the yeast strain S. cerevisiae Y2HGold. Neither autoactivation nor toxicity of Ps1365 was observed. Y2H was then performed by mating Y2HGold cells containing pGBKT7-Ps1365 with Y187 cells containing P. sojae mycelium cDNA library
plasmids. Once the 3-lobes-shaped structures were observed, indicating the successful mating,
the mated yeast cells were plated on double-dropout medium and assayed for X-α-Gal color and Aureobasidin A resistance activity. The interaction between Ps1365 and its interactive proteins was observed visually as positive blue colonies (please see details in Chapter 2). The plasmids were extracted from the blue yeast colonies and re-transformed into competent E. coli 106 cells in order to amplify the plasmids sufficiently for sequencing. The single colonies of E. coli transformants were subcultured and plasmids were extracted and sequenced by the University of Chicago Comprehensive Cancer Center DNA Sequencing & Genotyping Facility
(UCCCC-DSF).
II Bioinformatic Analysis
The Sequencher 5.4.5 (Gene Codes Corporation, Ann Arbor, MI) was used to screen for sequence redundancy and infer consensus sequences. In order to remove the redundant sequences and to analyze the potential prey inserts, sequences belonging to the vectors were trimmed and the remaining parts of the sequences were aligned with each other using
Sequencher 5.4.5 with default parameters (Figure 3.1).
Figure 3.1. Default parameters of Sequencher 5.4.5 used to align the prey sequences. 107
The resulting consensus sequences, as well as in-contig (highly similar sequences assembled into a same group, i.e. contig) and out-of-contig (sequences were not assembled into any group because of their differences with the in-contig sequences) acceptable-quality sequences, were subjected to searches using BLAST against transcripts, nucleotide and protein databases via the National Center for Biotechnology Information (NCBI)
(https://www.ncbi.nlm.nih.gov/) (Zhang, Schwartz, Wagner, & Miller, 2000) and FungiDB
(http://fungidb.org/fungidb/) (Stajich et al., 2012; Tyler et al., 2006). The RNA-Seq data of
the hypothetical protein hits were retrieved directly from FungiDB. The gene models of the
hypothetical protein hits in FungiDB were aligned with RNA-Seq data to correct the gene models. Molecular weight and isoelectric point of the achieved hypothetical protein hits were calculated using ExPASy - Compute pI/Mw tool (https://web.expasy.org/compute_pi/)
(Gasteiger et al., 2005). The putative amino acid sequences (retrieved from FungiDB) were blasted against protein databases by NCBI blastp tool (Altschul et al., 1997; Altschul et al.,
2005) as well as PSI-BLAST (https://www.ebi.ac.uk/Tools/sss/psiblast/) (Altschul et al.,
1997). Other characteristics of the deduced proteins were also analyzed. The potential
transmembrane domains were predicted by Kyte-Doolittle Hydropathy Plot
(https://web.expasy.org/protscale/) (Gasteiger et al., 2005; Kyte & Doolittle, 1982), TMHMM
Transmembrane Helix Prediction (http://www.cbs.dtu.dk/services/TMHMM/) (Krogh,
Larsson, von Heijne, & Sonnhammer, 2001; Sonnhammer, von Heijne, & Krogh, 1998), DAS
- Transmembrane Prediction Server (https://tmdas.bioinfo.se/DAS/) (Cserzö, Wallin, Simon,
von Heijne, & Elofsson, 1997), HMMTOP transmembrane topology prediction server
(http://www.enzim.hu/hmmtop/) (Tusnády & Simon, 1998; Tusnády & Simon, 2001) and 108
TMpred Prediction of Transmembrane Regions and Orientation
(https://embnet.vital-it.ch/software/TMPRED_form.html) (Hofmann & Stoffel, 1993).
Possible domain/motifs and protein family memberships were predicted using InterPro
protein sequence analysis & classification (http://www.ebi.ac.uk/interpro/) (Mitchell et al.,
2019; Jones et al., 2014), CDART Conserved Domain Architecture Retrievel Tool
(https://www.ncbi.nlm.nih.gov/Structure/lexington/lexington.cgi) (Geer, Domrachev, Lipman,
& Bryant, 2002), NCBI CD-Search (Conserved Domain Search)
(https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (Marchler-Bauer & Bryant, 2004;
Marchler-Bauer et al., 2011; Marchler-Bauer et al., 2015; Marchler-Bauer et al., 2017),
Scooby-domain (Sequence hydrophobicity predicts domains)
(http://www.ibi.vu.nl/programs/scoobywww/) (George, Lin, & Heringa, 2005; Pang, Lin,
Wouters, Heringa, & George, 2008), Pfam protein families database (http://pfam.xfam.org/)
(Finn et al., 2016), SMART Simple Modular Architecture Research Tool
(http://smart.embl-heidelberg.de/) (Letunic & Bork, 2018; Schultz, Milpetz, Bork, & Ponting,
1998). Several servers were used to predict secondary structure of the proteins. These servers
included Jpred 4 Protein Secondary Structure Prediction Server
(http://www.compbio.dundee.ac.uk/jpred/) (Drozdetskiy, Cole, Procter, & Barton, 2015),
NetSurfP Protein Surface Accessibility and Secondary Structure Predictions
(http://www.cbs.dtu.dk/services/NetSurfP/) (Petersen, Petersen, Andersen, Nielsen, &
Lundegaard, 2009), GOR protein secondary structure prediction server
(https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_gor4.html) (Garnier, Gibrat, &
Robson, 1996; Sen, Jernigan, Garnier, & Kloczkowski, 2005) , NetTurnP β-turn region 109 predictor (http://www.cbs.dtu.dk/services/NetTurnP/) (Petersen, Lundegaard, & Petersen,
2010), PORTER Protein Secondary Structure Prediction at University College Dublin
(http://distillf.ucd.ie/porter/) (Pollastri & McLysaght, 2005), SOPMA Secondary Structure
Prediction Server (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=npsa_sopma.html)
(Geourjon & Deléage, 1995). Their potential tertiary structures and possible functions were
determined with Phyre2 Protein Fold Recognition Server
(http://www.sbg.bio.ic.ac.uk/phyre2/html/page.cgi?id=index) (Kelley, Mezulis, Yates, Wass,
& Sternberg, 2015), (PS)2 Protein Structure Prediction Server (http://ps2v3.life.nctu.edu.tw/)
(Huang et al., 2015), RaptorX Protein Structure Prediction Server
(http://raptorx.uchicago.edu/StructurePrediction/predict/) (Källberg et al., 2012; Peng & Xu,
2011), PFP (Protein Function Prediction) (http://kiharalab.org/pfp.php) (Hawkins, Chitale,
Luban, & Kihara, 2009) and ESG (Extended Similarity Group method)
(http://kiharalab.org/esg.php) (Chitale, Hawkins, Park, & Kihara, 2009). The existence of
signal peptide and subcellular localization of the hypothetical proteins were predicted by
SignalP signal peptide prediction server (http://www.cbs.dtu.dk/services/SignalP/) (Petersen
et al., 2011), Wolf PSORT (https://wolfpsort.hgc.jp/) (Horton et al., 2007), TargetP protein
subcellular localization predictor server (http://www.cbs.dtu.dk/services/TargetP/)
(Emanuelsson et al., 2000), CELLO Subcellular Localization Predictive System
(http://cello.life.nctu.edu.tw/) (Yu, Lin, & Hwang, 2004; Yu, Chen, Lu, & Hwang, 2006),
iPSORT subcellular localization site predictor (http://ipsort.hgc.jp/) (Bannai et al., 2002;
Bannai et al., 2001), CoSiDe Combined Signal Peptide Predictor
(http://sigpep.services.came.sbg.ac.at/coside.html) (Frank, 2013). 110
Co-expression analysis of the bait gene PHYSODRAFT_342624-t26_1, as well as the
two prey candidate genes was conducted using the transcriptome data of those genes
available in FungiDB.
Results
Aim 1) Analysis of the Potential Interactive Sequences of Ps1365 Obtained from Yeast
Two-hybrid Assay
Analysis of the Potential Prey Inserts
E. coli transformants were transferred onto two new LB agar ampicillin plates (40 colonies/plate on average) and incubated at 37°C overnight to make secondary plates. The plates were sent to UCCCC-DSF for sequencing, and a total of 188 potential prey sequences were obtained. Among them, 48 sequences were discarded due to poor sequencing quality.
Another 140 sequences were of good quality and were used in the analysis.
Initially Sequencher 5.4.5 identified four contigs. They were Contig[0001],
Contig[0003], Contig[0007] and Contig[0074] with 76, 71, 7 and 6 members, respectively
(Table 1). However, all of the six sequences from Contig[0074] were of low quality.
Therefore, this contig was discarded. The consensus sequences of the remaining three contigs are provided in Appendix. 111
Table 3.1. NCBI BLAST analysis results of the contigs of P. sojae prey inserts.
Numbers
Numbers of raw of good Contig No. BLAST E-value %ID sequences quality
sequences
18S ribosomal RNA [0001] 76 64 0.0 100% gene
PHYSODRAFT_29
1312 [0003] 71 69 0.0 99% PHYSODRAFT_35
6433
28S large subunit [0007] 7 6 5e-175 99% ribosomal RNA gene
[0074] 6 0 N/A N/A N/A 112
Analysis of the Potential Prey Inserts by BLAST
The consensus sequences of the three contigs were subjected to analysis using the
NCBI BLAST tool in comparison to both P. sojae nucleotide and protein databases. BLAST analysis of Contig[0001] against the NCBI, FungiDB and JGI provided the same results and indicated that it is an 18S ribosomal RNA gene, while that of Contig[0007] indicated that it is a 28S large subunit ribosomal RNA gene (Table 1). For Contig[0003], BLAST analysis indicated that it is either a hypothetical protein PHYSODRAFT_291312 or
PHYSODRAFT_356433 (Table 1).
Validation of the Gene Models of the Prey Candidates
Because the gene models of the two hypothetical protein candidates in FungiDB were
predicted by annotation software, and not by laboratory experimentation, in addition, the
PHYSODRAFT_291312-t26_1 gene model in FungiDB includes an extremely long intron, it
is necessary to use RNA-Seq to figure out the right gene structures of the two prey protein
candidates.
Alignment of the FungiDB gene model of PHYSODRAFT_291312 with the existing
RNA-Seq data (available in FungiDB) showed that the new model of
PHYSODRAFT_291312 gene is significantly shorter than the predicted gene model (Figure
3.2). According to the alignment, the new gene model of PHYSODRAFT_291312 is now predicted to contain three exons and two introns (Figure 3.2 and 3), while the predicted gene model of PHYSODRAFT_356433 appeared to be correct (Figure 3.4). 113 a
b c
Figure 3.2. Alignment of the PHYSODRAFT_291312 (is shown here as
PHYSODRAFT_291312-t26_1) gene model with RNA-Seq data. (a) P. sojae transcription log scale graph obtained from FungiDB. (b) Predicted gene models in FungiDB. (c)
Newly-devised gene model of PHYSODRAFT_291312-t26_1 after aligning the original gene model with RNA-Seq data. White rectangles represent untranslated regions (UTRs), black
rectangles represent coding sequences (CDSs), solid lines represent introns. 114
>JH159173 | Phytophthora sojae unplaced genomic scaffold PHYSOscaffold_23, whole genome shotgun sequence. | 174389 to 178361
CACGACTTCAGCGAGTGCAACTGACGGCGTCGTGCCATCGTCAGTGCAAGTGAGTGACGC TTAGTGAAAAGCCGGCCCACCACGTCAAGGTGGTCACTCGTCGGACCGGGAAAGGACACC CGGATTGCTCGGGGTGCCCCGAGAACGCGTCAACGTCCCCAAGGGAGGCTCAAACTACCT GTGGGTCTGGTAAAATGCGGCTTCATCACCTAGGCGTTCCACGGCCCAAGCTATCCCCCG CAAGTCGATTGGAAGCGACAAGTTCGACGACTCAAGGTGCTTTCACGCGGAATGGAACCG CGGCTGACCGTCTCCCTTTCAGGACGGCCACACTTTGAGAGTCTCCCCGGACACACCCCA ACCTAGATCCGATCAAGTCGAATCCGCCAGCCTCCCCCCGCTACAACGAGCAGGTTCGAC CGTACTGCACAATGTGCAAGCTCCACCCACGATTCATCTCAAGCATGCGGCCTTAGGGAC CCCACGCGAGGACTACCTGCGTAAGTGACGACTCCTCTCGATCCACGCACGAATTGCAAC ACCAAAGCCACCGCTGAGTTTCCTAGGCTCAACAGATCCTCGATGCCAACCCCCTTGCAG ATTGAGACTCGTCCGTTTTGCCCGACGAGAACCTGGTCCGAATCGCGAAGCGTTTCTTCT ACTCGCCCGGCGGCTCCAAGGTGTCTGGACGATTTTCATCATGACCTATCAAGGTCGGTC CCGACTCCCTGCCGCGCCGTTTGAGGGCTCACTCCATGCCCCAAGTCGTCTGGAAGACTT TGTGTCAAGGTCGGTCAGACCCATCCAGATGATGACTTGTCACACTTCGTGACTGCTTCG CAATCCCACTGGTGATCAGCTCGCCAGCCAACTCCTTTATCGCCCGCAGCAGAACAGCCA AGCCCAAGTTCATCCAAGCCGAAGCCAGACAGACAAGAACCCAGCCATTCCGCCGCAAAC TTCCGGAATCGGCCGTTGATCATCTCACGTTGTTCTCGTCGACGTCTCGAGTACACCAAG ATGCTTTACGCCGCCATACGTGCGTTTTTTCGGCTGCTCACCGAATACCGATAACCCATG GCGGAGCTCTCCCGGTGCGACGAACTGTCTTAGTTCCAGGCACAGACGTTTCCAGCCGAC CACTCCCCCGGTCCGACCGTACTGCGTCAGCGACTCTGAAGCTGCAGCCACCTTTGTTGT GTGGACATCTGCCTCCCCCCGCTCACCCGGGCAGGAATCAGACTGCGCTTCGTCGTCTCA TTAAATTCGCACGCCCAGGCCGAAGCCCCGACGCTGATACTCGTTTCCAAGTTCTTCGAC ATCATGACATCTCCAGGTTGGTATTCTGTGTCCCCCTCTCATCCCAGGGCGGAACTCAGG CCGCAACTCGAAGAATCTCTCAAGAACACAACGTCCGGACCGAAGTCAAGACTACGTTCT TCAACGACTGCTACAACACGCTGCGCCCCCCGTAAGGTTGAAGTCTACATCCAGCCCCCA TCCCCCTCTCGCCGGGACAGGAATTGGAATATACTACGCAAGCTCGCAATCAAGATCAAC GGATCAGTGCCCAAGACCGAAGTCCTGGACCCATCTCCACCAGGTTTCCAGCCGGGGTTT GTACCCCCAGACATCCAGAGCGCACTGGACCATGCATCGCAGGCCGGAAGCTGTTTTTTA GAGAGGAGGTCTCCTCTATATACGCTAGCCTCTTGCGGAGAATTCTCCCCGCGTTCTACA TATTTGCTACTTTAGGTTTGGAATTAAGTCAATGTAATCAACATTCCCAGCCAAACTCCC CACCTGACGATGTCTTCCACGTAGCTCACCCAGACAAAAGCCCGAGATTACAACTAGAAC TGAGCGACCGCGAGGCCACCCAGATGCTACATCATGGTATAAGTAAAACAACATTGAGAG TAGTGGTATTTCACGGACGGCAGAGCCTCCCACTTATTCTACACCTCCCAAGTCATTTCA CAACGTCAGACTAGAGTCAAGCTCAACAGGGTCTTCTTTCCCCGCTGATTATTCCAAGCC CGTTCCCTTGGCTGTGGGTTCGCTAGACAGTAGATAGGGACAGTGGGAATCTCATTAATC CATTCATGCGCGTCACTAGTTAGATGACGAGGCATTTGGCTACCTTAAGAGAGTCATAGT TACTCCCGCCGTTTACCCGCGCTTGGTTGAATCTCTTCACTTTGACATTCAGAGCACTGG GCAGAAATCACATTGTGTCAACACCGTCTCCGGCCATCACAATGCTTTGTTTTAATTAAA CAGTCGGATTCCCCTTGTCCGCTCCAGTTCTGAGCCGGTTGTTCAACGCACTAGGGAAAC AGCCGCCAGCCCGAAAGCCAACGACCTTTCTCTCCGGCGCAGCAAGAGCAGCCCGACCGC 115
CGGGCCGCATCCGGTCCCCGAAAGGTCCAGACGCAGCCCAGCGTAGGCCGCCACAAGCTC TCCAAAAAGACCCCTAGGCCCAACCCTTAGAGCCAATCCTTTTCCCGAAGTTACGGATCT ATTTTGCCGACTTCCCTTATCTACATTCTTCTATCAACTAGAGGCTGCTAACCTTGGAGA CCTGATGCGGTTATGAGTACGAACGAGGGTGCGAATAAATCTCTAGCCAGGATTTTCAAG GGCTGTCGTGGGCGCACAGGACACTTCAAAAAGTAAAGTGCTTTGCCAAGGCGTCCTCCT TATCGCCGGATAATCCGTTTCCAAGGCAGGACATGCTTGTTAAAAAGAAAAGAGAACTCT TCCCTGGGCACACGCTAGCGTCTCCTGGGTCGGTTGTGTTGCCACACATTATCCACGTCT CGGTTCAGGAATATTAACCTGATTCCCTTTCGATACACGAGGCTGCATACAAAACGCGAG GACGAACCCCACGCCCCAAACAGCCCAGCTTTCAAACGGAATTATCCTATCTCTTAGGAT CGACTCACCCGTGTCCAATTACTGATCACACGGAACCTTTCTCCACTTCAGTCTTCAAAG TTCGCATTTGAATATTTGCTACTACCACCAAGATCTGCACTAAAGGCCGTTTCACTCAGG CTCACGCCACGAGCTTCTTCACGACCCTCACGCCCTCCTACTCATTAACGCGTACATTAC ATAT GCCAT CGCGCTAACGGCAAAGTATAAGTAGCCCGCTTTAGCGCCATCCATTTTCAG GGCTAGTTCATTCGGCAGGTGAGTTGTTACACACTCCTTAGCGGATTCCGACTTCCATGG CCACCGTCCTGCTGTCTAAATGAACCAACACCTTTTATGGTATCTAGGTGAGCGGGCATT TTGGCACTTTAACTTTGCGTTCGGTTCATCCCGCATCGCCAGACGAGCTTACCCCGTATG GCCCACTAGCAACTTGATATTCACATCCACAAGTTCAATTAAGAAACCTGCAGGTCTTAC AGATTTAAAGTTTGAGAATAGGTCGAGGAAGTTTCTTCCCCGAATCCTCTAATCATTCGC TTTACCTCATAAAACTATCGTAAATAAGTTGCTGCTATCCTGAGGGAAATTTCGGAGGGA ACCAGCTACTAGATGGTTCGATTAGTCTTTCGCCCCTATACCCAAGTTTGACGATCGATT TGCACGTCAGAATCGCTACGAGCTTCCACCAGAGTTTCCCCTGGCTTCACCCTACTCAGG CATAGTTCACCATCTTTCGGGTACCAACATATGTGCTCAAACTCAAATCTTTCACCACGA AGGTTCATGATCGGTCGATAGTGCCACGCCGCAGACCGAAGCCCGCAACATAGACGATAA TCACCTTTCCCGATGTGCCGCGAATAGCGATAGGTGTCTTCTGGGCACCCAACATCATAC AATTGCAACGCACTCCGCTGCCTGTCAAGTGCTGGCGGTGGAGAGTAGGCTGACTTGTAA TTTCAAATATTGGGAAAGATAAATCCTTTGTAGACGACTTAACTACAGAACGGGGTGTTG TAAGCATGAGAGT Figure 3.3. DNA sequence of the predicted new gene model of
PHYSODRAFT_291312 after aligning the original gene model from FungiDB with
RNA-Seq data from the same database. Red letters represent exonic regions, including untranslated regions (UTRs) and coding sequences (CDSs), and black letters represent introns. 116 a
b
Figure 3.4. Alignment of PHYSODRAFT_356433 gene model (is shown here as
PHYSODRAFT_356433-t26_1) with RNA-Seq data. (a) P. sojae transcription log scale
graph from FungiDB. (b) Predicted gene model in FungiDB.
Protein Features of PHYSODRAFT_291312
Data from FungiDB (http://fungidb.org/fungidb/) showed that the coding gene of
PHYSODRAFT_291312 was located on the forward strand of P. sojae genomic sequence
JH159173. Its exact position on the genome is 167015 - 178157 (+). The predicted coding
region of PHYSODRAFT_291312 is 213 bp long. The deduced length of
PHYSODRAFT_291312 is 70 amino acids and it does not have any signal peptide and
transmembrane domains (Figure 3.5). 117
Figure 3.5. Features of PHYSODRAFT_291312 protein achieved from FungiDB.
Additional analysis of PHYSODRAFT_291312 using ExPASy Compute pI/Mw tool,
https://web.expasy.org/compute_pi/ (Gasteiger et al., 2005) showed that it is a protein with
the molecular weight of 8,288.62 Da and isoelectric point of 11.34.
PHYSODRAFT_291312 against Protein Databases
BLAST search results from the PHYSODRAFT_291312 amino acid sequence
compared to protein databases using NCBI blastp and PSI-BLAST tools gave the highest
score hit to hypothetical protein PHMEG_00035810 in Phytophthora megakarya, with one amino acid difference.
PHYSODRAFT_291312: Globular or Membrane
Whether the two possible protein candidates are globular or membrane proteins was theoretically judged by predicting the possible existence of transmembrane domains using several different software. Kyte-Doolittle Hydropathy Plot, DAS, HMMTOP, TMHMM, and
TMpred predicted that there is no possibility of having transmembrane domains in
PHYSODRAFT_291312 (Figure 3.6). 118
a 119
b
c 120
d
Figure 3.6. Prediction of potential transmembrane domains in
PHYSODRAFT_291312. (a) Prediction using ExPASy ProtScale Kyte-Doolittle Hydropathy
Plot. X-axis represents the positions of amino acid residues, y-axis represents the hydrophobicity score of each amino acid residue. Window size was set to 19. Scores of the amino acid residues predicted to be lower than +1.6, implying the possibility that
PHYSODRAFT_291312 may not have transmembrane domains. (b) Prediction using DAS.
X-axis represents the positions of amino acid residues, y-axis represents the hydrophobicity score of each amino acid residue. The scores of the amino acid residues are lower that the loose hit value (1.7), hinting about the possibility that PHYSODRAFT_291312 may not have transmembrane domains. (c) Prediction using TMHMM. X-axis represents the positions of amino acid residues, y-axis represents the probabilities of the locations of amino acid residues in a protein molecule (interior, surface, transmembrane). The analysis showed that the 121 probability of locating on surface are highest for all the amino acids, so the prediction result is that PHYSODRAFT_291312 may be a water-soluble globular protein. (d) Prediction using
TMpred. X-axis represents the positions of amino acid residues, y-axis represents the TMpred scores of each amino acid residues. The prediction showed that the scores of all the amino acids are lower than 500, implying PHYSODRAFT_291312 may not be a transmembrane protein.
Domain and Protein Family Prediction of PHYSODRAFT_291312
Two hypothetical protein candidates were analyzed for possible domain structures and
protein family memberships for predicting their functions. Pfam, CDART, InterPro, NCBI
CD-Search, and SMART failed to find any domain structure for PHYSODRAFT_291312.
Secondary Structure of PHYSODRAFT_291312
Secondary structure of PHYSODRAFT_291312 was predicted using several different tools (Figure 3.7). Jpred 4, SOPMA, GOR and NetSurfP predicted α-helix structures in
PHYSODRAFT_291312. Though there are differences between the results of each software tool, this implies the possibility that PHYSODRAFT_291312 may have HLH or HTH domains. 122
a
b
c
Figure 3.7. Secondary structure prediction of PHYSODRAFT_291312. (a) Secondary structure predicted by GOR. “c” is random coil; “e” is extended strand; “h” is α-helix. (b)
Secondary structure predicted by Jpred 4. Letter “H” represents helical; “-” represents other
types of structures. (c) Secondary structure predicted by SOPMA. “h” is α-helix; “e” is
extended strand; “c” is random coil; “t” is beta-turn.
Tertiary Structure and Functional Analysis of PHYSODRAFT_291312
(PS)2, ESG, PFP failed to find any significant template for PHYSODRAFT_291312.
Phyre2 analysis (Kelley et al., 2015) indicated that the tertiary structure of
PHYSODRAFT_291312 resembles interleukin-37 with the confidence level of 59.6%
(Figure 3.8). RaptorX predicted the structural model shown in Figure 3.9. 123
Figure 3.8. The structural model of PHYSODRAFT_291312 predicted by Phyre2 web server. The directions of arrows as well as the change of rainbow colors (from blue to red) show the direction of polypeptide: N-terminus to C-terminus. 124
Figure 3.9. Tertiary structure of PHYSODRAFT_291312 predicted by RaptorX. 125
Figure 3.10. SignalP Signal Peptide Prediction of PHYSODRAFT_291312. The positions of amino acid residues are shown on the x-axis, while the y-axis indicates the values of the three scores. The C is the cleavage site score, which is high in the cleavage site. The amino acid residues with lower S-scores may potentially be the part of a mature protein, while the amino acid residues with higher S-scores may potentially be the part of a signal peptide. The Y-score is highest on the sites where C-score is significantly higher and S-score abruptly decreased, implying the potential existence of a cleavage site. 126
Signal Peptide and Subcellular Localization Prediction of PHYSODRAFT_291312
SignalP failed to detect any signal peptide for PHYSODRAFT_291312 (Figure 3.10) while CoSiDe Combined Signal Peptide Predictor predicted a signal peptide whose cleavage site is after the 22nd amino acid residue (Figure 3.11). iPSORT also failed to detect any signal peptide and predicted that the protein may be localize to mitochondria. Phobius predicted
PHYSODRAFT_291312 to be a noncytoplasmic protein (Figure 3.12).
TargetP and CELLO subcellular localization prediction servers predicted that
PHYSODRAFT_291312 is a mitochondrial protein, while WoLF PSORT predicted that the probability of being a nuclear protein is the highest. 127
Figure 3.11. Analysis of PHYSODRAFT_291312 by CoSiDe Combined Signal
Peptide Predictor predicted the best cleavage site at the 23rd amino acid residue. The x-axis represents the positions of amino acid residues, while the y-axis represents scores, meaning the probability of cleavage. 128
Figure 3.12. Subcellular localization prediction of PHYSODRAFT_291312 using
Phobius. Among all the four subcellular localization probabilities, being a non-cytoplasmic protein is the highest as indicated by the blue curve. 129
Protein Features of PHYSODRAFT_356433
Data from FungiDB (http://fungidb.org/fungidb/) indicated that the encoding gene of
PHYSODRAFT_356433 is located on the position between 114601 – 117100 on the reverse strand of P. sojae genomic sequence JH159173. The predicted transcript is 2500 bp long, has no intron, and encoded 101 amino acids. According to FungiDB, PHYSODRAFT_356433 does not have any functional domains nor signal peptide, but has a transmembrane domain
(Figure 3.13). Calculation by ExPASy Compute pI/Mw tool indicated that the molecular weight of PHYSODRAFT_356433 is 11,737.68 Da, and its isoelectric point is 9.64.
Figure 3.13. Features of PHYSODRAFT_356433 protein achieved from FungiDB.
PHYSODRAFT_356433: Globular or Membrane
For PHYSODRAFT_356433, Kyte-Doolittle Hydropathy Plot predicted that there are
no transmembrane domains in PHYSODRAFT_356433. In contrast, DAS, HMMTOP,
TMHMM, and TMpred predicted one transmembrane domain (Figure 3.14). 130
a 131 b
c 132
d
Figure 3.14. Prediction of potential transmembrane domains in
PHYSODRAFT_356433. (a) Prediction using ExPASy ProtScale Kyte-Doolittle Hydropathy
Plot. X-axis represents the positions of amino acid residues, y-axis represents the hydrophobicity score of each amino acid residue. Window size was set to 19. Scores of the very few amino acid residues predicted to be higher than +1.6, implying the possibility that
PHYSODRAFT_356433 may not have transmembrane domains. (b) Prediction using DAS.
X-axis represents the positions of amino acid residues, y-axis represents the hydrophobicity score of each amino acid residue. The scores of the amino acid residues between positions 19 and 46 are above the loose hit value (1.7), so the prediction is that PHYSODRAFT_356433 may have a transmembrane domain. (c) Prediction using TMHMM. X-axis represents the positions of amino acid residues, y-axis represents the probabilities of the locations of amino 133 acid residues in a protein molecule (interior, surface, transmembrane). The amino acids on the positions 30-52 showed the highest probability of being a transmembrane segment, so
PHYSODRAFT_356433 may have a transmembrane region. (d) Prediction using TMpred.
X-axis represents the positions of amino acid residues, y-axis represents the TMpred scores of each amino acid residues. The prediction showed that the scores of the amino acids on the positions 29-45 are above 500, implying the possible existence of a transmembrane segment.
Analysis of PHYSODRAFT_356433 against Protein Databases
BLAST analysis of PHYSODRAFT_356433 compared to the protein databases using
NCBI blastp and PSI-BLAST tools gave the highest-scored hit - hypothetical protein
PHMEG_00034225 in P. megakarya, with two-amino acid differences. Domain Analysis of
PHYSODRAFT_356433 by Pfam, CDART, InterPro, NCBI CD-Search and SMART
analyses did not show any domains and protein families.
Secondary Structure of PHYSODRAFT_356433 by NetSurfP, JPred 4, GOR, SOPMA
and PORTER predicted that the putative protein contains α-helix structure (Figure 3.15). 134
a
b
c
Figure 3.15. Secondary structure prediction of PHYSODRAFT_356433. (a)
Secondary structure predicted by GOR. “c” is random coil; “e” is extended strand; “h” is
α-helix. (b) Secondary structure predicted by Jpred 4. Letter “H” represents helical; “E”
represents extended, “-” represents other types of structures. (c) Secondary structure
predicted by SOPMA. “h” represents α-helix; “e” represents extended strand; “c” represents
random coil; “t” represents β-turn. 135
Tertiary Structure and Functional Prediction of PHYSODRAFT_356433
Analysis of PHYSODRAFT_356433 by (PS)2, PFP failed to predict a tertiary structure while ESG predicted that PHYSODRAFT_356433 may be a membrane protein involved in transportation. Phyre2 predicted that the tertiary structure of
PHYSODRAFT_356433 is similar to that of influenza B virus nucleoprotein (with 20.7% of confidence), which is a RNA-binding protein (Figure 3.16). RaptorX predicted the tertiary structure of PHYSODRAFT_356433 as shown in Figure 3.17. 136
Figure 3.16. Structure model of PHYSODRAFT_356433 by Phyre2 web server represented in ribbon diagram. The change of rainbow colors (from blue to red) shows the direction of polypeptide: N-terminus to C-terminus. 137
Figure 3.17. Prediction of PHYSODRAFT_356433 tertiary structure by RaptorX. 138
Figure 3.18. SignalP Signal Peptide Prediction showed that PHYSODRAFT_356433
contains no signal peptide because no cleavage site was observed from all the three scores. C
is the cleavage site score, which is high in the cleavage site. The amino acid residues with
lower S-scores may potentially be the part of a mature protein, while the amino acid residues
with higher S-scores may potentially be the part of a signal peptide. The Y-score is highest on the sites where C-score is significantly higher and S-score abruptly decreased, implying the potential existence of a cleavage site. 139
Signal Peptide and Subcellular Localization Analyses of PHYSODRAFT_356433
The programs CoSiDe, iPSORT, SignalP (Figure 3.18) predicted that
PHYSODRAFT_356433 does not have any signal peptide. CELLO predicted that
PHYSODRAFT_356433 may be an extracellular or nuclear protein. TargetP predicted that
PHYSODRAFT_356433 is neither a mitochondrial nor a secretary protein. WoLF PSORT
predicted that PHYSODRAFT_356433 is a nuclear protein. Phobius failed to predict the
subcellular localization of PHYSODRAFT_356433 (Figure 3.19).
Figure 3.19. Prediction of subcellular localization of PHYSODRAFT_356433 by
Phobius. The amino acids between the positions between 27 and 47 showed the highest probability of being a transmembrane region. 140
In conclusion, PHYSODRAFT_291312 appears to be a protein without any transmembrane domains by the majority of prediction models. Some software tools predicted it may have more than one α-helix, which indicates the possibility of having HTH or HLH
domains, which are the characteristic structures in TFs. Signal peptide and subcellular
localization predictors gave various results. Concluding the analysis results,
PHYSODRAFT_291312 may be a water-soluble globular protein, or possibly a TF. Using
four secondary structure prediction tools PHYSODRAFT_291312 appears to contain α-helix structure. Phyre2 predicted that PHYSODRAFT_356433 is similar to a RNA-binding protein, but with a low confidence. Signal peptide and subcellular localization prediction of
PHYSODRAFT_356433 is with a similar situation with PHYSODRAFT_291312.
Concluding these, PHYSODRAFT_356433 may be a TF.
Aim 2) Confirmation of the Co-expression of the Bait and the Potential Prey Protein Genes using Existing Transcriptome Data Available in FungiDB
Co-expression Analysis of Bait and Prey Protein Genes
The expression profiles of the bait gene Ps1365 and the possible interactive prey genes
PHYSODRAFT_291312 and PHYSODRAFT_356433 were available via the FungiDB
database. All the three genes showed the highest transcription levels during the mycelium
stage. The lowest transcription rate of Ps1365 occurred during the infection stage, while the
lowest transcription of PHYSODRAFT_291312 and PHYSODRAFT_356433 happened
during the cyst stage. However, during the stages of mycelium, cyst, and infection, the
transcription levels of those potential prey protein genes are significantly higher than that of 141
Ps1365 (Figure 3.20). Except for these differences, the transcription levels of those three protein genes showed a similar expression pattern during the mycelium and cyst stages. 142
Figure 3.20. Transcription levels of three P. sojae protein genes: Ps1365
(PHYSODRAFT_342624), PHYSODRAFT_291312 and PHYSODRAFT_356433 during the three developmental stages of P. sojae (by Tyler 2014, FungiDB: fungidb.org). These transcriptome data were retrieved from FungiDB. In all the three stages, the transcription 143 levels of two prey proteins (PHYSODRAFT_291312 and PHYSODRAFT_356433) are significantly higher than that of the bait protein (PHYSODRAFT_342624). Except in the infection stage, the expression patterns of the three proteins are basically the same in two other developmental stages.
Discussion
In this research, two potential interactive protein partners for the bait protein, Ps1365,
were identified through yeast two-hybrid screening and subsequent bioinformatic analyses.
Yeast two-hybrid screening has showed its feasibility in many previous studies. For example,
yeast two-hybrid assay has been used successfully to identify the interactive proteins of
parasitic Toxoplasma gondii protein SAG2 in human cells (Lai & Lau, 2017). The authors
used the same system (Matchmaker Gold Yeast Two-Hybrid System) as used in this study.
SAG2 was used as bait and human cDNA library was used as prey. Eighteen clones were
initially identified as harboring potential interactive partners of SAG2, and by sequencing
there were thirteen candidate preys. Validation of these results was conducted using small-scale, one-to-one Y2H with both SAG2-inserted and empty bait vectors, and the results showed that only one prey is the true interactive partner of SAG2. They identified the protein as a human zinc-finger protein (HZF). Their further examination of the interaction using
β-galactosidase and coimmunoprecipitation assays showed that HZF is the true interactor protein of SAG2. A second experiment by Xin et al. (2017) also used the Matchmaker Gold
Yeast Two-Hybrid System to screen for the interactive proteins of the bovine muscle protein
CMYA1. For bait, in addition to using the full-length CMYA1, they also used only the
Xin-repeats (16 aa) as bait in their Y2H assay. Twenty-seven putative proteins were 144 identified as interacting proteins, but some of them only interact with CMYA1, while some others only with Xin-repeats. Only three proteins interact with both CMYA1 and Xin-repeats.
The one-to-one Y2H assays were used to verify the interactions of these three proteins with
both CMYA1 and Xin-repeats, and the results confirmed their interactions. Two universal
ribosomal proteins that are highly abundant were also identified. By these results the authors
speculated that the Xin-repeats may play a role in translation via the interactions between
ribosome and cytoskeleton.
The combination of Y2H and bioinformatic analysis showed that two putative protein
candidates, PHYSODRAFT_291312 and PHYSODRAFT_356433 are the potential
interactive partners of Ps1365. PHYSODRAFT_291312 was predicted not to contain any
transmembrane domain and may be a globular protein with one or two α-helixes, which
implies the possibility of being a transcription factor. The molecular weight of the
PHYSODRAFT_291312 protein is 8,288 Da which is lower than 40 kDa – the upper limit of
molecules to freely pass through nuclear envelope (Zanta et al., 1999; Kalderon et al., 1984).
Thus, the PHYSODRAFT_291312 protein may be able to freely enter the nucleus without
having a nuclear localization signal. These predictions partly support our hypothesis that
PHYSODRAFT_291312 is a transcription factor or at least a nuclear protein that interacts
with Ps1365. Signal peptide and subcellular localization prediction tools provided varies
results, and this may be because the current prediction tools have their limitations in
analyzing the localization of proteins (Min, 2010). Co-expression analysis of Ps1365 and
PHYSODRAFT_291312 using existing transcriptome data available in FungiDB showed that
the two proteins are co-expressed in mycelium and during infection. Although the expression 145 levels of Ps1365 and PHYSODRAFT_291312 are different, this may be explained by the possibility that two interactive proteins do not have to be in an exact one-to-one ratio, even in any fixed ratio for interacting with each other. It is also possible that they may also have different half-lives, and/or interact with several other proteins. In addition, the different expression of the two proteins can also be explained by the two possibilities:
PHYSODRAFT_291312 protein may be stored as a reserve for future interaction with
Ps1365 or PHYSODRAFT_291312 may perform functions other than interacting with
Ps1365. Taken together, these results support the possibility that these two proteins form a complex to perform a certain function together.
Secondary structure prediction showed that PHYSODRAFT_356433 may have more than one α-helix, and this implies the possible existence of HTH and HLH domains, which are the characteristic structures of transcription factors, at least nucleic acid-binding proteins
(Jones, 2004; Brennan & Matthews, 1989). Results from Phyre2 analysis showed that
PHYSODRAFT_356433 may be an RNA-binding protein, but with a low confidence level of
20.7%. Together, our data supports the assumption that PHYSODRAFT_356433 may be a nucleic acid-binding protein, which is the necessary attribute of transcription factors. The molecular weight of the deduced protein of PHYSODRAFT_356433 is 8288 Da which is lower than 40 kDa, thus it is able to freely enter nucleus without NLS to interact with other proteins including Ps1365 (Zanta et al., 1999; Kalderon et al., 1984). Existing transcriptome data available in FungiDB showed the similarity of the expression patterns of Ps1365 and
PHYSODRAFT_356433 in mycelium and cyst which supports the hypothesis that Ps1365 interacts with PHYSODRAFT_356433. 146
Validation of the interactions of Ps1365 with PHYSODRAFT_291312, and with
PHYSODRAFT_356433 requires at least an individual one-to-one Y2H assay of each pair.
Other methods that could be used to confirm protein-protein interactions in P. sojae include
co-immunoprecipitation (co-IP), pull-down assays and crosslinking analysis.
Two ribosomal RNAs were also identified as the interactors of Ps1365. The presence
of ribosomal RNAs in the Ps1365 interaction could be explained as follows: Firstly, it was
due to the innate weakness of Y2H: false positives (Vidalain, Boxem, Ge, Li & Vidal, 2004);
Secondly, the hybrid yeast cells may have high expression of the plasmids which contain
ribosomal RNA genes, thus allowing rRNA to be included as the “interactive protein”. Yeast two-hybrid assay is known to produce some false positives (Lai & Lau, 2017; Xin et al, 2017).
In some systems, Y2H had failed to show genuine interactions between proteins. For example, in the work of Strausak et al. (2003), Y2H did not show the interaction between Atox1 and
MBS5/6, while the surface plasmon resonance (SPR) analysis successfully showed the interaction. They attributed this weakness of Y2H to its nature as an indirect method to measure protein-protein interactions because the interaction is dependent on the formation of active transcription factors in yeast. Therefore, Y2H assay may fail to detect many interactions or yield false positive results. According to the authors, factors such as salt concentrations and pH may interfere with the formation of an intact transcription factor and thus result in false positives and/or false negatives. Therefore, they preferred a real-time surface plasmon resonance (SPR) method. In addition, Chua et al. (2012) failed to show the interaction between PfAha1 and PfHsp90 by Y2H. They also mentioned that Y2H is not able to indicate the interactions between PfHsp90 and many putative co-chaperones (Chua, Low, 147
Lehming, & Sim, 2012). These examples of the previous studies showed that Y2H is prone to false negatives. This may explain why in this study the number of achieved candidate proteins is so few.
As to the possibility that some potential interactive proteins of Ps1365 may have escaped from detection during this Y2H analysis, the P. sojae protein library may be screened using methods like protein probing, or phage display (Phizicky and Fields, 1995). Possible escaped weak and transient interactions may be analyzed by crosslinking protein interaction
analysis as well as label transfer protein interaction analysis (Golemis, 2002; Phizicky and
Fields, 1995).
Possible functions of the bait protein (Ps1365) and the two potential prey proteins may
be deduced from their transcription patterns. The transcriptome data from FungiDB indicated
that the highest transcriptions of the three genes occur during mycelium (mature) stage, and
this pattern implies the possibility that the three proteins may be essential for the vegetative
development as well as the invasion of hypha into host tissues, even cells. In other words, the
three proteins may regulate the expression of proteins perform non-reproductive growth and developmental functions. Secondary structure predictions of the three proteins hinted about the possible existence of HLH or HTH domains. Because majority of the known HLH and
HTH proteins do functions not directly related to reproductive cell formation (Jones, 2004;
Norton, 2000; Rosinski & Atchley, 1999), the predicted secondary structure of the three proteins are in consistent with their presumptive functions.
The findings of this research may have potential practical values in controlling and combating the deadly soybean pathogen, P. sojae. Because transcription is one of the most 148 important processes for all the living organisms on Earth, and transcription factors are the key players of eukaryotic transcriptional regulation, controlling a target species could be attained by controlling its transcription factors. Bao et al. (2019) found that symptoms on infected potatoes like stunting are resulted from the RNA-directed silencing of the expression of the
potato transcription factor, StTCP23 by the pathogen Potato spindle tuber viroid (PSTVd),
and their findings established that control on transcription factors can alter, even manipulate
the target organism. In this study, three potential transcription factors from P. sojae were
theoretically predicted. The three potential transcription factors may act as biological targets
for future studies on controlling P. sojae, and those studies may involve methods including
Host-Induced Gene Silencing (HIGS) (Baulcombe, 2015; Goulin et al., 2019; Qi et al., 2019),
small molecule-inhibition (Berg, 2008; Fontaine et al., 2017) to manipulate the three potential
P. sojae TFs to destroy P. sojae, this disastrous and stubborn pathogen.
In conclusion, in this study, we achieved two candidate prey proteins as well as two
rRNAs by Y2H. Initial analysis indicated that Ps1365 may interact with the two candidate
prey proteins PHYSODRAFT_291312 and PHYSODRAFT_356433. One-to-one Y2H assay as well as co-immunoprecipitation or pull-down assay and western blot could be applied to confirm these interactions.
References
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman,
D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database
search programs. Nucleic Acids Research, 25(17), 3389-3402.
doi:10.1093/nar/25.17.3389 149
Altschul, S. F., Wootton, J. C., Gertz, E. M., Agarwala, R., Morgulis, A., Schäffer, A. A., &
Yu, Y. (2005). Protein database searches using compositionally adjusted substitution
matrices. The FEBS Journal, 272(20), 5101-5109.
doi:10.1111/j.1742-4658.2005.04945.x
Bannai, H., Tamada, Y., Maruyama, O., Nakai, K., & Miyano, S. (2002). Extensive feature
detection of N-terminal protein sorting signals. Bioinformatics, 18(2), 298-305.
doi:10.1093/bioinformatics/18.2.298
Bannai, H., Tamada, Y., Maruyama, O., Nakai, K., & Miyano, S. (2001). Views:
Fundamental building blocks in the process of knowledge discovery. Paper presented
at the Fourteenth International Florida Artificial Intelligence Research Society
Conference,
Bao, S., Owens, R. A., Sun, Q., Song, H., Liu, Y., Eamens, A. L., . . . Zhang, R. (2019).
Silencing of transcription factor encoding gene StTCP23 by small RNAs derived from
the virulence modulating region of potato spindle tuber viroid is associated with
symptom development in potato. PLoS Pathogens, 15(12), e1008110.
doi:10.1371/journal.ppat.1008110
Baulcombe, D. C. (2015). VIGS, HIGS and FIGS: Small RNA silencing in the interactions of
viruses or filamentous organisms with their plant hosts. Current Opinion in Plant
Biology, 26, 141-146. doi:10.1016/j.pbi.2015.06.007
Berg, T. (2008). Inhibition of transcription factors with small organic molecules. Current
Opinion in Chemical Biology, 12(4), 464-471. doi:10.1016/j.cbpa.2008.07.023 150
Brennan, R. G., & Matthews, B. W. (1989). The helix-turn-helix DNA binding motif. The
Journal of Biological Chemistry, 264(4), 1903-1906.
Chitale, M., Hawkins, T., Park, C., & Kihara, D. (2009). ESG: Extended similarity group
method for automated protein function prediction. Bioinformatics, 25(14), 1739-1745.
doi:10.1093/bioinformatics/btp309
Chua, C. S., Low, H., Lehming, N., & Sim, T. S. (2012). Molecular analysis of Plasmodium
falciparum co-chaperone Aha1 supports its interaction with and regulation of Hsp90 in
the malaria parasite. International Journal of Biochemistry and Cell Biology, 44(1),
233-245. doi:10.1016/j.biocel.2011.10.021
Cserzö, M., Wallin, E., Simon, I., von Heijne, G., & Elofsson, A. (1997). Prediction of
transmembrane alpha-helices in prokaryotic membrane proteins: The dense alignment
surface method. Protein Engineering, 10(6), 673-676. doi:10.1093/protein/10.6.673
Drozdetskiy, A., Cole, C., Procter, J., & Barton, G. J. (2015). JPred4: A protein secondary
structure prediction server. Nucleic Acids Research, 43(W1), W389-W394.
doi:10.1093/nar/gkv332
Emanuelsson, O., Nielsen, H., Brunak, S., & von Heijne, G. (2000). Predicting subcellular
localization of proteins based on their N-terminal amino acid sequence. Journal of
Molecular Biology, 300(4), 1005-1016. doi:10.1006/jmbi.2000.3903
Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L., . . .
Bateman, A. (2016). The Pfam protein families database: Towards a more sustainable
future. Nucleic Acids Research, 44(D1), D279-D285. doi:10.1093/nar/gkv1344 151
Fontaine, F., Overman, J., Moustaqil, M., Mamidyala, S., Salim, A., Narasimhan, K., . . .
Francois, M. (2017). Small-molecule inhibitors of the SOX18 transcription factor. Cell
Chemical Biology, 24(3), 346-359. doi:10.1016/j.chembiol.2017.01.003
Frank, K. (2013). Sequence and structure searches for biological molecules - development
and applications of bioinformatics methods in molecule biology (doctoral dissertation).
University of Salzburg, Salzburg, Austria
Gabor E. Tusnady, & Istvan Simon. (2001). The HMMTOP transmembrane topology
prediction server. Bioinformatics, 17(9), 849-850.
doi:10.1093/bioinformatics/17.9.849
Garnier, J., Gibrat, J. F., & Robson, B. (1996). GOR method for predicting protein secondary
structure from amino acid sequence. Methods in Enzymology, 266, 540-553.
Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M., Appel, R., & Bairoch, A.
(2005). Protein identification and analysis tools on the ExPASy server. The
proteomics protocols handbook (pp. 571-607). Totowa, NJ: Humana Press. doi:571
Geer, L. Y., Domrachev, M., Lipman, D. J., & Bryant, S. H. (2002). CDART: Protein
homology by domain architecture. Genome Research, 12(10), 1619-1623.
doi:10.1101/gr.278202
George, R. A., Lin, K., & Heringa, J. (2005). Scooby-domain: Prediction of globular domains
in protein sequence. Nucleic Acids Research, 33(Web Server issue), W160-W163.
doi:10.1093/nar/gki381 152
Geourjon, C., & Deléage, G. (1995). SOPMA: Significant improvements in protein
secondary structure prediction by consensus prediction from multiple alignments.
Computer Applications in the Biosciences : CABIOS, 11(6), 681-684.
Gianluca Pollastri, & Aoife McLysaght. (2005). PORTER: A new, accurate server for protein
secondary structure prediction. Bioinformatics, 21(8), 1719-1720.
doi:10.1093/bioinformatics/bti203
Golemis, E. (2002). Protein‐Protein interactions: A molecular cloning manual. Cold Spring
Harbor (New York): Cold Spring Harbor Laboratory Press.
Goulin, E. H., Galdeano, D. M., Granato, L. M., Matsumura, E. E., Dalio, R. J. D., &
Machado, M. A. (2019). RNA interference and CRISPR: Promising approaches to
better understand and control citrus pathogens. Microbiological Research, 226, 1-9.
doi:10.1016/j.micres.2019.03.006
Hawkins, T., Chitale, M., Luban, S., & Kihara, D. (2009). PFP: Automated prediction of gene
ontology functional annotations with confidence scores using protein sequence data.
Proteins, 74(3), 566-582. doi:10.1002/prot.22172
Hofmann, K., & Stoffel, W. (1993). TMBASE - A database of membrane spanning protein
segments [Abstract]. Biol. Chem. Hoppe-Seyler 374,166
Horton, P., Park, K., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C. J., & Nakai, K.
(2007). WoLF PSORT: Protein localization predictor. Nucleic Acids Research,
35(Web Server issue), W585-W587. doi:10.1093/nar/gkm259 153
Howe, K. L., Contreras-Moreira, B., De Silva, N., Maslen, G., Akanni, W., Allen, J., . . .
Flicek, P. (2020). Ensembl Genomes 2020 - enabling non-vertebrate genomic
research. Nucleic Acids Research, 48(D1), D689-D695. doi:10.1093/nar/gkz890
Huang, T., Hwang, J., Chen, C., Chu, C., Lee, C., & Chen, C. (2015). (PS)2: Protein structure
prediction server version 3.0. Nucleic Acids Research, 43(W1), W338-W342.
doi:10.1093/nar/gkv454
Jones, P., Binns, D., Chang, H., Fraser, M., Li, W., McAnulla, C., . . . Hunter, S. (2014).
InterProScan 5: Genome-scale protein function classification. Bioinformatics, 30(9),
1236-1240. doi:10.1093/bioinformatics/btu031
Jones, S. (2004). An overview of the basic helix-loop-helix proteins. Genome Biology, 5(6),
226.
Kalderon, D., Roberts, B. L., Richardson, W. D., & Smith, A. E. (1984). A short amino acid
sequence able to specify nuclear location. Cell, 39(3), 499-509.
doi:10.1016/0092-8674(84)90457-4
Källberg, M., Wang, H., Wang, S., Peng, J., Wang, Z., Lu, H., & Xu, J. (2012).
Template-based protein structure modeling using the RaptorX web server. Nature
Protocols, 7(8), 1511-1522. doi:10.1038/nprot.2012.085
Kaur, I., Kocher, G., & Gupta, V. (2012). Molecular cloning and nucleotide sequence of the
gene for an alkaline protease from Bacillus circulans MTCC 7906. Indian Journal of
Microbiology, 52(4), 630-637. doi:10.1007/s12088-012-0297-4 154
Kelley, L. A., Mezulis, S., Yates, C. M., Wass, M. N., & Sternberg, M. J. E. (2015). The
Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols,
10(6), 845-858. doi:10.1038/nprot.2015.053
Kersey, P. J., Allen, J. E., Allot, A., Barba, M., Boddu, S., Bolt, B. J., . . . Yates, A. (2018).
Ensembl Genomes 2018: An integrated omics infrastructure for non-vertebrate species.
Nucleic Acids Research, 46(D1), D802-D808. doi:10.1093/nar/gkx1011
Krogh, A., Larsson, B., von Heijne, G., & Sonnhammer, E. L. L. (2001). Predicting
transmembrane protein topology with a hidden markov model: Application to
complete genomes. Journal of Molecular Biology, 305(3), 567-580.
doi:10.1006/jmbi.2000.4315
Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character
of a protein. Journal of Molecular Biology, 157(1), 105-132.
doi:10.1016/0022-2836(82)90515-0
Lai, M., & Lau, Y. (2017). Screening and identification of host proteins interacting with
Toxoplasma gondii SAG2 by yeast two-hybrid assay. Parasites & Vectors, 10(1),
456-458. doi:10.1186/s13071-017-2387-y
Letunic, I., & Bork, P. (2018). 20 years of the SMART protein domain annotation resource.
Nucleic Acids Research, 46(D1), D493-D496. doi:10.1093/nar/gkx922
Marchler-Bauer, A., & Bryant, S. H. (2004). CD-search: Protein domain annotations on the
fly. Nucleic Acids Research, 32(Web Server issue), W327-W331.
doi:10.1093/nar/gkh454 155
Marchler-Bauer, A., Bo, Y., Han, L., He, J., Lanczycki, C. J., Lu, S., . . . Bryant, S. H. (2017).
CDD/SPARCLE: Functional classification of proteins via subfamily domain
architectures. Nucleic Acids Research, 45(D1), D200-D203. doi:10.1093/nar/gkw1129
Marchler-Bauer, A., Derbyshire, M. K., Gonzales, N. R., Lu, S., Chitsaz, F., Geer, L. Y., . . .
Bryant, S. H. (2015). CDD: NCBI's conserved domain database. Nucleic Acids
Research, 43(Database issue), D222-D226. doi:10.1093/nar/gku1221
Marchler-Bauer, A., Lu, S., Anderson, J. B., Chitsaz, F., Derbyshire, M. K., DeWeese-Scott,
C., . . . Bryant, S. H. (2011). CDD: A conserved domain database for the functional
annotation of proteins. Nucleic Acids Research, 39 (Database issue), D225-D229.
doi:10.1093/nar/gkq1189
Min, X. J. (2010). Evaluation of computational methods for secreted protein prediction in
different eukaryotes. Journal of Proteomics & Bioinformatics, 3(4), 143-147.
doi:10.4172/jpb.1000133
Mitchell, A. L., Attwood, T. K., Babbitt, P. C., Blum, M., Bork, P., Bridge, A., . . . Finn, R. D.
(2019). InterPro in 2019: Improving coverage, classification and access to protein
sequence annotations. Nucleic Acids Research, 47(D1), D351-D360.
doi:10.1093/nar/gky1100
Norton, J. D. (2000). ID helix-loop-helix proteins in cell growth, differentiation and
tumorigenesis. Journal of Cell Science, 113 (Pt 22), 3897-3905
Pang, C. N. I., Lin, K., Wouters, M. A., Heringa, J., & George, R. A. (2008). Identifying
foldable regions in protein sequence from the hydrophobic signal. Nucleic Acids
Research, 36(2), 578-588. doi:10.1093/nar/gkm1070 156
Peng, J., & Xu, J. (2011). Raptorx: Exploiting structure information for protein alignment by
statistical inference. Proteins: Structure, Function, and Bioinformatics, 79(S10),
161-171. doi:10.1002/prot.23175
Petersen, B., Lundegaard, C., & Petersen, T. N. (2010). NetTurnP – neural network
prediction of beta-turns by use of evolutionary information and predicted protein
sequence features. PLoS One, 5(11), e15079. doi:10.1371/journal.pone.0015079
Petersen, B., Petersen, T. N., Andersen, P., Nielsen, M., & Lundegaard, C. (2009). A generic
method for assignment of reliability scores applied to solvent accessibility predictions.
BMC Structural Biology, 9(1), 51. doi:10.1186/1472-6807-9-51
Petersen, T. N., Brunak, S., von Heijne, G., & Nielsen, H. (2011). SignalP 4.0:
Discriminating signal peptides from transmembrane regions. Nature Methods, 8(10),
785-786. doi:10.1038/nmeth.1701
Phizicky, E. M., & Fields, S. (1995). Protein-protein interactions: Methods for detection and
analysis. Microbiological Reviews, 59(1), 94-123.
Pollastri, G., & McLysaght, A. (2005). PORTER: A new, accurate server for protein
secondary structure prediction. Bioinformatics (Oxford, England), 21(8), 1719-1720.
doi:10.1093/bioinformatics/bti203
Qi, T., Guo, J., Peng, H., Liu, P., Kang, Z., & Guo, J. (2019). Host-induced gene silencing: A
powerful strategy to control diseases of wheat and barley. International Journal of
Molecular Sciences, 20(1), 206. doi:10.3390/ijms20010206
Rosinski, J. A., & Atchley, W. R. (1999). Molecular evolution of helix–turn–helix proteins.
Journal of Molecular Evolution, 49(3), 301-309. doi:10.1007/PL00006552 157
Rutter, B. D. (2012). Catch of the day: a yeast one-hybrid assay identifies a novel
DNA-binding domain in Phytophthora sojae (Master's thesis, Bowling Green State
University)
Schultz, J., Milpetz, F., Bork, P., & Ponting, C. P. (1998). SMART, a simple modular
architecture research tool: Identification of signaling domains. Proceedings of the
National Academy of Sciences of the United States of America, 95(11), 5857-5864.
doi:10.1073/pnas.95.11.5857
Sen, T. Z., Jernigan, R. L., Garnier, J., & Kloczkowski, A. (2005). GOR V server for protein
secondary structure prediction. Bioinformatics (Oxford, England), 21(11), 2787-2788.
doi:10.1093/bioinformatics/bti408
Sonnhammer, E. L., von Heijne, G., & Krogh, A. (1998). A hidden markov model for
predicting transmembrane helices in protein sequences. Proceedings. International
Conference on Intelligent Systems for Molecular Biology, 6, 175-182.
Stajich, J. E., Harris, T., Brunk, B. P., Brestelli, J., Fischer, S., Harb, O. S., . . . Roos, D. S.
(2012). FungiDB: An integrated functional genomics database for fungi. Nucleic Acids
Research, 40(D1), D675-D681. doi:10.1093/nar/gkr918
Strausak, D., Howie, M. K., Firth, S. D., Schlicksupp, A., Pipkorn, R., Multhaup, G., &
Mercer, J. F. B. (2003). Kinetic analysis of the interaction of the copper chaperone
Atox1 with the metal binding sites of the menkes protein. Journal of Biological
Chemistry, 278(23), 20821-20827. doi:10.1074/jbc.M212437200
Tian, M., Huitema, E., Cunha, L. d., Torto-Alalibo, T., & Kamoun, S. (2004). A kazal-like
extracellular serine protease inhibitor from Phytophthora infestans targets the tomato 158
pathogenesis-related protease P69B. Journal of Biological Chemistry, 279(25),
26370-26377. doi:10.1074/jbc.M400941200
Tusnády, G. E., & Simon, I. (1998). Principles governing amino acid composition of integral
membrane proteins: Application to topology prediction. Journal of Molecular Biology,
283(2), 489-506. doi:10.1006/jmbi.1998.2107
Tusnády, G. E., & Simon, I. (2001). The HMMTOP transmembrane topology prediction
server. Bioinformatics (Oxford, England), 17(9), 849-850.
doi:10.1093/bioinformatics/17.9.849
Tyler, B. M., Tripathy, S., Zhang, X., Dehal, P., Jiang, R. H. Y., Aerts, A., . . . Boore, J. L.
(2006). Phytophthora genome sequences uncover evolutionary origins and
mechanisms of pathogenesis. Science, 313(5791), 1261-1266.
doi:10.1126/science.1128796
Vidalain, P., Boxem, M., Ge, H., Li, S., & Vidal, M. (2004). Increasing specificity in
high-throughput yeast two-hybrid experiments. Methods, 32(4), 363-370.
doi:10.1016/j.ymeth.2003.10.001
Xin, X., Wang, T., Liu, X., Sui, G., Jin, C., Yue, Y., . . . Guo, H. (2017). A yeast two-hybrid
assay reveals CMYA1 interacting proteins. Comptes Rendus - Biologies, 340(6-7),
314-323. doi:10.1016/j.crvi.2017.06.003
Ye, W., Wang, X., Tao, K., Lu, Y., Dai, T., Dong, S., . . . Wang, Y. (2011). Digital gene
expression profiling of the Phytophthora sojae transcriptome. Molecular
Plant-Microbe Interactions : MPMI, 24(12), 1530-1539.
doi:10.1094/MPMI-05-11-0106 159
Yu, C., Chen, Y., Lu, C., & Hwang, J. (2006). Prediction of protein subcellular localization.
Proteins: Structure, Function, and Bioinformatics, 64(3), 643-651.
doi:10.1002/prot.21018
Yu, C., Lin, C., & Hwang, J. (2004). Predicting subcellular localization of proteins for
gram‐negative bacteria by support vector machines based on n‐peptide
compositions. Protein Science, 13(5), 1402-1406. doi:10.1110/ps.03479604
Zanta, M. A., Belguise-Valladier, P., & Behr, J. (1999). Gene delivery: A single nuclear
localization signal peptide is sufficient to carry DNA to the cell nucleus. Proceedings
of the National Academy of Sciences of the United States of America, 96(1), 91-96.
doi:10.1073/pnas.96.1.91
Zhang, Z., Schwartz, S., Wagner, L., & Miller, W. (2000). A greedy algorithm for aligning
DNA sequences. Journal of Computational Biology: A Journal of Computational
Molecular Cell Biology, 7(1-2), 203-214 160 APPENDIX A. CONSENSUS SEQUENCES OF THE FOUR PREY CONTIGS OBTAINED FROM SEQUENCHER Contig[0001] CTCAAAGATTAAGCCATGCATGTCTAAGTATAAACACTTTTGTACTGTGA AACTGCGAATGGCTCATTATATCAGTTATAGTCTACTCGATAGTACCTTA CTACTTGGATACCCGTAGTAATTCTAGAGCTAATACATGCATAAATACCC AACTGCTTGTCGGGCGGGTAGCATTTATTAGATTGAAACCAATGCAGTCT TCGGGCTGGTATTGTGTTGAGTCATAATAACTGTGCGGATCGCGCTTTTG CGCGATAAATCGATTGAGTTTCTGCCCTATCAGCTTTGGATGGTAGGATA TGGGCCTACCATGGCATTAACGGGTAACGGGGAATTAGGGTTTGATTCCG GAGAGGGAGCCTTAGAAACGGCTACCACATCCAAGGAAGGCAGCAGGCGCGTAA ATTACCCAATCCTGACACAGGGAGGTAGTGACAATAAATAACAATG CTCTGGCTCTTCGAGTCGGGCAATTGGAATGAGAACAATTTAAATCCCTT AACGAGGATCAATTGGAGGGCAAGTCTGGTGCCAGCAGCCGCGGTAATTC CAGCTCCAATAGCGTATATTAAAGTTGTTGCAGTTAAAAAGCTCGTAGTT GGATTTCTGTTTTGGATGTCCGGTCCGCTCCCTCTGGGAGTGCGTACTTA TGGATGTTCGAGGCATTTTT:TGTGAGGCTGCCTTTCTGCCATTAAGTTG GTGGGTTGGTGGGCTTGCATCGTTTACTGTGAAA:A:AATTAGAGTGTTT AAAGCAGGCGTTTGCTCATTTGAATACATTAGCATGGAATAATAAGATAC GGCCTTGGTGGTCTATTTTGTTGGTTTGCACACCAGGGTAATGATTAATA GGGACAGTTGGGGGTATTCATATTTCAGCGTCAGAGGTGAAATTCTTGGA TCGCTGAAAGATGAGCTTAGGCGAAAGCATTTACCAAGGATGTTTTCATT AATCAAAAAAAAAAAA
Contig[0003] GTAAGCTCGTCTGGCGATGCGGGATGAACCGAACGCAAAGTTAAAGTGCC AAAATGCCCGCTCACCTAGATACCATAAAAGGTGTTGGTTCATTTAGACA GCAGGACGGTGGCCATGGAAGTCGGAATCCGCTAAGGAGTGTGTAACAAC TCACCTGCCGAATGAACTAGCCCTGAAAATGGATGGCGCTAAAGCGGGCT ACTTATACTTTGCCGTTAGCGCGATGGCATATGTAATGTACGCGTTAATG AGTAGGAGGGCGTGAGGGTCGTGAAGAAGCTCGTGGCGTGAGCCTGAGTGAAAC GGCCTTTAGTGCAGATCTTGGTGGTAGTAGCAAATATTCAAATGCG AACTTTGAAGACTGAAGTGGAGAAAGGTTCCGTGTGATCAGTAATTGGAC ACGGGTGAGTCGATCCTAAGAGATAGGATAATTCCGTTTGAAAGCTGGGC TGTTTGGGGCGTGGGGTTCGTCCTCGCGTTTTGTATGCAGCCTCGTGTAT CGAAAGGGAATCAGGTTAATATTCCTGAACCGAGACGTGGATAATGTGTG GCAACACAACCGACCCAGGAGACGCTAGCGTGTGCCCAGGGAAGAGTTCT CTTTTCTTTTTAACAAGCATGTCCTGCCTTGGAAACGGATTATCCGGCGA TAAGGAGGACGCCTTGGCAAAGCACTTTACTTTTTGAAGTGTCCTGTGCG CCCACGACAGCCCTTGAAAATCCTGG:CTAGAGATTTATTCGCACCCTCG TTCGTACTCATAACCGCATCAGGTCTCCAAGGTTAGCAGCCTCTAGTTGA TAGAAGAATGTAGATAA:GGGAA:GT:CGGCAAAATAGATCCGTAACTTC GGAAAAAAAAAAAAAAA 161
Contig[0007] GAGGAAAAGAAACTAACAAGGATTCCCCTAGTAACGGCGAGTGAAGCGGG AAGAGCTCAAGCTTAAAATCTCCGTGCAAGTTTTGCGCGGCGAATTGTAG TCTATAGAGGCGTGGTCAGCGTGGGCGCTTGGGGCAAGTTCCTTGGAGGA GGACAGCATGGAGGGTGATACTCCCGTTCATCCCTGAGTGGCTCGTGCGT ACGACCCGTGTTCTTTGAGTCGCGTTGTTTGGGAATGCAGCGCAAAGTAG GTGGTAAATTCCATCTAAAGCTAAATATTGGTGCGAGACCGATAGCGAAC AAGTACCGTGAGGGAAAGATGAAAAGAACTTTAAAAAAAAAAAA
Contig[0074] GACGGTGTTGACACAATGTGATTTCTGCCCAGTGCTCTGAATGTCAAAGT GAAGAGATTCAACCAAGCGCGGGTAAACGGCGGGAGTAACTATGACTCTC TTAAGGTAGCCAAATGCCTCGTCATCTAACTGTGACGCGCATGAATGGAT TAATGAGATTCCCACTGTCCCTATCTACCGTCTAGCGAACCCACAGCCAA GGGAACGGGCTTGGAATAATCAGCGGGGAAAGAAGACCCTGTTGAGCTTG ACTCTAGTCTGACGTTGTGAAATGACTTGGGAGGTGTAGAATAAGTGGGA GGCTCTGCCGTCCGTGAAATACCACTACTCTCAATGTTGTTTTACTTATA CCATGATGTAGCATCTGGGTGGCCTCGCGGTCGCTCAGTTCTAGTTGTAA TCTCGGGCTTTTGTCTGGGTGAGCTACGTGGAAGACATCGTCAGGTGGGG AGTTTGGCTGGGGCGGCACATCTGTTAAATGATAACACAGGTGTCCTAAG GTGAGCTCAATGAGAACAGAAATCTCATGTAGAACAAAAGGGTAAAAGCT CACTTGATTTTGATTTTCAGTATGAATACAAACCGTGAAAGCGTGGCCTA TCGATCCTTTAGTTCTTTAGAATTTTAAGCTAGAGGTGTCAGAAAAGTTA CCACAGGGATAACTGGCTTGTGGCAGCCAAGCGTCCATAGCGACGTTGCT TTTTGATTCTTCGATGTCGGCTCTTCCTATCATTGCGAAGTAGAACTCGC CAATTGTTGGATTGTTCACCCACTAATAGGGAACGTGAGCTGGGTTTAGA CCGTCGTGAGACAGGTTAGTTTTACCCTACTGATGAGTTCGTTGTCTAAA CAGTAATCCAACCCAGTACGAGAGGAACCGTTGGTTCAGATAATTGGTAA CTGCGGTTAGCTGAAAAGCTAGTGCCGCCAAGCTACCATCTGTAGGATTA TGGCTGAAC:CCTCTAAGTCAGAATCCATGCTGGAATAGACGATATCACC TTTCCGA:TGTGCCGCGAATAGCGATAG:GTGTCTTTCTGGGACCCANCA TCATA:CAA:TGCAACGCACTCCGCTGCC:GTCAGTG:CTGGCG:NGAA: TTAGGCTG:AC:TT:GTAATTYCAAAAATA:TGTGGGGAANN