
G C A T T A C G G C A T genes Review Beyond Trees: Regulons and Regulatory Motif Characterization Xuhua Xia 1,2 1 Department of Biology, University of Ottawa, Ottawa, ON K1N 6N5, Canada; [email protected] 2 Ottawa Institute of Systems Biology, Ottawa, ON K1H 8M5, Canada Received: 24 May 2020; Accepted: 24 August 2020; Published: 25 August 2020 Abstract: Trees and their seeds regulate their germination, growth, and reproduction in response to environmental stimuli. These stimuli, through signal transduction, trigger transcription factors that alter the expression of various genes leading to the unfolding of the genetic program. A regulon is conceptually defined as a set of target genes regulated by a transcription factor by physically binding to regulatory motifs to accomplish a specific biological function, such as the CO-FT regulon for flowering timing and fall growth cessation in trees. Only with a clear characterization of regulatory motifs, can candidate target genes be experimentally validated, but motif characterization represents the weakest feature of regulon research, especially in tree genetics. I review here relevant experimental and bioinformatics approaches in characterizing transcription factors and their binding sites, outline problems in tree regulon research, and demonstrate how transcription factor databases can be effectively used to aid the characterization of tree regulons. Keywords: gene expression; transcription factor; regulon; regulatory motifs; comparative genomics; Gibbs sampler 1. Introduction Research in tree genetics, especially in gene regulation networks, has benefitted much from looking beyond trees. The finding of animal c-MYB family of genes resulted in the discovery of the R2R3-MYB family of transcription factors in Arabidopsis thaliana [1,2], and the understanding of how A. thaliana responds to cold through gene regulation sheds light on how tree species such Malus demestica respond to cold [3]. I present a conceptual framework here to highlight some shortcomings in tree regulon studies to facilitate better experimental designs in the future. The term regulon was originally proposed [4] as an extension to operon, in the context of a repressor controlling the expression of multiple genes that are not located within the same operon. For example, the eight genes involved in the biosynthesis of arginine are scattered in five separate genomic locations in the genome of Escherichia coli, but controlled by the same arginine repressor [4]. Since its early usage, “regulon” has been associated with specific biological processes, e.g., the argR (the E. coli gene encoding the arginine repressor) regulon for arginine biosynthesis, or the CO-FT regulon for flowering timing and seasonal growth in Arabidopsis thaliana [5] and in trees [6]. The adoption of Arabidopsis thaliana as a model for research in plant molecular biology has resulted in the discovery and characterization of regulons in Arabidopsis thaliana and related plants, but also in various tree species. Some regulons function similarly, such as the CBF regulon in plant response to low temperature [3,7,8] which is gated by a circadian clock in both Arabidopsis thaliana [9,10] and peach trees [11]. However, some other regulons appear to function differently. For example, the CO-FT regulon controls genes involved in seasonal growth and flowering time in Arabidopsis thaliana [5], but flowering and growth cessation appear to be two different processes in Populus species. While Genes 2020, 11, 995; doi:10.3390/genes11090995 www.mdpi.com/journal/genes Genes 2020, 11, 995 2 of 17 flowering time is still under the control of the CO-FT regulon [6], seasonal growth cessation is regulated by anGenes additional 2020, 11, x FOR set ofPEER genes REVIEW [12 ]. 2 of 17 Many different experimental and bioinformatics methods have been used to discover and characterizePopulus species. regulons, While but flowering the large-scale time is approachstill under of the combining control of ChIP-seqthe CO-FT and regulon RNA-seq [6], seasonal technology has beengrowth the cessation most frequently is regulated used by in an recent additional years set [13 of,14 genes] evolving [12]. from the microarray method that had Many different experimental and bioinformatics methods have been used to discover and been used successfully in the characterization of the CESA regulon for cellulose biosynthesis [15,16]. characterize regulons, but the large-scale approach of combining ChIP-seq and RNA-seq technology Newhas derivatives been the most from frequently ChIP-seq, used such in asrecent ChIP-exo years [13,14] [17,18 evolving] ChEC-seq from [ 19the] microarray and Cut&Run method [20 ],that as well as newhad technologiesbeen used successfully such as in DamID-seq the characterization [21], have of the been CESA developed regulon for in recentcellulose years. biosynthesis I evaluate these[15,16]. experimental New derivatives methods from and ChIP-seq, bioinformatics such as ChIP-exo algorithms [17,18] after ChEC-seq a formal [19] definition and Cut&Run of regulon [20], and regulonas well network. as new technologies such as DamID-seq [21], have been developed in recent years. I evaluate these experimental methods and bioinformatics algorithms after a formal definition of regulon and 2. Aregulon Formal network. Definition of Transcription Regulon and Regulon Network A conceptually ideal transcription regulon should have four features: (1) a transcription factor, 2. A Formal Definition of Transcription Regulon and Regulon Network (2) one or more target genes whose expression is regulated by the transcription factor, (3) a set of conservedA transcriptionconceptually ideal factor transcription binding sitesregulon (TFBS) should on have the four target features: genes, 1) and a transcription (4) specific factor, biological 2) one or more target genes whose expression is regulated by the transcription factor, 3) a set of functions accomplished by altering the expression of the target genes. All regulons I mention in this conserved transcription factor binding sites (TFBS) on the target genes, and 4) specific biological review are transcription regulons. functions accomplished by altering the expression of the target genes. All regulons I mention in this reviewThe importance are transcription of feature regulons. 3 above may be illustrated in Figure1. Suppose a transcription factor A (tfA) canThe bind importance to TFBS of Xfeature and regulate3 above may three be targetillustrated genes, in FigureT1, T2 1.and Suppose a transcription a transcription factor factor B (tfB), that shareA (tfA) TFBS can bind X. tfB to proteinTFBS X and can regulate in turn bindthree target to TFBS genes, Y to T1 control, T2 and the a transcription expression offactor target B (tfB genes), T4 and T5thatthat share share TFBS TFBS X. tfB Y protein in their can regulatory in turn bind region. to TFBS Criterion Y to control 3 implies the expression that the of tfA-regulon target genes includes T4 targetand genes T5 thatT1, shareT2, and TFBStfB Y, butin their not regulatoryT4 and T5 region.whose Criterion TSBS Yis 3 boundimplies bythat tfB the but tfA-regulon not by tfA. includes In contrast, the termtarget “coregulated genes T1, T2, genes” and tfB would, but not typically T4 and includeT5 whoseT1, TSBS T2, tfB,Y is T4, boundand byT5 tfBas coregulatedbut not by tfA. directly In by tfA orcontrast, indirectly the term by tfA “coregulated through tfB. genes” would typically include T1, T2, tfB, T4, and T5 as coregulated directly by tfA or indirectly by tfA through tfB. tfA Cold stress (A) tfB Chr13 MdMYB124 T1 T2 tfA tfB T4 T5 MYB88 Chr16 MdMYB88 ? MdCSP3 T1 T2T4 T5 (B) (C) regulon 1 MYB124 Chr1 ? MdCCA1 Transcription, translation, and tfA nuclear import CCA1 Protein-DNA binding T1 T2 tfB Chr7 ? MdCBF3 TFBS X for tfA TFBS Y for tfB T4 T5 Cold hardiness (D) ? Unknown TFBS regulon 2 FigureFigure 1. Illustration 1. Illustration of of a regulona regulon network network ofof twotwo interacting interacting regulons. regulons. (A) ( AtfA-regulon) tfA-regulon with withtarget target genesgenesT1, T2T1, T2 and, andtfB ,tfB and, and tfB-regulon tfB-regulon with with targettarget genes genes T4T4 andand T5T5. (B.() BLegends) Legends of graphic of graphic elements. elements. TFBS—TranscriptionTFBS—Transcription factor factor binding binding site. site. ((C) A simplified simplified representation representation of the of thetwo two regulons. regulons. (D) A ( D)A partialpartial regulon regulon network network [3 ][3] for for coping coping withwith cold stress stress in in apple apple (Malus (Malus demestica demestica), in), which in which TFBSs TFBSs remainremain poorly poorly characterized. characterized. In rareIn rare cases, cases, two two didifferentfferent TFs TFs represent represent nearly nearly identical identical paralogues paralogues and feature and the feature same DNA- the same binding affinity. Examples include MSN2 on chromosome 13 and MSN4 on chromosome 11 of the DNA- binding affinity. Examples include MSN2 on chromosome 13 and MSN4 on chromosome yeast Saccharomyces cerevisiae [22], and MdMYB124 (MYB88, transcription factor MYB51) on 11 of the yeast Saccharomyces cerevisiae [22], and MdMYB124 (MYB88, transcription factor MYB51) Genes 2020, 11, 995 3 of 17 on chromosome 13 and MdMYB88 (LOC103402919, transcription factor MYB88) on chromosome 16 of the apple [3] shown in Figure1D. They should be designated as MSN2 /MSN4 regulon and MdMYB124/MdMYB88 regulon, respectively. Such redundancy may enhance regulon reliability in responding to
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages17 Page
-
File Size-