The R2R3-MYB Gene Family in Banana (Musa Acuminata): Genome-Wide 2 Identification, Classification and Expression Patterns
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.02.03.932046; this version posted July 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. M. acuminata MYB family 1 The R2R3-MYB gene family in banana (Musa acuminata): genome-wide 2 identification, classification and expression patterns. 3 4 5 Boas Pucker1, Ashutosh Pandey1,2, Bernd Weisshaar1, Ralf Stracke1,* 6 7 8 9 1 Bielefeld University, Faculty of Biology, Genetics and Genomics of Plants, Bielefeld, 10 Germany 11 2 National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India 12 13 14 * Corresponding author 15 E-mail: [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.03.932046; this version posted July 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. M. acuminata MYB family 16 Abstract 17 The R2R3-MYB genes comprise one of the largest transcription factor gene families in plants, 18 playing regulatory roles in plant-specific developmental processes, defense responses and 19 metabolite accumulation. To date MYB family genes have not yet been comprehensively 20 identified in the major staple fruit crop banana. In this study, we present a comprehensive, 21 genome-wide analysis of the MYB genes from Musa acuminata DH-Pahang (A genome). A 22 total of 285 R2R3-MYB genes as well as genes encoding three other classes of MYB proteins 23 containing multiple MYB repeats were identified and characterised with respect to structure 24 and chromosomal organisation. Organ- and development-specific expression patterns were 25 determined from RNA-seq data. For 280 M. acuminata MYB genes for which expression was 26 found in at least one of the analysed samples, a variety of expression patterns were detected. 27 The M. acuminata R2R3-MYB genes were functionally categorised, leading to the 28 identification of seven clades containing only M. acuminata R2R3-MYBs. The encoded 29 proteins may have specialised functions that were acquired or expanded in Musa during 30 genome evolution. This functional classification and expression analysis of the MYB gene 31 family in banana establishes a solid foundation for future comprehensive functional analysis 32 of MaMYBs and can be utilized in banana improvement programmes. 33 34 Keywords: Musa acuminata, Zingiberales, R2R3-MYB, transcription factor, gene family 35 36 37 Introduction 38 Banana (Musa spp.), including dessert and cooking types, is a staple fruit crop for a major 39 world population, especially in developing countries. The crop is grown in more than 100 40 countries throughout the tropics and sub-tropics, mainly in the African, Asia-Pacific, and 41 Latin American and Caribbean regions [1]. Bananas provide an excellent source of energy 42 and are rich in certain minerals and in vitamins A, C and B6. Furthermore, this perennial, 43 monocotyledonous plant provides an important source of fibre, sugar, starch and cellulose 44 (used for paper, textiles). Bananas have also been considered as a useful tool to deliver edible 45 vaccines [2]. Certain agronomic traits, such as stress and pest resistance as well as fruit 46 quality, are thus of considerable interest. Banana improvement through breeding exercises has 47 been challenging for various reasons. Therefore, genetic engineering-based optimisations hold 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.03.932046; this version posted July 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. M. acuminata MYB family 48 great promise for crop improvement. For this purpose, candidate gene targets need to be 49 identified. The release of a high quality banana genome sequence [3] provides an useful 50 resource to understand functional genomics of important agronomic traits and to identify 51 candidate genes to be utilized in banana improvement programmes. 52 Almost all biological processes in eukaryotic cells or organisms are influenced by 53 transcriptional control of gene expression. Thus, the regulatory level is a good starting point 54 for genetic engineering. Regulatory proteins are involved in transcriptional control, alone or 55 complexed with other proteins, by activating or repressing (or both) the recruitment of RNA 56 polymerase to promoters of specific genes [4]. These proteins are called transcription factors. 57 As expected from their substantial regulatory complexity, transcription factors are numerous 58 and diverse [5]. By binding to specific DNA sequence motifs and regulating gene expression, 59 transcription factors control various regulatory and signaling networks involved in the 60 development, growth and stress response in an organism. 61 One of the widest distributed transcription factor families in all eukaryotes is the MYB 62 (myeloblastosis) protein family. In the plant kingdom, MYB proteins constitute one of the 63 largest transcription factor families. MYB proteins are defined by a highly conserved MYB 64 DNA-binding domain, mostly located at the N-terminus of the protein. The MYB domain 65 generally consists of up to four imperfect amino acid sequence repeats (R) of about 50-53 66 amino acids, each forming three alpha–helices [summarised in 6]. The second and third 67 helices of each repeat build a helix–turn–helix (HTH) structure with three regularly spaced 68 tryptophan (or hydrophobic) residues, forming a hydrophobic core [7]. The third helix of each 69 repeat was identified as the DNA recognition helix that makes direct contact with DNA [8]. 70 During DNA contact, two MYB repeats are closely packed in the major groove, so that the 71 two recognition helices bind cooperatively to the specific DNA recognition sequence motif. 72 In contrast to vertebrates genomes, which only encode MYB transcription factors with three 73 repeats, plants have different MYB domain organisations, comprising one to four repeats [6, 74 9]. R2R3-MYBs, which are MYB proteins with two repeats (named according to repeat 75 numbering in vertebrate MYBs), are particularly expanded in plant genomes. Copy numbers 76 range from 45 unique R2R3-MYBs in Ginkgo biloba [10] to 360 in Mexican cotton 77 (Gossypium hirsutum) [11]. The expansion of the R2R3-MYB family was coupled with 78 widening in the functional diversity of R2R3-MYBs, considered to regulate mainly plant- 79 specific processes including secondary metabolism, stress responses and development [6]. As 80 expected, R2R3-MYBs are involved in regulating several biological traits, for example wine 81 quality, fruit color, cotton fibre length, pollinator preferences and nodulation in legumes. 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.03.932046; this version posted July 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. M. acuminata MYB family 82 In this study, we have used genomic resources to systematically identify members of the 83 M. acuminata (A genome) R2R3-MYB gene family. We used knowledge from other plant 84 species, including the model plant A. thaliana, leading to a functional classification of the 85 banana R2R3-MYB genes based on the MYB phylogeny. Furthermore, RNA-seq data was 86 used to analyse expression in different M. acuminata organs and developmental stages and to 87 compare expression patterns of closely grouped co-orthologs. The identification and 88 functional characterization of the R2R3-MYB gene family from banana will provide an insight 89 into the regulatory aspects of different biochemical and physiological processes, as those 90 operating during fruit ripening as well as response to various environmental stresses.Our 91 findings offer the first step towards further investigations on the biological and molecular 92 functions of MYB transcription factors with the selection of genes responsible for 93 economically important traits in Musa, which can be utilized in banana improvement 94 programmes. 95 96 97 Material and Methods 98 Search for MYB protein coding genes in the M. acuminata genome 99 A consensus R2R3-MYB DNA binding domain sequence [12] (S1 Table) was used as protein 100 query in tBLASTn [13] searches on the M. acuminata DH-Pahang genome sequence 101 (version 2) (https://banana-genome-hub.southgreen.fr/sites/banana-genome- 102 hub.southgreen.fr/files/data/fasta/version2/musa_acuminata_v2_pseudochromosome.fna) in 103 an initial search for MYB protein coding genes. To confirm the obtained amino acid 104 sequences, the putative MYB sequences were manually analysed for the presence of an intact 105 MYB domain. All M. acuminata MYB candidates from the initial BLAST were inspected to 106 ensure that the putative gene models encode two or more (multiple) MYB repeats. The 107 identified gene models were analysed to map them individually to unique loci in the genome 108 and redundant sequences were discarded from the data set to obtain unique MaMYB genes. 109 The identified MaMYB genes were matched to the automatically annotated genes from the 110 Banana Genome Hub database [14]. The open reading frames of the identified MaMYB were 111 manually inspected and, if available, verified by mapping of RNA-seq data to the genomic 112 sequence. Resulting MaMYB annotation is provided in the supporting information: Multi- 113 FASTA files with MaMYB CDS sequences (S1 File) and MaMYB peptide sequences (S2 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.03.932046; this version posted July 29, 2020.