<<

Chen et al. Horticulture Research (2019) 6:112 Horticulture Research https://doi.org/10.1038/s41438-019-0195-6 www.nature.com/hortres

REVIEW ARTICLE Open Access Genome sequences of horticultural : past, present, and future Fei Chen1, Yunfeng Song2, Xiaojiang Li2,JunhaoChen 3, Lan Mo3, Xingtan Zhang2,ZhenguoLin 4 and Liangsheng Zhang5

Abstract Horticultural plants play various and critical roles for humans by providing fruits, vegetables, materials for beverages, and herbal medicines and by acting as ornamentals. They have also shaped human art, culture, and environments and thereby have influenced the lifestyles of humans. With the advent of sequencing technologies, there has been a dramatic increase in the number of sequenced genomes of horticultural in the past decade. The genomes of horticultural plants are highly diverse and complex, often with a high degree of heterozygosity and a high ploidy due to their long and complex history of evolution and domestication. Here we summarize the advances in the genome sequencing of horticultural plants, the reconstruction of pan-genomes, and the development of horticultural genome databases. We also discuss past, present, and future studies related to genome sequencing, data storage, data quality, data sharing, and data visualization to provide practical guidance for genomic studies of horticultural plants. Finally, we propose a horticultural plant genome project as well as the roadmap and technical details toward three goals of the project. 1234567890():,; 1234567890():,; 1234567890():,; 1234567890():,; Introduction Horticultural plants are distributed among a wide Horticultural plants mostly comprise vegetable-produ- variety of taxonomic plant spectra, which include a large cing, fruit-bearing, ornamental, and beverage-producing number of flowering plants and a few early-divergent land plants and herbal medicinal plants. These plants have plants. The sizes of their genomes vary greatly. For played important economic and social roles in the human example, the vegetable garlic (Allium sativum) has a lives and health by providing basic food needs, beautifying diploid genome (2n = 16) with an estimated genome size urban and rural landscapes, and improving personal of >30 Gb1, and onion (Allium cepa) has a similar genome esthetics. For example, the Food and Agriculture Orga- size2. In addition, most horticultural plants are domes- nization of the United Nations reported that, while ticated, and their genome sequences have experienced worldwide cereal food together is valued at 125 points strong artificial selection. For example, grape was found to (normalized value), vegetables and fruits together are have been cultivated (via viticulture) for >6000 years3; valued at 137 points (http://faostat.fao.org). Horticultural citrus, >4000 years4. In addition, some horticultural plants plants also contribute to ecological balance by improving are intermediates of domesticated and wild plants, such as our biological environment by providing oxygen and medicinal plants including ginseng (Panax ginseng), noto balancing urban temperatures. ginseng (Panax notoginseng), and Artemisia (Artemisia annua). Many domesticated horticultural plants have high levels of genetic diversity and heterozygosity, such as Correspondence: Fei Chen ([email protected])or sunflower (10% of bases differ between homologous Liangsheng Zhang ([email protected]) chromosomes)5, grape (7%)6, and potato (4.8%)7. 1College of Horticulture, Nanjing Agricultural University, Nanjing 210095, 2College of Crop Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China Full list of author information is available at the end of the article.

© The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a linktotheCreativeCommons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. Chen et al. Horticulture Research (2019) 6:112 Page 2 of 23

De novo sequencing of horticultural plant and the reference genome of rose has provided insights genomes into the floral color and scent pathways25. As of December 31, 2018, the genomes of 181 horti- The Solanaceae family consists of ~2700 species (http:// cultural species have been sequenced (Table 1). These www.theplantlist.org) that include a number of vegetable, include 4 beverage, 47 fruit, 44 medicinal, 44 ornamental, medicinal, and ornamental species. The genomes of sev- and 42 vegetable plants (Fig. 1a). In terms of taxonomic eral important Solanaceae vegetable species have been distribution, these plants include 175 angiosperms, 2 sequenced, such as tomato (Solanum lycopersicum, Sola- gymnosperms, 3 lycophytes, and 1 moss (Fig. 1b). As num pimpinellifolium)28,29, potato (Solanum tuber- shown in Fig. 1c, the number of sequenced genomes of osum)30, pepper (Capsicum annuum, Capsicum – horticultural plants completed each year has significantly baccatum, Capsicum chinense)31 33, and eggplant (Sola- increased from 1 in 2007 to 40 in 2018. Although most of num melongena)34. Solanaceae ornamentals include ivy the horticultural plants are angiosperms, the genome morning glory (Ipomoea nil)35, ornamental tobacco sequencing of non-angiosperm species has also demon- (Nicotiana sylvestris)36, and petunia (Petunia axillaris, strated steady growth (Fig. 1c). Vegetables and fruits have Petunia inflate)37. Although these genomes have helped been a focus of plant research in the past few years. to understand the evolution of Solanaceae plants, addi- However, only two vegetables and seven fruits had their tional Solanaceae horticultural genomes need to be genomes sequenced in 2018 (Fig. 1d). This is probably sequenced. These include the sequences of the medicinal because many economically important vegetables and plants Datura arborea, Datura metel, and Datura innoxia fruits were already sequenced prior to 2018. and the ornamentals Petunia spp., Nicotiana spp., Lycium Some angiosperms have a significant role in the econ- spp., Solanum spp., Cestrum spp., Calibrachoa spp., and omy8. The 181 horticultural plants with sequenced gen- Solandra spp. These available genome sequences have omes are distributed in 30 of the 64 angiosperm orders. helped to decipher the evolution and genomic basis of Among these 30 orders, 7 (, , Cucurbitales, metabolites such as vitamin C (or ascorbic acid)38 in Brassicales, Sapindales, Solanales, and Laminales) have tomato and alkaloids in tobacoo39. >10 species whose genomes have been sequenced The family, consisting of ~19,000 known spe- (Fig. 1e), suggesting their vital importance to humans. cies, is the third largest angiosperm family by number of Most of the genome-sequenced plants fall into the species richness, followed by the Orchidaceae and Aster- family, which is a medium-sized family with aceae families. Although only dozens of Fabaceae genomes approximately 4800 species (http://www.theplantlist.org), have been sequenced8, many of them are from horti- including many popular fruit-bearing and ornamental cultural species. The genome-decoded Fabaceae vegetable plants. The genome-decoded fruit-producing species plants include pigeon pea (Cajanus cajan)40, and include breadnut (Artocarpus camansi)9, ficus (Ficus its relative ( arietinum, Cicer reticulatum)41,42,soy- carica)10, jujube (Ziziphus jujuba)11, strawberry and its bean (Glycine max)43, barrelclover (Medicago trunca- close relatives (Fragaria × ananassa, Fragaria iinumae, tula)44, common bean (Phaseolus vulgaris)45, faba bean Fragaria nipponica, Fragaria nubicola, Fragaria orienta- (Vicia faba)46, adzuki bean (Vigna angularis)47, and mung – lis, Fragaria vesca)12 14, ( domestica)15, bean (Vigna radiata)48. The genome-sequenced Fabaceae morus (Morus notabilis)16, sweet cherry (Prunus ornamentals include eastern redbud (Cercis canadensis)49, avium)17, peach (Prunus persica)18, Chinese (Pyrus narrowleaf lupin (Lupinus angustifolius)50, and mimosa bretschneideri)19, European pear (Pyrus communis)20, and (Mimosa pudica). The Fabaceae medicinal plants with black raspberry (Rubus occidentalis)21. The genome- sequenced genomes include Chinese uralensis (Glycyr- decoded ornamentals include mei (Prunus mume)22, rhiza uralensis)51 and red clover (Trifolium pratense)52. sakura (Prunus yedoensis)23, and rose and its close rela- Legumes are considered a valuable source of food in the tives (Rosa × damascene, Rosa chinensis, Rosa multiflora, future53; thus the sequencing of their genomes would be – and Rosa roxburghii)24 26. However, the genomes of many valuable. Determining the genomic basis of important fruit-bearing Rosales plants, such as Crataegus legume–rhizobium interactions would help not only to pinnatifida, Malus prunifolia, Eriobotrya japonica, solve a classic fundamental problem in biology but also to Armeniaca vulgaris, and Prunus salicina, and of Rosales improve nitrogen utilization in horticultural plants. ornamentals, such as Photinia serrulata, Spiraea thun- The Brassicaceae family is a medium-sized family with bergii, Cotoneaster multiflorus, and Rubus japonicas, have ~4000 species, including many horticultural plant species. not yet been sequenced. The available genome sequences The Brassicaceae vegetable plants with sequenced gen- of Rosales species have largely improved our under- omes include Zhacai (Brassica juncea)54, cabbage (Bras- standing of the biology of fruits and flowers. For example, sica oleracea)55, napa cabbage (Brassica rapa)56, Capsella the high-quality apple genome sequence showed that a (Capsella bursa-pastoris and Capsella rubella)57,58, radish single allele is responsible for red fruit peal coloration27, (Raphanus sativus)59, and field pennycress (Thlaspi Chen et al. Horticulture Research (2019) 6:112 Page 3 of 23

Table 1 List of genome-sequenced horticultural plant species and their close relatives

Species Common name Type DB-url

Zoysia japonica Japanese Angiosperm/ Ornamental zoysia.kazusa.or.jp lawn grass Alismatales/Poaceae Zoysia matrella Manila grass Angiosperm/ Ornamental zoysia.kazusa.or.jp Alismatales/Poaceae Zoysia pacifica Mascarene grass Angiosperm/ Ornamental zoysia.kazusa.or.jp Alismatales/Poaceae Cocos nucifera Coconut palm Angiosperm/ Fruit gigadb.org Arecales/Arecaceae Phoenix Date palm Angiosperm/ Fruit drdb.big.ac.cn dactylifera Arecales/Arecaceae Asparagus Garden asparagus Angiosperm/ Vegetable phytozome.jgi.doe.gov officinalis Asparagales/ Asparagaceae Dendrobium N.A. Angiosperm/ Medicinal herbalplant.ynau.edu.cn catenatum Asparagales/ Orchidaceae Gastrodia elata Tianma Angiosperm/ Medicinal herbalplant.ynau.edu.cn Asparagales/ Orchidaceae Phalaenopsis Aphrodite's Angiosperm/ Ornamental genomevolution.org; chibba.agtec.uga.edu/duplication; orchidstra2.abrc. aphrodite phalaenopsis Asparagales/ sinica.edu.tw Orchidaceae Phalaenopsis Horse Angiosperm/ Ornamental genomevolution.org; chibba.agtec.uga.edu/duplication; orchidstra2.abrc. equestris phalaenopsis Asparagales/ sinica.edu.tw Orchidaceae Dioscorea White Guinea yam Angiosperm/ Vegetable genomevolution.org/CoGe; plants.ensembl.org rotundata Dioscoreales/ Dioscoreaceae Ananas comosus Pineapple Angiosperm/Poales/ Fruit phytozome.jgi.doe.gov; genomevolution.org/CoGe; pineapple.angiosperms. Bromeliaceae org/pineapple/html/index.html Echinochloa Cockspur grass Angiosperm/Poales/ Medicinal horticulture.eplant.org crus-galli Poaceae Lolium perenne Perennial ryegrass Angiosperm/Poales/ Ornamental pgsb.helmholtz-muenchen.de Poaceae Zizania latifolia Jiaobai Angiosperm/Poales/ Vegetable plants.ensembl.org Poaceae Musa acuminata Wild banana Angiosperm/ Ornamental chibba.agtec.uga.edu/duplication; plants.ensembl.org/; phytozome.jgi.doe. Zingiberales/ gov; banana-genome-hub.southgreen.fr Musaceae Musa balbisiana Wild banana Angiosperm/ Ornamental banana-genome-hub.southgreen.fr Zingiberales/ Musaceae Musa itinerans Yunnan banana Angiosperm/ Fruit banana-genome-hub.southgreen.fr Zingiberales/ Musaceae Chen et al. Horticulture Research (2019) 6:112 Page 4 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Ensete Ethiopian banana Angiosperm/ Medicinal horticulture.eplant.org ventricosum Zingiberales/ Musaceae Liriodendron Chinese tulip tree Angiosperm/ Ornamental www.hardwoodgenomics.org chinense Magnoliales/ Magnoliaceae Manihot Cassava Angiosperm/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; plants. esculenta Malpighiales/ ensembl.org; www.plantgdb.org/; phytozome.jgi.doe.gov Euphorbiaceae Rhizophora Tall-stilt mangrove Angiosperm/ Medicinal genomevolution.org/coge apiculata Malpighiales/ Rhizophoraceae Begonia Shrub Begonia Angiosperm/ Ornamental fuchsioides Cucurbitales/ Begoniaceae Cucumis melo Muskmelon Angiosperm/ Fruit cucurbitgenomics.org/; bioinformatics.psb.ugent.be/plaza Cucurbitales/ Cucurbitaceae Citrullus lanatus Watermelon Angiosperm/ Fruit www.coolseasonFoodlegume.org; bioinformatics.psb.ugent.be/plaza; Cucurbitales/ chibba.agtec.uga.edu/duplication; cucurbitgenomics.org Cucurbitaceae Siraitia Monk fruit Angiosperm/ Medicinal herbalplant.ynau.edu.cn grosvenorii Cucurbitales/ Cucurbitaceae Cucumis sativus Cucumber Angiosperm/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; phytozome. Cucurbitales/ jgi.doe.gov; chibba.agtec.uga.edu/duplication; plants.ensembl.org; www. Cucurbitaceae plantgdb.org; cucurbitgenomics.org Cucurbita Silver-seed gourd Angiosperm/ Vegetable cucurbitgenomics.org argyrosperma Cucurbitales/ Cucurbitaceae Cucurbita Winter squash Angiosperm/ Vegetable cucurbitgenomics.org maxima Cucurbitales/ Cucurbitaceae Cucurbita Pumpkin Angiosperm/ Vegetable cucurbitgenomics.org moschata Cucurbitales/ Cucurbitaceae Cucurbita pepo Summer squash Angiosperm/ Vegetable cucurbitgenomics.org Cucurbitales/ Cucurbitaceae Lagenaria Bottle gourd Angiosperm/ Vegetable genomevolution.org; cucurbitgenomics.org siceraria Cucurbitales/ Cucurbitaceae Momordica Bitter melon Angiosperm/ Vegetable charantia Cucurbitales/ Cucurbitaceae Chen et al. Horticulture Research (2019) 6:112 Page 5 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Glycyrrhiza Chinese liquorice Angiosperm/Fabales/ Medicinal ngs-data-archive.psc.riken.jp uralensis Fabaceae Trifolium Red clover Angiosperm/Fabales/ Medicinal http://www.cacaogenomedb.org; bioinformatics.psb.ugent.be/plaza; plants. pratense Fabaceae ensembl.org; phytozome.jgi.doe.gov Cercis canadensis Eastern redbud Angiosperm/Fabales/ Ornamental genomevolution.orgauth.iplantc.org Fabaceae Lupinus Narrow- Angiosperm/Fabales/ Ornamental plants.ensembl.org angustifolius leaved lupine Fabaceae Mimosa pudica Sensitive plant Angiosperm/Fabales/ Ornamental www.medicagogenome.org Fabaceae Cajanus cajan Pigeon pea Angiosperm/Fabales/ Vegetable brassicadb.org/brad; genomevolution.org/CoGe; bioinformatics.psb.ugent. Fabaceae be/plaza; chibba.agtec.uga.edu/duplication Cicer arietinum Chick pea Angiosperm/Fabales/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; chibba.agtec. Fabaceae uga.edu/duplication; phytozome.jgi.doe.gov Cicer reticulatum Chick pea Angiosperm/Fabales/ Vegetable www.coolseasonfoodlegume.org Fabaceae Glycine max Soybean Angiosperm/Fabales/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; phytozome. Fabaceae jgi.doe.gov; chibba.agtec.uga.edu/duplication/; plants.ensembl.org; www. plantgdb.org Medicago Barrelclover Angiosperm/Fabales/ Vegetable phytozome.jgi.doe.gov; bioinformatics.psb.ugent.be/plaza; chibba.agtec.uga. truncatula Fabaceae edu/duplication; /plant/plantsdb.jsp; plants.ensembl.org; www.plantgdb.org Phaseolus Common bean Angiosperm/Fabales/ Vegetable genomevolution.org/CoGe; chibba.agtec.uga.edu/duplication; plants. vulgaris Fabaceae ensembl.org; phytozome.jgi.doe.gov Vicia faba Fava bean Angiosperm/Fabales/ Vegetable Fabaceae Vigna angularis Adzuki bean Angiosperm/Fabales/ Vegetable plants.ensembl.org Fabaceae Vigna radiata Mungbean Angiosperm/Fabales/ Vegetable plants.ensembl.org Fabaceae Casuarina Australian Angiosperm/Fagales/ Ornamental hardwoodgenomics.org equisetifolia pine tree Casuarinaceae Castanea Chinese chestnut Angiosperm/Fagales/ Fruit genomevolution.org/CoGe mollissima Fagaceae Juglans Chinese walnut Angiosperm/Fagales/ Fruit www.hardwoodgenomics.org cathayensis Juglandaceae Juglans hindsii Northern Angiosperm/Fagales/ Fruit www.hardwoodgenomics.org California walnut Juglandaceae rootstock Juglans Texas black walnut Angiosperm/Fagales/ Fruit www.hardwoodgenomics.org microcarpa Juglandaceae Juglans nigra Eastern Angiosperm/Fagales/ Fruit www.hardwoodgenomics.org black walnut Juglandaceae rootstock Juglans regia Common walnut Angiosperm/Fagales/ Fruit www.hardwoodgenomics.org Juglandaceae Juglans sigillata Iron walnut Fruit www.hardwoodgenomics.org Chen et al. Horticulture Research (2019) 6:112 Page 6 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Angiosperm/Fagales/ Juglandaceae Morella rubra Red bayberry Angiosperm/Fagales/ Fruit Myricaceae Nelumbo Sacred lotus Angiosperm/ Ornamental bioinformatics.psb.ugent.be/plaza; chibba.agtec.uga.edu/duplication nucifera Proteales/ Nelumbonaceae Macadamia Macadamia nut Angiosperm/ Fruit www.hardwoodgenomics.org integrifolia Proteales/Proteaceae Macleaya Plume poppy Angiosperm/ Medicinal herbalplant.ynau.edu.cn cordata Ranunculales/ Papaveraceae Papaver Opium poppy Angiosperm/ Medicinal genomevolution.orgauth.iplantc.org somniferum Ranunculales/ Papaveraceae Eschscholzia California poppy Angiosperm/ Ornamental eschscholzia.kazusa.or.jp californica Ranunculales/ Papaveraceae Aquilegia Colorado blue Angiosperm/ Medicinal genome.jgi.doe.gov; genomevolution.org/CoGe; phytozome.jgi.doe.gov coerulea columbine Ranunculales/ Ranunculaceae Cannabis sativa Hemp Angiosperm/Rosales/ Medicinal genome.ccbr.utoronto.ca Cannabaceae Parasponia Caoye Angiosperm/Rosales/ Medicinal www.bioinformatics.nl/parasponia andersonii shanhuangma Cannabaceae Trema orientalis Indian Angiosperm/Rosales/ Medicinal www.bioinformatics.nl/parasponia charcoal tree Cannabaceae Artocarpus Breadnut Angiosperm/Rosales/ Fruit sites.northwestern.edu/zerega-lab/research/artocarpus-genomics camansi Moraceae Ficus carica Common fig Angiosperm/Rosales/ Fruit Moraceae Ziziphus jujuba Jujube Angiosperm/Rosales/ Fruit genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza Rhamnaceae Fragaria iinumae Nogo strawberry Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp Rosaceae Fragaria Japanese Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp nipponica strawberry Rosaceae Fragaria Tibet strawberry Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp nubicola Rosaceae Fragaria Eastern strawberry Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp orientalis Rosaceae Fragaria vesca Woodland Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp; genomevolution.org/CoGe;bioinformatics. strawberry Rosaceae psb.ugent.be/plaza; chibba.agtec.uga.edu/duplication; phytozome.jgi.doe. gov Chen et al. Horticulture Research (2019) 6:112 Page 7 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Fragaria × Strawberry Angiosperm/Rosales/ Fruit strawberry-garden.kazusa.or.jp ananassa Rosaceae Malus domestica Apple Angiosperm/Rosales/ Fruit genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; phytozome. Rosaceae jgi.doe.gov; www.rosaceae.org Morus notabilis Mulberry Angiosperm/Rosales/ Fruit morus.swu.edu.cn Rosaceae Prunus avium Sweet cherry Angiosperm/Rosales/ Fruit www.rosaceae.org Rosaceae Prunus persica Peach Angiosperm/Rosales/ Fruit genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; chibba.agtec. Rosaceae uga.edu/duplication; plants.ensembl.org; phytozome.jgi.doe.gov Pyrus Chinese pear Angiosperm/Rosales/ Fruit bioinformatics.psb.ugent.be/plaza; chibba.agtec.uga.edu/duplication bretschneideri Rosaceae Pyrus communis European pear Angiosperm/Rosales/ Fruit www.rosaceae.org Rosaceae Rubus Black raspberry Angiosperm/Rosales/ Fruit www.rosaceae.org occidentalis Rosaceae Prunus mume Mei Angiosperm/Rosales/ Ornamental genomevolution.org/CoGe; chibba.agtec.uga.edu/duplication Rosaceae Prunus yedoensis Yoshino cherry Angiosperm/Rosales/ Ornamental www.rosaceae.org Rosaceae Rosa × Damask rose Angiosperm/Rosales/ Ornamental gigadb.org; www.rosaceae.org damascena Rosaceae Rosa chinensis Chinese rose Angiosperm/Rosales/ Ornamental www.rosaceae.org Rosaceae Rosa multiflora Many- Angiosperm/Rosales/ Ornamental www.rosaceae.org flowered rose Rosaceae Rosa roxburghii Chestnut rose Angiosperm/Rosales/ Ornamental www.rosaceae.org Rosaceae Daucus carota Carrot Angiosperm/Apiales/ Vegetable bioinformatics.psb.ugent.be/plaza; plants.ensembl.org; phytozome.jgi.doe. Apiaceae gov Panax ginseng Asian ginseng Angiosperm/Apiales/ Medicinal herbalplant.ynau.edu.cn Araliaceae Panax Sanchi ginseng Angiosperm/Apiales/ Medicinal herbalplant.ynau.edu.cn notoginseng Araliaceae Artemisia annua Sweet wormwood Angiosperm/ Medicinal herbalplant.ynau.edu.cn Asterales/Asteraceae Conyza Horseweed Angiosperm/ Medicinal genomevolution.org/CoGe canadensis Asterales/Asteraceae Erigeron Chinese fleabane Angiosperm/ Medicinal www.ncbi.nlm.nih.gov/genome/?term=Eleusine+coracana breviscapus Asterales/Asteraceae Chrysanthemum Juhuanao Angiosperm/ Vegetable genomevolution.org/CoGe nankingense Asterales/Asteraceae Cynara Cardoon Angiosperm/ Vegetable www.artichokegenome.unito.it cardunculus Asterales/Asteraceae Chen et al. Horticulture Research (2019) 6:112 Page 8 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Lactuca sativa Lettuce Angiosperm/ Vegetable phytozome.jgi.doe.gov Asterales/Asteraceae Eutrema Shan yu cai Angiosperm/ Medicinal yunnanense Brassicales/ Brassicaceae Lepidium meyenii Maca Angiosperm/ Medicinal maca.eplant.org Brassicales/ Brassicaceae Brassica juncea Zhacai Angiosperm/ Vegetable brassicadb.org Brassicales/ Brassicaceae Brassica oleracea Cabbage Angiosperm/ Vegetable brassicadb.org; genomevolution.org/CoGe; bioinformatics.psb.ugent.be/ Brassicales/ plaza; chibba.agtec.uga.edu/duplication; plants.ensembl.org Brassicaceae Brassica rapa Chinese cabbage Angiosperm/ Vegetable plants.ensembl.org; genomevolution.org/CoGe; bioinformatics.psb.ugent. Brassicales/ be/plaza; phytozome.jgi.doe.gov; chibba.agtec.uga.edu/duplication; plants. Brassicaceae ensembl.org; www.plantgdb.org Capsella bursa- Shepherd’s purse Angiosperm/ Vegetable genome.ccbr.utoronto.ca/cgi-bin/hgGateway pastoris Brassicales/ Brassicaceae Capsella rubella Red Angiosperm/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; chibba.agtec. shepherd’s purse Brassicales/ uga.edu/duplication; phytozome.jgi.doe.gov Brassicaceae Raphanus sativus Radish Angiosperm/ Vegetable radish.kazusa.or.jp Brassicales/ Brassicaceae Thlaspi arvense Field pennycress Angiosperm/ Vegetable pennycress.umn.edu Brassicales/ Brassicaceae Carica papaya Papaya Angiosperm/ Fruit genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; phytozome. Brassicales/Caricaceae jgi.doe.gov; chibba.agtec.uga.edu/duplication; www.plantgdb.org Tarenaya Spider flower Angiosperm/ Ornamental genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza hassleriana Brassicales/ Cleomaceae Moringa oleifera Moringa Angiosperm/ Vegetable bioinformatics.psb.ugent.be/plaza Brassicales/ Moringaceae Amaranthus Prince’s feather Angiosperm/ Ornamental phytozome.jgi.doe.gov; genomevolution.org/CoGe; bioinformatics.psb. hypochondriacus Caryophyllales/ ugent.be/plaza Amaranthaceae Beta vulgaris Sugar beet Angiosperm/ Vegetable bioinformatics.psb.ugent.be/plaza; chibba.agtec.uga.edu/duplication; plants. Caryophyllales/ ensembl.org Amaranthaceae Spinacia oleracea Spinach Vegetable spinachbase.org Chen et al. Horticulture Research (2019) 6:112 Page 9 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Angiosperm/ Caryophyllales/ Amaranthaceae Carnegiea Saguaro cactus Angiosperm/ Ornamental phytozome.jgi.doe.gov gigantea Caryophyllales/ Cactaceae Dianthus Carnation Angiosperm/ Ornamental carnation.kazusa.or.jp caryophyllus Caryophyllales/ Caryophyllaceae Casuarina glauca Swamp oak Angiosperm/ Ornamental Caryophyllales/ Casuarinaceae Drosera capensis Cape sundew Angiosperm/ Ornamental Caryophyllales/ Droseraceae Camptotheca Happy tree Angiosperm/ Ornamental www.plantkingdomgdb.com; genomevolution.org/CoGe acuminata Cornales/Nyssaceae Actinidia Kiwifruit Angiosperm/Ericales/ Fruit bdg.hfut.edu.cn/kir; genomevolution.org/coge chinensis Actinidiaceae Diospyros lotus Date-plum Angiosperm/Ericales/ Fruit gigadb.org Ebenaceae Vaccinium Blueberry Angiosperm/Ericales/ Fruit www.vaccinium.org corymbosum Ericaceae Vaccinium American Angiosperm/Ericales/ Fruit gigadb.org macrocarpon cranberry Ericaceae Rhododendron Tree Angiosperm/Ericales/ Ornamental delavayi rhododendron Ericaceae Primula vulgaris Common primrose Angiosperm/Ericales/ Medicinal phytozome.jgi.doe.gov Primulaceae Primula veris Cowslip Angiosperm/Ericales/ Ornamental plantgenie.org Primulaceae Camellia sinensis Tea tree Angiosperm/Ericales/ Beverage tpia.teaplant.org Theaceae Eucommia Hardy rubber tree Angiosperm/ Medicinal ulmoides Garryales/ Eucommiaceae Calotropis Crown flower Angiosperm/ Medicinal gigantea Gentianales/ Apocynaceae Catharanthus Madagascar Angiosperm/ Medicinal genomevolution.org/CoGe roseus periwinkle Gentianales/ Apocynaceae Coffea arabica Arabian coffee Angiosperm/ Beverage www.coffee-genome.org; phytozome.jgi.doe.gov Gentianales/ Rubiaceae Chen et al. Horticulture Research (2019) 6:112 Page 10 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Coffea Robusta Coffee Angiosperm/ Beverage genomevolution.org/CoGe; www.coffee-genome.org; bioinformatics.psb. canephora Gentianales/ ugent.be/plaza Rubiaceae Andrographis Green chireta Angiosperm/ Medicinal paniculata Lamiales/ Acanthaceae Handroanthus Pink trumpet tree Angiosperm/ Ornamental www.hardwoodgenomics.org impetiginosus Lamiales/ Bignoniaceae Boea N.A. Angiosperm/ Ornamental genomevolution.org hygrometrica Lamiales/ Gesneriaceae Mentha Horse mint Angiosperm/ Medicinal phytozome.jgi.doe.gov longifolia Lamiales/Lamiaceae Ocimum Holy basil Angiosperm/ Medicinal caps.ncbs.res.in/Ote sanctum Lamiales/Lamiaceae Scutellaria Baikal skullcap Angiosperm/ Medicinal baicalensis Lamiales/Lamiaceae Lavandula Lavender Angiosperm/ Ornamental angustifolia Lamiales/Lamiaceae Salvia splendens Scarlet sage Angiosperm/ Ornamental gigadb.org Lamiales/Lamiaceae Osmanthus Sweet osmanthus Angiosperm/ Medicinal sweetolive.eplant.org fragrans Lamiales/Oleaceae Fraxinus excelsior European ash Angiosperm/ Ornamental www.hardwoodgenomics.org Lamiales/Oleaceae Mimulus Seep Angiosperm/ Ornamental phytozome.jgi.doe.gov; www.plantgdb.org guttatus monkeyflower Lamiales/Phrymaceae Theobroma Cacao Angiosperm/ Beverage bioinformatics.psb.ugent.be/plaza/; chibba.agtec.uga.edu/duplication; cacao Malvales/Malvaceae plants.ensembl.org; phytozome.jgi.doe.gov Durio zibethinus Durian Angiosperm/ Fruit Malvales/Malvaceae Corchorus Chang shuo Angiosperm/ Medicinal bioinformatics.psb.ugent.be/plaza olitorius huang ma Malvales/Malvaceae Bombax ceiba Red silk- Angiosperm/ Ornamental cotton tree Malvales/Malvaceae Hibiscus syriacus Rose of Sharon Angiosperm/ Ornamental Malvales/Malvaceae Aquilaria Agarwood Angiosperm/ Medicinal agallocha Malvales/ Thymelaeaceae Santalum album Indian sandalwood Angiosperm/ Medicinal Santalales/ Santalaceae Chen et al. Horticulture Research (2019) 6:112 Page 11 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Citrus clementina Clementine citrus Angiosperm/ Fruit genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; phytozome. Sapindales/Rutaceae jgi.doe.gov Citrus grandis Pummelo Angiosperm/ Fruit www.citrusgenomedb.org Sapindales/Rutaceae Citrus Ichang papeda Angiosperm/ Fruit www.citrusgenomedb.org ichangensis Sapindales/Rutaceae Citrus paradisi × Citrumelo Angiosperm/ Fruit Poncirus trifoliata Sapindales/Rutaceae Citrus reticulata Mandarin orange Angiosperm/ Fruit www.citrusgenomedb.org Sapindales/Rutaceae Citrus sinensis Sweet orange Angiosperm/ Fruit www.citrusgenomedb.org Sapindales/Rutaceae Citrus unshiu Cold hardy Angiosperm/ Fruit www.citrusgenomedb.org mandarin Sapindales/Rutaceae Atalantia Jiu bing le Angiosperm/ Medicinal www.citrusgenomedb.org buxifolia Sapindales/Rutaceae Citrus medica Citron Angiosperm/ Medicinal www.citrusgenomedb.org Sapindales/Rutaceae Dimocarpus Longan Angiosperm/ Fruit gigadb.org longan Sapindales/ Sapindaceae Rhodiola Tibetan Rhodiola Angiosperm/ Medicinal gigadb.org crenulata Saxifragales/ Crassulaceae Kalanchoe Lavender-scallops Angiosperm/ Ornamental phytozome.jgi.doe.gov fedtschenkoi Saxifragales/ Crassulaceae Cuscuta australis Australian dodder Angiosperm/ Medicinal Solanales/ Convolvulaceae Cuscuta Prairie dodder Angiosperm/ Medicinal plabipd.de/project_cuscuta2/start.ep campestris Solanales/ Convolvulaceae Ipomoea nil Japanese Angiosperm/ Ornamental viewer.shigen.info/asagao morning glory Solanales/ Convolvulaceae Nicotiana Flowering tobacco Angiosperm/ Ornamental solgenomics.net sylvestris Solanales/Solanaceae Petunia axillaris N.A. Angiosperm/ Ornamental genome.jgi.doe.gov; bioinformatics.psb.ugent.be/plaza; solgenomics.net Solanales/Solanaceae Petunia inflata N.A. Angiosperm/ Ornamental solgenomics.net Solanales/Solanaceae Solanum Wild tomato Angiosperm/ Vegetable pennellii Solanales/Solanaceae relative Chen et al. Horticulture Research (2019) 6:112 Page 12 of 23

Table 1 (continued)

Species Common name Taxonomy Type DB-url

Capsicum Spanish pepper Angiosperm/ Vegetable bioinformatics.psb.ugent.be/plaza; chibba.agtec.uga.edu/duplication; annuum Solanales/Solanaceae solgenomics.net Capsicum Berry-like pepper Angiosperm/ Vegetable genomevolution.org/CoGe baccatum Solanales/Solanaceae Capsicum Bonnet pepper Angiosperm/ Vegetable www.pepperpan.org:8012 chinense Solanales/Solanaceae Solanum Tomato Angiosperm/ Vegetable bioinformatics.psb.ugent.be/plaza; phytozome.jgi.doe.gov; chibba.agtec.uga. lycopersicum Solanales/Solanaceae edu/duplication;pgsb.helmholtz-muenchen.de; plants.ensembl.org; www. plantgdb.org; solgenomics.net Solanum Eggplant Angiosperm/ Vegetable solgenomics.net; genomevolution.org/CoGe melongena Solanales/Solanaceae Solanum Currant tomato Angiosperm/ Vegetable solgenomics.net pimpinellifolium Solanales/Solanaceae Solanum Potato Angiosperm/ Vegetable genomevolution.org/CoGe; bioinformatics.psb.ugent.be/plaza; chibba.agtec. tuberosum Solanales/Solanaceae uga.edu/duplication; plants.ensembl.org; www.plantgdb.org; phytozome.jgi. doe.gov; solgenomics.net Vitis vinifera Grape Angiosperm/Vitales/ Fruit phytozome.jgi.doe.gov; genomevolution.org/CoGe; bioinformatics.psb. Vitaceae ugent.be/plaza; chibba.agtec.uga.edu/duplication; plants.ensembl.org; www.plantgdb.org Punica granatum Pomegranate Angiosperm/ Fruit www.hardwoodgenomics.org Myrtales/Lythraceae Marchantia Umbrella liverwort Bryophyta/ Medicinal bioinformatics.psb.ugent.be/plaza; phytozome.jgi.doe.gov polymorpha Marchantiales/ Marchantiaceae Ginkgo biloba Ginkgo tree Gymnosperms/ Medicinal gigadb.org/site/index Ginkgoales/ Ginkgoaceae Gnetum Jointfir Gymnosperms/ Medicinal www.datadryad.org/resource/doi:10.5061/dryad.0vm37; genomevolution. montanum Gnetales/Gnetaceae org/coge Selaginella Resuscitation moss Lycophyta/ Medicinal plantgdb.org/SmGDB/ lepidophylla Selaginellales/ Selaginellaceae Selaginella Spikemoss Lycophyta/ Medicinal phytozome.jgi.doe.gov; genomevolution.org/CoGe; chibba.agtec.uga.edu/ moellendorffii Selaginellales/ duplication; bioinformatics.psb.ugent.be/plaza; plants.ensembl.org; www. Selaginellaceae plantgdb.org/ Selaginella Little club moss Lycophyta/ Medicinal www.ncbi.nlm.nih.gov/assembly/GCA_003024785.1; genomevolution.org/ tamariscina Selaginellales/ coge Selaginellaceae

N.A. not available arvense)60. The genomes of the Brassicaceae medicinal events within the Brassicaceae64,65. These genomes would plants Eutrema yunnanense61 and maca (Lepidium also shed light on the evolution of the hypocotyl, as has meyenii)62 have also been sequenced. With these genome been reported in maca62 and radish59. Within the Bras- sequences at hand, the genomic features of common sicaceae family, we could foresee a growing demand for ancestors and the subsequent evolution of the Brassica- the genome sequencing of horticultural Brassicaceae ceae can be clarified, such as the intron evolution within plants, both for evolutionary research and for decoding the Brassicaceae63, and gene and genome duplication the molecular basis of economically important traits. Chen et al. Horticulture Research (2019) 6:112 Page 13 of 23

Fig. 1 Statistics of genome-sequenced horticultural plant species. a Distribution of genome-sequenced horticultural plants. b Botanical distribution of genome-sequenced horticultural plants. c Annual increase in the genome-sequenced horticultural plants by botanical taxonomy. d Annual increase in the genome-sequenced horticultural plants by horticultural category. e The reported 181 horticultural plant species fall into 30 angiosperm orders. f List of the released but not reported horticultural plant species Chen et al. Horticulture Research (2019) 6:112 Page 14 of 23

Table 2 Pan-genome information of horticultural plants

Pan-genome Covered population Year of Horticultural Tool Pan-genome release category database

Glycine soja 7 2014 Wild relatives of N.A. N.A. vegetable Brassica 9 cultivars 2016 Vegetable Gbrowse, BLAST, http:// oleracea Gbrowse brassicagenome.net/ Capsicum spp. 383 cultivars, including 355 C. annuum, four C. baccatum,11 2018 Vegetable Search, Jbrowse http://www. C. chinense,13C. frutescens pepperpan.org:8012/ Helianthus 493 accessions 2018 Ornamental N.A. www. annuus sunflowergenome. org Solanum 725 accessions 2019 Vegetable N.A. N.A. lycopersicum

N.A. not available

The Cucurbitaceae family includes >3700 species (Citrofortunella microcarpa), lime (Citrus spp. hybrids), belonging to 134 genera (www.theplantlist.org). Within kumquat (Citrus japonica), and grapefruit (Citrus × this family, the genome-decoded vegetable plants include paradisi) will require genome sequencing. silver-seed gourd (Cucurbita argyrosperma)66, winter squash (Cucurbita maxima)67, pumpkin (Cucurbita Genome resequencing and the pan-genome of moschata)67, summer squash (Cucurbita pepo)68, bottle horticultural plants gourd (Lagenaria siceraria)69, and bitter melon (Momor- A single reference genome sequence is not sufficient for dica charantia)70. The genome-decoded fruit species identifying the best candidate genes for molecular breeding include muskmelon (Cucumis melo)71 and watermelon or for understanding the genomic background of a popu- (Citrullus lanatus)72. The only genome-decoded medic- lation due to the prevalence of genomic structural varia- inal plant is monk fruit (Siraitia grosvenorii)73,74. Via tions. Compared to the construction of a reference genome, analysis of these available genome sequences, it was found genome resequencing usually requires less sequencing that a tetraploid-inducing event occurred in the last coverage. It is feasible to obtain a high-quality resequenced common ancestor of the Cucurbitaceae species75. These genome via mapping to a reference genome. A pan-genome genome sequences can also help to better understand the is the summary of genomes of a species obtained by com- domestication history76 and fruit development77. paring a large number of resequenced genomes of a species Increasing numbers of the wild relatives of these eco- or, occasionally, a . A pan-genome can help to nomically important crop species, as well as those of understand the size of a core genome (defined as the con- thousands of plant cultivars, will be sequenced in the near served part among the related genomes), the size of a pan- future, providing additional details and surprises. genome, and the amount and nature of variations within a The Rutaceae or citrus family consists of 158 genera and species or a genus, which improve our understanding of the 6686 species (www.theplatlist.org). The Rutaceae fruit- evolution of a species/genu, as well as of agronomic traits. bearing plants with sequenced genomes include clem- Currently, a growing number of pan-genomes among entine (Citrus clementina)78, pomelo (Citrus grandis)79, horticultural plants have been constructed (Table 2). Ichang papeda (Citrus ichangensis)79, citrumelo (Citrus Soybean is an economically important vegetable crop; in paradisi × Poncirus trifoliate)80, mandarin orange (Citrus addition to being a source of human protein, it is an reticulata)81, sweet orange (Citrus sinensis)82, and cold- important source of vegetable oil. Glycine soja is the hardy mandarin (Citrus unshiu)83. The Rutaceae medic- closest wild relative to cultivated soybean (Glycine max). inal plants with sequenced genomes include jiu bing le The G. soja pan-genome was the first horticultural pan- (Atalantia buxifolia)79 and citron (Citrus medica)79. Via genome released, which occurred in 2014 and consisted of analysis of these genome sequences, the evolutionary seven wild accessions85 (Table 2). This pan-genome origin and evolutionary changes in the Citrus genus revealed that, when more genomes were added, the during domestication were mapped84. In the future, the number of shared genes decreased, and in contrast, the genome sequences of Rutaceae fruit-bearing plants number of total genes increased when more genomes including lemon (Citrus limon), calamansi were added. In addition, this pan-genome confirmed that Chen et al. Horticulture Research (2019) 6:112 Page 15 of 23

a single reference genome does not adequately represent domestication and quality improvement were identified88. the genomic and genetic diversity of a species. Because the Thus a pan-genome not only improves our understanding reference genome of G. soja was not previously available, of crop evolution but also is useful for the discovery of those researchers assembled all seven genomes with the novel genes and breeding. de novo assembly method, but this method was not adopted by subsequent researchers. Data storage and visualization Assembly of the B. oleracea pan-genome86 is another In addition to comprehensive plant-centric databases early trial in the genomic research of horticultural plants such as Phytozome (https://phytozome.jgi.doe.gov) and (Table 2). It is relatively small, created using nine mor- EnsemblPlants (http://plants.ensembl.org), 27 horti- phologically diverse varieties (covering two cabbage, one cultural plant-specific genome databases have been con- broccoli, one brussels sprout, one kohlrabi, two cauli- structed (Table 3). Among these, 22 provide data for flowers, and one kale plant) and a wild relative, Brassica downloading. Some databases are freely accessible to all macrocarpa. Through the analyses of this pan-genome, users, while others provide only limited access to specific we observed that 20% of genes are absent in some data or users. For example, the Genome Database for (s), and there are presence–absence variations (PAVs), Rosaceae89 requires user registration and a login to access including those related to major agronomic traits. This is the breeding data. a pioneering study that provided assembled pan-genome Visualization of genomic data of horticultural plants is contigs, pan-genome annotations, and the GBrowse tool, challenging due to the heterogeneous nature of the dif- – available at http://brassicagenome.net. ferent types of data. GBrowse90 and JBrowse91 93 are Pepper plants are important vegetable plants with dis- powerful tools that provide a visualization of various tinct fruit morphologies. The pepper pan-genome has levels of genomic features. The availability of genomic been generated for the pepper genus Capsicum87. This analysis tools also varies greatly among databases. pan-genome consists of 5 species and 383 cultivars, all of BLAST-related tools such as NCBI-BLAST94 and viro- which have 15 chromosomes. In addition to the com- BLAST95 are provided by some databases for homologous parison of PAVs among this large amount of pepper sequence searches and sequence comparisons. Gene cultivars, the pan-genome is also useful in linking the query tools can help to obtain details of genes such as association between important agronomic traits and cor- their sequence, annotation, and expression. HMMER96 responding genes. These valuable pan-genome data and searches allow the inference and extraction of gene JBrowse and other search tools are available (www. families from genomes in the database. Syntenic tools pepperpan.org:8012). allow the identification and visualization of genome-wide Sunflower plants provide seed that can be used for syntenic relationships across genomes. The BioCyc tools cooking oil and serve as popular ornamentals. The sun- (https://biocyc.org) allow users to navigate individual flower pan-genome was created by sequencing 493 pathways or the whole metabolic map of a genome for accessions, including cultivars, landraces, and wild rela- functional analyses97. tives5. A total of 61,205 genes have been identified within The Genome Database for Rosaceae (GDR), which was the gene set of the sunflower pan-genome. Via the aid of developed by the main bioinformatics laboratory at this pan-genome, the understanding of the evolutionary Washington State University89, is well known among the history of sunflower species has significantly improved, Rosaceae research community and even the plant research and genes linked to biotic stress resistance have been community. It covers the genome sequences of 18 Rosa- identified5. Although pan-genome data can be found in ceae species (Fragaria vesca, F. ananassa, F. iinumae, F. the sunflower genome database (www.sunflowergenome. nipponica, F. nubicola, F. orientalis, Malus domestica, org), no publicly accessible tool has been built to date Potentilla micrantha, Prunus avium, Prunus domestica, (accessed March 31, 2019). Prunus dulcis, Prunus persica, Prunus yedoensis, Pyrus Reference genome sequences are necessary to identify bretschneideri, Pyrus communis, Rosa chinensis, Rosa genes and to understand evolutionary trajectory. How- multiflora, and Rubus occidentalis), which are categorized ever, a pan-genome can help to uncover additional details. into seven genera: Fragaria, Malus, Potentilla, Prunus, For example, relying on the tomato genome sequence, Pyrus, Rosa, and Rubus. To facilitate online analyses, a researchers mapped only several genes and pathways series of tools are provided, including genomic tools controlling fruit ripening28. These flesh- and flavor- (BLAST+, JBrowse, Primer3, Sequence Retrieval, Map- related genes are the best targets in breeding. Moreover, Viewer, Synteny Viewer), metabolomic tools (GDRcyc, genome sequences allow comprehensive and systematic Pathway Inspector), and breeding tools (Breeding infor- analyses of fruit biology. Furthermore, via the sequencing mation Management System (BMS), Breeders Toolbox). of a tomato population and analysis of its pan-genome The same team at Washington State University also consisting of 725 accessions, the genes selected during developed a series of horticultural plant-themed Chen et al. Horticulture Research (2019) 6:112 Page 16 of 23

Table 3 List of horticultural plant-centric genome Table 3 (continued) databases Database name Covered Tools Database name Covered Tools species species Nicotiana Genome Browser Herbal Medicine Omics Database Calotropis BLAST tabacum (JBrowse) gigantea Petunia axillaris Comparative Map Viewer (herbalplant.ynau.edu.cn) Catharanthus GBrowse Petunia inflata CAPS Designer roseus Solanum solQTL: QTL Mapping Rhodiola rosea pimpinellifolium Gastrodia elata Solanum In Silico PCR Eucommia lycopersicum ulmoides Solanum Tomato Expression Camptotheca tuberosum Atlas (TEA) acuminata Solanum phureja Tomato Expression Ginkgo biloba Database (TED) Dioscorea Capsicum SolCyc Biochemical rotundata annuum Pathways Panax ginseng Petunia axillaris Coffee Interactomic Data Punica granatum Petunia x hybrida SGN Ontology Browser Boea Petunia Breeders Toolbox hygrometrica integrifolia var inflata Jatropha curcas FTP Site Glycyrrhiza uralensis Download Gene Sequences Cannabis sativa Clones, Arrays, Unigenes Macleaya and BACs cordata Unigene Converter Mentha longifoli Citrus Genome Database (CGD) Citrus clementina Breeding Information Erigeron Management System breviscapus (www.citrusgenomedb.org) Citrus BLAST+ Panax ichangensis notoginseng Citrus sinensis Breeders Toolbox Moringa oleifera Citrus reticulata GDRCyc Lepidium meyenii Citrus maxima JBrowse Dendrobium officinale Citrus medica MapViewer Salvia Poncirus trifoliata Pathway Inspector miltiorrhiza Atalantia Primer3 Genome Database for Fragaria vesca Breeding Information buxifolia Rosaceae (GDR) Management System Sequence Retrieval (www.rosaceae.org) Fragaria x BLAST+ Synteny Viewer ananassa Cool Season Food Legume Cicer arietinum JBrowse Malus x Breeders Toolbox Database (CSFL) domestica (www.coolseasonfoodlegume. Cicer reticulatum PathwayCyc Prunus GDRCyc org) armeniaca Vicia faba Breeding Information Prunus avium JBrowse Management System Prunus cerasus MapViewer Pisum sativum MapViewer Prunus dulcis Pathway Inspector Lens culinaris Synteny Viewer Prunus persica Primer3 BLAST+ Prunus serotina Sequence Retrieval Cucurbit Genomics Database Cucumis sativus BLAST Pyrus communis Synteny Viewer (CuGenDB) Rubus (cucurbitgenomics.org) Cucumis melo JBrowse occidentalis Citrullus lanatus Batch Query Sol Genomics Network Solanum BLAST Cucurbita Synteny Viewer pennellii maxima (solgenomics.net) Solanum VIGS Tool Cucurbita CMAP lycopersicoides moschata Nicotiana Alignment Analyzer Cucurbita pepo CucurbitCyc attenuata Lagenaria Pathway enrichment Nicotiana Tree Browser siceraria benthamiana GO enrichment Chen et al. Horticulture Research (2019) 6:112 Page 17 of 23

Table 3 (continued) Table 3 (continued)

Database name Covered Tools Database name Covered Tools species species

Gene classification (genome.ccbr.utoronto.ca/cgi- GBrowser Banana Genome Hub Musa acuminata BLAST bin/hgGateway) DH-Pahang Design primer (banana-genome-hub. Musa acuminata JBrowse Carnation DB Dianthus BLAST southgreen.fr) Banksii caryophyllus Musa acuminata GBrowser (carnation.kazusa.or.jp) Zebrina HopBase Humulus lupulus BLAST Musa acuminata Generic Maps (hopbase.cgrb.oregonstate.edu) Gbrowse Calcutta 4 Medicago truncatula Genome Medicago JBrowse Musa Gene Family Database (MTGD) truncatula balbisiana PKW (www.medicagogenome.org) BLAST Musa Itinerans Chromosome viewer Web Services Musa Transcriptomic Search schizocarpa CMap (LegumeInfo.org) Design primer GO Analysis Ontology Browser InterPro Annotations Dotplot Tea Plant Information Camellia sinensis BLAST Archive (TPIA) Brassica database (BRAD) Brassica rapa BLAST (tpia.teaplant.org/) Gbrowse (brassicadb.org) Brassica juncea Gbrowse Pathway Brassica napus Markers and Maps Correlation Analysis Brassica oleracea Gene families Function Erichment Glucosinolate genes Batch Retrival Anthocyanin genes Mulberry Genome Database Morus notabilis Transposable Element Resistance genes (MorusDB) Analysis Flower genes (morus.swu.edu.cn/morusdb) Horizontal Gene Transfer Transcription factors Analysis Auxin genes Ortholog and Paralog Phenotypes Group Analysis People/Labs BLAST Pepper Pangenome Browser Capsicum Generic genome WEGO (PepperPan) annuum browser HMMER (www.pepperpan.org:8012) Capsicum Browse GO baccatum Search GO Capsicum Find Motifs frutescens Pear Genome Project Pyrus Download Cofffee Genome Hub (CGH) Coffea Advanced Search bretschneideri canephora (peargenome.njau.edu.cn) (www.coffee-genome.org/ Coffea arabica Chromosome Viewer coffeacanephora) Radish Genome database Raphanus sativus BLAST Gene annotation (www.radish-genome.org/) Gbrowse Gene Expression Expression Gene Families CsiDB Citrus sinensis Gene Search Genetic Map (citrus.hzau.edu.cn) BLAST Primer Blaster GBrowser Primer Designer PPI SNPs Pathway Blast Mint Genomics Resource Mentha longifolia BLAST JBrowse (langelabtools.wsu.edu/mgr/ Gbrowse organism/Mentha/longifolia) GBrowser Pathway Viggs Vigna marina Gbrowse subsp. oblonga CeleryDB Apium BLAST graveolens (viggs.dna.affrc.go.jp) Vigna angularis BLAST (apiaceae.njau.edu.cn) GBrowser Vigna angularis BLAT (Willd.) Transcription factors Vigna vexillata CarrotDB Daucus carota BLAST Cannabis genome Cannabis sativa BLAST (apiaceae.njau.edu.cn/) Gbrowse project (CCBR) Transcription factors Chen et al. Horticulture Research (2019) 6:112 Page 18 of 23

Table 3 (continued) Table 3 (continued)

Database name Covered Tools Database name Covered Tools species species

Germplasm Resources (genome.ccbr.utoronto.ca/cgi- GBrowser Collection bin/hgGateway) Banana Genome Hub Musa acuminata BLAST Design primer DH-Pahang Carnation DB Dianthus BLAST (banana-genome-hub. Musa acuminata JBrowse caryophyllus southgreen.fr) Banksii (carnation.kazusa.or.jp) Musa acuminata GBrowser HopBase Humulus lupulus BLAST Zebrina (hopbase.cgrb.oregonstate.edu) Gbrowse Musa acuminata Generic Maps Calcutta 4 Medicago truncatula Genome Medicago JBrowse Database (MTGD) truncatula Musa Gene Family balbisiana PKW (www.medicagogenome.org) BLAST Musa Itinerans Chromosome viewer Web Services Musa Transcriptomic Search CMap (LegumeInfo.org) schizocarpa GO Analysis Design primer InterPro Annotations Ontology Browser Tea Plant Information Camellia sinensis BLAST Dotplot Archive (TPIA) Brassica database (BRAD) Brassica rapa BLAST (tpia.teaplant.org/) Gbrowse (brassicadb.org) Brassica juncea Gbrowse Pathway Brassica napus Markers and Maps Correlation Analysis Brassica oleracea Gene families Function Erichment Glucosinolate genes Batch Retrival Anthocyanin genes Mulberry Genome Database Morus notabilis Transposable Element (MorusDB) Analysis Resistance genes (morus.swu.edu.cn/morusdb) Horizontal Gene Transfer Flower genes Analysis Transcription factors Ortholog and Paralog Auxin genes Group Analysis Phenotypes BLAST People/Labs WEGO Pepper Pangenome Browser Capsicum Generic genome HMMER (PepperPan) annuum browser Browse GO (www.pepperpan.org:8012) Capsicum Search GO baccatum Find Motifs Capsicum frutescens Pear Genome Project Pyrus Download bretschneideri Cofffee Genome Hub (CGH) Coffea Advanced Search canephora (peargenome.njau.edu.cn) (www.coffee-genome.org/ Coffea arabica Chromosome Viewer Radish Genome database Raphanus sativus BLAST coffeacanephora) (www.radish-genome.org/) Gbrowse Gene annotation Expression Gene Expression CsiDB Citrus sinensis Gene Search Gene Families (citrus.hzau.edu.cn) BLAST Genetic Map GBrowser Primer Blaster PPI Primer Designer Pathway SNPs Mint Genomics Resource Mentha longifolia BLAST Blast (langelabtools.wsu.edu/mgr/ Gbrowse JBrowse organism/Mentha/longifolia) GBrowser Pathway Viggs Vigna marina Gbrowse CeleryDB Apium BLAST subsp. oblonga graveolens (viggs.dna.affrc.go.jp) Vigna angularis BLAST (apiaceae.njau.edu.cn) GBrowser Vigna angularis BLAT Transcription factors (Willd.) CarrotDB Daucus carota BLAST Vigna vexillata (apiaceae.njau.edu.cn/) Gbrowse Cannabis genome Cannabis sativa BLAST Transcription factors project (CCBR) Germplasm Resources Collection Chen et al. Horticulture Research (2019) 6:112 Page 19 of 23

databases, including the Citrus Genome Database, Cool- visualization of genomic data, and syntenic tools are Season Food Legume Crop Database resources, and useful for comparative analyses. Genome Database for Vaccinium (GRIN). All these The Herbal Medicine Omics Database103 includes databases share a similar data process standard and have genomic, transcriptomic, pathway, and metabolomic data built-in bioinformatics tools. for medicinal plants, although the medicinal properties of The Sol Genomics Network (SGN)98, a database of some plants are recognized only in some parts of the Solanaceae genomic and phenotypic data and tools, was world. In this database, hundreds of medicinal plants are developed by Mueller’s team from the Boyce Thompson included. However, the database currently provides only Institute for Plant Research and Cornell University. The the BLAST and GBrowse tools for the visualization of SGN includes 11 genomes: those of Solanum lycopersi- omics data. Other collected omic data can be downloaded cum, S. lycopersicoides, S. pimpinellifolium, S. tuberosum, but cannot be analyzed or visualized online. S. pennellii, Capsicum annuum, Nicotiana attenuata, N. There are other tool-specific databases that can be very benthamiana, N. tabacum, Petunia axillaris, and P. useful for the visualization and online analyses of horti- inflata. These species are categorized into four econom- cultural plant genome sequences. The Plant Genome ically important genera: Solanum, Capsicum, Nicotiana, Duplication Database (PGDD)104 offers online analyses of and Petunia. For online analyses of genomic sequences, gene synteny and visualization of different results, such as BLAST, Alignment Analyzer, Tree Browser, and VIGS dot plots (macrosynteny) and local genomic comparison tools are available. For mapping of various data, JBrowse, plots (microsynteny). The built-in Map-View tool allows Comparative Map Viewer, CAPS Designer, and solQTL mapping of a given sequence to the genomes of 47 species are provided. Some tools have been developed for com- from the PGDD (data accessed on March 31, 2019). The mon molecular wet laboratory experiments, including In- Plant Duplicate Gene Database105 is a collection of 141 Silico PCR, the Tomato Expression Atlas, and the Tomato plant species and offers online analysis and visualization Expression Database. Systems biology tools such as Sol- of duplicated genes in select species. Cyc Biochemical Pathways99, Coffee Interactome Data, and the SGN Ontology Browser are provided. The Bree- Discussions and future perspectives ders Toolbox was developed for breeders. The same team The horticultural plant genome project also developed a series of horticultural plant-themed It is challenging to determine the exact number of databases, including the YamBase (https://yambase.org), species or cultivars that exist for horticultural plants. In CassavaBase (https://cassavabase.org), and MusaBase terms of fruit-bearing plants, at least 91 species are eco- (https://musabase.org) databases. All these databases nomically important and produce fruit that are consumed adhere to the release of genomic data before publication (https://simple.wikipedia.org/wiki/List_of_fruits). More (the Toronto Agreement)100. than 200 vegetable plants are consumed (https://simple. The Cucurbit Genomics Database (CuGenDB)101 cur- wikipedia.org/wiki/List_of_vegetables). The exact number rently hosts eight high-quality genome sequences corre- of ornamentals is also unclear, as novel cultivars are sponding to those of cucumber (Cucumis sativus), water produced each year. However, it has been estimated that melon (Citrullus lanatus), winter squash (Cucurbita there are >6000 ornamental cultivars (https://www.rhs. maxima), pumpkin (Cucurbita moschata), summer org.uk/plants/pdfs/agm-lists/agm-ornamentals-(1).pdf), squash (Cucurbita pepo), muskmelon (Cucumis melo), and many cultivars are created and disappear each year. bottle gourd (Lagenaria siceraria), and silver-seed gourd Up to December 2018, genome sequences had been (Cucurbita argyrosperma). The search and batch query decoded for only 181 species, accounting for only a small system allow searching for sequences and annotations. To proportion of the total horticultural plant species. Hence, display genomic details, the JBrowse, BLAST, Gene there is a strong need to sequence additional genomes for Ontology (GO), Synteny Viewer, CAMP, and expression more horticultural plants that would be valuable for viewer tools are available. To display metabolic pathways, comparative genomics, to better understand their evolu- CucurbitCyc and Pathway enrichment tools are available. tionary history, and to possibly make genetic modifica- The Brassica Database (BARD)102, a database of tions to better utilize these plant species. important Brassica species, covers the vegetable species Here we propose a horticultural plant genome project Brassica rapa and B. oleracea, as well as the model plant (HPGP) with three goals (Fig. 2). The first goal of the Arabidopsis and Brassicaceae close relatives. In addition HPGP is to generate reference genome sequences for all to its genomic data, the BRAD database hosts a curated horticultural plants, after which pan-genomes and core list of genes involved with anthocyanins, resistance, auxin, collections would be generated as genetic banks for hor- flowering, and glucosinolates and a full list of gene ticultural plants. Two recently developed genome families that are of considerable importance in Brassica assembly methods could be applied to decode highly – research. BLAST and JBrowse tools were built for ploidy71 and highly heterozygous106 108 horticultural Chen et al. Horticulture Research (2019) 6:112 Page 20 of 23

Highly ploidy genome assembly Highly heterozygous genome assembly diploid genome hexaploid genome hybridization For each parent: 1, NGS 2, long read sequencing sequencing Goal 1 reads assembly graph Genome sequencing: haplotype 1 haplotype 1 # List of hort. crops haplotype 2 F1 For F1: and close wild relatives; haplotype 3 long read sequencing # reference genome; haplotype 2 haplotype 4 haplotype 5 F1 assembly # pan genome haplotype 6 mapping # core collection assembly graph draft parental genome

bubble haplotype haplotype and phasing parental Consensus phasing

Types of structural variation 1. Deletion 2. Novel sequence insertion 3. Mobile-sequence insertion Goal 2 Ref. Ref. Ref. Variation detection: # identify & compare variations; # Identify mechanistic signatures

4. Tandem duplication 5. Interspersed duplication 6. Inversion 7. translocation Ref. Ref. Ref. Ref.

QTL GWAS

materials DNA parents markers AAC AC C AA AAC C X Goal 3 materials Link with phenotypes *** Trait *** # Quantitative traits; F1 values Trait # Discrete traits values

F2 DNA population DNA markers Variety/ A C cultivar variants population

Fig. 2 The proposed roadmap to the horticultural plant genome project (HPGP). The first goal of HPGP is to generate all reference genome sequences for horticultural plants, after which pan-genomes and core collections will be generated as a gene bank for horticultural plants. Two recently developed methods would be applied to decode the highly ploidy and highly heterozygous horticultural genomes. The second goal is to detect the various genomic variations within a pan-genome. In addition, the mechanistic signatures leading to the variations would be explored. The third goal is to link the phenotypes with the genomic regions. Two methods would be applied: the quantitative trait locus (QTL) method to correlate genomic variations with a quantitative trait and the genome-wide association study (GWAS) method to associate genomic variation with many genomic variations from different individuals ***p < 0.001 genomes. The second goal is to identify the various Project and the 1000-Plant Genome Project will accelerate genomic variations within a pan-genome. In addition, the the genome sequencing process of horticultural plants. mechanistic signatures leading to the variations would be The timeline for obtaining the genome sequences of all explored. The third goal is to link the phenotypes to the horticultural plants at both draft and reference scales genomic regions. Two methods would be applied: quan- (goal one of the HPGP) would be short—within 3–5 years titative trait locus methods to correlate genomic varia- —because the cost for sequencing is dropping rapidly. tions with a quantitative trait and genome-wide However, collecting and sequencing the population defi- association study methods to associate genomic variation nitely requires worldwide collaborations and would take with many genomic variations from different indivi- >10 years. The second goal is to analyze the genomic duals109,110. The good news is that the Earth Genome variations to identify the mechanistic signatures within a Chen et al. Horticulture Research (2019) 6:112 Page 21 of 23

population, which is also time consuming and would be of these concerns make it hard to study these traits in gradually achieved. The third goal is an advanced step that model plants or via regular tools. Uniting the various omic occurs after or concurrently with the second goal. data and the development of novel tools for horticultural Although these last two goals appear to be enormous plants are needed. Moreover, aside from the compre- challenges, we are confident in the ability to achieve most hensive plant databases and the 27 horticultural plant- of these two goals in model horticultural plants such as specific databases mentioned above, there is still an the tomato, cucumber, and strawberry in the increasing need to find and compare an increased amount coming years. of data for horticultural plants. However, horticultural In addition, the quality of assembly and annotation of biologists usually need to frequently deal with breeders; existing reference genomes of horticultural plants need to thus the need to create a comprehensive horticultural be further improved. Although a few tools such as database to meet the interests of basic biologists and BUSCO111 and CEGMA112 have been widely used to breeders is largely required. Such a database should cover evaluate the quality of genome annotations, a good stan- as many horticultural plant genomes as possible and dard is still not available for the systematic evaluation of should provide an integrated set of bioinformatics tools. the quality of genome assemblies. As a result, the quality We believe that, in the future, the need for such a com- of the genome assemblies is very uneven and is sometimes prehensive database of all horticultural plants will satisfy related to the complexity or heterozygosity of the taxa. additional horticulture researchers and breeders. This situation is changing as sequencing platforms are Given the advancement of sequencing technologies and being upgraded. For example, since the first apple genome reduced costs, the genome sequencing data of horti- sequence was released in 2010 based on next-generation cultural plants are accumulating rapidly. The storage, sequencing technology15, an improved version produced analyses, and sharing of large collections of genome by next-generation sequencing (NGS) and PacBio tech- sequencing data are becoming even more laborious and nologies was released in 2016113. The third improved time consuming. The integrative analysis of various omic version of the apple genome, which was obtained using a data, such as genomic, transcriptomic, metabolomic, combination of NGS, PacBio, and Bionano technologies, phenomic, and breeding data, have become a major was released in 2017114. The fourth improved version was challenge for many horticultural biologists and requires released in 2019, based on the utilization of NGS, PacBio, coordinated efforts of scientists from different fields. For and Hi-C technologies27. In the future, the quality of the data processing and visualization, we recommend using reference genome should reach certain minimal standards BioMart tools, which could be easily built into a database. upon which the community can agree, similar to the For database construction, we suggest following the proposal for bacteria and archaea115, thereby leading to template of the Tripal series (www.tripal.infor)8. Finally, more accurate pan-genome analyses and biotechnology. we believe that, with a fostered collaboration of the hor- Storage and access of genomic data constitute another ticultural community, the HPGP and subsequent knowl- problem concerning horticultural biologists and bioin- edge and experiences will greatly benefit biology formatics scientists. For access to genome sequences and researchers and breeders. raw sequencing data, a number of public databases are fi usually the rst choice of researchers due to the nature of Acknowledgements their stability, low cost, and ease of access. The well- This work was supported by the National Natural Science Foundation of China known public databases include the NCBI (https://ncbi. (31801898), the Natural Science Foundation of Fujian Province, China (Kjd18033A), open funds of the State Key Laboratory of Crop Genetics and nlm.nih.gov), EMBL (www.embl.org), CNGB (www.cngb. Germplasm Enhancement (ZW201909), the State Key Laboratory of Tree org), BIGD (bigd.big.ac.cn), DDBJ (www.ddbj.nig.ac.jp), Genetics and Breeding (TGB2018004), and the Outstanding Youth Program of GigaDB (gigadb.org), Dryad (www.datadryad.org), and Fujian Agriculture and Forestry University. Phytozome (https://phytozome.jgi.doe.gov) databases. To share these data with worldwide researchers, we encou- Author details rage the release of data before publication, as was sug- 1College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China. 2 gested by the Toronto Agreement in 2009100. College of Crop Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China. 3State Key Laboratory of Subtropical Silviculture, School of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou 311300, China. The need for a horticultural plant-centric database 4Department of Biology, Saint Louis University, St. Louis, MO 63103, USA. 5 Unlike agricultural plants, horticultural plants share Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology and Quality Science and Processing Technology in Special Starch, Key Laboratory of multiple features. For example, plant growth requires Ministry of Education for Genetics & Breeding and Multiple Utilization of Crops, controlled conditions with specific equipment or facilities; College of Crop Science, Fuzhou, China plants generally need grafting, postharvest treatment, and a long juvenile phase; and plants usually undergo asexual Conflict of interest reproduction and have unique specialized metabolism. All The authors declare that they have no conflict of interest. Chen et al. Horticulture Research (2019) 6:112 Page 22 of 23

Received: 10 April 2019 Revised: 27 July 2019 Accepted: 10 August 2019 27. Zhang, L. et al. A high-quality apple genome assembly reveals the associa- tion of a retrotransposon and red fruit colour. Nat. Commun. 10, 1494 (2019). 28. Sato, S. et al. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485,635–641 (2012). 29. Razali, R. et al. The genome sequence of the wild tomato Solanum pimpi- References nellifolium provides insights into salinity tolerance. Front. Plant Sci. 9,1402 1. Egea, L. A., Merida-Garcia, R., Kilian, A., Hernandez, P. & Dorado, G. Assessment (2018). of genetic diversity and structure of large garlic (Allium sativum)germplasm 30. Xu, X. et al. Genome sequence and analysis of the tuber crop potato. Nature “ ” bank, by diversity arrays technology genotyping-by-sequencing platform 475,189–194 (2011). (DArTseq). Front. Genet. 8, 98 (2017). 31. Qin,C.etal.Whole-genomesequencing of cultivated and wild peppers 2. Peska, V., Mandakova, T., Ihradska, V. & Fajkus, J. Comparative dissection of provides insights into Capsicum domestication and specialization. Proc. Natl three giant genomes: Allium cepa, Allium sativum,andAllium ursinum. Int. J. Acad.Sci.USA111, 5135–5140 (2014). Mol. Sci. 20, E733 (2019). 32. Kim, S. et al. Genome sequence of the hot pepper provides insights into 3. Li, H. et al. The wolds of wine: old, new and ancient. Wine Econ. Pol. 7, the evolution of pungency in Capsicum species. Nat. Genet. 46, 270–278 – 178 182 (2018). (2014). 4. Zheng, Z., Chen, J. & Deng, X. Historical perspectives, management, and 33. Ahn, Y. K. et al. Whole genome resequencing of Capsicum baccatum and current research of citrus HLB in Guangdong Province of China, where the Capsicum annuum to discover single nucleotide polymorphism related to disease has been endemic for over a hundred years. Phytopathology 108, Powdery Mildew resistance. Sci. Rep. 8, 5188 (2018). – 1224 1236 (2018). 34. Hirakawa, H. et al. Draft genome sequence of eggplant (Solanum melongena fl 5. Hubner, S. et al. Sun ower pan-genome analysis shows that hybridization L.): the representative solanum species indigenous to the old world. DNA Res. – altered gene content and disease resistance. Nat. Plants 5,54 62 (2019). 21,649–660 (2014). 6. Jaillon, O. et al. The grapevine genome sequence suggests ancestral hex- 35. Hoshino, A. et al. Genome sequence and analysis of the Japanese morning – aploidization in major angiosperm phyla. Nature 449,463 467 (2007). glory Ipomoea nil. Nat. Commun. 7, 13295 (2016). 7. Leisner, C. P. et al. Genome sequence of M6, a diploid inbred clone of the 36. Sierro, N. et al. Reference genomes and transcriptomes of Nicotiana sylvestris high-glycoalkaloid-producing tuber-bearing potato species Solanum cha- and Nicotiana tomentosiformis. Genome Biol. 14, R60 (2013). – coense, reveals residual heterozygosity. Plant J. 94, 562 570 (2018). 37. Bombarely, A. et al. Insight into the evolution of the Solanaceae from the 8. Chen, F. et al. The sequenced angiosperm genomes and genome databases. parental genomes of Petunia hybrida. Nat. Plants 2, 16074 (2016). Front. Plant Sci. 9, 418 (2018). 38. Ruggieri,V.,Bostan,H.,Barone,A.,Frusciante,L.&Chiusano,M.L.Integrated 9. Gardner, E. M., Johnson, M. G., Ragone, D., Wickett, N. J. & Zerega, N. J. Low- bioinformatics to decipher the ascorbic acid metabolic network in tomato. coverage, whole-genome sequencing of Artocarpus camansi (Moraceae) for Plant Mol. Biol. 91,397–412 (2016). phylogenetic marker development and gene discovery. Appl Plant Sci. 4, 39. Sierro,N.etal.Thetobaccogenome sequence and its comparison with apps.1600017 (2016). those of tomato and potato. Nat. Commun. 5, 3833 (2014). fi 10. Mori, K. et al. Identi cation of RAN1 orthologue associated with sex deter- 40. Varshney, R. K. et al. Draft genome sequence of pigeonpea (Cajanus cajan), fi mination through whole genome sequencing analysis in g(Ficus carica L.). an orphan legume crop of resource-poor farmers. Nat. Biotechnol. 30,83–89 Sci. Rep. 7, 41124 (2017). (2012). 11. Liu, M. J. et al. The complex jujube genome provides insights into fruit tree 41. Varshney, R. K. et al. Draft genome sequence of chickpea (Cicer arietinum) biology. Nat. Commun. 5, 5315 (2014). provides a resource for trait improvement. Nat. Biotechnol. 31,240–246 12. Edger, P. P. et al. Origin and evolution of the octoploid strawberry genome. (2013). – Nat. Genet. 51,541 547 (2019). 42. Gupta, S. et al. Draft genome sequence of Cicer reticulatum L., the wild 13. Shulaev,V.etal.Thegenomeofwoodlandstrawberry(Fragaria vesca). Nat. progenitor of chickpea provides a resource for agronomic trait improvement. – Genet. 43,109 116 (2011). DNA Res. 24,1–10 (2017). 14. Hirakawa, H. et al. Dissection of the octoploid strawberry genome by deep 43. Schmutz,J.etal.Genomesequence of the palaeopolyploid soybean. Nature – sequencing of the genomes of Fragaria species. DNA Res. 21,169181 463,178–183 (2010). (2014). 44. Young, N. D. et al. The Medicago genome provides insight into the evolution 15. Velasco, R. et al. The genome of the domesticated apple (Malus x domestica of rhizobial symbioses. Nature 480,520–524 (2011). – Borkh.). Nat. Genet. 42, 833 839 (2010). 45. Schmutz, J. et al. A reference genome for common bean and genome-wide 16. He, N. et al. Draft genome sequence of the mulberry tree Morus notabilis. analysis of dual domestications. Nat. Genet. 46,707–713 (2014). Nat. Commun. 4, 2445 (2013). 46. Cooper, J. W. et al. Enhancing faba bean (Vicia faba L.) genome resources. J. 17. Shirasawa, K. et al. The genome sequence of sweet cherry (Prunus avium)for Exp. Bot. 68, 1941–1953 (2017). – use in genomics-assisted breeding. DNA Res. 24, 499 508 (2017). 47. Kang, Y. J. et al. Draft genome sequence of adzuki bean, Vigna angularis. Sci. 18. Verde, I. et al. The high-quality draft genome of peach (Prunus persica) Rep. 5, 8069 (2015). fi identi es unique patterns of genetic diversity, domestication and genome 48. Kang, Y. J. et al. Genome sequence of mungbean and insights into evolution – evolution. Nat. Genet. 45, 487 494 (2013). within Vigna species. Nat. Commun. 5, 5443 (2014). 19. Wu, J. et al. The genome of the pear (Pyrus bretschneideri Rehd.). Genome Res. 49. Griesmann,M.etal.Phylogenomicsreveals multiple losses of nitrogen-fixing – 23,396 408 (2013). root nodule symbiosis. Science 361, eaat1743 (2018). 20. Chagne, D. et al. The draft genome sequence of European pear (Pyrus 50. Hane, J. K. et al. A comprehensive draft genome sequence for lupin (Lupinus ‘ ’ communis L. Bartlett ). PLoS ONE 9, e92644 (2014). angustifolius), an emerging health food: insights into plant-microbe interac- 21. VanBuren, R. et al. A near complete, chromosome-scale assembly of the tions and legume evolution. Plant Biotechnol. J. 15, 318–330 (2017). black raspberry (Rubus occidentalis) genome. Gigascience 7, giy094 (2018). 51. Mochida, K. et al. Draft genome assembly and annotation of Glycyrrhiza 22. Zhang, Q. X. et al. The genome of Prunus mume. Nat. Commun. 3,1318 uralensis,amedicinallegume.Plant J. 89,181–194 (2017). (2012). 52. De Vega, J. J. et al. Red clover (Trifolium pratense L.) draft genome provides a 23. Baek, S. et al. Draft genome sequence of wild Prunus yedoensis reveals platform for trait improvement. Sci. Rep. 5, 17394 (2015). fi fl massive inter-speci c hybridization between sympatric owering cherries. 53. Cullis, C. & Kunert, K. J. Unlocking the potential of orphan legumes. J. Exp. Bot. Genome Biol. 19, 127 (2018). 68,1895–1903 (2017). fl 24. Nakamura, N. et al. Genome structure of Rosa multi ora, a wild ancestor of 54. Yang, J. et al. The genome sequence of allopolyploid Brassica juncea and – cultivated roses. DNA Res. 25,113 121 (2018). analysis of differential homoeolog gene expression influencing selection. 25.Raymond,O.etal.TheRosa genome provides new insights into the Nat. Genet. 48,1225–1232 (2016). – domestication of modern roses. Nat. Genet. 50, 772 777 (2018). 55. Liu,S.Y.etal.TheBrassica oleracea genome reveals the asymmetrical evo- 26. Lu, M., An, H. M. & Li, L. L. Genome survey sequencing for the characterization lution of polyploid genomes. Nat. Commun. 5, 3930 (2014). of the genetic background of Rosa roxburghii tratt and leaf ascorbate 56. Wang, X. W. et al. The genome of the mesopolyploid crop species Brassica metabolism genes. PLoS ONE 11, e0147530 (2016). rapa. Nat. Genet. 43,1035–1039 (2011). Chen et al. Horticulture Research (2019) 6:112 Page 23 of 23

57. Kasianov, A. S. et al. High-quality genome assembly of Capsella bursa-pastoris 84. Wu, G. A. et al. Genomics of the origin and evolution of Citrus. Nature 554, reveals asymmetry of regulatory elements at early stages ofpolyploid gen- 311–316 (2018). ome evolution. Plant J. 91,278–291 (2017). 85. Li, Y. H. et al. De novo assembly of soybean wild relatives for pan-genome 58. Slotte, T. et al. The Capsella rubella genome and the genomic consequences analysis of diversity and agronomic traits. Nat. Biotechnol. 32, 1045–1052 (2014). ofrapidmatingsystemevolution.Nat. Genet. 45, 831–835 (2013). 86. Golicz, A. A. et al. The pangenome of an agronomically important crop plant 59. Kitashiba, H. et al. Draft sequences of the radish (Raphanus sativus L.) gen- Brassica oleracea. Nat. Commun. 7, 13390 (2016). ome. DNA Res. 21, 481–490 (2014). 87. Ou, L. J. et al. Pan-genome of cultivated pepper (Capsicum) and its use in 60. Dorn, K. M., Fankhauser, J. D., Wyse, D. L. & Marks, M. D. A draft genome of gene presence-absence variation analyses. New Phytol. 220,360–363 (2018). field pennycress (Thlaspi arvense) provides tools for the domestication of a 88. Gao, L. et al. The tomato pan-genome uncovers new genes and a rare allele new winter biofuel crop. DNA Res. 22,121–131 (2015). regulating fruit flavor. Nat. Genet. 51, 1044–1051 (2019). 61. Guo, X. et al. The genomes of two Eutrema species provide insight into plant 89. Jung, S. et al. The Genome Database for Rosaceae (GDR): year 10 update. adaptation to high altitudes. DNA Res. https://doi.org/10.1093/dnares/dsy003 Nucleic Acids Res. 42,D1237–D1244 (2014). (2018). 90.Stein,L.D.UsingGBrowse2.0tovisualize and share next-generation 62. Zhang, J. et al. Genome of plant maca (Lepidium meyenii) illuminates sequence data. Brief Bioinform. 14,162–171 (2013). genomic basis for high-altitude adaptation in the central Andes. Mol. Plant 9, 91. Westesson, O., Skinner, M. & Holmes, I. Visualizing next-generation sequen- 1066–1077 (2016). cing data with JBrowse. Brief Bioinform. 14,172–177 (2013). 63. Milia, G., Camiolo, S., Avesani, L. & Porceddu, A. The dynamic loss and gain of 92. Hofmeister, B. T. & Schmitz, R. J. Enhanced JBrowse plugins for epigenomics introns during the evolution of the Brassicaceae. Plant J. 82,915–924 (2015). data visualization. BMC Bioinformatics 19, 159 (2018). 64. Singh, S., Das, S. & Geeta, R. A segmental duplication in the common 93. Buels, R. et al. JBrowse: a dynamic web platform for genome visualization ancestor of Brassicaceae is responsible for the origin of the paralogs KCS6- and analysis. Genome Biol. 17,66(2016). KCS5, which are not shared with other angiosperms. Mol. Phylogenet. Evol. 94. Johnson, M. et al. NCBI BLAST: a better web interface. Nucleic Acids Res. 36, 126,331–345 (2018). W5–W9 (2008). 65.Murat,F.etal.UnderstandingBrassicaceae evolution through ancestral 95. Deng, W., Nickle, D. C., Learn, G. H., Maust, B. & Mullins, J. I. ViroBLAST: a stand- genome reconstruction. Genome Biol. 16, 262 (2015). alone BLAST web server for flexible queries of multiple databases and user’s 66. Barrera-Redondo, J. et al. The genome of Cucurbita argyrosperma (silver-seed datasets. Bioinformatics 23,2334–2336 (2007). gourd) reveals faster rates of protein-coding gene and long noncoding RNA 96. Potter, S. C. et al. HMMER web server: 2018 update. Nucleic Acids Res. 46, turnover and neofunctionalization within Cucurbita. Mol. Plant 12,506–520 W200–W204 (2018). (2019). 97. Caspi,R.etal.TheMetaCycdatabase of metabolic pathways and enzymes. 67. Sun, H. et al. Karyotype stability and unbiased fractionation in the paleo- Nucleic Acids Res. 46,D633–D639 (2018). allotetraploid cucurbita genomes. Mol. Plant 10, 1293–1306 (2017). 98. Fernandez-Pozo, N. et al. The Sol Genomics Network (SGN)-from genotype 68. Montero-Pau, J. et al. De novo assembly of the zucchini genome reveals a to phenotype to breeding. Nucleic Acids Res. 43,D1036–D1041 (2015). whole-genome duplication associated with the origin of the Cucurbita 99. Foerster, H. et al. SolCyc: a database hub at the Sol Genomics Network (SGN) genus. Plant Biotechnol. J. 16,1161–1171 (2018). for the manual curation of metabolic networks in Solanum and Nicotiana 69. Wu, S. et al. The bottle gourd genome provides insights into Cucurbitaceae specific databases. Database 2018,bay035(2018). evolution and facilitates mapping of a Papaya ring-spot virus resistance locus. 100. Toronto International Data Release Workshop Authors et al.Prepublication Plant J. 92,963–975 (2017). data sharing. Nature 461,168–170 (2009). 70. Urasaki, N. et al. Draft genome sequence of bitter gourd (Momordica char- 101. Zheng, Y. et al. Cucurbit Genomics Database (CuGenDB): a central portal for antia), a vegetable and medicinal plant in tropical and subtropical regions. comparative and functional genomics of cucurbit crops. Nucleic Acids Res. 47, DNA Res. 24,51–58 (2017). D1128–D1136 (2019). 71. Garcia-Mas, J. et al. The genome of melon (Cucumis melo L.). Proc. Natl Acad. 102. Cheng, F. et al. BRAD, the genetics and genomics database for Brassica Sci. USA 109, 11872–11877 (2012). plants. BMC Plant Biol. 11, 136 (2011). 72. Guo, S. et al. The draft genome of watermelon (Citrullus lanatus) and rese- 103. Wang, X. et al. HMOD: an omics database for herbal medicine plants. Mol. quencing of 20 diverse accessions. Nat. Genet. 45,51–58 (2013). Plant 11,757–759 (2018). 73. Itkin, M. et al. The biosynthetic pathway of the nonsugar, high-intensity 104. Lee, T. H., Tang, H. B., Wang, X. Y. & Paterson, A. H. PGDD: a database of gene sweetener mogroside V from Siraitia grosvenorii. Proc. Natl Acad. Sci. USA 113, and genome duplication in plants. Nucleic Acids Res. 41,D1152–D1158 E7619–E7628 (2016). (2013). 74. Xia, M. et al. Improved de novo genome assembly and analysis of the 105. Qiao, X. et al. Gene duplication and evolution in recurring polyploidization- Chinese cucurbit Siraitia grosvenorii, also known as monk fruit or luo-han- diploidization cycles in plants. Genome Biol. 20,38(2019). guo. Gigascience 7, giy067 (2018). 106. Tang, H. B. Disentangling a polyploid genome. Nat. Plants 3,688–689 (2017). 75. Wang, J. et al. An Overlooked paleotetraploidization in Cucurbitaceae. Mol. 107. Zhu, T. et al. Sequencing a Juglans regia×J. microcarpa hybrid yields high- Biol. Evol. 35,16–26 (2018). quality genome assemblies of parental species. Hortic. Res. 6,55(2019). 76. Yang, L. M. et al. Chromosome rearrangements during domestication of 108. Wu, G. A. & Gmitter, F. G. Novel assembly strategy cracks open the mysteries cucumber as revealed by high-density genetic mapping and draft genome of walnut genome evolution. Hortic. Res. 6,57(2019). assembly. Plant J. 71, 895–906 (2012). 109. Smedley, D. et al. The BioMart community portal: an innovative alternative to 77. Shang, Y. et al. Biosynthesis, regulation, and domestication of bitterness in large, centralized data repositories. Nucleic Acids Res. 43,W589–W598 (2015). cucumber. Science 346,1084–1088 (2014). 110. Yano, K. et al. Genome-wide association study using whole-genome 78. Wu, G. A. et al. Sequencing of diverse mandarin, pummelo and orange sequencing rapidly identifies new genes influencing agronomic traits in rice. genomes reveals complex history of admixture during citrus domestication. Nat. Genet. 48,927–934 (2016). Nat. Biotechnol. 32,656–662 (2014). 111. Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. 79. Wang, X. et al. Genomic analyses of primitive, wild and cultivated citrus BUSCO: assessing genome assembly and annotation completeness with provide insights into asexual reproduction. Nat. Genet. 49,765–772 (2017). single-copy orthologs. Bioinformatics 31,3210–3212 (2015). 80. Zhang, Y., Barthe, G., Grosser, J. W. & Wang, N. Transcriptome analysis of root 112. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core response to citrus blight based on the newly assembled Swingle citrumelo genes in eukaryotic genomes. Bioinformatics 23,1061–1067 (2007). draft genome. BMC Genomics 17, 485 (2016). 113. Li, X. et al. Improved hybrid de novo genome assembly of domesticated 81. Wang, L. et al. Genome of wild mandarin and domestication history of apple (Malus x domestica). Gigascience 5,35(2016). mandarin. Mol. Plant 11,1024–1037 (2018). 114. Daccord, N. et al. High-quality de novo assembly of the apple genome and 82. Xu, Q. et al. The draft genome of sweet orange (Citrus sinensis). Nat. Genet. 45, methylome dynamics of early fruit development. Nat. Genet. 49,1099–1106 59–66 (2013). (2017). 83. Shimizu, T. et al. Draft sequencing of the heterozygous diploid genome of 115. Bowers, R. M. et al. Minimum information about a single amplified genome satsuma (Citrus unshiu Marc.) using a hybrid assembly approach. Front. Genet. (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and 8, 180 (2017). archaea. Nat. Biotechnol. 35,725–731 (2017).