Genome
The genetics of Cannabis – genomic variations of key synthases and their effect on cannabinoids content
Journal: Genome
Manuscript ID gen-2020-0087.R1
Manuscript Type: Mini Review
Date Submitted by the 17-Sep-2020 Author:
Complete List of Authors: Singh, Aparna; University of Lethbridge, biological sciences Bilichak, Andriy; Morden Research and Development Centre Kovalchuk, Igor; University of Lethbridge Keyword: Cannabis sativaDraft L., hemp, marijuana, THCAS, CBDAS Is the invited manuscript for consideration in a Special Genome Biology Issue? :
© The Author(s) or their Institution(s) Page 1 of 43 Genome
1 The genetics of Cannabis – genomic variations of key synthases and their
2 effect on cannabinoids content
3
4 Aparna Singh1, Andriy Bilichak2 and Igor Kovalchuk1*
5
6 1 – Department of Biological Sciences, University of Lethbridge, Lethbridge, AB T1K
7 3M4, Canada, 2 – Morden Research and Development Center, Agriculture and Agri-Food
8 Canada, Morden, MB R6M 1Y5, Canada
9 * Corresponding author: [email protected]
10
11 Draft
12
13
14
15
16
17
18
19
20
21
22
23
1 © The Author(s) or their Institution(s) Genome Page 2 of 43
24 Abstract
25 Despite being a controversial crop, Cannabis sativa L. has a long history of
26 cultivation throughout the world. Following recent legalisation in Canada, it is emerging
27 as an important plant for both medicinal and recreational purposes. Recent progress in
28 genome sequencing of both cannabis and hemp varieties allows for systematic analysis
29 of genes coding for enzymes involved in the cannabinoid biosynthesis pathway. Single
30 nucleotide polymorphisms in the coding regions of cannabinoid synthases play important
31 role in determining plant chemotype. Deep understanding of how these variants affect
32 enzymes activity and accumulation of cannabinoids will allow breeding of novel cultivars
33 with desirable cannabinoid profile. Here we present a short overview of the major
34 cannabinoid synthases and present Draftthe data on the analysis of their genetic variants and
35 their effect on cannabinoid content using several in-house sequenced Cannabis cultivars.
36
37 Keywords: Cannabis sativa L., hemp, marijuana, THCAS, CBDAS
38
39
40
41
42
43
44
45
46
2 © The Author(s) or their Institution(s) Page 3 of 43 Genome
47 Introduction
48 Cannabis sativa L. (including marijuana and hemp) is a herbaceous plant belonging
49 to the Cannabaceae family (Vavilov and Freier 1951, Brizicky 1966). Being one of the
50 major source of medicine, oil and fibre, it has been extensively cultivated in many
51 countries (Camp 1936, Godwin 1967, Quimby, Doorenbos et al. 1973, Schultes, Klein et
52 al. 1974, Kriese, Schumann et al. 2004, Laverty, Stout et al. 2019). Since ancient times,
53 the Cannabis plant is valued for its medicinal properties and used for treating pain,
54 nausea, depression, glaucoma, asthma, insomnia, etc. (Mechoulam, Lander et al. 1976,
55 Duke and Wain 1981). Although therapeutic properties of cannabinoids have been
56 extensively studied, the role of phytocannabinoids within plants is poorly understood.
57 Cannabis is diploid and its karyotypeDraft consists of nine autosomes and a pair of sex
58 chromosomes (2n = 18+XX for female or XY for male) (Flemming, Muntendam et al. 2007,
59 Divashuk, Alexandrov et al. 2014, Vyskot and Hobza 2015). The haploid genome size of
60 female and male plants is approximately 818 Mb and 843 Mb, respectively (Sakamoto,
61 Akiyama et al. 1998).
62 The medicinal properties of Cannabis are owed to the presence of terpenophenolic
63 compounds known as cannabinoids. They can modulate the human endocannabinoid
64 system and are useful for various physiopathological processes (Izzo, Borrelli et al. 2009).
65 They are named as cannabinoids due to their typical exhibition of a C21 terpenophenolic
66 structure (Hillig 2004, Brenneisen 2007, De Meijer 2014). To date, more than 120
67 cannabinoids (class of metabolites specific to Cannabis plant), including cannabidiol
68 (CBD), tetrahydrocannabinol (THC), cannabichromene (CBC), cannabigerol (CBG) and
69 their propyl homologs CBDV, THCV, CBCV, CBGV have been identified including those
3 © The Author(s) or their Institution(s) Genome Page 4 of 43
70 that occur in plant and their derivatives (ElSohly 2007, Radwan, ElSohly et al. 2009, de
71 Meijer and Pertwee 2014). Two of the most common ones are THCA and CBDA, with
72 varying levels among cultivars. The acidic forms of these cannabinoids, THCA and CBDA,
73 are present in major quantities inside plant (De Meijer, Hammond et al. 2009, Swift, Wong
74 et al. 2013). Δ9-tetrahydrocannabinol (THC) is the main psychoactive cannabinoid
75 responsible for therapeutic and hallucinogenic effects and therefore extensively studied
76 (Brenneisen, Egli et al. 1996, Long, Malone et al. 2005, Sirikantaramas, Taura et al.
77 2007). CBG was the first compound isolated from C. sativa in a pure form and considered
78 as an intermediate precursor to most of the phytocannabinoids. Recently, seven more
79 CBG type cannabinoids have been isolated from the buds of the mature female C. sativa
80 plants (Appendino, Giana et al. 2008,Draft Flores-Sanchez and Verpoorte 2008, Radwan,
81 Ross et al. 2008, Radwan, ElSohly et al. 2009, Pollastro, Taglialatela-Scafati et al. 2011).
82 In this review, we will discuss sequence variations in synthase enzymes involved in
83 biosynthesis of cannabinoids, causes of their occurrence as well as their effect on
84 cannabinoid content leading to chemotype diversity. This review will also cover the
85 significance of these variations in distinguishing Cannabis varieties and in the
86 establishment of novel cultivars with unique chemotypes having potential to meet the
87 requirements of future pharmaceutical demands.
88
89 Historical perspective of use of Cannabis-derived products
90 Cannabis is one of the earliest cultivated plant by mankind and is native to western,
91 central and eastern Asia (Li 1974, Small 2015). It has been used traditionally as a herbal
92 medicine in ancient times by Chinese, Tibetan and Indian civilizations (Mechoulam and
4 © The Author(s) or their Institution(s) Page 5 of 43 Genome
93 Parker 2013). Cannabis has also been associated with religious practices in Southern
94 Asia, specially in India, where written records of its holy use were found (Hasan 1974).
95 The evidence of its first cultivation comes from China as early as 4000 B.C. to obtain fibre,
96 medicine and food for humans and cattle (Small and Cronquist 1976, Jiang, Li et al. 2006).
97 There were several reports supporting shamanistic uses of Cannabis suggesting ancient
98 Chinese were well aware of its psychotropic properties (Touw 1981, Farag and Kayser
99 2017).
100 Hemp, a type of Cannabis sativa plant species, is presumed as one of the oldest
101 sources of fibre and has been valued for its strength and durability, hence was used for
102 manufacturing ropes and clothes in earlier times (Allegret 2013). Nowadays, it is
103 particularly grown for industrial purposesDraft to obtain its derivatives such as oil, fibre and
104 food. In around 2000 B.C., hemp was introduced as fibre to Egypt, Europe and western
105 Asia (Schultes 1979). Hemp seeds were indeed among one of the five most important
106 grains in ancient China, where it was considered as staple food until tenth century
107 (Cheatham, Johnston et al. 2009). In the modern world, hemp is also grown for its
108 medicinal and nutritional value (Farag and Kayser 2017).
109 Apart from food, Chinese used plant extracts and seeds of Cannabis to treat various
110 illnesses including constipation, malaria, rheumatic pain and female reproductive system
111 disorders. They also used different parts of plants such as roots and foliage as a medicine
112 for various treatments (Wang and Wei 2012). Historically, hemp seeds were used for the
113 treatment of jaundice, sores pain, skin diseases, blood related illnesses and constipation
114 (Callaway 2004). A popular beverage in Scandinavia known as “Maltos-Cannabis” was
115 used in early twentieth century for treating anemia, asthenia, emaciation and pulmonary
5 © The Author(s) or their Institution(s) Genome Page 6 of 43
116 diseases (Dahl and Frank 2011). Considerable evidence of Cannabis use as medicine
117 and recreational drug in different forms (Bhang, ganja and charas) was also reported from
118 ancient India approximately 1000 years ago. It is considered as a sacred plant in Hindu
119 religion and was used in several religious rituals and ceremonies (Hasan 1974). It has
120 been also actively used as an analgesic, tranquilizer, anticonvulsant, anti-inflammatory,
121 aphrodisiac, antispasmodic and antibiotic. Cannabis was also used in Tibet for religious,
122 medicinal and meditation purposes (Touw 1981). In Africa, Cannabis is known since the
123 fifteenth century and is used for the treatment of snake bite and diseases like malaria,
124 asthma, fever and dysentery. In South America, Cannabis use presumably started for
125 recreational and medicinal purposes during seventeenth and eighteenth century (Zuardi
126 2006, Rubin 2011). Cannabis was extensivelyDraft grown for use as fibre in Europe and north
127 Asia, whereas in Africa and Southern Asia it has been mostly used as a recreational,
128 medicinal, and cultural drug. Due to its classification as narcotics, very limited research
129 related to Cannabis and its effects on human body was conducted previously. Even today,
130 Cannabis is one of the major illicitly cultivated plant in the world, but only recently it was
131 made legal in several parts around the globe, and great advances have been made to
132 understand how cannabinoids affect human brain and nervous system, to develop new
133 Cannabis-based therapeutic products.
134
135 Taxonomical Classification
136 Historically, vernacular taxonomy differentiated three different Cannabis groups - C.
137 sativa (high CBD-containing plant), C. indica (high THC-containing plant) and C. ruderalis
138 (wild-type, equal levels of THC and CBD)(McPartland 2018). DNA barcode analysis
6 © The Author(s) or their Institution(s) Page 7 of 43 Genome
139 provides evidence for separation of the first two taxa at a subspecies level into C. sativa
140 subsp. sativa and C. sativa subsp. indica. At the same time, historical records reveal that
141 field botanists were not scrupulous when differentiating these subspecies. Furthermore,
142 “Sativa” and “Indica” varieties were extensively interbred throughout domestication
143 process, therefore making their distinction impossible (McPartland 2018). The third type,
144 C. ruderalis, is not a popular variety and is adapted to extreme environments of Indian
145 Himalayan ranges, Siberia and Eastern Europe. The plant is small and bushy with low-
146 THC and high-CBD content which is often not enough to produce any psychological
147 effects and hence is not widely used.
148
149 Biosynthesis of CannabinoidsDraft
150 Cannabigerolic acid (CBGA) is one of the main precursors in cannabinoid
151 biosynthesis pathway. CBGA is formed by the condensation of geranyl diphosphate
152 (GPP) and olivetolic acid (Vavilov and Freier). GPP originate from non-mevalonate
153 pathway occurring in plastid, known as 2-C-methyl-D-erythritol 4-phosphate (MEP)
154 pathway. Olivetolic acid (Vavilov and Freier) is derived from hexanoic acid which is first
155 converted to hexanoyl-CoA by the action of hexanoyl CoA synthetase enzyme (Stout,
156 Boubakir et al. 2012). Later, hexanoyl-CoA is converted to OLA using three molecules of
157 malonyl-CoA, catalyzed by polyketide synthase (PKS) enzyme and an olivetolic acid
158 cyclase (OAC) enzyme (Gagne, Stout et al. 2012). The geranylpyrophosphate:olivetolate
159 geranyltransferase, also known as prenyltransferase or cannabigerolic acid synthase
160 (CBGAS) catalyzes alkylation reaction between OLA and GPP to form CBGA (Fellermeier
161 and Zenk 1998). Downstream in the pathway, there are three oxidocyclases responsible
7 © The Author(s) or their Institution(s) Genome Page 8 of 43
162 for establishing structural diversity among cannabinoids: tetrahydrocannabinolic acid
163 synthase (THCAS), cannabidiolic acid synthase (CBDAS) and cannabichromenic acid
164 synthase (CBCAS). These enzymes catalyze stereoselective cyclization of CBGA to
165 THCA, CBDA and CBCA, respectively (Figure 1C) (Sirikantaramas, Morimoto et al. 2004,
166 Sirikantaramas, Taura et al. 2005, Taura, Sirikantaramas et al. 2007, Degenhardt, Stehle
167 et al. 2017). In addition to these pentyl-alkyl-cannabinoids which are dominant in plants,
168 propyl-alkyl-cannabinoids such as Δ9-tetrahydrocannabivarinic acid (THCVA) and
169 cannabidivarinic acid (CBDVA) are also reported from Cannabis plants of some
170 geographical regions (Baker, Fowler et al. 1980, Hillig and Mahlberg 2004) and are
171 synthesized from cannabigerovarinic acid or CBGVA (Taura, Sirikantaramas et al. 2007,
172 Flores-Sanchez and Verpoorte 2008).Draft All cannabinoids are synthesized in their carboxylic
173 acid form inside plant and can be converted to their neutral forms via thermal
174 decarboxylation (Dussy, Hamberg et al. 2005, Happyana, Agnolet et al. 2013, Happyana
175 and Kayser 2013).
176
177 Synthases involved in the Cannabinoids Biosynthesis Pathway
178 Three different synthases responsible for biosynthesis of the major cannabinoids are
179 THCAS, CBDAS and CBCAS. THCAS catalyzes stereoselective oxidative cyclization of
180 CBGA into THCA (acidic precursor of THC) using molecular oxygen. THCA eventually
181 undergoes non-enzymatic decarboxylation to form the psychoactive agent, Δ9-THC.
182 THCAS has 1635 nucleotide open reading frame encoding a 545 amino acid (Quimby,
183 Doorenbos et al.) polypeptide chain with 24 AA signal peptide and belongs to p-cresol
184 methyl-hydroxylase superfamily. Like THCAS, CBDAS is a single exon gene and encodes
8 © The Author(s) or their Institution(s) Page 9 of 43 Genome
185 a protein with 516 AA, including 28 AA-long signal peptide. Both synthases contain two
186 major domains – FAD binding and Berberine like (BBE) domain (Figure 1A and B). FAD
187 coenzyme binds to His114 and Cys176 amino acids present in domain I. Mutation of active
188 site residues (His292 and Tyr417) results in decreased enzymatic activity (Shoyama,
189 Tamada et al. 2012).
190 CBDAS catalyzes oxidative cyclization of CBGA to form CBDA which can be further
191 decarboxylated to cannabidiol (CBD) (Taura, Morimoto et al. 1996, Taura, Dono et al.
192 2007). Like THCAS, CBDAS possess His114 and Cys176 flavin-binding site. It has a FAD
193 binding site composed of amino acid sequence (Arg‐Ser‐Gly‐Gly‐His). Similarly, mutation
194 of His114 residue results in loss of CBDAS activity (Taura, Sirikantaramas et al. 2007).
195 Cannabichromenic acid synthase Draft (CBCAS) catalyzes stereoselective cyclization of
196 CBGA to CBCA. CBCAS does not require molecular oxygen for the oxidocyclization of
197 CBGA (Morimoto, Komatsu et al. 1998).
198 All three synthases involved in cannabinoid synthesis share high sequence
199 similarity. For example, at amino acid level, THCAS and CBDAS are 84% identical
200 whereas THCAS and CBCAS are approximately 96% identical (Taura, Sirikantaramas et
201 al. 2007, Laverty, Stout et al. 2019). Structural and biochemical properties of THCAS and
202 CBDAS are also similar. Both are soluble enzymes with 28 amino acid long signal peptide
203 and possess FAD domain. High level of sequence identity suggests that both synthases
204 evolved from a common ancestor over a period of time (Taura, Sirikantaramas et al.
205 2007). It was proposed that THCAS evolved from CBDAS by gene duplication (Taura,
206 Sirikantaramas et al. 2007, Shoyama, Tamada et al. 2012, Onofri, de Meijer et al. 2015).
9 © The Author(s) or their Institution(s) Genome Page 10 of 43
207 The reaction mechanism of both enzymes is similar, as both require molecular
208 oxygen for CBGA oxidation and produce hydrogen peroxide. Domain present in these
209 two enzymes shares a striking similarity with berberine-bridge enzyme (BBE) domain.
210 BBE is a crucial enzyme of alkaloid biosynthesis pathway of Eschscholzia. californica
211 (Onofri, de Meijer et al. 2015). Enzymatic reactions catalyzed by both enzymes starts with
212 the transfer of a hydride ion from substrate CBGA to isoalloxazine ring of FAD coenzyme.
213 Interestingly, only small difference in the sequence of amino acids between the two
214 enzymes is solely responsible for determining their product specificity (Taura,
215 Sirikantaramas et al. 2007). Biochemically, THCAS and CBDAS are monomeric with
216 native protein mass of 74 kDa and similar Pi, Vmax and Km for CBGA substrate.
217 Phylogenetic analysis of aminoDraft acid sequences of THCAS and CBDAS from in-
218 house sequenced cultivars demonstrated lower level of divergence for THCAS as
219 compared to CBDAS among cultivars (Figures 2 A and B). Among 29 analyzed cultivars,
220 we detected only 8 and 18 unique sequences for THCAS and CBDAS, respectively. This
221 agrees with a previous study suggesting a recent evolution of the THCAS from CBDAS
222 group (Onofri, de Meijer et al. 2015). Alternatively, artificial selection of plants with the
223 highest level of THCA and with the most active THCAS, apparently, reduced number of
224 THCAS with variations in the Cannabis population. In case of CBDAS, whereas most of
225 the drug-type cultivars clustered together, hemp varieties, like Finola and CFX2, were
226 placed separately on the tree (Figure 2, Maximum Likelihood, 100 bootstraps).
227
228 Cannabis Chemotypes
10 © The Author(s) or their Institution(s) Page 11 of 43 Genome
229 A detailed screening of germplasm collection is necessary for successful breeding
230 of Cannabis cultivars with desired level and ratio of cannabinoids both for pharmaceutical
231 application and for setting up breeding strategies (Welling, Liu et al. 2016). High
232 performance liquid chromatography (HPLC) method offers unequivocal way to analyze
233 cannabinoid profile of examined plant and assigning chemotype (De Backer, Debrus et
234 al. 2009). It can be further supported by using functional DNA markers associated with
235 genes coding for THCAS and CBDAS. Chemotaxonomically, Cannabis sativa is divided
236 into three chemotypes determined based on THC:CBD ratio (Small and Beckstead 1973).
237 THC:CBD ratio is generally used to distinguish high and low-THC containing plants. The
238 first study was performed by Fetterman et. al. 1971 to discriminate between fibre and drug
239 type plants using THC:CBD ratio (Fetterman,Draft Keith et al. 1971, Turner and Elsohly 1979).
240 Chemotype I, also known as drug-type with THC:CBD ratio of more than 1 and THC
241 content higher than 0.3% of total dry weight. Chemotype II is an intermediate type and
242 has THC:CBD ratio of around 1. Chemotype III, also known as a fibre-type, typically has
243 low THC content. Therefore, it is non-psychoactive with THC:CBD ratio of less than 1.
244 Two more chemotypes that have been added were chemotype IV showing CBG (>0.30%)
245 and CBD (<0.50% ) content and chemotype V with undetectable level of cannabinoids
246 (Fournier, Richez-Dumanois et al. 1987, Mandolino and Carboni 2004).
247 We analyzed cannabinoid content in flowers of individual plants representing 31
248 cultivars using HPLC [2] and assigned the chemotype based on the total THC equivalent
249 to total CBD equivalent ratio (Table 1). Overall, total level of all cannabinoids did not
250 exceed 21% of the flower’s dry weight (Figure 3C). Depending on a cultivar, percentage
251 of CBGA varied from 0 to 1.1% (e.g., NF). Similarly, level of CBN, which is an oxidized
11 © The Author(s) or their Institution(s) Genome Page 12 of 43
252 metabolite of THC [5], was in the range from 0 to 0.19%. Overall, among 31 cultivars
253 examined 26 were chemotype I, 3 – chemotype II and 2 – chemotype III.
254 Interestingly, some genotyping studies have revealed that hemp and marijuana are
255 significantly different at genome level (Sawler, Stout et al. 2015) and it has also been
256 proven that both environmental and genetic factors are responsible for such chemotype
257 diversity in Cannabis (Bócsa, Máthé et al. 1997, de Meijer, Bagatta et al. 2003, Hillig
258 2005) Environmental factors such as amount of light received by plant and its quality,
259 nutrients and temperature have shown to modulate cannabinoids accumulation in a plant.
260 At the same time, it has been reported that the CBD/THC ratio remains constant
261 irrespective of plant development and ambient conditions (Pacifico, Miselli et al. 2008).
262 Therefore, analysis of cannabinoid profileDraft in leaves of developing plants allows to deduce
263 chemotype of a plant before its maturity. To examine accumulation of cannabinoids
264 throughout plant development, we crossed hemp variety X59 to cannabis cultivar HC.
265 Segregating population contained two chemotypes – II and III with consistent ratio of CBD
266 to THC in leaves and flowers at different stages of plant development regardless of the
267 chemotype as measured by HPLC (e.g. CBD/THC = 1.29, 1.53 and 1.23 for 4-, 7-week-
268 old plants and flowers of X59HC9a#12, respectively) (Figure 3A). Therefore, the
269 chemotype of potential mother plants can be determined already in 4-weeks-old plants.
270 We also examined accumulation of CBGA throughout plant development, nevertheless,
271 we did not detect gradual increase in the level of cannabinoid in the leaves and flowers
272 of the progeny of X59 x HC crosses, although inflorescence accumulated higher level of
273 this cannabinoid in most of the plants. At the same time, we observed modest positive
12 © The Author(s) or their Institution(s) Page 13 of 43 Genome
274 correlation between accumulation of total CBD equivalent and CBGA in the progeny (r =
275 0.69) (Figure 3B).
276 Genetics of Cannabis
277 Genomic studies of several new synthase variants have been carried out recently
278 (Weiblen, Wenger et al. 2015, Grassa, Wenger et al. 2018, Laverty, Stout et al. 2019,
279 Gao, Wang et al. 2020). Initially, CBDAS and THCAS were identified as a codominant
280 allele at single locus where BT/BT and BD/BD homozygous plants are THC and CBD
281 dominant, respectively. However, recent advances in genomic studies have suggested
282 the involvement of multiple linked loci harbouring alleles at different loci. Weiblen et al. 283 proposed this observation based onDraft several factors such as presence of diverse THCA 284 and CBDA synthase sequences in test samples, expression pattern and loci position on
285 chromosome map (Weiblen, Wenger et al. 2015). Onofri et al. suggested that
286 THCA/CBDA variation is due to sequence variations at the BT and/or BD loci (Onofri, de
287 Meijer et al. 2015). However, Grassa et. al. reports that divergence at CBDAS loci is
288 mainly responsible for determining THCA/CBDA ratio of cultivars resulting in cannabinoid
289 profile differences between marijuana and hemp (Grassa, Wenger et al. 2018).
290 Interestingly, variation in gene copy number of THCAS and CBDAS has also contributed
291 to varied cannabinoid content in cultivars and is responsible for phytochemical diversity
292 which helps plant in adaptation (Vergara, Huscher et al. 2019). To identify duplications of
293 cannabinoid synthases in in-house sequenced cultivars a genBlastA program was used
294 (She, Chu et al. 2009). As an input we generated scaffolds of genomic sequences of
295 target cultivars and used gene sequences of THCAS, CBDAS and CBCAS. E-value
296 threshold was set to 0.00001 and the minimum percentage of query gene coverage in the
13 © The Author(s) or their Institution(s) Genome Page 14 of 43
297 output to 80%. Overall, the number of THCAS gene duplicates varied among cultivars
298 from one to four (e.g., BC Kush and Skywalker, respectively, Supplementary data File
299 S2) with no correlation to the level of total THC equivalent (r = 0.01). Similar numbers
300 were observed for CBDAS and CBCAS with the only exception for Zambiah which had
301 five duplicates of CBDAS. We can not exclude a possibility that some of the synthases
302 code for pseudogenes (Kojoma, Seki et al. 2006), since we did not examine their
303 sequences in detail. Therefore, lack of observed correlation between synthases copy
304 number and accumulation of cannabinoids needs to be examined further.
305 Mutation analysis of enzymes shows how substitutions of some targeted amino
306 acids could affect cannabinoid production in vitro. It was interesting to know that
307 glycosylation sites are not essentialDraft for optimum THCAS enzyme activity, whereas
308 increase in disulphide bonds improved CBDAS enzyme activity (Zirpel, Kayser et al.
309 2018).
310
311 Variation in Cannabis Synthases
312 Large number of Cannabis strains were developed over the centuries through
313 breeding and selection process. Modern breeders have used different types of DNA
314 marker tools such as random amplified polymorphic DNA (RAPD), restriction fragment
315 length polymorphism (RFLP), amplified fragment length polymorphisms (AFLP), inter
316 simple sequence repeat amplification (ISSR), expressed sequence tag simple sequence
317 repeat (EST-SSRs), single nucleotide polymorphism (SNP) and short tandem repeats
318 (STRs) to assess genetic diversity among C. sativa accessions (Gillan, Cole et al. 1995,
319 de Meijer, Bagatta et al. 2003, Miller, Shutler et al. 2003, Hu, Guo et al. 2012, Shirley,
14 © The Author(s) or their Institution(s) Page 15 of 43 Genome
320 Allgeier et al. 2013, Gao, Xin et al. 2014, Sawler, Stout et al. 2015, De Meijer and
321 Hammond 2016). These DNA marker tools also enabled discrimination between drug and
322 non-drug type Cannabis cultivars (Rotherham and Harbison 2011). Two sets of DNA
323 sequenced characterised amplified region (SCAR) markers have been used for prediction
324 of chemotype at the early stages of plant development – dominant D589 and co-dominant
325 B1080/B1192 (Pacifico, Miselli et al. 2006, Staginnus, Zörntlein et al. 2014). Whereas
326 D589 marker can provide information only regarding presence of active THCAS defined
327 as BT allele, B1080/B1192 marker can be used to assess both synthases – THCAS and
328 CBDAS indicated as BT and BD alleles, respectively. Both markers are mapped at or near
329 the protein domains (Figure 1).
330 Comparison of HPLC data for in-houseDraft sequenced cultivars to the presence of DNA
331 SCAR markers revealed 100% correlation only for D589 marker, whereas B1080/B1192
332 demonstrated inconclusive results, which was consistent with a previous study
333 (Brenneisen 2007). Moreover all analyzed cultivars carried BD marker, we detected single
334 nucleotide polymorphism at position 583 (C -> T) in CBDAS coding sequence in a number
335 of cultivars (e.g., BC Kush, Black Jack, etc.), resulting in premature stop codon rendering
336 synthase inactive. At the same time, several cultivars carried both – BD marker and
337 complete ORF, but very low level of total CBD equivalent (less than 1%, e.g. Bon Homme,
338 Canadian Cheese, etc.). Additionally, an opposite result was observed as well – absence
339 of active CBDAS with relatively high level of CBD equivalent (e.g. Jungle Wreck and
340 Zambiah), therefore suggesting that further analysis of CBDAS gene sequence is
341 required for detection of critical SNPs both in coding and promoter regions responsible
342 for accumulation of CBD in plants.
15 © The Author(s) or their Institution(s) Genome Page 16 of 43
343 Several studies on SNPs analysis in synthases genes have helped in determining
344 genetic differences associated with hemp versus drug type Cannabis and their affect on
345 enzymes’ activity (Borna, Salami et al. 2017, Cascini, Farcomeni et al. 2019). For
346 instance, Chiara et al. (2015) performed genotyping of inbred lines from different
347 geographical backgrounds and revealed a single SNP at position 706 in the THCAS gene
348 causing a change in amino acid from glutamic acid to glutamine resulting in strain which
349 mainly accumulated CBG(V)A as the major cannabinoid and produces low amount of
350 THCA whereas, active THCAS was found in cultivar in which single change in nucleotide
351 could not translate into different amino acid. Notably, SNPs did not alter enzyme’s activity
352 as seen in the case of CBDAS. Despite amino acid substitutions, CBDA content remained
353 high, suggesting that mutations occurringDraft near FAD binding site or catalytic site are
354 mostly responsible for altered enzyme activity (Onofri, de Meijer et al. 2015). Rotherham
355 et al. developed SNP assay system to differentiate drug and non-drug type Cannabis for
356 commercial production of fibre and seed oil cultivars and for analysis of confiscated
357 samples. The assay was capable in characterizing active THCAS from homozygous drug-
358 type plant, and inactive THCAS from both heterozygous drug-type and homozygous non-
359 drug type Cannabis varieties (Rotherham and Harbison 2011). In addition, SNP variations
360 among marijuana-type (Purple kush and chemdawg) and hemp-type cultivars (Finola and
361 ‘USO-31’) supported their phylogenetic separation (van Bakel, Stout et al. 2011). In fact,
362 SNPs were found to be responsible for establishing different genetic clusters among
363 Cannabis population (Soorni, Fatahi et al. 2017).
364
365 Characterization of cis-elements in the CBDAS and THCAS promoters
16 © The Author(s) or their Institution(s) Page 17 of 43 Genome
366 Cis-elements in promoters are known to be at the core of regulation of gene
367 expression during plant development and under environmental stimuli (Hernandez-
368 Garcia and Finer 2014). Manipulation of the THCAS and CBDAS genes expression can
369 potentially result in increased yield of target cannabinoids or altered ratio of CBD to THC.
370 To analyze upstream cis-regulatory elements, we extracted 768 and 1,000 bp 5' regions
371 representing promoter sequences for THCAS and CBDAS, respectively, from Purple
372 Kush and Finola genomes (PK scaffold 19603:6668-7668 and FN scaffold
373 14546436:3508-4508 for THCAS and CBDAS, respectively) (Laverty, Stout et al. 2019).
374 Cis-regulatory elements were downloaded from (Korkuc, Schippers et al. 2014) and
375 promoter sequences were scanned for the presence of motifs against 325 and 1496
376 promoters for CBDAS and THCAS, respectively,Draft using Find Individual Motif Occurrences
377 (FIMO) program (Grant, Bailey et al. 2011). 16 and 14 significantly enriched (p<0.0001)
378 motifs were identified for the THCAS and CBDAS promoters, respectively
379 (Supplementary data File S1). Ten cis-regulatory elements were common between
380 promoters and a number of motifs were responsible for regulation of gene expression
381 under abiotic stresses such as dehydration, cold and light (e.g., ATHB6, AtMYB2, RAV1-
382 A, GATA, Ibox, etc.). We also identified circadian clock responsive motif Evening Element
383 in promoter of the CBDAS gene, suggesting its possible circadian-regulated gene
384 expression. Further examination of the THCAS and CBDAS genes expression under
385 different stress conditions will reveal role of identified cis-regulatory elements and will
386 potentially allow to increase accumulation of corresponding cannabinoids.
387
388
17 © The Author(s) or their Institution(s) Genome Page 18 of 43
389 Conclusion and Future Prospects
390 Sequence variations provide insight about the complexity of cannabinoid
391 biosynthesis in plant resulting in chemotype diversity, altered gene expression and
392 enzymatic activity. Therefore, deeper analysis of regulatory mechanisms as well as
393 variants in sequences of synthases is needed for developing novel cultivars with diverse
394 cannabinoid profile. For this, synthetic biosynthesis pathways in heterologous hosts for
395 cannabinoid production (e.g., yeast, tobacco, etc.) (Sirikantaramas, Taura et al. 2005,
396 Luo, Reiter et al. 2019) can be established both for pharmaceutical applications as well
397 as for dissection of regulatory elements involved in synthases activity.
398
399 LIST OF FIGURES CAPTIONS Draft
400 Figure 1. Schematic representation of THCAS (A) and CBDAS (B) domains and
401 annealing regions of genotyping primers D589 and B1080/B1192 (C) Biosynthesis
402 pathway of major cannabinoids. Schematic view is derived and modified from pathway
403 reviewed in (Degenhardt, Stehle et al. 2017). Chemical structures are generated with
404 chemdraw. GPP-geranyl diphosphate; OLA-Olivetolicacid; CBGA-Cannabigerolic acid;
405 THCA- Δ9-Tetrahydrocannabinolic acid; THC- Δ9-Tetrahydrocannabinol; CBCA-
406 Cannabichromenic acid; CBC-Cannabichromene; CBDA-Cannabidiolic acid;CBD-
407 Cannabidiol;CBGAS-Cannabigerolic acid synthase; THCAS-Tetrahydrocannabinolic acid
408 synthase:CBCAS-Cannabichromenic acid synthase and CBDAS-Cannabidioloc acid
409 synthase
410
18 © The Author(s) or their Institution(s) Page 19 of 43 Genome
411 Figure 2. Phylogenetic tree of amino acid sequences of (A) THCAS and (B) CBDAS from
412 examined Cannabis cultivars. Phylogenetic tree was build using Geneious Prime
413 2020.2.3 software. AA sequences were aligned using MUSCLE alignment and consensus
414 tree was inferred using the Maximum likelihood method with 100 bootstraps. Values on
415 the branches demonstrate bootstrap proportions.
416
417 Figure 3. HPLC analysis of in-house cannabis cultivars. Chemotypes were deduced from
418 the ratio of total equivalent THC to total equivalent CBD (A) HPLC profile of total THC,
419 CBD (B) and CBGA throughout plant development in the progeny of X59 x HC crosses.
420 WOP – weeks old plant. (C) Percentage of pentyl-cannabinoids in individual plants of
421 Cannabis sativa L. representing 31 cultivars.Draft Level of cannabinoids was measured in dry
422 inflorescence using HPLC.
423
424
425
426
427
428
429
430
431
432
433 References
19 © The Author(s) or their Institution(s) Genome Page 20 of 43
434
435 Allegret, S. (2013). "The history of hemp." Hemp: industrial production and uses: 4-26.
436 Appendino, G., A. Giana, S. Gibbons, M. Maffei, G. Gnavi, G. Grassi and O. Sterner (2008). "A polar
437 cannabinoid from Cannabis sativa var. Carma." Natural Product Communications 3(12):
438 1934578X0800301207.
439 Baker, P. B., R. Fowler, K. R. Bagon and T. A. Gough (1980). "Determination of the distribution of
440 cannabinoids in cannabis resin using high performance liquid chromatography." Journal of analytical
441 toxicology 4(3): 145-152.
442 Bócsa, I., P. Máthé and L. Hangyel (1997). "Effect of nitrogen on tetrahydrocannabinol (THC) content in
443 hemp (Cannabis sativa L.) leaves at different positions." J Int Hemp Assoc 4(2): 78-79.
444 Borna, T., S. A. Salami and M. Shokrpour (2017).Draft "High resolution melting curve analysis revealed SNPs in
445 major cannabinoid genes associated with drug and non-drug types of cannabis." Biotechnology &
446 Biotechnological Equipment 31(4): 839-845.
447 Brenneisen, R. (2007). Chemistry and analysis of phytocannabinoids and other Cannabis constituents.
448 Marijuana and the Cannabinoids, Springer: 17-49.
449 Brenneisen, R., A. Egli, M. Elsohly, V. Henn and Y. Spiess (1996). "The effect of orally and rectally
450 administered delta 9-tetrahydrocannabinol on spasticity: a pilot study with 2 patients." International
451 journal of clinical pharmacology and therapeutics 34(10): 446-452.
452 Brizicky, G. K. (1966). Cultivated Plants and Their Wild Relatives. Taxonomy, Geography, Cytogenetics,
453 Ecology, Origin, Utilization, JSTOR.
454 Callaway, J. (2004). "Hempseed as a nutritional resource: An overview." Euphytica 140(1-2): 65-72.
455 Camp, W. (1936). "The antiquity of hemp as an economic plant J." NY Bot. Gard 37: 110-114.
20 © The Author(s) or their Institution(s) Page 21 of 43 Genome
456 Cascini, F., A. Farcomeni, D. Migliorini, L. Baldassarri, I. Boschi, S. Martello, S. Amaducci, L. Lucini and J.
457 Bernardi (2019). "Highly Predictive Genetic Markers Distinguish Drug-Type from Fiber-Type Cannabis
458 sativa L." Plants 8(11): 496.
459 Cheatham, S., M. Johnston and L. Marshall (2009). "The useful wild plants of Texas, the Southeastern
460 and Southwestern United States, the Southern Plains, and Northern Mexico, vol 3. Useful Wild Plants."
461 Inc, Austin (Treatment of Cannabis: pp 13–126).
462 Dahl, H. V. and V. A. Frank (2011). "Medical marijuana–exploring the concept in relation to small scale
463 cannabis growers in Denmark." World wide weed–Global trends in cannabis cultivation and its control:
464 116-141.
465 De Backer, B., B. Debrus, P. Lebrun, L. Theunis, N. Dubois, L. Decock, A. Verstraete, P. Hubert and C.
466 Charlier (2009). "Innovative development andDraft validation of an HPLC/DAD method for the qualitative and
467 quantitative determination of major cannabinoids in cannabis plant material." J Chromatogr B Analyt
468 Technol Biomed Life Sci 877(32): 4115-4124.
469 De Meijer, E. and K. Hammond (2016). "The inheritance of chemical phenotype in Cannabis sativa L.(V):
470 regulation of the propyl-/pentyl cannabinoid ratio, completion of a genetic model." Euphytica 210(2):
471 291-307.
472 De Meijer, E., K. Hammond and A. Sutton (2009). "The inheritance of chemical phenotype in
473 Cannabissativa L.(IV): cannabinoid-free plants." Euphytica 168(1): 95-112.
474 de Meijer, E. and R. Pertwee (2014). "Handbook of Cannabis. Handbooks in Psychopharmacology."
475 De Meijer, E. P. (2014). "The chemical phenotypes (chemotypes) of Cannabis." Handbook of Cannabis:
476 89-110.
477 de Meijer, E. P., M. Bagatta, A. Carboni, P. Crucitti, V. C. Moliterni, P. Ranalli and G. Mandolino (2003).
478 "The inheritance of chemical phenotype in Cannabis sativa L." Genetics 163(1): 335-346.
21 © The Author(s) or their Institution(s) Genome Page 22 of 43
479 Degenhardt, F., F. Stehle and O. Kayser (2017). The biosynthesis of cannabinoids. Handbook of Cannabis
480 and related pathologies, Elsevier: 13-23.
481 Divashuk, M. G., O. S. Alexandrov, O. V. Razumova, I. V. Kirov and G. I. Karlov (2014). "Molecular
482 cytogenetic characterization of the dioecious Cannabis sativa with an XY chromosome sex determination
483 system." PloS one 9(1).
484 Duke, J. and K. Wain (1981). "Medicinal plants of the world. Computer index with more than 85000
485 entries." Handbook of Medicinal Herbs (Ed. Duke JA), CRC press, Boca Raton, Florida: 96.
486 Dussy, F. E., C. Hamberg, M. Luginbühl, T. Schwerzmann and T. A. Briellmann (2005). "Isolation of Δ9-
487 THCA-A from hemp and analytical aspects concerning the determination of Δ9-THC in cannabis
488 products." Forensic science international 149(1): 3-10.
489 ElSohly, M. A. (2007). Marijuana and the Cannabinoids,Draft Springer Science & Business Media.
490 Farag, S. and O. Kayser (2017). The cannabis plant: botanical aspects. Handbook of Cannabis and Related
491 Pathologies, Elsevier: 3-12.
492 Fellermeier, M. and M. H. Zenk (1998). "Prenylation of olivetolate by a hemp transferase yields
493 cannabigerolic acid, the precursor of tetrahydrocannabinol." FEBS Letters 427(2): 283-285.
494 Fetterman, P. S., E. S. Keith, C. W. Waller, O. Guerrero, N. J. Doorenbos and M. W. Quimby (1971).
495 "Mississippi-grown Cannabis sativa L.: Preliminary observation on chemical definition of phenotype and
496 variations in tetrahydrocannabinol content versus age, sex, and plant part." Journal of Pharmaceutical
497 Sciences 60(8): 1246-1249.
498 Flemming, T., R. Muntendam, C. Steup and O. Kayser (2007). Chemistry and biological activity of
499 tetrahydrocannabinol and its derivatives. Bioactive Heterocycles IV, Springer: 1-42.
500 Flores-Sanchez, I. J. and R. Verpoorte (2008). "Secondary metabolism in cannabis." Phytochemistry
501 reviews 7(3): 615-639.
22 © The Author(s) or their Institution(s) Page 23 of 43 Genome
502 Fournier, G., C. Richez-Dumanois, J. Duvezin, J.-P. Mathieu and M. Paris (1987). "Identification of a new
503 chemotype in Cannabis sativa: cannabigerol-dominant plants, biogenetic and agronomic prospects."
504 Planta Medica 53(03): 277-280.
505 Gagne, S. J., J. M. Stout, E. Liu, Z. Boubakir, S. M. Clark and J. E. Page (2012). "Identification of olivetolic
506 acid cyclase from Cannabis sativa reveals a unique catalytic route to plant polyketides." Proceedings of
507 the National Academy of Sciences 109(31): 12811-12816.
508 Gao, C., P. Xin, C. Cheng, Q. Tang, P. Chen, C. Wang, G. Zang and L. Zhao (2014). "Diversity analysis in
509 Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence
510 repeat markers." PloS one 9(10).
511 Gao, S., B. Wang, S. Xie, X. Xu, J. Zhang, L. Pei, Y. Yu, W. Yang and Y. Zhang (2020). "A high-quality
512 reference genome of wild Cannabis sativa."Draft Horticulture Research 7(1): 73.
513 Gillan, R., M. Cole, A. Linacre, J. Thorpe and N. Watson (1995). "Comparison of Cannabis sativa by
514 random amplification of polymorphic DNA (RAPD) and HPLC of cannabinoids: a preliminary study."
515 Science & justice: journal of the Forensic Science Society 35(3): 169-177.
516 Godwin, H. (1967). "The ancient cultivation of hemp." Antiquity 41(161): 42-49.
517 Grant, C. E., T. L. Bailey and W. S. Noble (2011). "FIMO: scanning for occurrences of a given motif."
518 Bioinformatics 27(7): 1017-1018.
519 Grassa, C. J., J. P. Wenger, C. Dabney, S. G. Poplawski, S. T. Motley, T. P. Michael, C. Schwartz and G. D.
520 Weiblen (2018). "A complete Cannabis chromosome assembly and adaptive admixture for elevated
521 cannabidiol (CBD) content." BioRxiv: 458083.
522 Happyana, N., S. Agnolet, R. Muntendam, A. Van Dam, B. Schneider and O. Kayser (2013). "Analysis of
523 cannabinoids in laser-microdissected trichomes of medicinal Cannabis sativa using LCMS and cryogenic
524 NMR." Phytochemistry 87: 51-59.
23 © The Author(s) or their Institution(s) Genome Page 24 of 43
525 Happyana, N. and O. Kayser (2013). "Monitoring metabolites production and cannabinoids analysis in
526 medicinal Cannabis trichomes during flowering period by 1H NMR-based metabolomics." Planta Medica
527 79(13): SL44.
528 Hasan, K. A. (1974). "Social aspects of the use of cannabis in India." Cannabis and culture: 235-246.
529 Hernandez-Garcia, C. M. and J. J. Finer (2014). "Identification and validation of promoters and cis-acting
530 regulatory elements." Plant Sci 217-218: 109-119.
531 Hillig, K. W. (2004). "A chemotaxonomic analysis of terpenoid variation in Cannabis." Biochemical
532 systematics and ecology 32(10): 875-891.
533 Hillig, K. W. (2005). "Genetic evidence for speciation in Cannabis (Cannabaceae)." Genetic Resources and
534 Crop Evolution 52(2): 161-180.
535 Hillig, K. W. and P. G. Mahlberg (2004). "A chemotaxonomicDraft analysis of cannabinoid variation in
536 Cannabis (Cannabaceae)." American journal of botany 91(6): 966-975.
537 HU, Z.-G., H.-Y. GUO, X.-L. HU, X. CHEN, X.-Y. LIU, M.-B. GUO, Q.-Y. ZHANG, Y.-P. XU, L.-F. GUO and M.
538 YANG (2012). "Genetic diversity research of hemp (Cannabis sativa L) cultivar based on AFLP analysis."
539 Journal of Plant Genetic Resources 13(4): 555-561.
540 Izzo, A. A., F. Borrelli, R. Capasso, V. Di Marzo and R. Mechoulam (2009). "Non-psychotropic plant
541 cannabinoids: new therapeutic opportunities from an ancient herb." Trends in pharmacological sciences
542 30(10): 515-527.
543 Jiang, H.-E., X. Li, Y.-X. Zhao, D. K. Ferguson, F. Hueber, S. Bera, Y.-F. Wang, L.-C. Zhao, C.-J. Liu and C.-S.
544 Li (2006). "A new insight into Cannabis sativa (Cannabaceae) utilization from 2500-year-old Yanghai
545 Tombs, Xinjiang, China." Journal of ethnopharmacology 108(3): 414-422.
546 Jin, J., F. Tian, D. C. Yang, Y. Q. Meng, L. Kong, J. Luo and G. Gao (2017). "PlantTFDB 4.0: toward a central
547 hub for transcription factors and regulatory interactions in plants." Nucleic Acids Res 45(D1): D1040-
548 D1045.
24 © The Author(s) or their Institution(s) Page 25 of 43 Genome
549 Kojoma, M., H. Seki, S. Yoshida and T. Muranaka (2006). "DNA polymorphisms in the
550 tetrahydrocannabinolic acid (THCA) synthase gene in "drug-type" and "fiber-type" Cannabis sativa L."
551 Forensic Sci Int 159(2-3): 132-140.
552 Korkuc, P., J. H. Schippers and D. Walther (2014). "Characterization and identification of cis-regulatory
553 elements in Arabidopsis based on single-nucleotide polymorphism information." Plant Physiol 164(1):
554 181-200.
555 Kriese, U., E. Schumann, W. Weber, M. Beyer and L. Brühl (2004). "Oil content, tocopherol composition
556 and fatty acid patterns of the seeds of 51 Cannabis sativa L. genotypes." Euphytica 137(3): 339-351.
557 Laverty, K. U., J. M. Stout, M. J. Sullivan, H. Shah, N. Gill, L. Holbrook, G. Deikus, R. Sebra, T. R. Hughes
558 and J. E. Page (2019). "A physical and genetic map of Cannabis sativa identifies extensive
559 rearrangements at the THC/CBD acid synthaseDraft loci." Genome research 29(1): 146-156.
560 Laverty, K. U., J. M. Stout, M. J. Sullivan, H. Shah, N. Gill, L. Holbrook, G. Deikus, R. Sebra, T. R. Hughes, J.
561 E. Page and H. van Bakel (2019). "A physical and genetic map of Cannabis sativa identifies extensive
562 rearrangements at the THC/CBD acid synthase loci." Genome Res 29(1): 146-156.
563 Li, H.-L. (1974). "The origin and use of Cannabis in eastern Asia linguistic-cultural implications." Economic
564 Botany 28(3): 293-301.
565 Ling, Y., Z. Du, Z. Zhang and Z. Su (2010). "ProFITS of maize: a database of protein families involved in the
566 transduction of signalling in the maize genome." BMC Genomics 11: 580.
567 Long, L. E., D. T. Malone and D. A. Taylor (2005). "The pharmacological actions of cannabidiol." Drugs of
568 the Future 30(7): 747.
569 Luo, X., M. A. Reiter, L. d’Espaux, J. Wong, C. M. Denby, A. Lechner, Y. Zhang, A. T. Grzybowski, S. Harth,
570 W. Lin, H. Lee, C. Yu, J. Shin, K. Deng, V. T. Benites, G. Wang, E. E. K. Baidoo, Y. Chen, I. Dev, C. J. Petzold
571 and J. D. Keasling (2019). "Complete biosynthesis of cannabinoids and their unnatural analogues in
572 yeast." Nature 567(7746): 123-126.
25 © The Author(s) or their Institution(s) Genome Page 26 of 43
573 Mandolino, G. and A. Carboni (2004). "Potential of marker-assisted selection in hemp genetic
574 improvement." Euphytica 140(1-2): 107-120.
575 McPartland, J. M. (2018). "Cannabis Systematics at the Levels of Family, Genus, and Species." Cannabis
576 and Cannabinoid Research 3(1): 203-212.
577 Mechoulam, R., N. Lander, S. Dikstein, E. Carlini and M. Blumenthal (1976). On the therapeutic
578 possibilities of some cannabinoids. The Therapeutic potential of marihuana, Springer: 35-45.
579 Mechoulam, R. and L. A. Parker (2013). "The endocannabinoid system and the brain." Annual review of
580 psychology 64: 21-47.
581 Miller, H. C., G. Shutler, S. Abrams, J. Hanniman, S. Neylon, C. Ladd, T. Palmbach and H. C. Lee (2003). "A
582 simple DNA extraction method for marijuana samples used in amplified fragment length polymorphism
583 (AFLP) analysis." Journal of forensic sciencesDraft 48(2): 343-347.
584 Morimoto, S., K. Komatsu, F. Taura and Y. Shoyama (1998). "Purification and characterization of
585 cannabichromenic acid synthase from Cannabis sativa." Phytochemistry 49(6): 1525-1529.
586 Onofri, C., E. P. de Meijer and G. Mandolino (2015). "Sequence heterogeneity of cannabidiolic-and
587 tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical
588 phenotype." Phytochemistry 116: 57-68.
589 Onofri, C., E. P. M. de Meijer and G. Mandolino (2015). "Sequence heterogeneity of cannabidiolic- and
590 tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical
591 phenotype." Phytochemistry 116: 57-68.
592 Pacifico, D., F. Miselli, A. Carboni, A. Moschella and G. Mandolino (2008). "Time course of cannabinoid
593 accumulation and chemotype development during the growth of Cannabis sativa L." Euphytica 160(2):
594 231-240.
595 Pacifico, D., F. Miselli, M. Micheler, A. Carboni, P. Ranalli and G. Mandolino (2006). "Genetics and
596 Marker-assisted Selection of the Chemotype in Cannabis sativa L." Molecular Breeding 17(3): 257-268.
26 © The Author(s) or their Institution(s) Page 27 of 43 Genome
597 Pollastro, F., O. Taglialatela-Scafati, M. Allara, E. Munoz, V. Di Marzo, L. De Petrocellis and G. Appendino
598 (2011). "Bioactive prenylogous cannabinoid from fiber hemp (Cannabis sativa)." Journal of natural
599 products 74(9): 2019-2022.
600 Quimby, M. W., N. J. Doorenbos, C. E. Turner and A. Masoud (1973). "Mississippi-Grown Marihuana:
601 Cannabis sativa Cultivation and Observed Morphological Variations." Economic botany: 117-127.
602 Radwan, M. M., M. A. ElSohly, D. Slade, S. A. Ahmed, I. A. Khan and S. A. Ross (2009). "Biologically active
603 cannabinoids from high-potency Cannabis sativa." Journal of natural products 72(5): 906-911.
604 Radwan, M. M., S. A. Ross, D. Slade, S. A. Ahmed, F. Zulfiqar and M. A. ElSohly (2008). "Isolation and
605 characterization of new cannabis constituents from a high potency variety." Planta medica 74(03): 267-
606 272.
607 Rotherham, D. and S. Harbison (2011). "DifferentiationDraft of drug and non-drug Cannabis using a single
608 nucleotide polymorphism (SNP) assay." Forensic science international 207(1-3): 193-197.
609 Rubin, V. (2011). Cannabis and culture, Walter de Gruyter.
610 Sakamoto, K., Y. Akiyama, K. Fukui, H. Kamada and S. Satoh (1998). "Characterization; genome sizes and
611 morphology of sex chromosomes in hemp (Cannabis sativa L.)." Cytologia 63(4): 459-464.
612 Sawler, J., J. M. Stout, K. M. Gardner, D. Hudson, J. Vidmar, L. Butler, J. E. Page and S. Myles (2015). "The
613 genetic structure of marijuana and hemp." PloS one 10(8).
614 Schultes, R. E. (1979). The Species Problem in Cannabis—Science and Semantics, by Ernest Small,
615 published by Corpus, Toronto, Canada, 2 volumes; soft cover, price $28.(Vol. 1, soft cover, $10.95; hard
616 cover, $16.95. Vol. 2, soft cover, $9.95; hard cover, $14.95.), Elsevier.
617 Schultes, R. E., W. M. Klein, T. Plowman and T. E. Lockwood (1974). "Cannabis: an example of taxonomic
618 neglect." Botanical Museum Leaflets, Harvard University 23(9): 337-367.
619 She, R., J. S. Chu, K. Wang, J. Pei and N. Chen (2009). "GenBlastA: enabling BLAST to identify homologous
620 gene sequences." Genome Res 19(1): 143-149.
27 © The Author(s) or their Institution(s) Genome Page 28 of 43
621 Shirley, N., L. Allgeier, T. LaNier and H. M. Coyle (2013). "Analysis of the NMI01 Marker for a Population
622 Database of Cannabis Seeds." Journal of Forensic Sciences 58(s1): S176-S182.
623 Shoyama, Y., T. Tamada, K. Kurihara, A. Takeuchi, F. Taura, S. Arai, M. Blaber, Y. Shoyama, S. Morimoto
624 and R. Kuroki (2012). "Structure and function of∆ 1-tetrahydrocannabinolic acid (THCA) synthase, the
625 enzyme controlling the psychoactivity of Cannabis sativa." Journal of molecular biology 423(1): 96-105.
626 Sirikantaramas, S., S. Morimoto, Y. Shoyama, Y. Ishikawa, Y. Wada, Y. Shoyama and F. Taura (2004). "The
627 gene controlling marijuana psychoactivity molecular cloning and heterologous expression of Δ1-
628 tetrahydrocannabinolic acid synthase from Cannabis sativa L." Journal of Biological Chemistry 279(38):
629 39767-39774.
630 Sirikantaramas, S., F. Taura, S. Morimoto and Y. Shoyama (2007). "Recent advances in Cannabis sativa
631 research: biosynthetic studies and its potentialDraft in biotechnology." Current pharmaceutical biotechnology
632 8(4): 237-243.
633 Sirikantaramas, S., F. Taura, Y. Tanaka, Y. Ishikawa, S. Morimoto and Y. Shoyama (2005).
634 "Tetrahydrocannabinolic acid synthase, the enzyme controlling marijuana psychoactivity, is secreted
635 into the storage cavity of the glandular trichomes." Plant and Cell Physiology 46(9): 1578-1582.
636 Sirikantaramas, S., F. Taura, Y. Tanaka, Y. Ishikawa, S. Morimoto and Y. Shoyama (2005).
637 "Tetrahydrocannabinolic acid synthase, the enzyme controlling marijuana psychoactivity, is secreted
638 into the storage cavity of the glandular trichomes." Plant Cell Physiol 46(9): 1578-1582.
639 Small, E. (2015). "Evolution and Classification of Cannabis sativa (Marijuana, Hemp) in Relation to
640 Human Utilization." The Botanical Review 81(3): 189-294.
641 Small, E. and H. Beckstead (1973). "Common cannabinoid phenotypes in 350 stocks of Cannabis."
642 Lloydia.
643 Small, E. and A. Cronquist (1976). "A practical and natural taxonomy for Cannabis." Taxon: 405-435.
28 © The Author(s) or their Institution(s) Page 29 of 43 Genome
644 Soorni, A., R. Fatahi, D. C. Haak, S. A. Salami and A. Bombarely (2017). "Assessment of Genetic Diversity
645 and Population Structure in Iranian Cannabis Germplasm." Scientific Reports 7(1): 15668.
646 Staginnus, C., S. Zörntlein and E. de Meijer (2014). "A PCR marker linked to a THCA synthase
647 polymorphism is a reliable tool to discriminate potentially THC-rich plants of Cannabis sativa L." J
648 Forensic Sci 59(4): 919-926.
649 Stout, J. M., Z. Boubakir, S. J. Ambrose, R. W. Purves and J. E. Page (2012). "The hexanoyl-CoA precursor
650 for cannabinoid biosynthesis is formed by an acyl-activating enzyme in Cannabis sativa trichomes." The
651 Plant Journal 71(3): 353-365.
652 Swift, W., A. Wong, K. M. Li, J. C. Arnold and I. S. McGregor (2013). "Analysis of cannabis seizures in
653 NSW, Australia: cannabis potency and cannabinoid profile." PloS one 8(7).
654 Taura, F., E. Dono, S. Sirikantaramas, K. Yoshimura,Draft Y. Shoyama and S. Morimoto (2007). "Production of
655 Δ1-tetrahydrocannabinolic acid by the biosynthetic enzyme secreted from transgenic Pichia pastoris."
656 Biochemical and biophysical research communications 361(3): 675-680.
657 Taura, F., S. Morimoto and Y. Shoyama (1996). "Purification and characterization of cannabidiolic-acid
658 synthase from Cannabis sativa L. Biochemical analysis of a novel enzyme that catalyzes the
659 oxidocyclization of cannabigerolic acid to cannabidiolic acid." Journal of Biological Chemistry 271(29):
660 17411-17416.
661 Taura, F., S. Sirikantaramas, Y. Shoyama, K. Yoshikai, Y. Shoyama and S. Morimoto (2007).
662 "Cannabidiolic-acid synthase, the chemotype-determining enzyme in the fiber-type Cannabis sativa."
663 FEBS letters 581(16): 2929-2934.
664 Touw, M. (1981). "The religious and medicinal uses of Cannabis in China, India and Tibet." Journal of
665 psychoactive drugs 13(1): 23-34.
666 Turner, C. E. and M. A. Elsohly (1979). "Constituents of cannabis sativa L. XVI. A possible decomposition
667 pathway of Δ9-tetrahydrocannabinol to cannabinol." Journal of heterocyclic chemistry 16(8): 1667-1668.
29 © The Author(s) or their Institution(s) Genome Page 30 of 43
668 van Bakel, H., J. M. Stout, A. G. Cote, C. M. Tallon, A. G. Sharpe, T. R. Hughes and J. E. Page (2011). "The
669 draft genome and transcriptome of Cannabis sativa." Genome Biology 12(10): R102.
670 van Bakel, H., J. M. Stout, A. G. Cote, C. M. Tallon, A. G. Sharpe, T. R. Hughes and J. E. Page (2011). "The
671 draft genome and transcriptome of Cannabis sativa." Genome Biol 12(10): R102.
672 Vavilov, N. I. and F. Freier (1951). "Studies on the origin of cultivated plants." Studies on the origin of
673 cultivated plants.
674 Vergara, D., E. L. Huscher, K. G. Keepers, R. M. Givens, C. G. Cizek, A. Torres, R. Gaudino and N. C. Kane
675 (2019). "Gene copy number is associated with phytochemistry in Cannabis sativa." AoB PLANTS 11(6).
676 Vyskot, B. and R. Hobza (2015). "The genomics of plant sex chromosomes." Plant Science 236: 126-135.
677 Wang, H. and Y. Wei (2012). "Survey on the germplasm resources of Cannabis sativa L." Medicinal Plant
678 3(7): 11-14. Draft
679 Weiblen, G. D., J. P. Wenger, K. J. Craft, M. A. ElSohly, Z. Mehmedic, E. L. Treiber and M. D. Marks (2015).
680 "Gene duplication and divergence affecting drug content in Cannabis sativa." New Phytologist 208(4):
681 1241-1250.
682 Welling, M. T., L. Liu, T. Shapter, C. A. Raymond and G. J. King (2016). "Characterisation of cannabinoid
683 composition in a diverse Cannabis sativa L. germplasm collection." Euphytica 208(3): 463-475.
684 Zirpel, B., O. Kayser and F. Stehle (2018). "Elucidation of structure-function relationship of THCA and
685 CBDA synthase from Cannabis sativa L." Journal of biotechnology 284: 17-26.
686 Zuardi, A. W. (2006). "History of cannabis as a medicine: a review." Brazilian Journal of Psychiatry 28(2):
687 153-157.
688
689
690
30 © The Author(s) or their Institution(s) Page 31 of 43 Genome
691 LIST OF TABLES
692 Table 1. Genotyping of examined Cannabis sativa L. cultivars
Chemotype B1080/B1192 D589 marker Complete Cultivars identified by marker phenotype CBDAS ORF HPLC phenotype
BC Kush I BT/BD BTpresent -
Black Jack I BT/BD BTpresent -
Bon Homme I BT/BD BTpresent +
Brasil KC I BT/BD BTpresent -
Canadian Cheese I BT/BD BTpresent +
Candy I BT/BD BTpresent -
CBD Chemdog I DraftBT/BD BTpresent +
CBD God Bud I BD BTpresent -
CBD Haze I BT/BD BTpresent -
CBD Rene I BT/BD BTpresent -
Chemdaws I BT/BD BTpresent +
Cherry I BD BTpresent -
Crystal Limit I BT/BD BTpresent -
Doctor G I BT/BD BTpresent -
Girl Scout I BT/BD BTpresent +
Haze I BD BTpresent -
Malawi Gold I BT/BD BTpresent +
NF I BT/BD BTpresent -
Pink Rush I BT/BD BTpresent +
Pink Rush x Head I BT/BD BTpresent + Band
PP2 x Maui I BD BTpresent -
31 © The Author(s) or their Institution(s) Genome Page 32 of 43
RKA I BT/BD BTpresent +
RKE I BT/BD BTpresent +
Skywalker I BT/BD BTpresent -
Trainwreck I BT/BD BTpresent -
White Grapefruit I BT/BD BTpresent +
Jungle Wreck II BT/BD BTpresent -
RIO II BT/BD BTpresent +
Zambiah II BT/BD BTpresent -
CFX2 III BD BTabsent +
Finola III BD BTabsent +
693 694 Draft 695
696
697
698
699
700
701
702
703
704
705
706
707
32 © The Author(s) or their Institution(s) Page 33 of 43 Genome
708 LIST OF FIGURES
709
710
Draft
711 712 713
714 Figure 1. 715
33 © The Author(s) or their Institution(s) Genome Page 34 of 43
716 717 718 719 720 721 722 723 724
725
Draft
726
727
728
729 Figure 2.
34 © The Author(s) or their Institution(s) Page 35 of 43 Genome
730
731
A
B Draft
C
35 © The Author(s) or their Institution(s) Genome Page 36 of 43
732 Figure 3.
Draft
36 © The Author(s) or their Institution(s) Page 37 of 43 Genome
Table 1. Genotyping of examined Cannabis sativa L. cultivars.
Chemotype B1080/B1192 D589 marker Complete Cultivars identified by marker phenotype CBDAS ORF HPLC phenotype
BC Kush I BT/BD BTpresent -
Black Jack I BT/BD BTpresent -
Bon Homme I BT/BD BTpresent +
Brasil KC I BT/BD BTpresent -
Canadian Cheese I BT/BD BTpresent +
Candy I BT/BD BTpresent -
CBD Chemdog I BT/BD BTpresent + CBD God Bud I DraftBD BTpresent - CBD Haze I BT/BD BTpresent -
CBD Rene I BT/BD BTpresent -
Chemdaws I BT/BD BTpresent +
Cherry I BD BTpresent -
Crystal Limit I BT/BD BTpresent -
Doctor G I BT/BD BTpresent -
Girl Scout I BT/BD BTpresent +
Haze I BD BTpresent -
Malawi Gold I BT/BD BTpresent +
NF I BT/BD BTpresent -
Pink Rush I BT/BD BTpresent +
Pink Rush x Head I BT/BD BTpresent + Band
PP2 x Maui I BD BTpresent -
RKA I BT/BD BTpresent +
© The Author(s) or their Institution(s) Genome Page 38 of 43
RKE I BT/BD BTpresent +
Skywalker I BT/BD BTpresent -
Trainwreck I BT/BD BTpresent -
White Grapefruit I BT/BD BTpresent +
Jungle Wreck II BT/BD BTpresent -
RIO II BT/BD BTpresent +
Zambiah II BT/BD BTpresent -
CFX2 III BD BTabsent +
Finola III BD BTabsent +
Draft
© The Author(s) or their Institution(s) Page 39 of 43 Genome
257x71mm (150 x 150 DPI)
Draft
© The Author(s) or their Institution(s) Genome Page 40 of 43
Draft
158x154mm (96 x 96 DPI)
© The Author(s) or their Institution(s) Page 41 of 43 Genome
328x171mmDraft (150 x 150 DPI)
© The Author(s) or their Institution(s) Genome Page 42 of 43
Draft
207x191mm (150 x 150 DPI)
© The Author(s) or their Institution(s) Page 43 of 43 Genome
322x164mmDraft (150 x 150 DPI)
© The Author(s) or their Institution(s)