The compatibility of inhibitor cyclotides with plant-based recombinant expression systems

Bronwyn Smithies B. BiotechMedRes (Hons)

A thesis submitted for the degree of Doctor of Philosophy at The University of Queensland in 2019 The Institute for Molecular Bioscience Abstract

The compatibility of trypsin inhibitor cyclotides with plant-based recombinant expression systems Bronwyn Smithies, The University of Queensland, 2019

Plant-based production of modern pharmaceutical products could provide cheaper, greener pro- duction alternatives to laboratory-based methods. This is particularly relevant when it comes to producing , where the cost of chemical synthesis can be prohibitive on a large scale and where environmentally damaging reagents are required. Cyclotides are a class of plant-derived cyclic peptides that are amenable to chemical synthesis for small-scale laboratory-based studies, but clinical trials and applications employing peptides require scale-up. Recombinant expression is an attractive alternative to chemical synthesis for scaling up cyclotide production because cyclotides are naturally ribosomally-synthesised and consist of natural amino acids. In planta expression has already been demonstrated for some cyclotides, particularly in relation to studying their biosynthesis. This thesis focuses primarily on one subclass of cyclotides, the trypsin inhibitors, whose in planta expression is yet to be fully established. The trypsin inhibitor cyclotides have been re-engineered to develop promising drug leads for chronic myeoloid leukemia, cardiovascular disease and inflammation, but the expression of these valuable lead compounds in recombinant plant systems is largely unexplored. Key challenges for plant-based production of trypsin inhibitor cyclotides include compatibility with biosynthetic enzymes and optimising the folding and accumulation conditions in suitable biofactory host plants. This poses a challenge for large scale plant-based production strategies that must consider the requirement for accessory biosynthetic elements. Ultimately, the compatibility of trypsin inhibitor cyclotides with plant-based production will rely on their ability to be cyclised by enzymes present in the plant, and their accumulation in productive plant tissues. This thesis addresses aspects of recombinant expression of trypsin inhibitor cyclotides over the course of six chapters. Firstly, this thesis details the current status of cyclotide research with a focus on re-engineered cyclotides for pharmaceutical purposes, and the current options for cyclotide synthesis or biosynthesis. Primary research findings are then presented in Chapters 2 to 5. Chapter 2 focuses on identifying promising genetic constructs for trypsin inhibitor expression in seeds and characterising the challenges for leaf-based expression. In Chapter 3, the investigation moves to the in vitro space to explore the compatibility between trypsin inhibitor cyclotides and plant-derived enzymes capable of backbone cyclisation. Modified trypsin inhibitor cyclotides are redesigned to be compatible with cyclising enzymes in vitro to enable semi-enzymatic production from either synthetic or recombinant substrates. In the fourth chapter, the transient leaf-based expression system of Nicotiana benthamiana is enhanced by the co-expression of a master regulator of seed development to promote a seed-like environment within the leaf tissue, resulting in greater accumulation of cyclotides. To conclude the primary research, Chapter 5 is dedicated to the documentation of a plant collection trip undertaken in the remote Kimberley region of Western Australia in May 2018. Finally, Chapter 6 presents a discussion

ii of the results, including insights into the future directions and applications for the results presented throughout the thesis. In conclusion, the trypsin inhibitor cyclotide scaffold has been re-engineered to be compatible with both therapeutic engineering applications and enzyme-mediated cyclisation, and the foundations for in planta expression have been laid with consideration for the most widely used plant biofactory systems. An important step going forward will be the elucidation of intracellular transport pathways, specifically for trypsin inhibitor cyclotides and their cyclising enzyme partners. Additional discovery efforts are encouraged, particularly to identify and characterise more plant-derived enzymes compatible with head-to-tail cyclisation of the cyclotide backbone. Although challenging, the development of plant-based expression for trypsin inhibitor cyclotides should be pursued to provide a cheap, rapid and scalable production platform to support the development of this class of cyclotides that hold so much therapeutic potential.

iii Declaration by author

This thesis is composed of my original work, and contains no material previously published or written by another person except where due reference has been made in the text. I have clearly stated the contribution by others to jointly-authored works that I have included in my thesis. I have clearly stated the contribution of others to my thesis as a whole, including statistical assistance, survey design, data analysis, significant technical procedures, professional editorial advice, financial support and any other original research work used or reported in my thesis. The content of my thesis is the result of work I have carried out since the commencement of my higher degree by research candidature and does not include a substantial part of work that has been submitted to qualify for the award of any other degree or diploma in any university or other tertiary institution. I have clearly stated which parts of my thesis, if any, have been submitted to qualify for another award. I acknowledge that an electronic copy of my thesis must be lodged with the University Library and, subject to the policy and procedures of The University of Queensland, the thesis be made available for research and study in accordance with the Copyright Act 1968 unless a period of embargo has been approved by the Dean of the Graduate School. I acknowledge that copyright of all material contained in my thesis resides with the copyright holder(s) of that material. Where appropriate I have obtained copyright permission from the copyright holder to reproduce material in this thesis and have sought permission from co-authors for any jointly authored works included in the thesis.

iv Publications included in this thesis

1. Smithies, B. J., Huang Y-H., Jackson M. A., Yap K., Gilding E. K., Harris K. S., Anderson M. A. and Craik D. J. (2020) Circular permutation of the native enzyme-mediated cyclization position in cyclotides, ACS Chemical Biology 15, pp. 962-969. This work is reported in Chapter 3 of this thesis.

Submitted manuscripts included in this thesis

No manuscripts submitted for publication.

Other publications during candidature

Peer-reviewed articles

1. Qu H., Smithies B. J., Durek T., and Craik D. J. (2017). Synthesis and engineering applications of cyclotides, Australian Journal of Chemistry 70, p. 152.

Conference abstracts

1. Smithies B. J., Huang Y-H., Gilding E., Jackson M. and Craik D. J. Using plants as biofactories for therapeutic peptides. ComBio, October 2016, Brisbane, Australia. Poster ‘teaser’ oral presentation and poster presentation.

2. Smithies B. J., Jackson M., Huang Y-H., Gilding E., and Craik D. J. Using plants as biofactories for therapeutic cyclic peptides. EMBL Australia Postgraduate Symposium, November 2016. Adelaide, Australia. Poster presentation.

3. Smithies B. J., Jackson M., Huang Y-H., Gilding E., and Craik D. J. Closing the ring: Reor- ganising cyclic structure to enable expression of promising peptide pharmaceuticals in plants. 3rd Congress of the International Society for Plant Molecular Farming, June 2018. Helsinki, Finland. Poster presentation.

4. Smithies B. J., Jackson M., Huang Y-H., Gilding E., and Craik D. J. Closing the ring: a new cyclisation position for the cyclic peptide MCoTI-II. 4th International Conference on Circular Peptides and , November 2018. Kawasaki, Japan. ‘Hotspot’ talk and poster presentation.

v Contributions by others to the thesis

Dr Mark Jackson and Dr Edward Gilding helped conceptualise, design and analyse the results of experiments in Chapters 2 to 5. Dr Jackson designed some of the expression constructs, provided guidance and suggestions for further construct design, and gave instruction on molecular cloning and plant expression techniques used in Chapters 2 and 4. Dr Gilding designed and provided the pDONR221 expression vector used in Chapters 2 and 4, gave instruction on molecular cloning and plant expression techniques for Chapters 2 and 4, and developed the CRISPR-Cas9 edited plants used in Chapter 4. Dr Jackson and Dr Yen-Hua Huang provided the original concept for the alternative cyclisation experiments in Chapters 2, 3 and 4. Dr Huang gave instruction on peptide purification and NMR techniques and analysed NMR data for Chapter 3. Kuok Yap conducted E. coli expression, provided purified AEPs, and assisted with co-elution studies for Chapter 3. Dr Gilding organised the itinerary, car hire, collection permits and compiled the target taxa list for the field trip to the Kimberleys (Chapter 5). Dr Gilding, Dr Jackson, Dr Quentin Kaas and Prof. David Craik assisted in collecting plant specimens and providing photographic records for Chapter 5. Dr Jackson, Dr Gilding, Professor Marilyn Anderson and Prof. Craik provided critical revision during preparation of this thesis.

Statement of parts of the thesis submitted to qualify for the award of another degree

No works submitted towards another degree have been included in this thesis.

Research involving human or animal subjects

No animal or human subjects were involved in this research.

vi Acknowledgments

During my PhD I have been exceptionally lucky to receive professional and personal support from those around me. My colleagues, friends and family have truly shaped this experience and I wish to convey my sincerest gratitude to them. In particular, I would like to express my thanks to the following people: First and foremost I wish to thank my advisory team lead by Prof. David Craik and including Dr Mark Jackson, Dr Edward Gilding and Prof. Marilyn Anderson. David, you have been highly encouraging from our first conversation and I thank you for your continued support, your generosity and your genuine interest in my thoughts and opinions. You continue to support the incorporation of plant science in your research group and I am very grateful that you took me in in this regard. Mark, thank you for your constant and unwavering support, for your guidance in designing my experiments, for the endless time you have spared to pass on your knowledge and answer my questions, for sharing your sixth sense for genetic construct design, and for being a caring and honest mentor. Ed, thank you for supporting my ideas, for helping me troubleshoot my projects, for always teaching me in a thoughtful manner, for developing my capacity to supervise others and for sharing your enthusiasm for the world of plants. Marilyn, I would like to thank you for being open to joining my advisory team and for hosting me in your group on several occasions over the past few years. I appreciate your thoughtful feedback, you honesty and your mentorship. I also wish to thank the following people for their assistance in carrying out my technical work. Kuok Yap, thank you for your daily support and for having a solution to every method I have ever needed to troubleshoot. Thank you for helping me with bacterial expression, purification and mass spectrometry. Dr Crystal Huang, thank you for your input into the design of my experiments and for your guidance with peptide purification, mass spectrometry and NMR. To Dr Peta Harvey, thank you for your patience while instructing me in NMR. I would also like to thank Dr Aaron Poth for advice about mass spectrometry, Dr Quentin Kaas for discussions around peptide design, and Olivier Cheneval for help with peptide synthesis. I would like to convey my appreciation to the IMB for support through the IMBRAA scholarship, for fostering a wonderful community and for providing professional development opportunities. I thank Dr Amanda Carozzi for her constant support through the PhD process and for always supporting student endeavours. I also thank the senior management and professional staff who provided assistance for student activities and community events that helped motivate me throughout my PhD. Thank you to Dr Annie Kan for keeping everyone in good spirits, for your help with settling in to the group, and for always lending an ear. To Robyn Craik, thank you for your fierce support, for your words of kindness and for being so generous with your time, especially when it has come to proofreading my thesis. To my colleagues who I am lucky enough to call my friends, thank you for making this experience and enjoyable one. Haiou and Georgie, I’m so glad that I have had you by my side throughout this PhD endeavour. You have been my mentors, my friends and my daily support. Sarah, Greg, and Guillaume, thank you for allowing me to join the most amazing team and for some of my best memories in

vii Brisbane. To Emma, I’m ever grateful for your words of wisdom, your company when things have been tough, and your unwavering friendship. To two people who have been by my side throughout this whole experience, Alex and Mark. I can’t imagine living with anyone else these past four years. You truly are my Brisbane family. Finally, to my family, Mark, Keren and Julian. Thank you for your continued love and support and for letting me know that you are proud of me. I am truly grateful to know that I have your support in all my endeavours.

viii Financial support

This research was supported by an Australian Government Research Training Program Scholarship, The Institute For Molecular Bioscience Research Advancement Award, and the Institute for Molecular Bioscience Write-Up Award

Keywords

Cyclotide, recombinant production, Arabidopsis thaliana, Nicotiana benthamiana, plant molecular farming, biofactory, asparaginyl endopeptidase

Australian and New Zealand Standard Research Classifications (ANZSRC)

ANZSRC code: 060702, Plant Cell and Molecular Biology, 50% ANZSRC code: 030406, Proteins and Peptides, 30% ANZSRC code: 060101, Analytical Biochemistry, 20%

Fields of Research (FoR) Classification

FoR code: 0601, Biochemistry and Cell Biology, 50% FoR code: 0607, Plant Biology, 30% FoR code: 0304, Medicinal and Biomolecular Chemistry, 20%

ix Contents

Abstract ...... ii

Contents x

List of figures xiv

List of tables xvi

List of abbreviations xvii

1 Introduction 1 1.1 Overview ...... 2 1.2 Cyclotides in nature ...... 2 1.2.1 Discovery of cyclotides ...... 2 1.2.2 The defining features of cyclotides ...... 3 1.2.3 Structure-function relationships and natural bioactivities ...... 5 1.3 Engineering the cyclotide scaffold ...... 5 1.3.1 Peptides as therapeutics ...... 5 1.3.2 Cyclotide grafting ...... 6 1.4 Cyclotide production ...... 10 1.4.1 Cyclotide synthesis ...... 10 1.4.2 Native cyclotide biosynthesis ...... 11 1.4.3 Comparing native and synthetic cyclotide production ...... 14 1.5 Plants as production systems ...... 15 1.5.1 Summary ...... 17 1.5.2 Scope and outline of the thesis ...... 17 References ...... 19

2 Recombinant expression of MCoTI-II: molecular construct design 29 2.1 Overview ...... 30 2.2 Methods ...... 32 2.2.1 Genetic constructs and cloning ...... 32

x CONTENTS

2.2.2 Plant growth and maintenance ...... 33 2.2.3 Plant transformation ...... 34 2.2.4 Extraction and analysis of peptide content in plant tissue ...... 35 2.3 Results ...... 36 2.3.1 A first look at MCoTI-II expression in planta using three precursor peptides . 36 2.3.2 Expression of a grafted MCoTI-II analogue ...... 37 2.3.3 Expression of cyclisation variants of MCoTI-II in Arabidopsis thaliana seeds 39 2.3.4 MCoTI-II expression in leaves ...... 41 2.3.5 Expression of MCoTI-II with alternative C-terminal propeptides for AEP compatibility ...... 42 2.3.6 Linear [K6V]MCoTI-II, but not native MCoTI-II, is detectable in Nicotiana benthamiana from no-CTPP constructs...... 44 2.3.7 Summary of expression profiles ...... 45 2.4 Discussion ...... 48 2.5 Conclusion ...... 50 References ...... 51

3 Circular permutation of the native enzyme-mediated cyclization position in cyclotides 55 3.1 Authorship contribution ...... 56 3.2 Overview ...... 56 3.3 Abstract ...... 57 3.4 Introduction ...... 58 3.5 Methods ...... 60 3.5.1 Peptide synthesis, purification and oxidation...... 60 3.5.2 NMR spectroscopy...... 60

3.5.3 Recombinant production of [G1,G33]L5_Abl06ALEG peptide in E. coli. . . . 61 3.5.4 Recombinant production of OaAEP1b in E. coli...... 61 3.5.5 AEP cyclization assays...... 61 3.5.6 MALDI-MS...... 61 3.5.7 Co-elution of chemically and AEP-cyclized [G1,G33]L5_Abl06...... 62 3.5.8 Evaluation of Abl kinase inhibition using LC/MS...... 62 3.6 Results and discussion ...... 63 3.6.1 Choosing an alternative cyclization point in MCoTI-II...... 63 3.6.2 MCoTI-II can be cyclized efficiently in loop 5 with the addition of two residues...... 65 3.6.3 Small residue additions may affect AEP-binding and/or amide bond formation. 67 3.6.4 A loop-6-grafted cyclotide is compatible with AEP-mediated cyclization in loop 5...... 67 3.6.5 In vitro AEP-mediated cyclization of a recombinant cyclotide precursor. . . . 68

xi CONTENTS

3.6.6 One-pot cyclization and folding of grafted L5 cyclotide...... 69 3.7 Conclusion ...... 70 References ...... 71

4 Manipulation of the plant cell environment to enhance recombinant expression of cy- clotides 75 4.1 Overview ...... 76 4.2 Methods ...... 77 4.2.1 Generation of LEC2 constructs ...... 77 4.2.2 FP1 promoter construct ...... 77 4.2.3 Constructs containing the of interest ...... 79 4.2.4 Sequencing to confirm LEC2 binding site in the Oleosin promoter ...... 79 4.2.5 Transient expression in N. benthamiana ...... 80 4.2.6 Peptide extraction from N. benthamiana leaves ...... 80 4.2.7 Cyclotide content analysis using MALDI-MS ...... 80 4.2.8 Transient expression in CRISPR-Cas9 AEP-knockout lines of N. benthamiana 80 4.3 Results ...... 81 4.3.1 The effect of LEC2 on kalata B1 expression ...... 82 4.3.2 Comparing the three vectors; pEAQ-HT-Dest1, pOH123 and pFH123 . . . . 85 4.3.3 Co-expression of MCoTI-II and LEC2 ...... 87 4.3.4 Background Oleosin promoter activity is probably not due to mobile auxin activity induced elsewhere on the plant ...... 88 4.3.5 The Oleosin promoter used in this study contains the RY elements necessary for LEC2 transcription factor binding ...... 90 4.3.6 Expression of cyclotides in AEP-knockout N. benthamiana ...... 91 4.3.7 Summary of cyclotides detected and not detected in N. benthamiana . . . . . 93 4.4 Discussion ...... 94 4.5 Conclusions ...... 97 References ...... 98

5 Cyclotide discovery efforts in the Kimberley region of Western Australia 101 5.1 Overview ...... 102 5.2 Methods ...... 103 5.2.1 Itinerary ...... 103 5.2.2 Retrieving historical plant collection data ...... 104 5.2.3 Plant collection ...... 105 5.2.4 Record keeping ...... 105 5.2.5 Other planning: Permits, equipment, provisions ...... 106 5.2.6 Safety ...... 107 5.3 Results ...... 109

xii CONTENTS

5.3.1 Specimens collected ...... 109 5.3.2 Collection sites ...... 112 5.4 Discussion ...... 125 5.4.1 Elusive species and future collections ...... 125 5.4.2 Working in a remote area ...... 126 5.4.3 Record keeping and suggestions ...... 128 5.5 Conclusion ...... 129 References ...... 130

6 Outlook and Future Directions 135 6.1 Overview ...... 136 6.2 MCoTI-II expression is suited to seeds ...... 136 6.3 The loop 5 cyclisation point opens up the possibility for in planta biosynthesis of grafted cyclotides ...... 138 6.4 Transient expression in Nicotiana benthamiana can be optimised ...... 139 6.5 Ongoing discovery efforts for cyclotides and cyclising enzymes are encouraged . . . 140 6.6 Suggested experiments for the continuation of this work ...... 141 6.6.1 Suggestions for seed-based expression ...... 141 6.6.2 Suggestions for leaf-based expression ...... 142 6.7 Concluding remarks ...... 143 References ...... 144

A Supplementary information 147

xiii List of figures

1.1 Plant families and cyclotide structures...... 4 1.2 The cyclotide grafting concept...... 7 1.3 Cyclotide scaffold frequency by publication...... 9 1.4 Synthetic cyclotide production...... 11 1.5 Cyclotide biosynthesis and genetic components...... 12

2.1 Cloning vectors pDONR221, pOH123 and pEAQ-HT-Dest1...... 32 2.2 Expression of MCoTI-II in Arabidopsis thaliana seeds...... 37 2.3 Expression of MCoAbl09 in Arabidopsis thaliana seeds...... 38 2.4 Expression of an MCoTI-II loop 5 variant in Arabidopsis thaliana seeds...... 39 2.5 Expression of MCoTI-II loop 5 analogues in Arabidopsis thaliana seeds...... 40 2.6 Expression of MCoTI-II and [K6V]MCoTI-II in Nicotiana benthamiana leaves with three AEPs...... 42 2.7 Modification of the C-terminal propeptide (CTPP) of [K6V]MCoTI-II for expression with two AEPs in Nicotiana benthamiana...... 43 2.8 Linear MCoTI-II does not accumulate in Nicotiana benthamiana tissues...... 44

3.1 Cyclization of MCoTI-II analogues, L5ALEG and L5GLP...... 64 3.2 Optimization of the loop 5 cyclization point of oxidized MCoTI-II...... 66

3.3 Cyclization of grafted cyclotide [G1,G33]L5_Abl06ALEG...... 68 3.4 Co-elution of chemically and enzymatically cyclised peptides...... 69

4.1 Design of pFH123...... 78 4.2 Experimental design to assess the effect of LEC2 expression on cyclotide accumulation in Nicotiana benthamiana...... 81 4.3 The effect of LEC2 on the expression of kalata B1 when driven by three expression cassettes. 83 4.4 The effect of LEC2 on expression of kalata B1 in the absence of OaAEP1b...... 84 4.5 General trends in the use of three different vectors and three different precursors for kalata-B1 expression in the presence and absence of LEC2...... 86 4.6 The experimental design for LEC2 and Oak_MCoTI-II co-expression...... 87 4.7 The effect of LEC2 on the expression of MCoTI-II driven by three expression cassettes. . 88

xiv LIST OF FIGURES

4.8 The expression of LEC2 and non-LEC2 treatments on separate plants...... 89 4.9 The effect of younger or older infiltrated leaves on kalata B1 expression ...... 89 4.10 Analysis of the -200 bp region of the Oleosin promoter and pOH123 ...... 90 4.11 The detection of kalata B1 in AEP knockout lines of N. benthamiana...... 92 4.12 Relative quantification of kalata B1 from OakOpt+UTR precursor in AEP knockout N. benthamiana lines...... 92

5.1 The driving route taken for the Kimberley Field Trip...... 104 5.2 An example of Field Guides generated by the Atlas of Living Australia ...... 105 5.3 Plant collection kit...... 106 5.4 First aid resources for the Kimberley Field Trip...... 107 5.5 Crocodile safety...... 108 5.6 Map of plant collection sites...... 113 5.7 Collection site 1: Roadside 60 km from Broome...... 114 5.8 Collection site 2: Windjana Savannah...... 115 5.9 Collection site 3: Same Creek...... 116 5.10 Collection site 4: Bell Gorge...... 117 5.11 Collection site 5: Silent Grove...... 118 5.12 Collection site 6: Billy Goat Crossing...... 118 5.13 Collection site 7: Mt Elizabeth track...... 119 5.14 Collection sites 8–10: Hann River Crossing, Mistake Creek, and Gibb River...... 120 5.15 Collection site 12: Unnamed Roadside...... 121 5.16 Collection site 13: Wolfe Creek...... 122 5.17 Collection site 14: Willare...... 123 5.18 Collection site 15: Roadside 63 km from Broome...... 124

A.1 The comparison of Hα chemical shifts of [G1,G33]L5_Abl06 with the native scaffold MCoTI-II...... 148 A.2 The TOCSY and NOESY spectra of [G1,G33]L5_Abl06...... 149 A.3 Cyclisation of [G1,G33]L5_Abl06 in the presence of reducing agent...... 150 A.4 LC-MS traces for peptides used in Chapter 3...... 150 A.5 1D NMR for peptides used in cyclisation assays in Chapter 3...... 151 A.6 Abl kinase inhibition of MTAbl06...... 151 A.7 Oxidative folding of cyclic [G1,G33]L5_Abl06 ...... 152 A.8 Kimberley trip May 2018: Original collection sheet ...... 153 A.9 Kimberley trip: Meal plan and shopping list ...... 160

xv List of tables

1.1 Cyclotide grafting studies ...... 8 1.2 Publications describing recombinant cyclotide expression in plants...... 17

2.1 Expression summary: Arabidopsis thaliana seed ...... 45 2.2 Expression summary: Nicotiana benthamiana leaf ...... 47 2.3 Expression summary: Lactuca sativa leaf ...... 47

4.1 Genes of interest and vectors used in Chapter 4 ...... 79 4.2 Positive detection of cyclotides and linear products of GOIs from different vectors used in this study...... 93

5.1 Kimberley trip: Itinerary ...... 103 5.2 Specimens collected in the Kimberley region of Western Australia ...... 110

A.1 Calculated masses of peptides used in Chapter 3 ...... 148 A.2 Kimberley trip: Targeted taxa ...... 155

xvi List of abbreviations

Abbreviations AEP asparaginyl endopeptidase ALA Atlas of Living Australia AVH Australasian Virtual Herbarium CCK cyclic CPMV cowpea mosaic virus CRISPR clustered regularly interspaced short palindromic repeats CtAEP Clitoria ternatea asparaginyl endopeptidase CTPP C-terminal propeptide DTT dithiothreitol GFP green fluorescent protein GOI gene of interest kB1 kalata B1 MALDI-TOF matrix assisted laser desorption ionisation, time of flight McAEP Momordica cochinchinensis asparaginyl endopeptidase MCoTI Momordica cochinchinensis trypsin inhibitor MES 2-(N-morpholino)ethanesulfonic acid MS mass spectrometry nd no data NTPR N-terminal pro-region NTR N-terminal repeat OaAEP Oldenlandia affinis asparaginyl endopeptidase OD optical density PPI potato protease inhibitor PxAEP Petunia hybrida asparaginyl endopeptidase SFTI sunflower trypsin inhibitor SPPS solid phase peptide synthesis TI trypsin inhibitor UTR untranslated region

xvii

Chapter 1

Introduction

1 CHAPTER 1. INTRODUCTION 1.1 Overview

Cyclotides are naturally occurring cyclic peptides from plants that have many features that make them fascinating to study. They are unique to the plant kingdom, as far as we know, and although the number of naturally occurring cyclotides is predicted to be in the tens of thousands, they are found so far in just five plant families. The core structure of cyclotides is conserved, whereas their peptide sequence varies immensely. The conservation of the cyclotide structure not only defines this class of cyclic peptides, but underpins one of their most useful features: stability. At the same time, the natural variation in the sequence has led to the intriguing possibility that almost any amino acid sequence could be designed while keeping the underlying structure intact, thereby coupling the inherent stability of the cyclotide structure with any number of bioactivities. The combination of the variation in detail and conservation of structure has opened up the area of cyclotide engineering, in particular ‘molecular grafting’, which will be referred to here as ‘grafting’. Analogous to the horticultural use of the word, grafting here means the replacement of some part of the cyclotide sequence with another sequence to introduce a new feature to the cyclotide. Through the discovery, characterisation, and grafting of cyclotides, their usefulness as bioactive molecules has been exploited and expanded to create a suite of pharmaceutical candidates for cancer, HIV, inflammatory conditions and more. The extensive effort that has been applied to the discovery, characterisation, and engineering of this fascinating class of molecules is here followed by an exploration into options for their production. This thesis aims to explore plant-based recombinant expression of cyclotides as an alternative to chemical synthesis for production purposes.

1.2 Cyclotides in nature

1.2.1 Discovery of cyclotides

Cyclotides (Craik et al., 1999) are cyclic peptides that occur naturally in five families within the plant kingdom: the Rubiaceae (Koehbach et al., 2015), Violaceae (Craik et al., 1999), Curcubitaceae (Heitz et al., 2001; Hernandez et al., 2000), Fabaceae (Poth et al., 2011; Nguyen et al., 2011), and Solonaceae (Poth et al., 2012). The prototypical cyclotide, kalata B1, was discovered in an African plant, Oldenlandia affinis, during the 1970s although its molecular structure was unknown at the time. Kalata B1 was identified as the active ingredient in a medicinal tea brewed from the plant that was used to accelerate childbirth in Zaire, now known as the Democratic Republic of the Congo (Gran, 1970). That this peptide could retain bioactivity after being boiled and ingested hinted that it must conform to an exceptionally stable molecular structure that is resistant to both heat and digestive enzymes. In the 1970’s, the term cyclotide was still 30 years away from being defined. In the meantime, more cyclic peptides were identified from O. affinis (Gran, 1973a), other members of the Rubiaceae (Witherup et al., 1994), and from the Violaceae family (Schöpke et al., 1993; Claeson et al., 1998; Göransson et al., 1999). The circular nature of kalata B1 was first demonstrated in the mid 1990’s (Saether et al., 1995),

2 1.2. CYCLOTIDES IN NATURE and was followed by the discovery of similar circular peptides from the Violaceae family. The term cyclotide was coined in 1999 (Craik et al., 1999). Now, over 400 wild-type cyclotides have been catalogued in the cyclic peptide database, CyBase (Wang et al., 2008). Cyclotides have been found in multiple plant tissues including shoots, flowers, leaves, roots and seeds, and some plants contain multiple cyclotides. This is epitomised by the legume Clitoria ternatea which contains 74 different cyclotides (Oguis et al., 2019) distributed differentially throughout the plant (Gilding et al., 2016). It is predicted that there could be tens of thousands of unique cyclotides in existence that are yet to be identified in the Rubiaceae family alone (Gruber et al., 2008). Some suggest that this is a conservative estimate (Koehbach et al., 2015). This natural prevalence provides an abundance of potentially valuable products for the agrichemical and pharmaceutical industries, either by harnessing the natural bioactivities of cyclotides or by using native cyclotides as a template for drug design.

1.2.2 The defining features of cyclotides

Cyclotides are interesting because unlike most proteins, they are topologically circular. That is, their N- and C- termini are covalently linked as opposed to being free. The term ‘cyclotide’ derives from ‘cyclo peptide’ (Craik et al., 1999) which refers to their cyclic backbone and their small size, which ranges between 28 and 37 amino acids. Although they are named in essence for their circular backbone, this is just one of the defining features of cyclotides. The difference between cyclotides and other cyclic peptides lies within the structure of the cyclotide core, which contains three disulfide bonds in a conserved arrangement. This number of disulfide bonds, and the order in which they are arranged, is a defining feature of cyclotides (Craik et al., 1999). Specifically, the disulfide bonds are arranged as follows: CysI-CysIV, and CysII-CysV cross the backbone in parallel, and CysIII-CysVI threads between them to form a knot-like structure. This cystine knot structure can be found in non-cyclic proteins called knottins (Le Nguyen et al., 1990), but in cyclotides it is termed the cyclic cystine knot (CCK)(Craik et al., 1999). The CCK differentiates cyclotides from other disulfide-containing cyclic peptides such as theta defensins, which have three disulfide bonds that are arranged in parallel rather than a knotted formation. Within the cyclotide family there are three subclasses: the Möbius, bracelet and trypsin inhibitor cyclotides, shown in Figure 1.1. The Möbius cyclotides contain a cis- in loop 5 (Craik et al., 1999) that creates the characteristic twist in the backbone ribbon, whereas bracelet cyclotides do not have this feature. Trypsin inhibitor (TI) cyclotides have less sequence similarity to both Möbius and bracelets, and have been identified only in the Cucurbitaceae family (Figure 1.1) (Hernandez et al., 2000; Mylne et al., 2012; Mahatmanto et al., 2015; Du et al., 2019). TI cyclotides appear to be cyclic versions of almost identical linear TIs from the same plants that are expressed from within the same multi-domain precursor genes (Mylne et al., 2012). The three-dimensional shape of the cyclic MCoTI-II overlays almost perfectly with the linear TI-5 (Mylne et al., 2012), demonstrating that cyclisation in TIs is unlikely to affect the form and therefore function, but rather evolved as a mechanism to protect the N- and C- termini from proteolytic degradation. The unique features of TI

3 CHAPTER 1. INTRODUCTION cyclotides will be explored in later sections, but the stability conferred by cyclisation is something in common amongst all cyclotides.

Figure 1.1: Plant families and cyclotide structures. Cyclotides are found within five plant families shown here. They are ubiquitous in the Violaceae family, but are relatively uncommon in the other four families. Structurally, cyclotides fall into three classes: the Möbius, bracelet and trypsin inhibitor classes. The Rubiaceae, Violaceae and Fabaceae families contain both Möbius and bracelet cyclotides, the Solanaceae contains bracelets only, and the trypsin inhibitor class is exclusive to the Cucurbitaceae. Structural features: all cyclotides contain six residues, labelled here in roman numerals. Disulfide bonds are formed between cystines I-IV, II-V and III-VI as shown here as yellow sticks. The six regions between cystines are called loops and are numbered in Arabic numerals.

The stability of cyclotides is probably their most frequently-cited feature and makes them appealing for pharmaceutical and agrichemical applications. The first indication that cyclotides were resistant to thermal and enzymatic degradation came from the initial observations of the uterotonic tea consumed by Congolese women (Gran, 1970) where the effects of the tea were potent even after boiling and ingestion. The stability of the cyclotide structure has been attributed primarily to the CCK, hydrogen bonding network, cyclisation, and secondary structures such as the β turn in loop 5 (Cemažar˘ et al., 2006; Wang et al., 2009). The conservation of these structures has resulted in the defining stability of cyclotides, but the sequence variation in the other parts of the cyclotide backbone has resulted in diversity of function among this group of peptides.

4 1.3. ENGINEERING THE CYCLOTIDE SCAFFOLD

1.2.3 Structure-function relationships and natural bioactivities

Cyclotide sequences vary greatly in the inter-cystine regions, termed ‘loops’, while leaving the CCK motif intact. This is an important feature of cyclotides because it leads to their diversity, providing a vast range of bioactivities with the same underlying stability. In terms of structure-function relationships, scanning mutagenesis studies using kalataB1 have identified several residues important for insecticidal activity, anthelmintic activity, and hemolytic activity (Simonsen et al., 2008). These residues are located in loops 1, 2 and 3 of the cyclotide, but map to the same region of the tertiary structure, which is therefore known as the bioactive patch or face. The residues of the bioactive face that are located in loop 1 are the most highly conserved among other Möbius class cyclotides. Another region important for activity and membrane interactions is the hydrophobic patch, iden- tified via lysine scanning of kalata B1 (Huang et al., 2010). The mode of action of cyclotides is largely attributed to the subclass, where Möbius cyclotides act via membrane interactions, and Trypsin Inhibitor cyclotides cross the cell membrane to act via direct protein-protein interactions (Veer et al., 2019). Membrane binding was first observed for Möbius cyclotides using surface plasmon resonance (Kamimori et al., 2005) and NMR (Shenkarev et al., 2006) and cellular uptake relies on interactions with phosphatidylethanolamine (PE) phospholipids (Henriques et al., 2015). Trypsin inhibitor cy- clotides lack the residues required for PE binding and appear to enter cells via macropinocytosis (Contreras et al., 2011). Instead of interaction with PE, loop 1 of trypsin inhibitor cyclotides is required for binding to the active site of trypsin (Daly et al., 2013). Cyclotides were initially discovered based on their bioactivity (Gran, 1970; Gran, 1973b) and this has been one motivation for ongoing discovery efforts. The role of cyclotides in planta is probably pesticidal, as demonstrated in particular by their insecticidal activity (Jennings et al., 2001; Jennings et al., 2005) and exemplified by the launch of the first cyclotide-containing pesticide in 2018, which is made from Clitoria ternatea extracts (InnovateAg, 2018). Pharmaceutically-relevant bioactivities have also been demonstrated from natural cyclotides, including anti-HIV, antimicrobial, immuno- suppressive, and anti-cancer activity (Oguis et al., 2015), and although these features have not been exploited yet in the clinic, they have provided inspiration for the use of cyclotides as scaffolds for drug-design.

1.3 Engineering the cyclotide scaffold

1.3.1 Peptides as therapeutics

Most drugs on the market today are small molecules (Fotouhi, 2015) which means that they can interact with extracellular and intracellular targets by quickly crossing the cell membrane. Although they dominate the market, small molecules can be problematic in terms of off-target effects. On the other hand, larger proteins or ‘biologics’ have had success as pharmaceuticals, having higher specificity to their targets. They are less likely to succumb to resistance from their targets as their interaction with a substrate covers a larger area than small molecules. Despite being generally unsuitable for oral

5 CHAPTER 1. INTRODUCTION delivery, large protein biologics have been successful in the pharmaceutical world, with well-known examples including Humira® (adalimumab) for rheumatoid arthritis and Herceptin® (trastuzumab) for HER2-positive breast cancer treatment. The success of biologics has relied not only on their specificity, but on advances made in genome sequencing and proteomics which have contributed to discovery, rational drug design, and screening, albeit to a lesser extent than the high-throughput screening available for discovery of small molecules. Peptide drugs currently make up just 2% of the global pharmaceutical market, primarily because of their poor bioavailability (Fotouhi, 2015). But, peptides can fill a niche in the pharmaceutical market because they can combine the specificity of biologics with the cell-permeability of small molecules. Peptides can be from larger proteins, retaining their activity but reducing their size so that cell penetration is much more achievable. They retain the selectivity of larger proteins and are potentially less toxic than some small molecules (Craik et al., 2013). Some successful peptide drugs on the market today include Lizness® (linaclotide) for the treatment of irritable bowel syndrome and Prialt® (ziconotide) for chronic pain management. These two peptides have three disulfide bonds like cyclotides, but they are not backbone-cyclised. To realise the potential of peptide drugs on a broader scale, their stability and bioavailability need to be improved. Molecular grafting into scaffolds such as cyclotides is being explored as a means of achieving this (Craik et al., 2013).

1.3.2 Cyclotide grafting

The diversity of amino acid sequences among naturally occurring cyclotides, with the exception of cysteine residues, was an early indication that cyclotides could tolerate intentional amino acid substitutions, and that this could be used to change or enhance their properties. The idea of cyclotide grafting is to insert or replace a segment of the cyclotide backbone with a peptide sequence that has a desired activity, as shown in Figure 1.2. This insertion stabilises the peptide because its ends are no longer exposed to degradation, and because it is now part of the ultra-stable CCK scaffold. The grafting concept was first investigated in experiments looking at key differences in folding between Möbius and bracelet cyclotides. The residues KNK from cycloviolacin01, a bracelet cyclotide, were grafted in to kalata B1 to demonstrate that the CCK structure would hold after the substitution (Clark et al., 2006). The field quickly moved on to grafting therapeutic sequences that are not of cyclotide-origin. Now, 25 studies have used cyclotides as scaffolds for drug design (see Table 1.1), with most of these publications describing multiple grafted analogues. The largest grafted sequence published to date is 21 amino acids in length, grafted into loop 6 of MCoTI-II (D’Souza et al., 2016). This graft, MCOG2, had activity against K562 cancer cells in vitro via antagonism of SET, an oncoprotein otherwise implicated in inhibition of some tumour suppressors. Loop 6 of native MCoTI-II contains 8 amino acids, but MCOG2 took that number to 28 (21 plus 7 residues retained from the original loop 6 sequence), demonstrating the remarkable ability of MCoTI-II to take on extra residues, more than tripling the length of the original loop 6, while retaining the CCK structure and activity of the epitope. On the other extreme, truncation of MCoTI-II has also been

6 1.3. ENGINEERING THE CYCLOTIDE SCAFFOLD demonstrated by removing four residues, ‘SDGG’, from the native loop 6 sequence. This truncation, along with a single substitution in loop 1, was designed to reduce steric interference and resulted in a 9 nM tryptase inhibitor; 60 times more potent than the non-truncated parent cyclotide (Thongyoo et al., 2009). These two studies represent the two extremes of loop length manipulation, but epitopes of almost all sizes in between have also been characterised (Table 1.1).

Figure 1.2: The cyclotide grafting concept. A peptide epitope (green) replaces part of the cyclotide backbone to create a grafted cyclotide.

The flexibility of the cyclotide grafting scaffold has not only been demonstrated by the length of the inserts, but also the number of grafted sequences in one scaffold. Sixteen instances of multiple grafts have been published to date, including nine double grafts (Chan et al., 2016; Huang et al., 2015b; Swedberg et al., 2016; Wang et al., 2014a), five triple grafts (Gao et al., 2010; Maaß et al., 2015; Sommerhoff et al., 2010) and one quadruple graft if single base substitutions are considered (Getz et al., 2013). Multiple grafts can include copies of the same epitope, or a combination of different epitopes. Multiple copies of the same epitope can be used to enhance activity, for example, in the case of the abltide epitopes grafted into MCoTI-II (Huang et al., 2015b). The aim of the study was to target BCR-ABL, the fusion protein implicated in chronic myeloid leukaemia, by designing inhibitors based on the substrate of ABL. Having two copies of an abltide-derived sequence grafted into MCoTI-II proved to be more efficient at killing K562 cancer cells than just one copy (Huang et al., 2015b).

7 CHAPTER 1. INTRODUCTION Lesniak et al., 2017 Poth et al., 2019 Reference b Graft length a 661 14-166 Aboye et al., 18 2012 5, 6 1-33, 5, 6 81, Ji 6 et al., 2013 Thongyoo 1,6 et 10 al., 1-5 2008 6 Jagadish et Sangphukieo 1, al.,1, et - 2015 3, al., 4 6 2015 Sommerhoff et al.,1, 2010 6 6-7 Thongyoo6 et 1-12 al., 1 2009 1, 2, 6 Chan 7-136 et Maaß al., et 2011 al., 2015 6 7 7 Quimbar et Huang al., et5, 2013 al., 6 2015b 1, 6 18-216-1 Conibear et 6 Chan al., et 4-7 2016 al., D’Souza 2015 et al., 2016 1-32, 3, 5, 6 42, Chan 5 et Aboye al., 6-9 et 2016 al.,6 2016 Swedberg et al., 2016 6 Swedberg 5-7 Gunasekera et et al., al., 2018 2008 5, 6 4-65, Gao 6 et 8-9 al., 2010 4, 2 Eliasen 4-7 et al., 2012 Wong 4-10 et al., 2012; 1 Getz et al., Wang 2013 et al., 2014b Thell et al., 2016 in vivo & in vitro in vivo in vitro in vitro in vitro in silico in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vitro in vivo in vitro in vivo in vivo Table 1.1: Cyclotide grafting studies -tryptase inhibitor-tryptase and HLE inhibitor Infammatory disorders Asthma & inflammation CXCR4 antagonist & HIV inhibitorHdm2/Hdmx antagonist Anti-cancer & anti-HIV FMDV 3C protease inhibitorSynuclein-induced cytotoxicityHIV gp120 inhibitorβ Phenotypic Foot-and-mouth screening disease Anti-cancer β AngiogenicTrypsin & matriptase inhibitorCTLA-4BCR-ABL kinase inhibitorCell migration Anti-HIV inhibitorLyP1 Anti-cancer markerSET antagonistAngiotensin receptor activatorMultiple Chronic anti-angiogenesis myeloid pathways leukaemia FXIIa and FXa inhibitors CVD Anti-cancer &KLK4 Anti-angiogenesis wound inhibitor healing Lung cancer or heart attack VEGF-A antagonist Immunotherapy Dengue NS2B-NS3 protease inhibitorMelanocortin 4 receptor Anticancer agonist Dengue Anti-cancer fever Thrombosis,Bradykinin CVD B1 receptor agonistNeuropilin-1 and -2 Obesity antagonistImmunomodulation Prostate Chronic cancer Immunomodulation and Anti-cancer inflammatory pain Endothelial cell migration Multiple sclerosis Multiple sclerosis

           aaaB1 kalata MCoTI-II MCoTI-I                                                                      

öisclass Möbius rpi niio Class Inhibitor Trypsin Loop of the cyclotide inNumber which of the amino graft acids sequence in is the inserted grafted sequence Cyclotide Biological activity Application Context Loop a b

8 1.3. ENGINEERING THE CYCLOTIDE SCAFFOLD

The usefulness of the TI grafting scaffold in particular is evident when looking through the published cyclotide grafting studies to date. Of the 25 articles describing novel grafted cyclotides in the last 11 years, 18 have made use of TI cyclotide scaffolds, as displayed in Figure 1.3. Fourteen publications use MCoTI-II specifically, and four use MCoTI-I. The remaining seven studies utilise kalata B1 from the Möbius cyclotide family. Bracelet cyclotides have not yet been adopted for grafting studies due to difficulties folding them in vitro (Qu et al., 2017).

Figure 1.3: Cyclotide scaffold frequency by publication. a. The 25 cyclotide grafting publications to date can be grouped into those using the trypsin inhibitor (red) and Möbius (blue) cyclotide scaffolds. Of those using the trypsin inhibitor cyclotide scaffold, most use MCoTI-II and four use MCoTI-I. b. The MCoTI-II scaffold indicating the inter-cystine loops. c. The number of publications using each loop of the MCoTI-II scaffold for grafting. Some publications use multiple loops in the same or multiple analogues.

In general, the MCoTI peptides display less non-specific toxicity to human cells than kalata B1 (Greenwood et al., 2007; Cascales et al., 2011), which is one reason why they are preferred for therapeutic grafting studies. The toxicity of MCoTIs has typically been studied alongside investigations into their cell-penetration activity. Both MCoTI-II and kalata B1 can penetrate cells, albeit by different mechanisms. Kalata B1 appears to enter cells via direct membrane interaction, whereas MCoTI-

9 CHAPTER 1. INTRODUCTION

II enters via energy-dependent micropinocytosis (Cascales et al., 2011) which might explain the difference in general toxicity. MCoTI-II has also been the focus of a study to improve its cell penetration efficiency by synthesising a graft based on CTP512 (Huang et al., 2015a), which was in turn designed based on the protein transduction domain (PTD) of HIV-1 trans-activating factor (TAT) (Kim et al., 2006). By grafting the sequence YGRARRR into loop 1 of MCoTI-II, its internalisation into cells was equivalent to TAT, which is considered to be the gold standard for cell penetrating peptides (CPPs) in the field. The preference for MCoTI peptides in grafting and cell internalisation studies was an important factor in choosing this scaffold as the primary cyclotide for investigation in this thesis.

1.4 Cyclotide production

1.4.1 Cyclotide synthesis

The design and testing of grafted cyclotides for pharmaceutical applications has largely been possible due to their suitability to chemical synthesis. This involves three stages: synthesis of the amino acid chain, followed by cyclisation and, finally, oxidative folding of the cyclotide to achieve the final product (Qu et al., 2017). Figure 1.4 shows the amino acid chain being assembled using solid phase peptide synthesis (SPPS). This involves adding each amino acid in the chain one-by-one until the full peptide is complete, starting with the C-terminal amino acid which is attached to a bead i.e the solid phase. During the sequential addition of each amino acid, the α-amines of incoming amino acids are temporarily protected to ensure binding to the growing peptide chain. Protection groups commonly used in cyclotide synthesis are Fmoc (fluorenylmethyloxycarbonyl) and Boc (butyloxycarbonyl). Almost any amino acid sequence can be designed and then synthesised in this way, but a yield penalty is introduced with each amino acid addition. As cyclotides are generally under 40 amino acids in length, they are suitable for synthesis in this way, and so SPPS has been an invaluable tool for cyclotide research over the past 20+ years. After synthesis, the peptide chain is cyclised either using native chemical ligation (NCL) (Tam et al., 1999; Thongyoo et al., 2008) or peptide bond formation with side groups protected (Cheneval et al., 2014). NCL requires an N-terminal cysteine, so theoretically there are six variations of the cyclotide backbone that can be designed according to which cysteine will be at the N-terminus. If desired, the backbone can also be cyclised by joining any N-terminal residue with the C-terminal residue using a two-step chemical reaction, provided that the side chains of the amino acids remain protected during the reaction (Cheneval et al., 2014). The third step in producing the final cyclotide product is oxidative folding. This step usually requires the most optimisation, and has been particularly problematic for bracelet cyclotides. Cyclotides from the bracelet subclass are the most abundant cyclotides known to occur in nature, but are the least represented in cyclotide engineering literature. Only two bracelet cyclotides, cycloviolacinO1 (Gunasekera et al., 2009) and cycloviolacin O2 (Aboye et al., 2008; Zheng et al., 2012) have been

10 1.4. CYCLOTIDE PRODUCTION

Figure 1.4: Synthetic cyclotide production. SPPS (solid phase peptide synthesis) is used to assemble the peptide chain. Each chemically-protected amino acid is added sequentially to the growing peptide chain via the coupling steps. In between each coupling, the peptide chain is de-protected to remove the protecting group, FMOC in this example. The linear chain is then cleaved from the solid phase (orange) and cyclised followed by oxidative folding to produce the mature cyclotide. studied extensively regarding their correct folding in vitro. For this reason, Möbius and trypsin inhibitor cyclotides have been the focus of cyclotide grafting research. Where in vitro synthesis of cyclotides has allowed grafting studies to be carried out, it might not be the most suitable method for producing grafted cyclotides in large quantities for commercial production due to environmental concerns. A recent review by Isidro-Llobet and colleagues suggests that the chemical waste generated by synthesis could be in the multi-tonne range per kilogram of peptide produced (Isidro-Llobet et al., 2019) and concerns have been raised since the early 2000s around the safety, regulatory and environmental aspects of use of solvents in particular, such as diethyl ether, and hazardous acids such as TFA and HF (Andersson et al., 2000). Working with small volumes for laboratory-scale experiments is feasible, but scaling up for clinical trials or commercial production would require confronting these concerns. Looking back to cyclotide origins in planta and understanding their biosynthesis has provided inspiration for greener production options for cyclotide production explored throughout this thesis.

1.4.2 Native cyclotide biosynthesis

Cyclotides are genetically encoded, ribosomally-synthesised peptides that undergo post-translational modification in the plant cell vacuole to produce the final cyclic product. Figure 1.5 shows the overall pathway for cyclotide biosynthesis in planta. Ribosomal synthesis occurs at the endoplasmic reticulum, where formation of disulfide bonds and folding of the cyclotide is presumed to occur (Jennings et al.,

11 CHAPTER 1. INTRODUCTION

2001). The remaining processing including excision of the mature cyclotide sequence with cyclisation is predicted to occur in the vacuole (Conlan et al., 2011b; Conlan et al., 2012). The N-terminal segment of a cyclotide typically consists of an endoplasmic reticulum (ER) and an N-terminal pro-region. The mature cyclotide domain then follows, and is sometimes repeated several times (Jennings et al., 2001; Mahatmanto et al., 2015). If the mature cyclotide coding sequence is repeated, each repeat is also preceded by a portion of the N-terminal pro-region, termed the N-terminal repeat (NTR) (Jennings et al., 2001). The cyclotide domain is followed by a C-terminal propeptide (CTPP), which is important for cyclisation (Gillon et al., 2008; Harris et al., 2015).

Figure 1.5: Cyclotide biosynthesis and genetic components. a. The linear peptide chain is assembled on the ribosome and then moves through the endomembrane system to the vacuole. Oxidative folding is believed to occur during endomembrane transport, followed by cyclisation by AEPs in the vacuole. b. The generic cyclotide precursor gene contains an ER signal peptide, an N-terminal pro-region (NTPR), the cyclotide domain and a C-terminal propeptide (CTPP). Specific precursor genes can encode a single cyclotide, such as the OAK1 gene from Oldenlandia affinis that encodes kalata B1 (kB1). They can also encode multiple cyclotides, such as the TIPTOP2 gene from Momordica cochinchinensis which contains five cyclotide domains and one acyclotide (linear) domain, TI-V.

12 1.4. CYCLOTIDE PRODUCTION

Compared with our knowledge of the protein synthesis and trafficking pathways, very little is known about gene regulation of native cyclotides except for the identification of several cis-regulatory elements from C. ternatea identified in a recent doctoral dissertation (Oguis, 2019). Although the regulatory elements for cyclotide expression are largely unidentified, several studies have demonstrated differential expression levels under certain circumstances which may imply that their expression is regulated rather than ubiquitous. In the same thesis that identified several cis-regulatory elements in C. ternatea, it was reported that cyclotide expression was upregulated after exogenous application of three plant defence hormones; jasmonate, ethylene and salicylic acid (Oguis, 2019). Expression of cyclotides in Viola species differs between seasons in Sweden (Trabi et al., 2004b), and between geographic locations in India (Narayani et al., 2017). Moreover, differential expression can be seen within different tissues of the same plant (Gilding et al., 2016; Trabi et al., 2004a). The genetic regulation of cyclotide expression with relation to their differential expression represents a gap in cyclotide knowledge that has only just begun being explored (Oguis, 2019) and so this thesis will focus instead on manipulating cyclotide expression in the context of coding sequences and post-translational modification.

AEP-mediated cyclisation

Unlike chemical synthesis where the cyclotide backbone can be cyclised at almost any position, plant- produced cyclotides are cyclised at a precise position in loop 6. Accordingly, the first and last amino acids encoded in the mature cyclotide domain are both located in what will be loop 6 of the mature cyclotide. Asparaginyl endopeptidases (AEPs) (EC no. 3.4.22.34) were identified as the enzymes involved in cyclisation in planta and recognise an aspartate or residue followed by a specific C-terminal sequence (Gillon et al., 2008; Mylne et al., 2012; Mylne et al., 2011; Saska et al., 2007). These enzymes are sometimes referred to as cyclases in the context of cyclotide research, but this ability is something of an anomaly as AEPs are primarily peptidases, and are found in both plants and animals. AEPs are also referred to by several different names, including vacuolar processing enzymes (VPEs) (Hara-Nishimura et al., 1991), and legumains, due to early work on their discovery and protease activity in legume seeds (Kembhavi et al., 1993). Here they will be referred to as AEPs to be consistent with the use of the term throughout cyclotide literature. AEPs are cysteine proteases that are primarily involved in maturation of seed storage proteins as well as programmed plant cell death, and are similar to mammalian caspases with the equivalent function (Hatsugai et al., 2015; Hatsugai et al., 2006). However, it appears that some AEPs have also evolved to mediate peptide bond formation, which can result in the joining of the cyclotide backbone. Our understanding of this anomaly in AEP function has advanced in recent years giving us a better understanding of the process of AEP-mediated cyclisation at a molecular level as well as a better understanding of the structural features of cyclase type AEPs. In 2017, Yang and colleagues demonstrated the striking influence of the ‘gatekeeper’ residue, Cys247, of the AEP, OaAEP1, from the cyclotide-producing plant, O. affinis (Yang et al., 2017). A Cys247Ala mutant of the enzyme was a more efficient ligase, cyclising a linear substrate 160 times faster than wild-type OaAEP1. The activity

13 CHAPTER 1. INTRODUCTION of the gatekeeper residue is suggested to be a nucleophile filter that prevents water from permeating the active site and facilitating hydrolysis. Excluding water allows the N-terminus of the incoming peptide to act as the nucleophile, resulting in peptide bond formation. Not long after this report, Zauner and colleagues found that a non-hydrophobic residue at the P2’ position of a peptide substrate encouraged hydrolysis, whereas a hydrophobic residue encouraged cyclisation (Zauner et al., 2018). Their work utilised recombinantly produced AtLEGγ, an AEP from Arabidopsis thaliana which is not a cyclotide-producing plant.

Expanding the model to include multiple residues that can influence AEP-cyclisation preference, a region termed the marker-of-ligase-activity (MLA) was shown to influence ligase/hydrolase preference (Jackson et al., 2018). The region was defined using mutation analysis of polymorphic regions of AEP ligases and was demonstrated to be predictive of cyclase activity when compared to sequences that were truncated or enriched with hydrophobic residues. Other regions or residues that correlate with AEP cyclase activity include the substrate pocket and polyproline loop (Jackson et al., 2018), and the LAD1 (gatekeeper region) and LAD2 (binding pocket) regions (Hemu et al., 2019). With a better understanding of AEP-dynamics and our ability to produce recombinant AEP for use in vitro, AEPs are becoming an attractive tool for peptide ligation and cyclisation.

1.4.3 Comparing native and synthetic cyclotide production

The physical separation of the steps of cyclotide synthesis, folding and cyclisation in the plant cell leads to a major difference between biosynthesis and chemical synthesis. Higher yields are achieved in vitro if the cyclotide is cyclised first, followed by oxidative folding (Daly et al., 1999). In plants, the cyclotide is folded in the ER before cyclisation in the vacuole (Jennings et al., 2001). We don’t know whether this is the optimal sequence of events for plants or if it is a result of the placement of organelles in the cell, and this reflects a lack of understanding of cyclotide processing and folding during endomembrane transport. Our inability to translate this process in vitro in some cases, such as folding of synthetic bracelet cyclotide precursors, demonstrates this lack of knowledge.

Recombinant expression of cyclotides in plants was utilised in early studies looking at their biosynthesis and to uncover the role of AEPs in cyclisation (Conlan et al., 2011a; Gillon et al., 2008; Saska et al., 2007), but this thesis explores the use of plants for cyclotide production rather than just research. The molecular mechanisms required to synthesise, cyclise and fold cyclotides correctly are already present within plant cells, and AEPs from one plant can be expressed in another to optimise expression (Poon et al., 2018; Jackson et al., 2018; Jackson et al., 2019). Therefore, it seems prudent to suggest production of grafted cyclotides on a commercial scale could be achieved by producing them recombinantly in plants. Heterologous expression in other organisms is certainly not ruled out, as bacterial and mammalian cell culture of other proteins is well established, but AEP activity for cyclisation has not been established for these systems.

14 1.5. PLANTS AS PRODUCTION SYSTEMS 1.5 Plants as production systems

Because proteins can be gene-encoded, they can be produced recombinantly, a process which has been used to the advantage of the pharmaceutical industry and has led to a rise in biologics as therapeutic molecules. Ideally, a commercial production system is cheap, efficient, scalable, safe and widely- accessible. Bacteria, yeast, insect and mammalian cells are commonly used to produce recombinant proteins, because they grow quickly and can be grown as a homogenous culture. It is easy to monitor and control their growth conditions and they can respond to changes in these conditions quickly due to their fast regeneration time. They are generally regarded as cheap and efficient. However, it has been suggested that the production of pharmaceuticals in plants might enhance the benefits of cell culture, having the benefit of being grown in less-specialised facilities and being highly versatile. Plant molecular farming to produce valuable products has seen slow but steady progress over the past 20-30 years. The idea for using plants to produce recombinant pharmaceuticals has been around since the mid-1980s when the first recombinant proteins were expressed in plants, including human growth hormone (Barta et al., 1986), INFαD (De Zoeten et al., 1989), antibodies (Hiatt et al., 1989) and human serum albumin (Sijmons et al., 1990). The first vaccine produced in plants, for hepatitis B, was published in the early 1990s (Mason et al., 1992). However, compared to the uptake of bacterial and other cell-based production systems, it took a relatively long time for the first plant-made pharmaceutical to achieve FDA approval in 2012 (Elelyso ™, Protalix Biotherapeutics). The slow progress has been attributed to the lack of a standardised production platform and therefore lack of good manufacturing practice (GMP)-compliant facilities (Stoger et al., 2014). This is reflected in the fact that taliglucerase alpha, the first plant-made recombinant pharmaceutical, is made in cell suspension cultures in bioreactors, a technology already established for other types of cell cultures. The general consensus from recent reviews in the field is one of optimism for the near future of plant molecular farming. This is based on an increased number of production systems complying with GMP, establishment of post-translational modification engineering in plant cells (Stoger et al., 2014), insights in to which sectors of the market should be focused upon (Paul et al., 2013; Paul et al., 2015), ongoing discussions around regulatory frameworks (Drake et al., 2017) and investigations into side-stream revenue from plant-made valuable products (Buyel, 2019). Plants as production systems for recombinant proteins offer several advantages over other types of recombinant protein production including scalability and versatility which will be outlined below. To be able to scale up production according to demand for a product is an important economical factor in producing any pharmaceutical. This could be based on fluctuations in occurrence of a disease or development of resistance to current drugs. Plants are relatively easy to scale-up, as the only resources required are more space and basic horticultural supplies, as opposed to building specialised bioreactors. With regulations around the environmental escape of genetically modified plants, the most expensive component of scale-up would probably be the building of containment glasshouses. Recent examples of proteins involved in scale-up studies include human serum albumin produced in rice (Chen et al., 2013), coagulation factor X produced in transgenic lettuce chloroplasts for the suppression

15 CHAPTER 1. INTRODUCTION of immune tolerance induction in haemophilia (Herzog et al., 2017), and a virus-like-protein as a vaccine for human papilloma virus produced in tobacco chloroplasts (Zahin et al., 2016). Clinical trials have recently been successful for a plant-made influenza vaccine (Pillet et al., 2019). Importantly, plants are also suitable for small-scale production, which is sometimes overlooked as an advantage of the technology, but is important in the case of rare medicines for which demand is low and so don’t reap the benefits of economies of scale. The cost for valuable therapeutics is something that is often subsidised or affordable in some countries, but access to medicines is a fundamental human right for all people (The United Nations, 1948). Therefore, economical production costs for pharmaceuticals are desired across all levels of affluence.

Using plants as production systems for cyclotides was first explored using native producers, i.e. non-transgenic native production of cyclotides from their native host plant. This was carried out using O. affinis where tissue and cell culture protocols were developed with the intention of creating plant-based cyclotide platforms (Seydel et al., 2006; Seydel et al., 2007). But, it was the use of transient expression in N. benthamiana that allowed the identification of AEPs as the facilitators of cyclisation and opened up the possibility of using plants for cyclotide production. The first wave of recombinant cyclotide expression in plants was used as a tool to study biosynthetic pathways in plant cells, followed by the introduction of specific AEPs through co-expression with cyclotide genes to incorporate a full biosynthetic pathway in non-cyclotide producing plants (Poon et al., 2018). Although cyclic products were detected in early studies, these relied on the activity of endogenous, non-specialised AEPs. It wasn’t until 2018 with the co-expression of cyclase-type AEPs that these became the predominant product (Poon et al., 2018). Table 1.2 summarises the reports of recombinant cyclotide expression in planta.

Looking through the literature reporting on cyclotide expression in plants, some common trends become clear. Exploring AEP-compatibility has been the main focus of cyclotide expression in planta to date. A common theme throughout the literature is the use of the constitutive promoter, CaMV 35S, combined in more recent studies with the p19 suppressor in gene silencing using the pEAQ-HT-Dest1 vector for transient expression (Sainsbury et al., 2009). This, along with the analysis of leaf tissue only in almost all publications to date, highlights a lack of exploration of recombinant expression in other tissues, such as seeds or roots, using either specific or constitutive promoters. Cyclotides have been identified in many plant tissues, suggesting that matching the cyclotide with its preferred tissue type could be worth exploring. So far, only one study has been published describing the recombinant production of MCoTI-II in planta (Mylne et al., 2012).

An important distinction to make is between the use of recombinant expression for research purposes or for production purposes, with the latter being explored only relatively recently with the incorporation of co-expression with known cyclase AEPs alongside cyclotide constructs (Poon et al., 2018). It is worth noting that another class of cyclic peptides, SFTIs from Helianthus annuus, are also cyclised by AEPs and have been investigated with the goal of moving towards commercial production of AEP-dependent cystine rich cyclic peptides in plants (Jackson et al., 2019).

16 1.5. PLANTS AS PRODUCTION SYSTEMS

Table 1.2: Publications describing recombinant cyclotide expression in plants. Cyclotide Dominant % Cyclic Purpose of study Host plant Mode Tissue expressed product product Reference Biosynthesis N. benthamiana Transient Leaf kalata B1 Linear ∼5% Saska et al., 2007 Biosynthesis N. benthamiana Transient Leaf kalata B1 Linear nd Gillon et al., 2008 N. tabacum Stable Leaf kalata B1 Linear nd A. thaliana Stable Leaf Linear nd Biosynthesis N. benthamiana Transient Leaf kalata B1 Linear nd Conlan et al., 2011a Biosynthesis N. benthamiana Transient Leaf kalata B1 & Linear 10.30% Conlan et al., 2012 22 analogues Biosynthesis A. thaliana Stable Seeds MCoTI-I Cyclic ∼45% Mylne et al., 2012 MCoTI-II ∼70% MCoTI-IV ∼55% Cyclotide N. benthamiana Transient Leaf kalata B1 Cyclic 89% Poon et al., 2018 production kalata B2 Cyclic 91% kalata B3 Cyclic 87% CterM Cyclic 90% KNK-kB1 Cyclic 86% DK-kB1 Cyclic 78% MOG3 Cyclic 51% kB1(T20K) Cyclic 61% P. vulgaris Transient Leaf kalata B1 Cyclic 91% L. sativa Transient Leaf kalata B1 Cyclic 83% N. tabacum Stable Leaf kalata B1 Cyclic 97% B. napus Stable Leaf kalata B1 Cyclic 100% Biosynthesis P. hybrida Transient Leaf kalata B1 Cyclic nd Jackson et al., 2018 N. benthamiana Transient Leaf Cyclic ∼90% Biosynthesis N. benthamiana Transient Leaf kalata B1 Cyclic nd Rehm et al., 2019

1.5.1 Summary

Until now, most pharmaceuticals produced in plants have been large biologics such as antibodies or enzymes, but this project aims to explore using plants to produce cyclotides. The motivation for doing this is based on the fact that cyclotides originate from plants, and so the molecular machinery required for biosynthesis is compatible with the system. Limitations to producing cyclotides in plants include our limited understanding of the enzymes that perform cyclisation in planta, and particularly whether or not they will be compatible with grafted cyclotides, our limited knowledge as to which cyclotide genes can be expressed in which plant tissues, and how the plant cell environment enables or hinders cyclotide accumulation in different circumstances. This thesis aims to explore these topics, with a focus on the production of the cyclotide MCoTI-II and its grafted analogues.

1.5.2 Scope and outline of the thesis

The broad aim of this thesis is to explore the compatibility of MCoTI-II and grafted analogues of this cyclotide with plant-based production. This is explored throughout 6 chapters. The first chapter has introduced cyclotides and detailed their synthesis and biosynthesis to contextualise MCoTI-II production in planta. Chapter 2 begins with designing and testing of expression constructs for MCoTI- II and related cyclotides in planta, followed by Chapter 3 which describes re-designing MCoTI-II to accommodate both AEP-mediated cyclisation and grafting. Chapter 4 then describes the manipulation of the leaf expression environment by co-expression of enhancing factors. In Chapter 5, a report is given on a field trip to collect plant specimens of interest to cyclotide research. Chapter 6 then considers the findings of each research chapter together and outlines suggestions for future investigation. A

17 CHAPTER 1. INTRODUCTION general overview of each chapter is as follows: Chapter 1 has introduced cyclotides and their potential for pharmaceutical applications, their synthesis and biosynthesis, and the prospect of producing cyclotides in planta. The popularity of MCoTI-II for molecular grafting studies is juxtaposed with the lack of investigation into in planta recombinant production of this cyclotide and its analogues. The aim to explore MCoTI-II expression in planta is proposed. Chapter 2 explores MCoTI-II expression in several plant systems including Arabidopsis thaliana, Nicotiana benthamiana and Lactuca sativa, and introduces a range of genetic constructs for cyclotide expression. The focus here is on manipulating the expression constructs, including elements of the precursor peptide and the mature cyclotide. This chapter determines which precursor proteins are beneficial for expression of MCoTI-II and analogues in A. thaliana seeds, and highlights the difficulty in expressing MCoTI-II in leaf tissue. Chapter 3 investigates the compatibility between grafted cyclotides and AEPs to ensure AEP- mediated cyclisation is a viable tool for production. MCoTI-II is redesigned to be cyclised in loop 5 rather than loop 6, so that the cyclisation point can be retained when grafted sequences are introduced into loop 6, as is the case with most grafting studies. This chapter focuses on in vitro applications, and provides the groundwork for future applications in semi-enzymatic or fully recombinant production by establishing cyclotide-AEP compatibility. Chapter 4 focuses on improving cyclotide accumulation in leaf tissue by changing the cellular environment. This is achieved by using co-expression of the transcription factor LEC2 in N. ben- thamiana leaves, which induces a seed-like environment that potentially enhances protein storage preferences in the leaf. This chapter identifies LEC2 co-expression as a simple means for enhancing cyclotide accumulation in certain cases, and suggests that LEC2 could be a valuable tool for studying differences between cyclotide accumulation in seeds versus leaf tissue in the future. The knockdown of endogenous AEPs and alternative pathways are also discussed. Chapter 5 outlines the discovery efforts undertaken during a field trip to the Kimberley region in Western Australia in 2018. The aim was to collect plant samples for testing for cyclotide content and/or the presence of proteases of interest. This included looking for plants that might contain robust and diverse AEPs for peptide cyclisation. This chapter serves as a record of the collection trip, including collection data, planning and suggestions for future discovery efforts. It also serves as a guide for those interested in cyclotide discovery who are less familiar with the logistics of plant collecting. Chapter 6 provides insights and considerations to take forward based on the results presented in this thesis. The feasibility of production of MCoTI-II and grafted analogues is discussed, and the ways in which the research here can be taken forward are proposed.

18 REFERENCES References

Aboye, T. L., Clark, R. J., Craik, D. J., and Göransson, U. (2008). Ultra stable peptide scaffolds for protein engineering - synthesis and folding of the circular cystine knotted cyclotide cycloviolacin O2. ChemBioChem 9, pp. 103–113. Aboye, T. L., Ha, H., Majumder, S., Christ, F., Debyser, Z., Shekhtman, A., Neamati, N., and Ca- marero, J. A. (2012). Design of a novel cyclotide-based CXCR4 antagonist with anti-human immunodeficiency virus (HIV)-1 activity. Journal of Medicinal Chemistry 55, pp. 10729–10734. Aboye, T., Meeks, C., Majumder, S., Shekhtman, A., Rodgers, K., and Camarero, J. (2016). Design of a MCoTI-based cyclotide with angiotensin (1-7)-like activity. Molecules 21, p. 152. Andersson, L., Blomberg, L., Flegel, M., Lepsa, L., Nilsson, B., and Verlander, M. (2000). Large-scale synthesis of peptides. Biopolymers Peptide Science 55, pp. 227–250. Barta, A., Sommergruber, K., Thompson, D., Hartmuth, K., Matzke, M. A., and Matzke, A. J. M. (1986). The expression of a nopaline synthase - human growth-hormone chimeric gene in transformed tobacco and sunflower callus-tissue. Plant Molecular Biology 6, pp. 347–357. Buyel, J. F. (2019). Plant molecular farming – Integration and exploitation of side streams to achieve sustainable biomanufacturing. Frontiers in Plant Science 9, pp. 1–17. Cascales, L., Henriques, S. T., Kerr, M. C., Huang, Y. H., Sweet, M. J., Daly, N. L., and Craik, D. J. (2011). Identification and characterization of a new family of cell-penetrating peptides. Cyclic cell-penetrating peptides. Journal of Biological Chemistry 286, pp. 36932–36943. Cemažar,˘ M. and Craik, D. J. (2006). Factors influencing the stability of cyclotides: Proteins with a cir- cular backbone and cystine knot motif. International Journal of Peptide Research and Therapeutics 12, pp. 253–260. Chan, L. Y., Gunasekera, S., Henriques, S. T., Worth, N. F., Le, S. J., Clark, R. J., Campbell, J. H., Craik, D. J., and Daly, N. L. (2011). Engineering pro-angiogenic peptides using stable, disulfide-rich cyclic scaffolds. Blood 118, pp. 6709–6717. Chan, L. Y., Craik, D. J., and Daly, N. L. (2015). Cyclic thrombospondin-1 mimetics: grafting of a thrombospondin sequence into circular disulfide-rich frameworks to inhibit endothelial cell migration. Bioscience Reports 35, e00270. Chan, L. Y., Craik, D. J., and Daly, N. L. (2016). Dual-targeting anti-angiogenic cyclic peptides as potential drug leads for cancer therapy. Scientific Reports 6, p. 35347. Chen, Z., He, Y., Shi, B., and Yang, D. (2013). Human serum albumin from recombinant DNA technology: Challenges and strategies. Biochimica et Biophysica Acta 1830, pp. 5515–5525. Cheneval, O., Schroeder, C. I., Durek, T., Walsh, P., Huang, Y.-H., Liras, S., Price, D. A., and Craik, D. J. (2014). Fmoc-based synthesis of disulfide-rich cyclic peptides. The Journal of Organic Chemistry 79, pp. 5538–5544. Claeson, P., Göransson, U., Johansson, S., Luijendijk, T., and Bohlin, L. (1998). Fractionation protocol for the isolation of polypeptides from plant biomass. Journal of Natural Products 61, pp. 77–81.

19 CHAPTER 1. INTRODUCTION

Clark, R. J., Daly, N. L., and Craik, D. J. (2006). Structural plasticity of the cyclic-cystine-knot framework: Implications for biological activity and drug design. Biochemical Journal 394, pp. 85– 93. Conibear, A. C., Chaousis, S., Durek, T., Johan Rosengren, K., Craik, D. J., and Schroeder, C. I. (2016). Approaches to the stabilization of bioactive epitopes by grafting and peptide cyclization. Biopolymers 106, pp. 89–100. Conlan, B. F., Colgrave, M. L., Gillon, A. D., Guarino, R., Craik, D. J., and Anderso, M. A. (2012). Insights into processing and cyclization events associated with biosynthesis of the cyclic peptide kalata B1. Journal of Biological Chemistry 287, pp. 28037–28046. Conlan, B. F., Gillon, A. D., Barbeta, B. L., and Anderson, M. A. (2011a). Subcellular targeting and biosynthesis of cyclotides in plant cells. American Journal of Botany 98, pp. 2018–2026. Conlan, B. F. and Anderson, M. A. (2011b). Circular micro-proteins and mechanisms of cyclization. Current Pharmaceutical Design 17, pp. 4318–4328. Contreras, J., Elnagar, A. Y. O., Hamm-Alvarez, S. F., and Camarero, J. A. (2011). Cellular uptake of cyclotide MCoTI-I follows multiple endocytic pathways. Journal of Controlled Release 155, pp. 134–143. Craik, D. J., Daly, N. L., Bond, T., and Waine, C. (1999). Plant cyclotides: A unique family of cyclic and knotted proteins that defines the cyclic cystine knot structural motif. Journal of Molecular Biology 294, pp. 1327–1336. Craik, D. J., Fairlie, D. P., Liras, S., and Price, D. (2013). The Future of peptide-based drugs. Chemical Biology and Drug Design 81, pp. 136–147. D’Souza, C., Henriques, S. T., Wang, C. K., Cheneval, O., Chan, L. Y., Bokil, N. J., Sweet, M. J., and Craik, D. J. (2016). Using the MCoTI-II cyclotide scaffold to design a stable cyclic peptide antagonist of SET, a protein overexpressed in human cancer. Biochemistry 55, pp. 396–405. Daly, N. L., Love, S., Alewood, P. F., and Craik, D. J. (1999). Chemical synthesis and folding pathways of large cyclic polypeptides: Studies of the cystine knot polypeptide kalata B1. Biochemistry 38, pp. 10606–10614. Daly, N. L., Thorstholm, L., Greenwood, K. P., King, G. J., Rosengren, K. J., Heras, B., Martin, J. L., and Craik, D. J. (2013). Structural insights into the role of the cyclic backbone in a squash trypsin inhibitor. Journal of Biological Chemistry 288, pp. 36141–36148. De Zoeten, G., Penswick, J., Horisberger, M., Ahl, P., Schultze, M., and Hohn, T. (1989). The expression, localization, and effect of a human interferon in plants. Virology 172, pp. 213–222. Drake, P. M. W., Szeto, T. H., Paul, M. J., Teh, A. Y.-H., and Ma, J. K.-C. (2017). Recombinant biologic products versus nutraceuticals from plants - a regulatory choice? British Journal of Clinical Pharmacology 83, pp. 82–87. Du, J., Chan, L. Y., Poth, A. G., and Craik, D. J. (2019). Discovery and characterization of cyclic and acyclic trypsin inhibitors from Momordica dioica. Journal of Natural Products 82, pp. 293–300.

20 REFERENCES

Eliasen, R., Daly, N. L., Wulff, B. S., Andresen, T. L., Conde-Frieboes, K. W., and Craik, D. J. (2012). Design, synthesis, structural and functional characterization of novel melanocortin agonists based on the cyclotide kalata B1. Journal of Biological Chemistry 287, pp. 40493–40501. Fotouhi, N. (2015). Peptide therapeutics. Peptide Chemistry and Drug Design. 1st ed. John Wiley & Sons, Inc. Chap. 1, pp. 1–10. Gao, Y., Cui, T., and Lam, Y. (2010). Synthesis and disulfide bond connectivity-activity studies of a kalata B1-inspired cyclopeptide against dengue NS2B-NS3 protease. Bioorganic and Medicinal Chemistry 18, pp. 1331–1336. Getz, J. A., Cheneval, O., Craik, D. J., and Daugherty, P. S. (2013). Design of a cyclotide antagonist of neuropilin-1 and -2 that potently inhibits endothelial cell migration. ACS Chemical Biology 8, pp. 1147–1154. Gilding, E. K., Jackson, M. A., Poth, A. G., Henriques, S. T., Prentis, P. J., Mahatmanto, T., and Craik, D. J. (2016). Gene coevolution and regulation lock cyclic plant defence peptides to their targets. New Phytologist 210, pp. 717–730. Gillon, A. D., Saska, I., Jennings, C. V., Guarino, R. F., Craik, D. J., and Anderson, M. A. (2008). Biosynthesis of circular proteins in plants. Plant Journal 53, pp. 505–515. Göransson, U., Luijendijk, T., Johansson, S., Bohlin, L., and Claeson, P. (1999). Seven novel macro- cyclic polypeptides from Viola arvensis. Journal of Natural Products 62, pp. 283–286. Gran, L. (1970). An oxytocic principle found in Oldenlandia affinis DC. Meddelelser fra Norsk Farmaceutisk Selskap 12, pp. 173–180. Gran, L. (1973a). Isolation of oxytocic peptides from Oldenlandia affinis by solvent extraction of tetraphenylborate complexes and chromatography on sephadex LH-20. Lloydia 36, pp. 207–208. Gran, L. (1973b). Oxytocic principles of Oldenlandia affinis. Lloydia 36, pp. 174–178. Greenwood, K. P., Daly, N. L., Brown, D. L., Stow, J. L., and Craik, D. J. (2007). The cyclic cystine knot miniprotein MCoTI-II is internalized into cells by macropinocytosis. International Journal of Biochemistry and Cell Biology 39, pp. 2252–2264. Gruber, C. W., Elliott, A. G., Ireland, D. C., Delprete, P. G., Dessein, S., Göransson, U., Trabi, M., Wang, C. K., Kinghorn, A. B., Robbrecht, E., and Craik, D. J. (2008). Distribution and evolution of circular miniproteins in flowering plants. Plant Cell 20, pp. 2471–2483. Gunasekera, S., Daly, N., Clark, R., and Craik, D. J. (2009). Dissecting the oxidative folding of circular cystine knot miniproteins: Development of hybrid cyclotide scaffolds for protein engineering. Antioxid. Redox Signal. 11, pp. 971–980. Gunasekera, S., Foley, F. M., Clark, R. J., Sando, L., Fabri, L. J., Craik, D. J., and Daly, N. L. (2008). Engineering stabilized vascular endothelial growth factor-A antagonists: Synthesis, structural characterization, and bioactivity of grafted analogues of cyclotides. Journal of Medicinal Chemistry 51, pp. 7697–7704. Hara-Nishimura, I., Inoue, K., and Nishimura, M. (1991). A unique vacuolar processing enzyme responsible for conversion of several proprotein precursors into the mature forms. FEBS Letters 294, pp. 89–93.

21 CHAPTER 1. INTRODUCTION

Harris, K. S., Durek, T., Kaas, Q., Poth, A. G., Gilding, E. K., Conlan, B. F., Saska, I., Daly, N. L., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2015). Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nature Communications 6, p. 10199. Hatsugai, N., Kuroyanagi, M., Nishimura, M., and Hara-Nishimura, I. (2006). A cellular suicide strategy of plants: vacuole-mediated cell death. Apoptosis 11, pp. 905–911. Hatsugai, N., Yamada, K., Goto-Yamada, S., and Hara-Nishimura, I. (2015). Vacuolar processing enzyme in plant programmed cell death. Frontiers in Plant Science 6, p. 234. Heitz, A., Hernandez, J. F., Gagnon, J., Hong, T. T., Pham, T. T., Nguyen, T. M., Le-Nguyen, D., and Chiche, L. (2001). Solution structure of the squash trypsin inhibitor MCoTI-II. A new family for cyclic knottins. Biochemistry 40, pp. 7973–7983. Hemu, X., El Sahili, A., Hu, S., Wong, K., Chen, Y., Wong, Y. H., Zhang, X., Serra, A., Goh, B. C., Darwis, D. A., Chen, M. W., Sze, S. K., Liu, C.-F., Lescar, J., and Tam, J. P. (2019). Structural determinants for peptide-bond formation by asparaginyl ligases. Proceedings of the National Academy of Sciences of the United States of America 116, pp. 11737–11746. Henriques, S. T., Huang, Y.-H., Chaousis, S., Sani, M.-A., Poth, A. G., Separovic, F., and Craik, D. J. (2015). The prototypic cyclotide kalata B1 has a unique mechanism of entering cells. Chemistry and Biology 22, pp. 1087–1097. Hernandez, J. F., Gagnon, J., Chiche, L., Nguyen, T. M., Andrieu, J. P., Heitz, A., Trinh Hong, T., Pham, T. T., and Le Nguyen, D. (2000). Squash trypsin inhibitors from Momordica cochinchinensis exhibit an atypical macrocyclic structure. Biochemistry 39, pp. 5722–5730. Herzog, R. W., Nichols, T. C., Su, J., Zhang, B., Sherman, A., Merricks, E. P., Raymer, R., Perrin, G. Q., Häger, M., Wiinberg, B., and Daniell, H. (2017). Oral tolerance induction in hemophilia b dogs fed with transplastomic lettuce. Molecular Therapy 25, pp. 1–11. Hiatt, A., Caffferkey, R., and Bowdish, K. (1989). Production of antibodies in transgenic plants. Nature 342, pp. 76–78. Huang, Y. H., Colgrave, M. L., Clark, R. J., Kotze, A. C., and Craik, D. J. (2010). Lysine-scanning mutagenesis reveals a previously unidentified amendable face of the cyclotide kalata B1 for the optimisation of nematocidal activity. Journal of Biological Chemistry 285, pp. 10797–10805. Huang, Y.-H., Chaousis, S., Cheneval, O., Craik, D. J., and Henriques, S. T. (2015a). Optimization of the cyclotide framework to improve cell penetration properties. Frontiers in Pharmacology 6, p. 17. Huang, Y.-H., Henriques, S. T., Wang, C. K., Thorstholm, L., Daly, N. L., Kaas, Q., and Craik, D. J. (2015b). Design of substrate-based BCR-ABL kinase inhibitors using the cyclotide scaffold. Scientific Reports 5, p. 12974. InnovateAg (2018). Sero-X overview. URL: https://innovate-ag.com.au/sero-x/ (visited on 10/18/2019). Isidro-Llobet, A., Kenworthy, M. N., Mukherjee, S., Kopach, M. E., Wegner, K., Gallou, F., Smith, A. G., and Roschangar, F. (2019). Sustainability challenges in peptide synthesis and purification: From R&D to production. The Journal of Organic Chemistry 84, pp. 4615–4628.

22 REFERENCES

Jackson, M. A., Gilding, E. K., Shafee, T., Harris, K. S., Kaas, Q., Poon, S., Yap, K., Jia, H., Guarino, R., Chan, L. Y., Durek, T., Anderson, M. A., and Craik, D. J. (2018). Molecular basis for the production of cyclic peptides by plant asparaginyl endopeptidases. Nature Communications 9, p. 2411. Jackson, M. A., Yap, K., Poth, A. G., Gilding, E. K., Swedberg, J. E., Poon, S., Qu, H., Durek, T., Harris, K., Anderson, M. A., and Craik, D. J. (2019). Rapid and scalable plant-based production of a potent plasmin inhibitor peptide. Frontiers in Plant Science 10, p. 602. Jagadish, K., Gould, A., Borra, R., Majumder, S., Mushtaq, Z., Shekhtman, A., and Camarero, J. A. (2015). Recombinant expression and phenotypic screening of a bioactive cyclotide against α- synuclein-induced cytotoxicity in baker’s yeast. Angewandte Chemie-International Edition 54, pp. 8390–8394. Jennings, C., Rosengren, J., Daly, N., Plan, M., Stevens, J., Scanlon, M., Waine, C., Norman, D., Anderson, M., and Craik, D. (2005). Isolation, solution structure, and insecticidal activity of kalata B2, a circular protein with a twist: Do möbius strips exist in nature? Biochemistry 44, pp. 851–860. Jennings, C., West, J., Waine, C., Craik, D., and Anderson, M. (2001). Biosynthesis and insecticidal properties of plant cyclotides: the cyclic knotted proteins from Oldenlandia affinis. Proceedings of the National Academy of Sciences of the United States of America 98, pp. 10614–10619. Ji, Y., Majumder, S., Millard, M., Borra, R., Bi, T., Elnagar, A. Y., Neamati, N., Shekhtman, A., and Camarero, J. A. (2013). In vivo activation of the p53 tumor suppressor pathway by an engineered cyclotide. Journal of the American Chemical Society 135, pp. 11623–11633. Kamimori, H., Hall, K., Craik, D. J., and Aguilar, M. I. (2005). Studies on the membrane interactions of the cyclotides kalata B1 and kalata B6 on model membrane systems by surface plasmon resonance. Anal Biochem. 337, pp. 149–153. Kembhavi, A., Buttle, D., Knight, C., and Barrett, A. (1993). The two cysteine endopeptidases of legume seeds: purification and characterization by use of specific fluorometric assays. Archives of Biochemistry and Biophysics 303, pp. 208–213. Kim, D., Jeon, C., Kim, J. H., Kim, M. S., Yoon, C. H., Choi, I. S., Kim, S. H., and Bae, Y. S. (2006). Cytoplasmic transduction peptide (CTP): New approach for the delivery of biomolecules into cytoplasm in vitro and in vivo. Experimental Cell Research 312, pp. 1277–1288. Koehbach, J. and Gruber, C. W. (2015). Chapter Three - Cyclotides in the Rubiaceae. Advances in Botanical Research. Ed. by D. J. Craik. Vol. 76. Academic Press, pp. 51–78. Le Nguyen, D., Heitz, A., Chiche, L., Castro, B., Boigegrain, R., Favel, A., and Coletti-Previero, M. (1990). Molecular recognition between proteases and new bioactive microproteins with a knotted structure. Biochimie 72, pp. 431–435. Lesniak, W. G., Aboye, T., Chatterjee, S., Camarero, J. A., and Nimmagadda, S. (2017). In vivo evaluation of an engineered cyclotide as specific CXCR4 imaging reagent. Chemistry - A European Journal 23, pp. 14469–14475. Maaß, F., Wustehube-Lausch, J., Dickgiessr, S., Valldorf, B., Reinwarth, M., Schmoldt, H. U., Daneschdar, M., Avrutina, O., Sahin, U., and Kolmar, H. (2015). Cystine-knot peptides targeting

23 CHAPTER 1. INTRODUCTION

cancer-relevant human cytotoxic T lymphocyte-associated antigen 4 (CTLA-4). Journal of Peptide Science 21, pp. 651–660. Mahatmanto, T., Mylne, J. S., Poth, A. G., Swedberg, J. E., Kaas, Q., Schaefer, H., and Craik, D. J. (2015). The evolution of Momordica cyclic peptides. Molecular Biology and Evolution 32, pp. 392– 405. Mason, H. S., Lam, D. M., and Arntzen, C. J. (1992). Expression of hepatitis B surface antigen in transgenic plants. Proceedings of the National Academy of Sciences of the United States of America 89, pp. 11745–9. Mylne, J. S., Chan, L. Y., Chanson, A. H., Daly, N. L., Schaefer, H., Bailey, T. L., Nguyencong, P., Cascales, L., and Craik, D. J. (2012). Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase–mediated biosynthesis. The Plant Cell 24, pp. 2765–2778. Mylne, J. S., Colgrave, M. L., Daly, N. L., Chanson, A. H., Elliott, A. G., McCallum, E. J., Jones, A., and Craik, D. J. (2011). Albumins and their processing machinery are hijacked for cyclic peptides in sunflower. Nature Chemical Biology 7, pp. 257–259. Narayani, M., Chadha, A., and Srivastava, S. (2017). Cyclotides from the Indian medicinal plant Viola odorata (Banafsha): Identification and characterization. Journal of Natural Products 80, pp. 1972–1980. Nguyen, G. K. T., Zhang, S., Ngan, T. K. N., Phuong, Q. T. N., Chiu, M. S., Hardjojo, A., and Tam, J. P. (2011). Discovery and characterization of novel cyclotides originated from chimeric precursors consisting of albumin-1 chain a and cyclotide domains in the Fabaceae family. Journal of Biological Chemistry 286, pp. 24275–24287. Oguis, G. K., Gilding, E. K., Jackson, M. A., and Craik, D. J. (2019). Butterfly pea (Clitoria ternatea), a cyclotide-bearing plant with applications in agriculture and medicine. Frontiers in Plant Science 10, p. 645. Oguis, G. K. (2019). Clitoria ternatea (butterfly pea) cyclotides: Insights on functional diversity, regulation and biotechnological applications. PhD thesis. The University of Queensland, Brisbane. Oguis, G. K., Kan, M.-W., and Craik, D. J. (2015). Chapter Seven - Natural functions and struc- ture–activity relationships of cyclotides. Advances in Botanical Research. Ed. by D. J. Craik. Vol. 76. Academic Press, pp. 187–226. Paul, M., Teh, A., Twyman, R., and Ma, J. (2013). Target product selection - Where can molecular pharming make the difference? Current Pharmaceutical Design 19, pp. 5478–5485. Paul, M. J., Thangaraj, H., and Ma, J. K.-C. (2015). Commercialization of new biotechnology: a sys- tematic review of 16 commercial case studies in a novel manufacturing sector. Plant Biotechnology Journal 13, pp. 1209–1220. Pillet, S., Couillard, J., Trépanier, S., Poulin, J.-F., Yassine-Diab, B., Guy, B., Ward, B. J., and Landry, N. (2019). Immunogenicity and safety of a quadrivalent plant-derived virus like particle influenza vaccine candidate—two randomized phase ii clinical trials in 18 to 49 and ≥50 years old adults. PLOS ONE 14, pp. 1–26.

24 REFERENCES

Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641. Poth, A. G., Colgrave, M. L., Philip, R., Kerenga, B., Daly, N. L., Anderson, M. A., and Craik, D. J. (2011). Discovery of cyclotides in the Fabaceae plant family provides new insights into the cyclization, evolution, and distribution of circular proteins. ACS Chemical Biology 6, pp. 345–355. Poth, A. G., Mylne, J. S., Grassl, J., Lyons, R. E., Millar, A. H., Colgrave, M. L., and Craik, D. J. (2012). Cyclotides associate with leaf vasculature and are the products of a novel precursor in Petunia (Solanaceae). Journal of Biological Chemistry 287, pp. 27033–27046. Poth, A. G., Huang, Y.-H., Le, T. T., Kan, M.-W., and Craik, D. J. (2019). Pharmacokinetic characteri- zation of kalata B1 and related therapeutics built on the cyclotide scaffold. International Journal of Pharmaceutics 565, pp. 437–446. Qu, H., Smithies, B. J., Durek, T., and Craik, D. J. (2017). Synthesis and protein engineering applica- tions of cyclotides. Australian Journal of Chemistry 70, p. 152. Quimbar, P., Malik, U., Sommerhoff, C. P., Kaas, Q., Chan, L. Y., Huang, Y.-H., Grundhuber, M., Dunse, K., Craik, D. J., Anderson, M. A., and Daly, N. L. (2013). High-affinity cyclic peptide matriptase inhibitors. Journal of Biological Chemistry 288, pp. 13885–13896. Rehm, F. B. H., Jackson, M. A., De Geyter, E., Yap, K., Gilding, E. K., Durek, T., and Craik, D. J. (2019). Papain-like cysteine proteases prepare plant cyclic peptide precursors for cyclization. Proceedings of the National Academy of Sciences, pp. 7831–7936. Saether, O., Craik, D. J., Campbell, I. D., Sletten, K., Juul, J., and Norman, D. G. (1995). Elucidation of the primary and three-dimensional structure of the uterotonic polypeptide kalata B1. Biochemistry 34, pp. 4147–4158. Sainsbury, F., Thuenemann, E. C., and Lomonossoff, G. P. (2009). pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnology Journal 7, pp. 682–693. Sangphukieo, A., Nawae, W., Laomettachit, T., Supasitthimethee, U., and Ruengjitchatchawalya, M. (2015). Computational design of hypothetical new peptides based on a cyclotide scaffold as HIV gp120 inhibitor. PLoS ONE 10, e0139562. Saska, I., Gillon, A. D., Hatsugai, N., Dietzgen, R. G., Hara-Nishimura, I., Anderson, M. A., and Craik, D. J. (2007). An asparaginyl endopeptidase mediates in vivo protein backbone cyclization. Journal of Biological Chemistry 282, pp. 29721–29728. Schöpke, T., Hasan Agha, M. I., Kraft, R., Otto, A., and Hiller, K. (1993). Hämolytisch aktive komponenten aus Viola tricolor L. und Viola arvensis Murray. Scientia Pharmaceutica 61, pp. 145– 153. Seydel, P., Gruber, C. W., Craik, D. J., and Dornenburg, H. (2007). Formation of cyclotides and variations in cyclotide expression in Oldenlandia affinis suspension cultures. Applied Microbiology and Biotechnology 77, pp. 275–284.

25 CHAPTER 1. INTRODUCTION

Seydel, P. and Dörnenburg, H. (2006). Establishment of in vitro plants, cell and tissue cultures from Oldenlandia affinis for the production of cyclic peptides. Plant Cell, Tissue and Organ Culture 85, pp. 247–255. Shenkarev, Z. O., Nadezhdin, K. D., Sobol, V. A., Sobol, A. G., Skjeldal, L., and Arseniev, A. S. (2006). Conformation and mode of membrane interaction in cyclotides - Spatial structure of kalata B1 bound to a dodecylphosphocholine micelle. FEBS Journal 273, pp. 2658–2672. Sijmons, P. C., Dekker, B. M. M., Schrammeijer, B., Verwoerd, T. C., Elzen, P. J. M. van den, and Hoekema, A. (1990). Production of correctly processed human serum albumin in transgenic plants. Bio/Technology 8, pp. 217–221. Simonsen, S. M., Sando, L., Rosengren, K. J., Wang, C. K., Colgrave, M. L., Daly, N. L., and Craik, D. J. (2008). scanning mutagenesis of the prototypic cyclotide reveals a cluster of residues essential for bioactivity. Journal of Biological Chemistry 283, pp. 9805–9813. Sommerhoff, C. P., Avrutina, O., Schmoldt, H.-U., Gabrijelcic-Geiger, D., Diederichsen, U., and Kolmar, H. (2010). Engineered cystine knot miniproteins as potent inhibitors of human mast cell tryptase β. Journal of Molecular Biology 395, pp. 167–175. Stoger, E., Fischer, R., Moloney, M., and Ma, J. K.-C. (2014). Plant molecular pharming for the treatment of chronic and infectious diseases. Annual Review of Plant Biology 65, pp. 743–768. Swedberg, J. E., Ghani, H. A., Harris, J. M., Veer, S. J. de, and Craik, D. J. (2018). Potent, selective, and cell-penetrating inhibitors of kallikrein-related peptidase 4 based on the cyclic peptide MCoTI-II. ACS Medicinal Chemistry Letters 9, pp. 1258–1262. Swedberg, J. E., Mahatmanto, T., Abdul Ghani, H., De Veer, S. J., Schroeder, C. I., Harris, J. M., and Craik, D. J. (2016). Substrate-guided design of selective FXIIa inhibitors based on the plant- derived Momordica cochinchinensis trypsin inhibitor-II (MCoTI-II) scaffold. Journal of Medicinal Chemistry 59, pp. 7287–7292. Tam, J. P., Lu, Y.-A., and Yu, Q. (1999). Thia zip reaction for synthesis of large cyclic peptides: Mechanisms and applications. Journal of the American Chemical Society 121, pp. 4316–4324. The United Nations (1948). Universal declaration of human rights. URL: https://www.un.org/en/ universal-declaration-human-rights/ (visited on 01/18/2020). Thell, K., Hellinger, R., Sahin, E., Michenthaler, P., Gold-Binder, M., Haider, T., Kuttke, M., Li- utkeviciˇ ut¯ e,˙ Z., Göransson, U., Gründemann, C., Schabbauer, G., and Gruber, C. W. (2016). Oral activity of a nature-derived cyclic peptide for the treatment of multiple sclerosis. Proceedings of the National Academy of Sciences 113, pp. 3960–3965. Thongyoo, P., Bonomelli, C., Leatherbarrow, R. J., and Tate, E. W. (2009). Potent inhibitors of β- tryptase and human leukocyte elastase based on the MCoTI-II scaffold. Journal of Medicinal Chemistry 52, pp. 6197–6200. Thongyoo, P., Roqué-Rosell, N., Leatherbarrow, R. J., and Tate, E. W. (2008). Chemical and biomimetic total syntheses of natural and engineered MCoTI cyclotides. Organic & Biomolecular Chemistry 6, p. 1462.

26 REFERENCES

Trabi, M. and Craik, D. J. (2004a). Tissue-specific expression of head-to-tail cyclized miniproteins in Violaceae and structure determination of the root cyclotide Viola hederacea root cyclotide1. Plant Cell 16, pp. 2204–2216. Trabi, M., Svangard, E., Herrmann, A., Göransson, U., Claeson, P., Craik, D. J., and Bohlin, L. (2004b). Variations in cyclotide expression in Viola species. Journal of Natural Products 67, pp. 806–810. Veer, S. J. de, Kan, M.-W., and Craik, D. J. (2019). Cyclotides: From structure to function. Chemical Reviews 119, pp. 12375–12421. Wang, C. K., Hu, S. H., Martin, J. L., Sjogren, T., Hajdu, J., Bohlin, L., Claeson, P., Göransson, U., Rosengren, K. J., Tang, J., Tan, N. H., and Craik, D. J. (2009). Combined X-ray and NMR analysis of the stability of the cyclotide cystine knot fold that underpins its insecticidal activity and potential use as a drug scaffold. Journal of Biological Chemistry 284, pp. 10672–10683. Wang, C. K., Kaas, Q., Chiche, L., and Craik, D. J. (2008). CyBase: A database of cyclic protein sequences and structures, with applications in protein discovery and engineering. Nucleic Acids Research 36, pp. D206–D210. Wang, C. K., King, G. J., Northfield, S. E., Ojeda, P. G., and Craik, D. J. (2014a). Racemic and quasi-racemic x-ray structures of cyclic disulfide-rich peptide drug scaffolds. Angewandte Chemie- International Edition 53, pp. 11236–11241. Wang, C. K., Gruber, C. W., Cemažar,˘ M., Siatskas, C., Tagore, P., Payne, N., Sun, G., Wang, S., Bernard, C. C., and Craik, D. J. (2014b). Molecular grafting onto a stable framework yields novel cyclic peptides for the treatment of multiple sclerosis. ACS Chemical Biology 9, pp. 156–163. Witherup, K. M., Bogusky, M. J., Anderson, P. S., Ramjit, H., Ransom, R. W., Wood, T., and Sardana, M. (1994). Cyclopsychotride A, a biologically active, 31-residue cyclic peptide isolated from Psychotria longipes. Journal of Natural Products 57, pp. 1619–1625. Wong, C. T. T., Rowlands, D. K., Wong, C. H., Lo, T. W. C., Nguyen, G. K. T., Li, H. Y., and Tam, J. P. (2012). Orally active peptidic bradykinin B-1 receptor antagonists engineered from a cyclotide scaffold for inflammatory pain treatment. Angewandte Chemie-International Edition 51, pp. 5620– 5624. Yang, R., Wong, Y. H., Nguyen, G. K., Tam, J. P., Lescar, J., and Wu, B. (2017). Engineering a catalytically efficient recombinant protein ligase. Journal of the American Chemical Society 139, pp. 5351–5358. Zahin, M., Joh, J., Khanal, S., Husk, A., Mason, H., Warzecha, H., Ghim, S.-J., Miller, D. M., Matoba, N., and Jenson, A. B. (2016). Scalable production of HPV16 L1 protein and VLPs from tobacco leaves. PLOS ONE 11, e0160995. Zauner, F. B., Elsasser, B., Dall, E., Cabrele, C., and Brandstetter, H. (2018). Structural analyses of Arabidopsis thaliana legumain γ reveal the differential recognition and processing of and ligation substrates. Journal of Biological Chemistry 8, pp. 8934–8946. Zheng, J. S., Tang, S., Guo, Y., Chang, H. N., and Liu, L. (2012). Synthesis of cyclic peptides and cyclic proteins via ligation of peptide hydrazides. ChemBioChem 13, pp. 542–546.

27

Chapter 2

Recombinant expression of MCoTI-II: molecular construct design

29 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN 2.1 Overview

The seeds of the tropical fruit Momordica cochinchinensis produce the cyclotide MCoTI-II (Hernandez et al., 2000) which has been extensively studied regarding its structure (Felizmenio-Quimio et al., 2001; Heitz et al., 2001), molecular dynamics (Puttamadappa et al., 2010; Daly et al., 2013), cell penetration (Huang et al., 2015a; Contreras et al., 2011) and potential for drug design (Camarero et al., 2019). These studies have relied heavily on chemical synthesis as a rapid and efficient alternative to obtaining the cyclotide directly from the native source and to generate new amino acid sequence analogues. The ability to produce MCoTI-II and analogues in a recombinant system would provide a more environmentally friendly alternative to chemical synthesis and benefit laboratories without synthesis expertise wanting to study this promising cyclotide. As outlined in the introduction to this thesis, using plants for recombinant production of cyclotides offers the specific advantage of accommodating the full biosynthetic pathway, including cyclisation by AEPs to produce the mature cyclic product (Poon et al., 2018).

Expression of MCoTI-II in plants is a relatively new undertaking despite its popularity for laboratory studies. Until now, much of the published in planta work for cyclic peptide production has focused on the prototypical cyclotide kalata B1 (Poon et al., 2018), as well as a smaller AEP-compatible cyclic peptide from sunflower, SFTI-1 (Jackson et al., 2018). Kalata B1 has been used for most cyclotide biosynthesis studies (Saska et al., 2007; Gillon et al., 2008; Conlan et al., 2011; Conlan et al., 2012; Jackson et al., 2018; Rehm et al., 2019), and to demonstrate in planta expression of predominantly cyclic product when transiently co-expressed with AEPs in N. benthamiana leaves (Poon et al., 2018). The only study describing MCoTI-II expression in planta was published in 2012 (Mylne et al., 2012). To assess whether an AEP was required for biosynthesis of MCoTI cyclotides, the native precursor, TIPTOP2, was expressed in A. thaliana seeds with and without endogenous AEP expression. Predominantly cyclic MCoTI-II was detected in the wild-type system, along with related cyclotides encoded in the same multi-domain gene. No cyclic or linear MCoTI-II was detected in the aep null mutant (Mylne et al., 2012). So, MCoTI-II can be expressed in A. thaliana seeds and is dependant on AEP cyclisation, but it has not been the focus of optimisation like kalata B1.

Although kalata B1 has been expressed in planta routinely to study cyclotide biosynthesis and production, differences between cyclotide precursor genes and propeptide processing mean that expres- sion of other cyclotides will not necessarily follow the same narrative. Cyclotide precursor peptides contains several elements that contribute to intracellular localisation and, importantly, compatibility with cyclising AEPs. These elements vary between cyclotide precursor, creating a complex range of options for optimising the precursor protein for successful maturation. In naturally-occurring cyclotides, it is probable that the propeptide has co-evolved alongside a cyclising AEP in the same plant resulting in specific sequences for compatibility between the two. For example, the propeptide for the prototypic cyclotide, kalata B1, contains a C-terminal propeptide (CTPP) consisting of the amino acids GLPSLAA, of which the ‘GL’ has been demonstrated to be necessary for cyclising activity by the endogenous AEP, OaAEP1b (Harris et al., 2015). For the cyclic peptide SFTI-1, the precursor

30 2.1. OVERVIEW peptide is encoded by the Paws1 gene which encodes both SFTI-1 as well as the small and large subunits of a seed albumin (Mylne et al., 2011). This information was incorporated into the design of genetic constructs used in this chapter. The cyclotide precursor can be altered to maximise the chances of compatibility with an AEP in a transgenic system, but it might not be possible to co-express a cyclotide precursor with the desired AEP partner in all cases. This could occur for new designer cyclotides where a compatible AEP is not predetermined, when the native AEP partner is unknown, or when the native cyclotide-AEP pair is not compatible within a foreign plant system or tissue type. To increase the chances of cyclotide-AEP compatibility in these cases, a cyclotide could be expressed from another cyclotide precursor by replacing the mature cyclotide coding sequence with the cyclotide of choice. Kalata-B1 has been successfully expressed from different precursor genes, including the CterM gene from Clitoria ternatea (Poon et al., 2018) and SFTI-1 has been expressed from the kalata B1 gene, Oak1 (Jackson et al., 2019), although with its native CTPP ‘GLDN’. Based on these findings it is predicted that the choice of precursor peptide might also affect MCoTI-II expression. To explore cyclisation efficiency it is now common practice to co-express different cyclising AEPs with cyclotide constructs in N. benthamiana (Poon et al., 2018; Jackson et al., 2018; Rehm et al., 2019). This has been used as a tool to improve cyclotide expression (Poon et al., 2018) as well as improve our understanding of cyclising AEPs compared to those preferring hydrolysis (Jackson et al., 2018). However, the successful pairing of cyclotide and AEP can differ on a case-by-case basis. In one study, three grafted analogues of kalata B1 were expressed more successfully alongside OaAEP1b in N. benthamiana, whereas a fourth analogue was expressed more efficiently alongside OaAEP3 (Poon et al., 2018). Another study showed that kalata B1 is expressed more efficiently in N. benthamiana when coexpressed with CtAEP1 than with three other AEPs, including its native AEP partner OaAEP1b. The preference for this AEP diminished when the C-terminal residue was mutated from N to D (Jackson et al., 2019). A complex picture of cyclotide expression is painted when comparing different cyclotide precursors and when considering the case-by-case basis to which these influence expression in planta. To reproduce MCoTI-II biosynthesis in vitro in this chapter, changes were made to the precursor peptide, the CTPP, the choice of AEP to co-infiltrate, and the plant species and tissue in which expression of MCoTI-II was attempted. In summary, the aim of this chapter was to determine whether MCoTI-II and analogues can be recombinantly expressed in plants and whether this could be optimised using different precursor construct designs in different plant tissues.

31 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN 2.2 Methods

2.2.1 Genetic constructs and cloning

The coding sequences for MCoTI-II and [K6V]MCoTI-II were cloned from existing plasmids in-house. Grafted MCoTI-II sequences and novel loop 5 variants of MCoTI-II were synthesised by Integrated DNA Technologies Inc. (IDT) as gBlock® DNA fragments. These were PCR amplified using primers to add attB sites for Gateway® cloning. The purified PCR products were recombined with the entry vector, pDONR221 (Figure 2.1), and cloned using TOP10 E. coli chemically competent cells.

Figure 2.1: Cloning vectors pDONR221, pOH123 and pEAQ-HT-Dest1. TOP: The pDONR221™ (Thermo Fisher-Invitrogen) SpecR entry vector contains spectinomycin resistance (blue) and attP sites for Gateway® cloning (yellow). BOTTOM LEFT: The pOH123 vector is derived from pMDC123 (Tair accession = Vec- tor:1009003750) and contains the Oleosin promoter and Hsp7 terminator (pink) to control gene expression. Gateway® attR cloning sites are indicated in yellow. BOTTOM RIGHT: The pEAQ-HT-Dest1 vector was developed by Sainsbury and colleagues (2009) and contains the p19 suppressor of gene silencing (blue) and Cowpea Mosaic virus (CPMV) 5’ UTR as well as attR Gateway® cloning sites (yellow). Sequence maps were created using SnapGene®.

32 2.2. METHODS

Selection using LB agar with 100 µg/mL spectinomycin was carried out to identify individual E. coli colonies which were miniprepped and sequenced by means of Sanger sequencing at the Australian Genome Research Facility (AGRF), Brisbane, Australia. Single base pair mutations (K-V, N-D) were introduced to MCoTI-II analogues using primers for site directed mutagenesis. Plasmids with confirmed gene sequences were recombined with the pOH123 vector for expression in A. thaliana, directed by the Oleosin promoter (Figure 2.1). These were cloned in TOP10 E. coli cells followed by kanamycin selection, miniprep, and sequencing as described for pDONR221, before transformation of the plasmids into the A. tumefaciens strain EHA105. A. tumefaciens cells were trans- formed via eletroporation to introduce the plasmid and selected for on 50 µg/mL rifampicin/kanamycin LB agar. For expression in N. benthamiana, genes of interest were cloned into the destination vector pEAQ-HT-Dest1 (Sainsbury et al., 2009) (Figure 2.1) instead of pOH123 as described above, and were transformed into the A. tumefaciens stain LBA4404. Antibiotic selection and electroporation procedures were the same as for the EHA105 strain. Genetic constructs for the AEPs OaAEP1b, PxAEP3b, CtAEP1, McAEP1, McAEP2 and GFP used for co-expression in N. benthamiana, and the linear kalata B6 (kB6) internal standard, were obtained from in-house plasmids.

2.2.2 Plant growth and maintenance

Arabidopsis thaliana

A. thaliana seeds of the Colombia ecotype were stratified for four days at 4 degrees, either on damp soil or suspended in 0.01% agarose. The latter were sown onto damp soil after stratification using a pipette to distribute the suspended seeds. Seedlings were germinated and maintained in a 16:8 light:dark photoperiod in a temperature controlled growth room at 24°C. Cool white fluorescent lighting was used for illumination. All plants were grown in small plastic pots sitting in plastic trays. Care was taken not to over-water young seedlings by adding water only when there was none left in the trays, but before the soil completely dried out. This produced healthy-looking plants and prevented algal growth on the top layer of soil. Mature plants were watered every 2–3 days.

Nicotiana benthamiana

N. benthamiana seeds were sown onto hydrated Jiffy pellets (42mm Jiffy-7® Pellet) pre-placed in plastic trays with several centimetres of water. Plants were grown in a 16:8 light:dark photoperiod in temperature controlled conditions at 26°C under LED lighting (AP67 spectrum, Valoya Ltd, Helsinki, Finland). Plants were watered as necessary, 3–4 times per week, by adding several cm of water to the trays. Plants were fertilised using standard hydroponic fertilisers at 2 and 4 weeks post germination.

Lactuca sativa

Heirloom seeds for Cos, Oakleaf Green and Black-seeded lettuce varieties were sown onto hydrated Jiffy pellets (42mm Jiffy-7® Pellet) and placed in plastic trays. Germination was carried out at 24°C

33 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN after which plants were moved to a 26°C growth room. Plants were watered regularly and fertilised in the same time frame as N. benthamiana.

2.2.3 Plant transformation

Stable transformation of Arabidopsis thaliana seeds

A. thaliana plants were transformed at around four weeks post-germination when the flowers were present but mostly unopened. The floral dip method was used essentially as described by Weigel and Glazebrook 2002 but with simplified infiltration buffer consisting of 5% sucrose and 50 µL/L Silwet L-77 (Lehle Seeds, Texas, USA) . Inflorescences were submerged in A. tumefaciens strain EHA105 containing the constructs of interest within the pOH123 vector and agitated with a plastic pasteur pipette for two minutes. Dipped plants were laid flat in trays and covered with plastic wrap for 24 hours away from direct lighting before being placed back under normal growth conditions for two to three weeks. Watering was continued until seeds pods were fully mature and browning, upon which watering was ceased and the siliques allowed to dry on the plant before harvesting. The first generation of potential transgenic seeds were sown on soil as above and sprayed with Basta® non-selective herbicide (glufosinate-ammonium) daily post-germination to select for positive transformants (T1 seeds). Selected lines were transferred to individual pots after the herbicide- susceptible seedlings around them had withered (four to five days). These selected lines were grown to maturity, left to dry, and then the T2 seeds harvested for analysis.

Transient expression in Nicotiana benthamiana

N. benthamiana plants were agroinfiltrated at four weeks when several mature leaves were present but before flowering. pEAQ-HT-Dest1 plasmid-containing A. tumefaciens strain LBA4404 were grown to stationary phase at 28°C in LB medium supplemented with 1 mM MgSO4 to avoid aggregation (Van Zyl et al., 2016). Cells were then centrifuged and re-suspended in MES infiltration buffer (10 mM MES salts, 10 mM MgCl2, 100 mM acetosyringone). These suspensions were brought to an OD600 of 1 and allowed to incubate at room temperature for three hours. For co-infiltrations, A. tumefaciens containing each construct were combined at equal ratios, including a GFP-encoding plasmid for no-AEP controls to maintain final concentration levels for all strains. Agroinfiltration was carried out by hand using a needle-free 1 mL syringe to infiltrate A. tumefaciens cultures into the interstitial space of the plant leaf via the abaxial surface. Infiltrated areas were outlined in permanent marker for reference. Plants were grown as normal for six days until harvesting infiltrated tissue by excision of the infiltrated leaf tissue.

Transient expression in Lactuca sativa

L. sativa plants were agroinfiltrated at four weeks using the hand infiltration method described for N. benthamiana as well as vacuum infiltration. For the latter, A. tumefaciens cultures were grown and re-suspended in MES buffer as described for N. benthamiana but scaled up to 1 L. Lettuce plants were inverted and immersed in the culture, using a skewer through the Jiffy pellet to rest on top of the beaker

34 2.2. METHODS and hold the plant in place. A vacuum chamber was used to lower the pressure to 100 mbar followed by gradual release. This process was repeated once before removing the plant from the chamber and placing in normal growth conditions.

2.2.4 Extraction and analysis of peptide content in plant tissue

Extraction methods

For extraction from A. thaliana seeds, samples were frozen in liquid nitrogen and ground to a fine powder using a Geno/Grinder® (SPEX®) at 1500 rpm for 5 minutes. N. benthamiana and L. sativa tissue was frozen in liquid nitrogen and lyophilised before grinding in the same machine at 1000 rpm for 1 minute. For all plant tissues, 50% acetonitrile, 1% formic acid was used as the extraction buffer (Mahatmanto et al., 2014). The buffer was added to the ground material at a ratio of 20 µL per mg and samples agitated for at least two hours at room temperature. Samples were centrifuged to remove debris and the supernatant used for analysis using MALDI-TOF MS.

MALDI-TOF MS

Plant extracts were de-salted and concentrated using C18 ZipTips® (Merck Millipore). The samples were then mixed 1:1 with α-cyano-4-hydroxycinnamic acid and analysed using matrix assisted laser desorption ionisation - time of flight (MALDI-TOF) mass spectrometry (AB SCIEX 4700 MALDI TOF/TOF™ system). Trace images and peak data were collected using AB SCIEC Data Explorer software.

Reduction of peptide extracts

To assess the number of cysteine residues in the peptides that were identified by MALDI-MS the extracts were chemically reduced and re-analysed. Plant extracts containing the putative MCoAbl09 grafted cyclotide were lyophilised and then resuspended in 100 µL sodium bicardonate buffer (100 mM

NH4HCO3, pH 8.0). Dithiothreitol (DTT) was added to 50 µL of the extract to a final concentration of

10 mM. N2 gas was used to expel oxygen from the tube and then the sample was incubated at 60°C for 30 minutes. A mass shift of +1 Da after reduction of the extract was taken to indicate the presence of a cysteine in the samples.

35 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN 2.3 Results

2.3.1 A first look at MCoTI-II expression in planta using three precursor pep- tides

To begin investigating plant-based expression of MCoTI-II, three different cyclic peptide precursors were chosen for expression in A. thaliana seeds. A truncated version of the original MCoTI-II precursor, TIPTOP2 from Momordica cochinchinensis, was designed to accommodate just one cyclotide domain instead of the original five. This precursor, named TI2Vac, retained the ER signal, vacuolar targeting sequence, and the native C-terminal propeptide. Two precursor genes from other species were also included. These genes were Paws1, which normally encodes the 14 amino acid cyclic peptide, SFTI-1 (Mylne et al., 2011), and the Henn2 precursor from Hybanthus enneaspermus (Figure 2.2). During a series of experiments to alter the tryptase and human leukocyte elastase inhibitory activity of MCoTI-II, the lysine in loop 2 was modified to various amino acids including R, R, Q, A or V (Thongyoo et al., 2009). The K-V substitution was shown to have 21 ± 7 nM activity against human leukocyte elastase, and was included here for comparison to MCoTI-II expression. This cyclotide was named [K6V]MCoTI-II and was expressed from the TI2Vac and Paws1 precursors (Figure 2.2). Using MALDI-TOF mass spectrometry, MCoTI-II and [K6V]MCoTI-II masses were detected in A. thaliana seeds from all constructs as shown in the representative spectra in Figure 2.2. The cyclic mass was predominant, but linear masses were also detected at +18 Da from the cyclic peak. Here, the linear mass refers to the mature cyclotide sequence that has been folded and liberated from the precursor but not cyclised. In these experiments, cyclisation relied on endogenous AEPs in A. thaliana, which does not normally express cyclotides. Therefore, the detection of some linear product was expected and aligned with results obtained by Mylne and colleagues (2012). Both the signal intensity and the frequency of detection were considered when assessing the success of the precursor peptides for MCoTI-II and [K6V]MCoTI-II expression. The cyclic mass of both cyclotides was detected in higher resolution and more frequently from the Paws1 precursor gene compared to TI2Vac and Henn2. As illustrated in the spectra in Figure 2.2, both constructs expressed from within the Paws1 precursor are clearly detectable, whereas the spectra of the TI2Vac and Henn2 precursors were relatively noisy. The table at the bottom of Figure 2.2 highlights the number of selected lines that were positive for MCoTI-II or [K6V]MCoTI-II masses. The TI2Vac precursor had the lowest hit rate followed by Henn2 and then Paws1. From the Paws1 precursor, cyclic MCoTI-II was detected 63% of the time and [K6V]MCoTI-II was detected 100% of the time, albeit from fewer lines (three in total compared to 19 for Paws1_MCoTI-II).

36 2.3. RESULTS

Figure 2.2: Expression of MCoTI-II in Arabidopsis thaliana seeds. Genetic constructs for MCoTI-II are illustrated at the top of the figure showing where the cyclotide domain was inserted into the TI2Vac, Paws1 and Henn2 precursor genes. The sequences of the two cyclotides are given underneath along with their calculated monoisotopic mass ([M+H]+). The spectra displayed are MALDI-MS traces showing the peaks detected from A. thaliana seed extracts. They are representative of the positive lines for each construct. The bottom table details the number of lines in which a positive detection was made along with a conversion to percentage detection.

2.3.2 Expression of a grafted MCoTI-II analogue

The grafted MCoTI-II cyclotide, MTAbl09 (Huang et al., 2015b), contains a small peptide epitope designed to inhibit BCR-ABL; the fusion protein implicated in chronic myeloid leukaemia (Lugo et al., 1990). This peptide was chosen for expression in A. thaliana seeds because it was the most active analogue containing the grafted epitope in loop 1 of the cyclotide (Huang et al., 2015b), as opposed to loop 6 grafts that interrupt the native cyclisation position. The recombinant version is referred to as MCoAbl09 throughout this chapter. This was the first attempt at expressing a grafted

37 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN

MCoTI-II-derived cyclotide with therapeutic potential in planta.

As shown by the MADLI-MS spectrum in Figure 2.3, the cyclic and linear forms of the grafted peptide were readily detected in transgenic A. thaliana seed extract when expressed from the TI2Vac precursor, and only just detectable from the Paws1 precursor. To confirm the presence of six cysteine residues, a small aliquot was taken from the TI2Vac-derived cyclotide extract and reduced using 10 mM DTT. This resulted in a mass shift of 6 Da to give a peak at 3953.32 m/z. This corresponds to the reduction of six cysteine residues and further implies that the detected mass is indeed a cyclotide. The poor signal to noise ratio of the reduced sample is probably a result of using a smaller aliquot of the extract which had been lyophilised and resuspended in sodium bicarbonate buffer before reduction.

Figure 2.3: Expression of MCoAbl09 in Arabidopsis thaliana seeds. The Abltide epitope replacing loop 1 of MCoTI-II is shown in red lettering. The calculated monoisotopic mass (Da; [M+H]+) is listed for cyclic and linear versions of MCoAbl09. The cyclotide was expressed in A. thaliana from the precursors Paws1 and TI2Vac. A representative MALDI-MS trace is presented to illustrate the difference in peak signals from each precursor. The TI2Vac precursor produced clear peaks for cyclic and linear MCoAbl09 and for the internal standard at 3192 m/z. A portion of this extract was chemically reduced using 10 mM DTT resulting in a mass shift of +6 Da as indicated at 3593.32 m/z.

38 2.3. RESULTS

2.3.3 Expression of cyclisation variants of MCoTI-II in Arabidopsis thaliana seeds

The expression of MCoAbl09 confirmed that a grafted MCoTI-II could be expressed in planta, but this loop 1 grafted cyclotide was not the most efficient inhibitor of Abl kinase in the initial study. Instead, a version with Abltide grafted into loop 6 was more effective (Huang et al., 2015b). However, the loop 6 graft replaces the native loop 6 sequence which normally contains the motif required for cyclisation of the peptide by compatible AEPs. To potentially accommodate a grafted sequence while maintaining AEP compatibility, an alternative recognition site was putatively identified in loop 5 of native MCoTI-II. Working with the knowledge that AEPs recognise either D or N residues, the MCoTI-II sequence was modified so that the N in loop 5 would be located at the C-terminus of the peptide, followed by the native CTPP ‘ALEG’. This peptide was named L51 for ‘loop 5 variant 1’, and cloned into the Paws1 precursor for expression in A. thaliana seeds. This precursor was chosen based on previous evidence that it was more suited to expression of the native MCoTI-II sequence than the TI2Vac precursor (Figure 2.2). Although rearranged, the overall MCoTI-II sequence was only modified slightly here (Figures 2.4 and 2.5), making Paws1 the logical precursor choice. Analysis of T2 seed extracts showed a mass matching that calculated for the linear L5 peptide as indicated in Figure 2.4. A high signal to noise ratio made it difficult to identify whether the cyclic mass was present, although an unlabelled peak to the left of the linear one might correspond to the cyclic mass.

Figure 2.4: Expression of an MCoTI-II loop 5 variant in Arabidopsis thaliana seeds. The MCoTI-II native sequence was rearranged so that the N in loop 5 would be at the C-terminus of the peptide. The original site of cyclisation in loop 6 was replaced with a D, highlighted in blue, to discourage possible processing there by AEPs. The peptide was expressed in A. thaliana seeds from the Paws1 precursor gene. A representative MALDI-MS trace is shown highlighting a peak corresponding to the linear mass of L51.

Several modifications were made to the L51 peptide to see whether expression of a loop 5 cyclisation variant could be improved in planta. The simplest modification was to change the N in loop 5 to a D to mimic the original cyclisation point in loop 6 that ends with this residue. The loop 5 variant with a C-terminal D instead of N was designated L52 as shown in Figure 2.5. The peptide L53 was

39 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN the same as L52 but with the K to V substitution that was used in the initial MCoTI-II expression experiments (Figure 2.2). Two G residues were inserted into loop 5 to create L54. The rationale for this was to lengthen the regions either side of the cyclisation point to better represent the length of loop 6. An analogue of this but with the aforementioned K to V substitution was also included as L55. To re-engineer the peptide further, a final analogue was made where the entire loop 5 sequence of MCoTI-II was replaced with the loop 6 sequence. The sequences and calculated monoisotopic masses are presented in Figure 2.5 along with the rate at which they were detected in T2 A. thaliana seed extracts. Considering the cyclic masses, cyclic L55 was detected most frequently, in four out of nine lines selected for, and this was followed by L54: the same sequence without the K-V substitution in loop 1. However, the clarity of the MALDI-MS for

L55 was the greatest as shown by the representative traces in Figure 2.5. The L56 sequence with the native cyclisation loop relocated to loop 5 was detected as a linear mass in one of the 12 lines selected but the signal was only just above the noise level.

Figure 2.5: Expression of MCoTI-II loop 5 analogues in Arabidopsis thaliana seeds. Each of the peptides shown were expressed in A. thaliana seed. The modifications to each peptide are highlighted in blue lettering and the number of positive lines identified is shown to the right of each peptide mass. A representative spectrum is shown for each cyclotide that was detectable. Cyclic and linear masses were detected for L54 and L55 (top and middle spectra) which had extra G residues either side of their cyclisation point in loop 5. The linear mass for the L56 peptide was just detectable (bottom spectrum).

40 2.3. RESULTS

Because of the relatively low signals for all peptides, absolute quantification was not carried out, but the detection of cyclic L54 and L55 was taken as a positive indication that the peptides are compatible with in planta expression, particularly as expression relied on endogenous AEPs rather than those known to be capable of efficient cyclisation. The introduction of cyclising AEPs to the system would hopefully increase the accumulation of mature, cyclic products, but several AEPs would need to be screened for their compatibility with the peptides. Therefore, the rapid transient expression system of N. benthamiana was explored as an alternative to seed-based expression for determining AEP-cyclotide compatibility.

2.3.4 MCoTI-II expression in leaves

Several AEPs have been classified as ligase-type AEPs capable of cyclotide cyclisation, but their compatibility with MCoTI-II has not been characterised in planta. Having established that expression of MCoTI-II and analogues is possible in A. thaliana seeds without the co-expression of cyclising AEPs, the logical progression was to introduce AEPs to the system. But, it was deemed necessary to explore a faster way of assessing cyclotide-AEP compatibility in planta than obtaining A. thaliana lines for every desired AEP. Until this point, MCoTI-II and analogues were expressed mostly from the seed-derived TI2Vac and Paws1 precursors and from the Oleosin promoter for seed-specific expression. For N. benthamiana leaf assays, the Oak1 gene from O. affinis that normally encodes the prototypic cyclotide, kalata B1, was used as a precursor that has been experimentally verified for this system (Poon et al., 2018; Jackson et al., 2018; Jackson et al., 2019; Rehm et al., 2019), but with the kalata B1 coding sequence replaced with MCoTI-II and [K6V]MCoTI-II. These constructs were co-expressed with three different AEPs in N. benthamiana leaves. OaAEP1b is an AEP from O. affinis and has been used for in planta maturation of cyclotides (Harris et al., 2015; Poon et al., 2018; Jackson et al., 2018). CtAEP1, also known as butelase (Nguyen et al., 2014), has also been well-characterised and utilised for in planta cyclisation of peptides (Poon et al., 2018; Jackson et al., 2019). The third AEP, PxAEP3b has also been studied in N. benthamiana (Jackson et al., 2018). Upon co-expression of the MCoTI-II constructs with the three AEPs, some linear peptide signals were detected for [K6V]MCoTI-II, but no signals were detected for native MCoTI-II, as shown by the MALDI traces in Figure 2.6. Although the linear [K6V]MCoTI-II signals were detected, the levels did not appear to change substantially when comparing AEPs. The no-AEP control actually contained a small peak that could be attributed to cyclic [K6V]MCoTI-II but the signal was relatively weak. The co-expression of AEPs in this case did not improve cyclic product accumulation as expected, but the value of having a transient leaf expression system is such that optimisation was pursued to see whether changes to the CTPP could improve AEP compatibility.

41 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN

Figure 2.6: Expression of MCoTI-II and [K6V]MCoTI-II in Nicotiana benthamiana leaves with three AEPs. Both cyclotides were transiently expressed in N. benthamiana with either OaAEP1b, CtAEP1 or PxAEP3b. No cyclic or linear peptide masses for MCoTI-II were detected via MALDI-MS analysis of leaf extracts. Linear [K6V]MCoTI-II was detected but signals were too low to observe any difference between treatments and the GFP (no AEP) control.

2.3.5 Expression of MCoTI-II with alternative C-terminal propeptides for AEP compatibility

The C-terminal propeptide is the portion of the cyclotide precursor that is important for AEP recognition and is cleaved during the cyclisation process. The C-terminal propeptide can differ between cyclotides but it almost exclusively requires a non polar residue, usually leucine, at the second position. For this study, the CTPPs GLP, ALEG, EII and HII were used, and the C-terminal residue of the cyclotide sequence immediately before the CTPP was alternated between N and D. This gave eight peptide ‘tails’: DGLP, NGLP, DALEG, NALEG, DEII, NEII, DHII and NHII, which are indicated at the bottom of Figure 2.7. Two truncated constructs were also included that did not contain a CTPP so that cyclisation would be hindered to test expression of the linear peptide only. This was used to give an indication of the stability of the linear peptide in the leaf cell. All the peptide tail variants were applied to [K6V]MCoTI-II as this was detectable in the previous experiment (Figure 2.6).

42 2.3. RESULTS

Figure 2.7: Modification of the C-terminal propeptide (CTPP) of [K6V]MCoTI-II for expression with two AEPs in Nicotiana benthamiana. The [K6V]MCoTI-II cyclotide domain was expressed from the Oak1 precursor with a series of CTPPs indicated as the variable region. The bar graph shows relative quantification of cyclic (solid coloured bars) and linear (striped bars) of [K6V]MCoTI-II from N. benthamiana leaf extracts compared to an internal standard peptide added during extraction. Detection was carried out using MALDI-MS. Each bar represents one technical replicate containing two biological replicates combined in one extract. The CTPP is indicated beneath each data set at the bottom of the graph. GFP was included as a negative AEP control.

Of the cases where [K6V]MCoTI-II was detected, the linear peptide was the predominant mass as shown by the striped bars in Figure 2.7. However, the cyclic mass was detected in seven of the 20 treatments. This was an improvement on the previous experiment where the cyclic mass could not be confidently identified when the same peptide was expressed from the Oak1 precursor (Figure 2.6). The Oak1 precursor initially included the CTPP GLPSLAA which was shortened here to GLP, but again the precursor did not result in detectable cyclic peptide (Figure 2.7). However, other CTPPs did result in detectable cyclic [K6V]MCoTI-II. The native CTPP from the TIPTOP2 precursor for MCoTI-II, ALEG, resulted in cyclic peptide and, interestingly, the activity of the enzyme depended on the N or D preceding the CTPP. Cyclic peptide from DALEG was only detected when co-expressed with OaAEP1b whereas cyclic peptide from NALEG was only detected with CtAEP1.

Both NEII and NHII CTPPs resulted in cyclic peptide detection (Figure 2.7). However for the NHII tail, the cyclic peptide was also detected at similar relative levels in the no-AEP control. This suggested that co-expression of AEPs did not result in cyclotide levels above the capability of endogenous AEPs.

43 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN

The low signals overall were an indication that co-expression with AEPs is not the only factor required to successfully express MCoTI-II-derived cyclotides. Nonetheless, the detection of [K6V]MCoTI-II in leaves is a positive step towards expression in this system. It demonstrates that the genetic material is translated and that accumulation of the nascent peptide should be the focus of future optimisation. The detection of kalata B1 in the positive control extracts at almost 10 fold higher levels than the MCoTI-II analogues suggests that this system requires some optimisation before it will be ideal for expression of the MCoTI class of cyclotides.

2.3.6 Linear [K6V]MCoTI-II, but not native MCoTI-II, is detectable in Nico- tiana benthamiana from no-CTPP constructs.

To investigate the lack of native MCoTI-II expression in N. benthamiana leaves, MCoTI-II and [K6V]MCoTI-II constructs were designed without a CTPP i.e. truncated at the C-terminus (MCoTI-

IItrunc and [K6V]MCoTI-IItrunc). Without the ability to be cyclised only the linear peptide would be able to accumulate in the plant. For MCoTI-IItrunc, no linear peptide was detected (Figure 2.8), indicating that accumulation of this peptide is probably interrupted before AEP cyclisation can take place. Linear [K6V]MCoTI-IItrunc peptide was detected and no difference was observed between AEP treatments as expected based on the lack of AEP recognition motif.

Figure 2.8: Linear MCoTI-II does not accumulate in Nicotiana benthamiana tissues. Truncated MCoTI-II and [K6V]MCoTI-II constructs were designed without a C-terminal propeptide (CTPP) and expressed in N. benthamiana with OaAEP1b and CtAEP1. The bar graph shows the relative expression as calculated against an internal standard peptide added to each extract. Each bar represents one technical replicate containing two biological replicates combined during extraction.

44 2.3. RESULTS

2.3.7 Summary of expression profiles

Many genetic constructs were tested throughout this chapter to explore the feasibility of expressing MCoTI-II and related peptides in planta. The tables below summarise the constructs tested in the simplest terms of a positive or negative detection in plant extracts. A positive detection was considered when at least one sample contained the expected calculated mass. This summary does not take into account the signal strength or frequency of detection but provides an overall assessment of what did and did not work to get a better understanding of where improvements could be made. Table 2.1 summarises the constructs designed for expression in A. thaliana seeds, including the precursor peptides from which they were expressed. MCoTI-II, [K6V]MCoTI-II and grafted MCoAbl09 were successfully detected, as well as some of the L5 variants shown earlier in this chapter. A double-grafted Abltide analogue, MCoAbl13 (Huang et al., 2015b), was also expressed from the Paws1 precursor in one experiment, but the peptide was not detectable in any lines tested. This double-graft included an epitope in loop 6, next to the native cyclisation position, and this might have interrupted AEP-mediated cyclisation. MCoCTP, another grafted peptide based on MCoTI-II, contained a cell-internalisation sequence that has been shown to improve internalisation of the peptide (Huang et al., 2015a). This peptide was not detected when expressed from either TI2Vac or Paws1 precursors. This could be due to the concentration of R residues in the grafted sequence rendering the peptide highly charged and therefore a target for proteolysis.

Table 2.1: Expression summary: Arabidopsis thaliana seed Cyclotide Precursor Cyclic Linear MCoTI-II TI2Vac  Paws1  Henn2  [K6V]MCoTI-II TI2Vac  Paws1  MCoAbl09 TI2Vac  Paws1  MCoAbl13 Paws1  MCoCTP TI2Vac  Paws1  L51 TI2Vac  L52 Paws1  L53 Paws1  L54 Paws1  L55 Paws1  L56 Paws1 

45 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN

The expression of MCoTI-II and related peptides in N. benthamiana is summarised in Table 2.2 including attempts at expressing the grafted cyclotides MCoAbl09 and McFCQR in the presence of OaAEP1b as well as M. cochinchinensis AEPs, McAEP1 and McAEP2. MCoAbl09 was successfully expressed in A. thaliana seed as shown in Figure 2.3, but it was not detected in N. benthamiana, even with co-expression of AEPs which were not included in the A. thaliana seed expression experiments. McFCQR is another grafted MCoTI-II cyclotide containing the epitope previously grafted into the cyclic peptide SFTI-1. SFTI-FCQR inhibits human kallikrein-related peptidase 4 which activates complex pathways implicated in prostate cancer progression (Swedberg et al., 2009). SFTI-FCQR has been expressed in N. benthamiana alongside OaAEP1b (Poon et al., 2018), but the MCoTI-II version here was not detected when co-expressed with the same AEP or the AEPs from M. cochinchinensis. To see whether leaf-based expression could be possible in another plant species, lettuce was used for transient expression of [K6V]MCoTI-II. Attempts to express [K6V]MCoTI-II in Lactuca sativa are summarised in Table 2.3. Kalata B1 has been expressed in the Cos variety (Poon et al., 2018) so this variety was included as well as Oakleaf green and Black-seeded varieties. Expression of the positive control, kalata B1, was successful but [K6V]MCoTI-II was not detected in any of the three lettuce varieties and optimisation in this system was not pursued.

46 2.3. RESULTS    nd nd nd nd nd nd nd nd leaf leaf nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd nd Lactuca sativa  Nicotiana benthamiana nd nd nd nd nd nd nd nd nd nd nd nd nd nd Cos Oakleaf green Black-seeded Table 2.3: Expression summary: Table 2.2: Expression summary:                   GFP (no AEP) OaAEP1b CtAEP1 PxAEP3b McAEP1 McAEP2    GFP (no AEP) OaAEP1b GFP (no AEP) OaAEP1b GFP (no AEP) OaAEP1b NHII DALEG NALEG DGLP NGLP DEII NEII NHII McFCQR DALEG MCoTI-II GLPSLAA MCoAbl09 NALEG kalata B1 GLPSLAA Cyclotide CTPP Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear Cyclotide CTPP Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear [K6V]MCoTI-II GLPSLAA n.d. = no data [K6V]MCoTI-II DALEG

47 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN 2.4 Discussion

Until now, the characterisation and optimisation of MCoTI-II expression in planta has not been explored. This chapter describes some of the first attempts at MCoTI-II expression in plant systems and focussed on manipulating the genetic constructs that steer biosynthesis in both seed and leaf tissue. Cyclic MCoTI-II and [K6V]MCoTI-II masses were detected in extracts of A. thaliana seed expressing the cyclotides from the Paws1 gene from H. annuus. The grafted cyclotide, MCoAbl09, was readily expressed in A. thaliana seed demonstrating plant-based expression of a grafted MCoTI-II analogue for the first time. The need to accommodate grafted sequences in loop 6 and maintain AEP compatibility for cyclisation was recognised, and so the cyclisation point was relocated to loop 5, producing weak signals for the cyclic product. These results were considered promising considering the lack of cyclising AEPs present in A. thaliana seed. To screen for compatible AEPs, a faster expression system was tested using transient expression in N. benthamiana. This proved difficult, and highlighted the differences between plant tissues for expression of these seed-derived cyclotide analogues. Expression of MCoTI-II and [K6V]MCoTI-II from the Paws1 precursor resulted in clear MALDI- MS signals with minimal signal to noise ratio and were detected in more A. thaliana lines than from the TI2Vac and Henn2 precursors. It was somewhat unexpected that the TI2Vac precursor was not successful in producing a clear MALDI signal because this precursor is a truncated version of the native MCoTI-II-encoding gene, TIPTOP2, which is also a seed-derived gene and has been used to express MCoTI-II in A. thaliana (Mylne et al., 2012). Mylne and colleagues (2012) suggest that TIPTOP2, from which TI2Vac is derived, is potentially more efficient for cyclic peptide expression than Paws1 in A. thaliana based on their comparison between the two native peptides (Mylne et al., 2012; Mylne et al., 2011), but the findings here show that this is not the case when mixing and matching the coding sequences with other cyclotide domains. Cyclic MCoTI-II and [K6V]MCoTI-II were detected clearly from the Paws1 precursor here but the native gene produced only just-detectable signals for SFTI-1 in the initial study (Mylne et al., 2011). SFTI-1 itself is not incompatible with in planta expression as it has been expressed from other precursors in leaf tissue (Poon et al., 2018; Jackson et al., 2018). Interestingly, the grafted cyclotide MCoAbl09 was easily detected when expressed from the TI2Vac precursor, indicating that the precursor is suitable in some cases. MCoAbl09 was later expressed from Paws1, with the hypothesis that expression would be improved, but the peptide was only just detectable in two of the 17 lines tested. Taken together, these results suggest that the expression of MCoTI-II and close analogues is more efficient from the Paws1 precursor than from TI2Vac, but that the latter might be more suited to analogues that differ more substantially from the native sequence, such as MCoAbl09. More biological replicates with cyclotides incorporated into both precursors may help to uncover whether the affect is due to the precursor or the cyclotide itself. These results highlight that there is probably no universal cyclotide precursor that is efficient for all cyclotide expression. This is not surprising given the great natural variation in cyclotides and their precursors. Grafted kalata-B1 has been expressed in N. benthamiana (Poon et al., 2018) as well as grafted

48 2.4. DISCUSSION

SFTI-1 (Poon et al., 2018; Jackson et al., 2019), but the expression of MCoAbl09 in A. thaliana seeds detailed in this chapter is the first example of a grafted MCoTI-II analogue expressed in planta. This is important because MCoTI-II is the most frequently used scaffold for grafting studies (Craik et al., 2017) so developing a plant-based expression platform for this scaffold should have the most impact for production of designer cyclotides for therapeutic benefit. However, for this to be realised, the incorporation of grafts into loop 6 of MCoTI-II needs to be accommodated. This is the most flexible region of the peptide, and is most often used for grafting (Craik et al., 2017). Changing the amino acid sequence in this region almost always replaces the cyclisation site for biosynthesis, so alternative cyclisation options for in planta maturation of the cyclic product were explored.

The detection of cyclic masses for L54 and L55 was a positive sign that the new cyclisation point in loop 5 could be compatible with an in planta system. Relocation of the cyclisation position is an exciting prospect as it would open up plant based expression for a large proportion of grafted cyclotides. It was not surprising that the signals detected were relatively low because compatible AEPs were not coexpressed in the A. thaliana seed system. In addition, the L5 peptides might have been identified as erroneous structures in the seed and degraded. The data from A. thaliana seeds highlighted the need to be able to test AEP compatibility in a high throughput expression system, such as that carried out by Poon and colleagues (Poon et al., 2018), to find out whether AEP-mediated cyclisation could improve cyclotide accumulation. Unfortunately, transient expression in N. benthamiana did not result in discernible differences between MCoTI-II and AEP co-expression combinations. MCoTI-II was generally undetectable, and [K6V]MCoTI-II was detectable at low signal intensities. The low expression meant that meaningful comparisons between treatments were not possible. Trends might become apparent with the inclusion of more biological replicates, but this was not pursued due to the poor detection of cyclotides from all treatments and the expectation that any trends would be minor if present. The co-expression of AEPs did not improve detection of cyclotide masses in the leaf extracts, suggesting that AEP compatibility might not be the issue with this system. More likely, the synthesis and folding of the cyclotide precursor is not compatible with leaf-based expression or the proteases in the leaf might destroy precursors before they can be cyclised and protected against such metabolism. Low transcript levels could also contribute to expression difficulties and could be assessed using RT-PCR. A fundamental knowledge of the relationship between transcript and peptide levels would be useful for future cyclotide expression efforts.

The detection of linear [K6V]MCoTI-II from the [K6V]MCoTI-IItrunc, but not MCoTI-II from

MCoTI-IItrunc, reflected the similar trend of detection of linear [K6V]MCoTI-II but not MCoTI-II in the first N. benthamiana tests using the Oak1 precursor with three AEPs. The detection of linear [K6V]MCoTI-II from the truncated precursor that cannot be cyclised indicates some degree of stability of this precursor over native MCoTI-II. This was also evident in some degree in the A. thaliana seed experiments when comparing L54 (without K-V substitution) and L55 (with K-V substitution), where the latter produced clearer spectra than the former. This raises the question as to whether cyclotide accumulation relies more heavily on precursor stability than AEP compatibility. In N. benthamiana

49 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN experiments, it is possible that the MCoTI-II precursor is degraded before it comes in to contact with the AEP, either because it is degraded early in the pathway before the expected AEP contact, or because the spatio-temporal location of the AEPs in the vacuolar targeting pathway is different in N. benthamiana leaves than in seeds. The latter would explain the general lack of cyclic product detection in leaves compared to seeds seen throughout this chapter. There is some evidence that intracellular transport to the vacuole involves complex pre-vacuolar sorting in A. thaliana seed, and this could explain the difference between seed and leaf expression seen here. One study in particular showed that pre-vacuolar vesicles containing processing enzymes and storage proteins are segregated until fusing into a multivesicular body, upon which proteolytic processing is carried out (Otegui et al., 2006) and the authors hypothesise that activation of processing enzymes occurs as the multivesicular body lumen gradually acidifies. As MCoTI-II is detected in A. thaliana seeds it probably travels through the same endomembrane system, and perhaps separation from endogenous AEPs is a means by which cyclotide precursors are preserved until finally being processed when the conditions are favourable in a multivesicular body or protein storage vacuole. Whether AEP and cyclotide transport is decoupled in this way in N. benthamiana leaves has not been determined, and it would be interesting to find out where and when MCoTI-II might come into contact with endogenous or recombinant AEPs during transport to the leaf cell vacuole in future studies.

2.5 Conclusion

This chapter focused on the manipulation of genetic elements of the MCoTI-II precursor to optimise expression in planta. MCoTI-II and analogues were successfully expressed from the Paws1 precursor from H. annuus which was then carried forward for expression of other constructs. The grafted cyclotide, MCoAbl09, was expressed in A. thaliana seed as the first example of a grafted trypsin inhibitor cyclotide produced in planta. With the aim of incorporating loop-6 grafted cyclotides, the cyclisation position was putatively relocated to loop 5 in a series of MCoTI-II analogues and two of these cyclic peptides were detectable in A. thaliana seed. The need for a rapid in planta expression platform for assessing cyclotide-AEP compatibility was recognised, and N. benthamiana leaves were trialled in this regard. Expression in seeds might be a suitable option for MCoTI-II production in the future but without a rapid screening solution to test AEP-compatibility the optimisation of seed-based production would be time consuming. Alternatives for MCoTI-II and AEP compatibility screening include testing in vitro with purified precursors and AEPs or optimising the leaf expression environment. The following chapters will explore these two options in further detail.

50 REFERENCES References

Camarero, J. A. and Campbell, M. J. (2019). The potential of the cyclotide scaffold for drug develop- ment. Biomedicines 7, p. 31. Conlan, B. F., Colgrave, M. L., Gillon, A. D., Guarino, R., Craik, D. J., and Anderso, M. A. (2012). Insights into processing and cyclization events associated with biosynthesis of the cyclic peptide kalata B1. Journal of Biological Chemistry 287, pp. 28037–28046. Conlan, B. F., Gillon, A. D., Barbeta, B. L., and Anderson, M. A. (2011). Subcellular targeting and biosynthesis of cyclotides in plant cells. American Journal of Botany 98, pp. 2018–2026. Contreras, J., Elnagar, A. Y. O., Hamm-Alvarez, S. F., and Camarero, J. A. (2011). Cellular uptake of cyclotide MCoTI-I follows multiple endocytic pathways. Journal of Controlled Release 155, pp. 134–143. Craik, D. J. and Du, J. (2017). Cyclotides as drug design scaffolds. Current Opinion in Chemical Biology 38, pp. 8–16. Daly, N. L., Thorstholm, L., Greenwood, K. P., King, G. J., Rosengren, K. J., Heras, B., Martin, J. L., and Craik, D. J. (2013). Structural insights into the role of the cyclic backbone in a squash trypsin inhibitor. Journal of Biological Chemistry 288, pp. 36141–36148. Felizmenio-Quimio, M. E., Daly, N. L., and Craik, D. J. (2001). Circular proteins in plants: Solution structure of a novel macrocyclic trypsin inhibitor from Momordica cochinchinensis. Journal of Biological Chemistry 276, pp. 22875–22882. Gillon, A. D., Saska, I., Jennings, C. V., Guarino, R. F., Craik, D. J., and Anderson, M. A. (2008). Biosynthesis of circular proteins in plants. Plant Journal 53, pp. 505–515. Harris, K. S., Durek, T., Kaas, Q., Poth, A. G., Gilding, E. K., Conlan, B. F., Saska, I., Daly, N. L., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2015). Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nature Communications 6, p. 10199. Heitz, A., Hernandez, J. F., Gagnon, J., Hong, T. T., Pham, T. T., Nguyen, T. M., Le-Nguyen, D., and Chiche, L. (2001). Solution structure of the squash trypsin inhibitor MCoTI-II. A new family for cyclic knottins. Biochemistry 40, pp. 7973–7983. Hernandez, J. F., Gagnon, J., Chiche, L., Nguyen, T. M., Andrieu, J. P., Heitz, A., Trinh Hong, T., Pham, T. T., and Le Nguyen, D. (2000). Squash trypsin inhibitors from Momordica cochinchinensis exhibit an atypical macrocyclic structure. Biochemistry 39, pp. 5722–5730. Huang, Y.-H., Chaousis, S., Cheneval, O., Craik, D. J., and Henriques, S. T. (2015a). Optimization of the cyclotide framework to improve cell penetration properties. Frontiers in Pharmacology 6, p. 17. Huang, Y.-H., Henriques, S. T., Wang, C. K., Thorstholm, L., Daly, N. L., Kaas, Q., and Craik, D. J. (2015b). Design of substrate-based BCR-ABL kinase inhibitors using the cyclotide scaffold. Scientific Reports 5, p. 12974. Jackson, M. A., Gilding, E. K., Shafee, T., Harris, K. S., Kaas, Q., Poon, S., Yap, K., Jia, H., Guarino, R., Chan, L. Y., Durek, T., Anderson, M. A., and Craik, D. J. (2018). Molecular basis for the

51 CHAPTER 2. MOLECULAR CONSTRUCT DESIGN

production of cyclic peptides by plant asparaginyl endopeptidases. Nature Communications 9, p. 2411. Jackson, M. A., Yap, K., Poth, A. G., Gilding, E. K., Swedberg, J. E., Poon, S., Qu, H., Durek, T., Harris, K., Anderson, M. A., and Craik, D. J. (2019). Rapid and scalable plant-based production of a potent plasmin inhibitor peptide. Frontiers in Plant Science 10, p. 602. Lugo, T., Pendergast, A., Muller, A., and Witte, O. (1990). Tyrosine kinase activity and transformation potency of BCR-ABL oncogene products. Science 247, pp. 1079–1082. Mahatmanto, T., Poth, A. G., Mylne, J. S., and Craik, D. J. (2014). A comparative study of ex- traction methods reveals preferred solvents for cystine knot peptide isolation from Momordica cochinchinensis seeds. Fitoterapia 95, pp. 22–33. Mylne, J. S., Chan, L. Y., Chanson, A. H., Daly, N. L., Schaefer, H., Bailey, T. L., Nguyencong, P., Cascales, L., and Craik, D. J. (2012). Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase–mediated biosynthesis. The Plant Cell 24, pp. 2765–2778. Mylne, J. S., Colgrave, M. L., Daly, N. L., Chanson, A. H., Elliott, A. G., McCallum, E. J., Jones, A., and Craik, D. J. (2011). Albumins and their processing machinery are hijacked for cyclic peptides in sunflower. Nature Chemical Biology 7, pp. 257–259. Nguyen, G. K. T., Wang, S., Qiu, Y., Hemu, X., Lian, Y., and Tam, J. P. (2014). Butelase 1 is an Asx-specific ligase enabling peptide macrocyclization and synthesis. Nature Chemical Biology 10, pp. 732–738. Otegui, M. S., Herder, R., Schulze, J., Jung, R., and Staehelin, L. A. (2006). The proteolytic processing of seed storage proteins in Arabidopsis embryo cells starts in the multivesicular bodies. The Plant Cell Online 18, pp. 2567–2581. Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641. Puttamadappa, S. S., Jagadish, K., Shekhtman, A., and Camarero, J. A. (2010). Backbone dynamics of cyclotide MCoTI-I free and complexed with trypsin. Angew Chem Int Ed 49, pp. 7030–7034. Rehm, F. B. H., Jackson, M. A., De Geyter, E., Yap, K., Gilding, E. K., Durek, T., and Craik, D. J. (2019). Papain-like cysteine proteases prepare plant cyclic peptide precursors for cyclization. Proceedings of the National Academy of Sciences, pp. 7831–7936. Sainsbury, F., Thuenemann, E. C., and Lomonossoff, G. P. (2009). pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnology Journal 7, pp. 682–693. Saska, I., Gillon, A. D., Hatsugai, N., Dietzgen, R. G., Hara-Nishimura, I., Anderson, M. A., and Craik, D. J. (2007). An asparaginyl endopeptidase mediates in vivo protein backbone cyclization. Journal of Biological Chemistry 282, pp. 29721–29728. Swedberg, J. E., Nigon, L. V., Reid, J. C., Veer, S. J. de, Walpole, C. M., Stephens, C. R., Walsh, T. P., Takayama, T. K., Hooper, J. D., Clements, J. A., et al. (2009). Substrate-guided design of a

52 REFERENCES

potent and selective kallikrein-related peptidase inhibitor for kallikrein 4. Chemistry & Biology 16, pp. 633–643. Thongyoo, P., Bonomelli, C., Leatherbarrow, R. J., and Tate, E. W. (2009). Potent inhibitors of β- tryptase and human leukocyte elastase based on the MCoTI-II scaffold. Journal of Medicinal Chemistry 52, pp. 6197–6200. Van Zyl, A. R., Meyers, A. E., and Rybicki, E. P. (2016). Transient Bluetongue virus serotype 8 capsid protein expression in Nicotiana benthamiana. Biotechnology Reports 9, pp. 15–24. Weigel, D. and Glazebrook, J. (2002). Chapter 5: How to transform Arabidopsis. Arabidopsis: A Laboratory Manual. Ed. by S. Curtis. Cold Spring Harbour Laboratory Press, pp. 129–130.

53

Chapter 3

Circular permutation of the native enzyme-mediated cyclization position in cyclotides

55 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION 3.1 Authorship contribution

Smithies, B. J., Huang Y-H., Jackson M. A., Yap K., Gilding E. K., Harris K. S., Anderson M. A. and Craik D. J. (2020) Circular permutation of the native enzyme-mediated cyclization position in cyclotides, ACS Chemical Biology 15, pp. 962-969.

I contributed to the conception and design of the experiments (60%), analysis and interpretation of the data presented (60%), as well as writing (85%) and editing (15%) the manuscript.

3.2 Overview

This chapter continues the thesis by exploring AEP-cyclotide compatibility within the context of in vitro experimentation. As suggested in the concluding remarks of the previous chapter, this is a means to explore AEP-cyclotide compatibility before establishment in a plant expression system. The work presented here details this in vitro approach with a particular focus on the development of an alternative cyclisation position in loop 5 of MCoTI-II. This builds on the L5 peptides expressed in A. thaliana seeds in Chapter 2 by introducing a cyclising AEP, OaAEP1b, to the L5 cyclotide precursor under controlled conditions. The following work is presented as a reproduction of the published manuscript. Supplementary figures referred to in the chapter are included in the Appendix section of this thesis.

56 3.3. ABSTRACT 3.3 Abstract

Cyclotides are a class of cyclic disulfide-rich peptides found in plants that have been adopted as a molecular scaffold for pharmaceutical applications due to their inherent stability and ability to penetrate cell membranes. For research purposes, they are usually produced and cyclized synthetically, but there are concerns around the cost and environmental impact of large-scale chemical synthesis. One strategy to improve this is to combine a recombinant production system with native enzyme- mediated cyclization. Asparaginyl endopeptidases (AEPs) are enzymes that can act as peptide ligases in certain plants to facilitate cyclotide maturation. One of these ligases, OaAEP1b, originates from the cyclotide-producing plant, Oldenlandia affinis, and can be produced recombinantly for use in vitro as an alternative to chemical cyclization of recombinant substrates. However, not all engineered cyclotides are compatible with AEP-mediated cyclization because new pharmaceutical epitopes often replace the most flexible region of the peptide, where the native cyclization site is located. Here we redesign a popular cyclotide grafting scaffold, MCoTI-II, to incorporate an AEP cyclization site located away from the usual grafting region. We demonstrate the incorporation of a foreign peptide sequence in the most flexible region of MCoTI-II while maintaining AEP compatibility, where the two were previously mutually exclusive. We anticipate that our AEP-compatible scaffold, based on the most popular cyclotide for pharmaceutical applications, will be useful in designing bioactive cyclotides that are compatible with AEP-mediated cyclization and will therefore open up the possibility of larger scale enzyme-mediated production of recombinant or synthetic cyclotides alike.

57 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION 3.4 Introduction

Cyclotides are plant-produced cyclic peptides (Craik et al., 1999; Göransson et al., 2012) that adopt structures amenable to engineering for pharmaceutical applications (Veer et al., 2019; Camarero et al., 2019). They are characterized by their cyclic backbones that are threaded by a triple disulfide-bonded cystine knot (Craik et al., 1999). This structure gives cyclotides resistance to thermal, enzymatic and chemical degradation (Colgrave et al., 2004). Aside from six conserved cysteine residues, natural sequence variation exists in the inter-cysteine loops and underpins a multitude of bioactivities that include insecticidal, anti-HIV, protease inhibitory, and cytotoxic activity (Huang et al., 2019; Ojeda et al., 2019). This extensive sequence diversity has been the inspiration for engineering cyclotides with novel bioactivities; a concept known as molecular grafting. These grafted epitopes are often imbued with enhanced stability and efficacy (Wang et al., 2018) and have ranged in size from between 1–21 residues (Craik et al., 2017). Due to their small size (<50 residues), production of grafted cyclotides on a laboratory scale is often performed using solid phase peptide synthesis before backbone cyclization using native chemical ligation (Thongyoo et al., 2008; Clark et al., 2010) or chemically-protected peptide bond formation (Cheneval et al., 2014). Although appropriate for generating experimental quantities, such synthetic approaches for large-scale production may be costly, hazardous, and environmentally taxing. The cost of raw materials and the burden of multi-ton waste per kilogram of product are most concerning (Isidro-Llobet et al., 2019). Thus, the development of a recombinant based alternative will be important for large-scale production for clinical trials or commercial use, and might also benefit laboratories without peptide synthesis experience or access to synthesis facilities. Recombinant production of cyclotides has been explored in several biological systems, including bacteria, yeast and plants, and each system has its strengths and limitations. Cell-based systems such as these are the industrial standard for recombinant therapeutic production and have been investigated for cyclotide biosynthesis, although the machinery to effect backbone cyclization are not naturally present within the cells. One promising approach to overcome this is the intein-based protein splicing system which uses split intein domains flanking the cyclotide precursor to facilitate cyclization via splicing. This approach has been applied successfully to both prokaryotic (Jagadish et al., 2017; Camarero et al., 2007) and eukaryotic (Jagadish et al., 2015) systems. Cyclotide-intein fusion proteins have also been used to facilitate expressed protein ligation (Kimura et al., 2006), thereby combining E. coli expression with in vitro cyclization. While the residues that flank the intein/exon boundaries can be restrictive, intein-based approaches have been applied to several cystine-rich backbone cyclized peptides (Bi et al., 2018; Li et al., 2016) and could be explored in future as a means for larger scale production purposes. An alternative to production in traditional cell culture methods is to utilize plants for recombi- nant protein expression. This approach has proven useful in cases where glycosylation is important (Schiermeyer, 2020) such as taliglucerase alfa which is produced in carrot cells (Protalix Biotherapeu- tics)(Shaaltiel et al., 2007), or where scale up and flexibility is required, such as vaccine production. Large-scale production of influenza vaccines for clinical trials is currently underway in tobacco plants,

58 3.4. INTRODUCTION

(Pillet et al., 2019) and numerous nonpharmaceutical proteins are produced in plants for use in research and cosmetic applications (Tschofen et al., 2016). An added benefit for cyclotide production in plants in particular is the presence of essential post-translational machinery in cyclotide-producing plants (Saska et al., 2007; Jackson et al., 2018) that can be transiently expressed in other species for plant-based production (Poon et al., 2018). Using plants as a complete system for cyclic peptide production should be a great step forward for safe, efficient cyclotide production. Still, plant vacuoles, which are thought to accumulate mature cyclotides, (Conlan et al., 2011; Gillon et al., 2008; Slazak et al., 2018) are acidic and highly proteolytic (Vitale et al., 2005) and consequently not all designer cyclotides may remain stable upon biosynthesis (Jackson et al., 2019). Semi-enzymatic approaches to peptide production have been the subject of recent reviews (Thomp- son et al., 2019; Nuijens et al., 2019) and may be applicable in circumstances where in vivo biosynthesis produces low yields or is not conveniently accessible. In the cyclotide context, peptide precursors are first produced synthetically or recombinantly before enzymatic cyclization in vitro. Enzymes that are capable of backbone cyclization include bacterial sortases (Tsukiji et al., 2009), trypsin (Thongyoo et al., 2007; Thongyoo et al., 2008) omniligase (Schmidt et al., 2017; Schmidt et al., 2019) and plant AEPs (Jackson et al., 2018; Nguyen et al., 2014; Harris et al., 2015; Harris et al., 2019). Each of these enzymes has advantages and limitations. For example, sortases are easy to produce recombinantly, but they have a strict requirement for residue conservation flanking the cyclization site, leaving several extra residues in the final cyclized product. Trypsin has been used to efficiently cyclize MCoTI-I and MCoTI-II and analogues (Thongyoo et al., 2007; Thongyoo et al., 2008) but it lacks the adaptability to cyclize some analogues of MCoTI as well as other classes of cyclotides. Omniligase-1 has been utilized for efficient backbone cyclization of cyclotide precursors including MCoTI-II and kalata B1, as well as cyclic theta-defensin RTD-1 (Schmidt et al., 2017; Schmidt et al., 2019). However, a C-terminal ester must be introduced to the peptide precursor to be compatible with the enzyme. Some plant AEPs have now been established as efficient ligases (Nguyen et al., 2014; Harris et al., 2019) and their native targets are thought to be cyclotides, making them intrinsically compatible with cyclotide precursors containing their native AEP recognition sequence. Recombinant production of AEPs (Harris et al., 2015) has made it possible to utilize these enzymes in vitro. Moreover, the suite of AEPs available for both in vitro and in planta cyclic peptide maturation is expanding due to recent advancements in our ability to predict AEP ligase activity from peptide sequence data (Jackson et al., 2018; Harris et al., 2019; Hemu et al., 2019; Zauner et al., 2018). One key consideration for semi-enzymatic production of grafted cyclotides is that it is desirable for many of the grafted epitopes to be strategically positioned within the most flexible region of the peptide, loop 6, (Craik et al., 2017) necessarily replacing the native Asn/Asp residue required for AEP-mediated cyclization. This precludes the utilization of AEPs for the cyclization of these cyclotides in vitro, as well as limiting the potential for complete biosynthesis in heterologous systems such as plants. For AEPs to target cyclotides with grafts in loop 6, the AEP-mediated cyclization point must be switched to an alternative loop. In this study we redesign MCoTI-II, the most frequently grafted cyclotide (Craik et al., 2017) by

59 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION introducing and optimising an alternative cyclization site in loop 5 of the native cyclotide. A series of synthetic MCoTI-II analogues, named ‘L5’ for loop 5 variant, were generated and assessed for their compatibility with a native plant AEP, OaAEP1b. Subsequently, we used the new L5-based scaffold for generating a grafted cyclotide with therapeutic potential for the treatment of chronic myeloid leukaemia (Huang et al., 2015). Finally, we show the usefulness of this system by combining the heterologous production of the non-cyclic precursor peptides in E. coli followed by enzyme-mediated cyclization, demonstrating proof of concept for semi-enzymatic production of grafted cyclotides.

3.5 Methods

3.5.1 Peptide synthesis, purification and oxidation.

All peptide precursors were synthesized using Fmoc-based solid-phase peptide synthesis as described previously (Cheneval et al., 2014). In brief, peptide precursors were assembled on 2-chlorotrityl chloride resin stepwise from C- to N-terminus using an automatic synthesizer (Symphony, Gyros Protein Technologies). For the chemically cyclized peptide, [G1,G33]L5_Abl06, the peptide precursor was cleaved off the resin with 1% trifluoroacetic acid (TFA) in dichloromethane. The partially protected peptide precursor was then cyclized in solution, deprotected and isolated using reverse-phase high-performance liquid chromatography (RP-HPLC) on a preparative C18 Phenomenex column.

Linear peptides L5ALEG, L5GLP, [G1]L5ALEG, [G33]L5ALEG and [G1,G33]L5ALEG were assembled as described above and were cleaved and deprotected in TFA/triisopropylsilane/H2O (95:2.5:2.5), and purified using RP-HPLC. Direct oxidative folding of cyclic and linear L5 peptides was both carried out in 0.1 M ammonium bicarbonate buffer (pH 8.0) for 18 hours and the oxidized peptides were purified using RP-HPLC to 95% purity. Oxidation yields were between 5-10% based on the weight of the crude and oxidized products and were not optimised further. The peptide mass and purity was confirmed using LC/MS followed by 1D NMR to confirm the presence of a single correctly folded product.

3.5.2 NMR spectroscopy.

For structural characterization, peptides were dissolved in H2O/D2O (10:1, v/v) at 1 mM concentration. The one-dimensional spectrum and two-dimensional total correlated spectroscopy (TOCSY) and nuclear Overhauser effect spectroscopy (NOESY) data sets were acquired on a Bruker Avance-600 MHz NMR spectrometer at 298 K as described previously (Huang et al., 2015). Spectra were analysed and assigned using Sparky (NMRFAM, University of Wisconsin-Madison).

60 3.5. METHODS

3.5.3 Recombinant production of [G1,G33]L5_Abl06ALEG peptide in E. coli.

[G1,G33]L5_Abl06ALEG was reverse transcribed to design a genetic construct coding for the linear peptide, including the four amino acid C-terminal propeptide ‘ALEG’. Double stranded DNA was synthesized as GeneBlock fragments (IDT – Singapore) and cloned using Gibson Assembly. E. coli colonies were selected using standard blue-white screening and positive transformants confirmed via Sanger sequencing (AGRF Brisbane). Constructed plasmids of pET15+GOI were transformed into BL21(DE3) cells for expression. Transformed cells were cultured at 37°C in Luria broth (LB) with

100 µg/mL ampicillin and induced when cell density reached 0.5–0.6 at OD600. For expression, the temperature was reduced to 18°C and isopropyl β-D-1-thiogalactopyranoside (IPTG) was added to a final concentration of 0.4 mM. After 16 hours, cells were collected via centrifugation, lysed using a cell disruptor and fusion proteins captured by Ni-NTA resin. Captured fusion proteins were eluted with elution buffer (20 mM tris-HCL, 0.3 M NaCl, 300 mM imidazole, pH 8.0). Eluted fusion proteins were then cleaved using tobacco etch virus (TEV) protease at molar ratio of 25:1 fusion protein to TEV protease to liberate the cyclotide precursor peptides. Cleavage reactions were then acidified using TFA to a final concentration of 1% and loaded on a preparative C18 column for further purification.

3.5.4 Recombinant production of OaAEP1b in E. coli.

OaAEP1b was produced in E. coli and purified as described previously (Rehm et al., 2019). OaAEP1b concentration was determined by measuring the absorbance at 280 nm using a NanoDrop spectropho- tometer.

3.5.5 AEP cyclization assays.

20 µM peptide was incubated with 200 nM or 500 nM OaAEP1b as stated in the text. Reactions were carried out in sodium acetate buffer (50 mM NaOAc, 50 mM NaCl, 1 mM EDTA) at pH 5, at room temperature for up to 24 hours. Ten µL reaction volume was taken out at 12 and/or 24 hours and quenched with 1 µL of 5% TFA. Quenched samples were desalted using C18 ZipTips (Millipore) and eluted in 80% acetonitrile, 1% formic acid. All AEP cyclization assays were carried out on the same day with the same batch of OaAEP1b enzyme, with the exception of the cyclization assay for the

[G1,G33]L5_Abl06ALEG peptide from E. coli which was carried out on a separate occasion.

3.5.6 MALDI-MS.

Desalted samples were mixed 1:1 with α-cyano-4-hydrocycinammic acid matrix (10 mg mL−1 in 80% acetonitrile, 1% formic acid) and data collected using MALDI-MS (AB SCIEX 5800 MALDI TOF/TOFTM system). Relative quantification was calculated using the peak cluster area detected for the calculated cyclic, linear and precursor peptide masses.

61 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION

3.5.7 Co-elution of chemically and AEP-cyclized [G1,G33]L5_Abl06.

Cyclic [G1,G33]L5_Abl06 was prepared using OaAEP1b-mediated cyclization, or chemically as described above, followed by oxidation in folding solution containing ammonium bicarbonate, glu- tathione and isopropanol (0.1 M NH4HCO3, 5 mM reduced glutathione, 1 mM oxidized glutathione, 20% v/v isopropanol) for 18 h. The two peptides were analysed separately and together using TripleTOF 5600+ mass spectrometer system (AB SCIEX) and peaks visualized via extracted ion chromatography between 1024.64 and 1024.68 m/z.

3.5.8 Evaluation of Abl kinase inhibition using LC/MS.

The kinase inhibitory activity of [G1,G33]L5_Abl06 against human active Abl kinase was evaluated and compared to that of MTAbl06. The phosphorylation of the optimal substrate of abl kinase, abltide, was quantified using Liquid Chromatography-Mass Spectrometry (LC/MS, Shimadzu LCMS-2020) in the presence of MTAbl06 or [G1,G33]L5_Abl06 as described previously (Huang et al., 2015). Briefly, abltide (EAIYAAPFAKKK) and Abl kinase (0.25 U/mg, Abl protein, Merck Millipore) were mixed in kinase buffer (NEBuffer, New England Biolabs) at a desired concentration and then spiked with MTAbl06 or [G1,G33]L5_Abl06, individually. The reaction was initiated by the addition of 500 µM ATP (Sigma-Aldrich) and incubated at 37°C for 30 min. The samples were collected and acidified with 1% TFA before injecting onto a C18 analytical column (Jupiter 300 5 µm 300 Å, 150 x 2.0 mm, Phenomex). The percentage of phosphorylation of abltide was calculated by comparing the peak area of phosphorylated abltide (1417 Da) with the peak area of abltide (1336.6 Da) from LC at 0 h. The experiments were repeated on two independent days.

62 3.6. RESULTS AND DISCUSSION 3.6 Results and discussion

AEP-mediated cyclization of cyclotides utilizes the native ligation site in loop 6 of linear precursors, but this site is often replaced with other residues when engineering cyclotides for drug design applications. To address this conflict, we identified an Asn in loop 5 (Asn26) of native MCoTI-II and investigated its use as an alternative AEP-mediated cyclization point to create our new L5 scaffolds. Normally, this internal Asn residue is not processed by AEPs, so to improve enzyme recognition we shifted Asn26 to the C-terminus and added a necessary C-terminal propeptide (CTPP) for AEP recognition. This study is the first to attempt to relocate the cyclization point specifically for plant-derived AEP- mediated cyclization, and utilizes MCoTI-II as it is the most promising grafting scaffold for future cyclotide-based drug design.

3.6.1 Choosing an alternative cyclization point in MCoTI-II.

Native MCoTI-II is cyclized in loop 6 via the recognition of Asp34 in the precursor sequence, GSGSDALEG (Mylne et al., 2012), where ALEG is the CTPP that is removed upon cyclization. Because AEPs recognize both Asp and Asn residues, another potential cyclization site was identified in loop 5 of MCoTI-II, i.e. RGNGY. To test whether cyclization could occur at this site, two linear versions of MCoTI-II were synthesized with the loop 5 Asn as the intended cleavage site (P1) followed by a CTPP (peptides L5ALEG and L5GLP). The first analogue terminated with the CTPP ‘ALEG’ found on the native MCoTI-II precursor (Mylne et al., 2012) while the second terminated with ‘GLP’ that naturally flanks the cyclotide kalata B1 produced from Oldenlandia affinis (Gillon et al., 2008). The latter was selected to reflect the favoured recognition site of the O. affinis AEP enzyme OaAEP1b (Harris et al., 2015; Harris et al., 2019). Figure 3.1 shows the precursor sequences for both reengineered L5 peptides. To test their cyclization potential, oxidized L5 precursor peptides and recombinantly-produced OaAEP1b were incubated for 24 hours. MALDI-MS was used to quantify MS signals for unprocessed, linear (CTPP removed via hydrolysis) and backbone cyclized peptide. Over the course of the 24 h incubation with OaAEP1b, a portion of both L5ALEG and L5GLP were cyclized as determined by the MALDI-MS peak data shown in Figure 3.1. The percentage of cyclic product was calculated as

64 ± 0.8% and 51 ± 7% for L5ALEG and L5GLP, respectively. For both L5 peptide substrates, the non-cyclized portion of the peptide at 24 h was predominantly unprocessed full-length precursor with the CTPP present, whilst only a small percentage of signal (around 10%) was for linear peptide with the CTPP removed. The peptide kB1EII was included as a positive control and was fully processed to cyclic kB1, demonstrating that the enzyme was active. For peptide masses ([M+H]+) calculated for this study, see Table A.1.

63 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION

Figure 3.1: Cyclization of MCoTI-II analogues, L5ALEG and L5GLP. (a) Top: Sequence of MCoTI-II indicating native and proposed cyclization positions. Disulfide bonds are depicted as yellow sticks. Bottom: Precursor sequences of L5 analogues. C-terminal propeptide (CTPP) indicated in grey. Cysteine residues are numbered with Roman numerals. (b) Representative MALDI-MS showing mass shifts of 20 µM peptide over 24 h incubation with 500 nM OaAEP1b (total protein concentration). Precursor peptides were oxidized prior to cyclization. Additional peaks are probably an artefact of MALDI-MS typical of MCoTI-II derived peptides. See Figures A.4 - A.5 for indicators of peptide quality. (c) Relative quantification of precursor, linear, and cyclic peptide masses determined using peak cluster area from MALDI-MS data. Error bars show SEM, n = 4 technical replicates, except for kB1E II at 12 and 24 h where n = 2. Oxidized kB1E II was included as a positive control for OaAEP1b cyclization activity (sequence: GLPVCGETCVGGTCNTPGCTCSWPVCTRNEII). All reactions were conducted in sodium acetate buffer at pH 5. The average observed monoisotopic mass (Da; [M+H]+) for each peptide at 24 h is L5ALEG: precursor = 3840.79, linear = 3470.47, cyclic =3452.44, L5GLP: precursor 3738.45, linear = 3470.18, cyclic = 3452.17, kB1EII: precursor = 3265.16, linear = 2909.72, cyclic = 2891.78. No MALDI-MS peaks were detected that would correspond with peptides processed at the original Asp site in loop 6 (data not shown).

64 3.6. RESULTS AND DISCUSSION

The relatively low abundance of processed linear product in the samples suggests that OaAEP1b was functioning as a preferential cyclase rather than a hydrolase, as has been observed for the enzyme with other peptide substrates (Jackson et al., 2018; Harris et al., 2015; Harris et al., 2019). The highest relative yield of cyclic product (64%) was generated from the L5ALEG peptide and this was therefore selected as the most promising candidate for loop-5 cyclization and taken forward for further optimization.

3.6.2 MCoTI-II can be cyclized efficiently in loop 5 with the addition of two glycine residues.

To determine if extending the size of the loop sequence on the L5 precursor would facilitate cyclization more efficiently, we prepared three new L5 peptide analogues whose sequences are shown in Figure

3.2. The peptide [G1,G33]L5ALEG contained an extra glycine at the N- and C-terminal regions of loop 5 (i.e. P4 and P1’). This extended the length of this loop to seven residues, which matches the length of the cyclization loop of the native OaAEP1b substrate, kalata B1. An additional two precursors were also synthesized to test the individual effect of each additional Gly residue on cyclization efficiency;

[G1]L5ALEG and [G33]L5ALEG. For all three peptide precursors the CTPP ‘ALEG’ was retained as it was processed more efficiently than ‘GLP’ in the initial L5 peptide test (Figure 3.1).

The addition of Gly to both the N- and C-terminal regions, [G1,G33]L5ALEG, resulted in efficient cyclization by OaAEP1b in under 12 hours as illustrated by MALDI-MS in Figure 3.2b, and relative peak cluster area in Figure 3.2c. At this time point, all precursor peptide had essentially been processed with relative cyclic product amounting to 95 ± 0.6% of the resulting MS signal, with the remaining 5

± 0.3% being linear product (CTPP removed). This contrasted with the analogues [G1]L5ALEG and

[G33]L5ALEG, where the addition of only one glycine at either side of the new cyclization loop did not improve the percentage yield of cyclic product dramatically. [G1]L5ALEG was processed similarly to the original L5ALEG and L5GLP peptides, with a slightly higher percentage of cyclic product at 70 ±

0.4% after 24 hours, while [G33]L5ALEG produced only 50 ± 1.3% of cyclic product. Unlike L5ALEG, the [G1]L5ALEG and [G33]L5ALEG substrates were also processed more readily to linear forms with 24 ± 0.3% and 50 ± 1% of linear product present, respectively. Whether the increase in linear product is due to preferential hydrolysis of the precursor or re-processing of the cyclic product is uncertain, however the latter seems unlikely, as we do not observe any reduction in the percentage of cyclic product between 12 and 24 hours (Figure 3.2c).

65 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION

Figure 3.2: Optimization of the loop 5 cyclization point of oxidized MCoTI-II. (a) Precursor peptide sequences for L5 analogues containing two extra Gly residues (circled). Cysteine residues are indicated in roman numerals. (b) Representative MALDI-MS showing mass shifts of 20 µM peptide over a 24 h incubation with 500 nM OaAEP1b (total protein concentration). Precursor peptides were oxidized prior to cyclization. (c) Relative quantification of precursor, linear and cyclic peptide masses using peak cluster area from MALDI-MS. Error bars show SEM, n = 4 technical replicates. All reactions were conducted in sodium acetate buffer at pH 5. The average observed monoisotopic mass (Da; [M+H]+) for each peptide at 24 h, unless stated otherwise, is: [G1,G33]L5ALEG precursor = 3954.53 (0 h), linear = 3584.34, cyclic = 3566.30; [G1]L5ALEG precursor = 3897.59, linear = 3527.38, cyclic = 3509.36; [G33]L5ALEG precursor = 3897.60 (0 h), linear = 3527.35, cyclic = 3509.34. No MALDI-MS peaks were detected that would correspond with peptides processed at the original Asp site in loop 6 (data not shown).

66 3.6. RESULTS AND DISCUSSION

3.6.3 Small residue additions may affect AEP-binding and/or amide bond for- mation.

Our results demonstrate the importance of the additional Gly residues at the P1’ and P4 positions for the compatibility of the peptides with OaAEP1b. According to models of AEP-substrate binding, changes to the C-terminus of the cyclotide precursor could affect formation of the key thioester intermediate with the AEP, while the N-terminal modification could affect cyclization via nucleophilic attack (Zauner et al., 2018; Yang et al., 2017). Considering each Gly addition separately, the insertion of Gly in the C-terminal region of loop 5

([G33]L5ALEG) appeared to increase the rate of processing (hydrolysis or ligation) of the precursor compared to the parent peptide (L5ALEG), evident by the absence of precursor remaining after 12 hours of incubation. This could suggest that C-terminal modification was able to enhance precursor-AEP binding via formation of the enzyme-acyl intermediate. However, this insertion also resulted in an increased proportion of linear product suggesting that the N-terminus was ill-equipped to keep up with the increased rate of enzyme-acyl intermediate formation, leading to increased hydrolysis.

When Gly was inserted at the N-terminus of the target peptide ([G1]L5ALEG), a relative increase in cyclic product was observed compared to the L5ALEG (70 ± 0.4% versus 64 ± 0.08%). This suggests that the nucleophilic attack required for resolution of the enzyme-acyl intermediate, thought to be affected primarily by the incoming N-terminus, may have been able to happen more quickly. However, an increase in linear product was also observed compared to L5ALEG (24 ± 0.3% versus 14 ± 1%). Interestingly, when the G1 and G33 insertions were combined, a high rate of processing was obtained as well as a higher proportion of cyclic product than was achievable when each insertion was introduced separately. This suggests that these insertions have a combined effect that is different to their individual effect and highlights the need to optimize the sequence at both the C- and N-termini of target peptide. Since 95 ± 0.3% of cyclic product could be generated from the [G1,G33]L5 scaffold (compared to 70 ± 0.4% and 50 ± 1.3% when they were incorporated alone), the combined insertion was identified as the most promising scaffold for the incorporation of bioactive epitopes via molecular grafting.

3.6.4 A loop-6-grafted cyclotide is compatible with AEP-mediated cyclization in loop 5.

To test whether the new [G1,G33]L5ALEG scaffold could accommodate a grafted sequence in loop 6 and remain compatible with OaAEP1b-mediated cyclization, a published grafted sequence was incorporated into the scaffold to make [G1,G33]L5_Abl06ALEG, shown in Figure 3.3. The epitope chosen was from the MCoTI-II analogue, MTAbl06; which inhibits Abl kinase, the catalytic domain of the BCR-ABL fusion protein that has been implicated in chronic myeloid leukaemia (Huang et al.,

2015). Unlike the parent scaffold, [G1,G33]L5ALEG, we had difficulty in isolating a single isomer of oxidized [G1,G33]L5_Abl06ALEG, so purified, reduced, [G1,G33]L5_Abl06ALEG was used for cyclization by OaAEP1b. Similar to the parent scaffold, incubation of the [G1,G33]L5_Abl06ALEG

67 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION peptide with OaAEP1b resulted in a high relative percentage (88 ± 4%) of correctly cyclized peptide with minimal unprocessed precursor remaining (Figure 3.3). Additionally, the oxidized mass was observed in all MALDI-MS analysis, suggesting that the peptide had oxidized in the reaction buffer.

Figure 3.3: Cyclization of grafted cyclotide [G1,G33]L5_Abl06ALEG. (a) [G1,G33]L5_Abl06ALEG amino acid sequence showing extra Gly residues (circled) and the grafted epitope in loop 6 (rounded rectangle). Cyclotide loops are numbered and cysteine residues are labelled with Roman numerals. (b) Representative MALDI-MS showing the shift from 20 µM precursor [G1,G33]L5_Abl06ALEG to cyclic product over 24 h incubation with 500 nM OaAEP1b. (c) Relative quantification of MALDI-MS cluster peak area after 12 and 24 h incubation of 500 nM OaAEP1b (total protein concentration) with 20 µM synthetic [G1,G33]L5_Abl06ALEG precursor (left) and after a 24 h incubation of 200 nM OaAEP1b with 20 µM precursor produced recombinantly in E. coli (right). For the synthetic precursor, error bars show SEM and n = 4 technical replicates. Due to availability of peptide and AEP, bars represent one technical replicate for the recombinant peptide. All reactions were conducted in sodium acetate buffer at pH 5.

3.6.5 In vitro AEP-mediated cyclization of a recombinant cyclotide precursor.

Ideally, an enzyme-mediated approach to cyclization should be applicable to both synthetic and recombinantly produced precursor peptides so that it can be applied in laboratories with different peptide-generating expertize. One advantage of AEPs as native cyclization enzymes is that no chem- ical modification of peptide precursors is required for activity, meaning that peptides isolated from in vitro systems should be compatible with AEP-mediated cyclization in vitro. Here, we demon- strate this by coupling recombinant production of [G1,G33]L5_Abl06ALEG with in vitro cyclization.

[G1,G33]L5_Abl06ALEG was expressed as a GST-fusion protein in E. coli with an N-terminal TEV cleavable sequence to liberate the cyclotide precursor. The E. coli–derived peptide precursor in reduced form was incubated with OaAEP1b for 24 hours and the resultant peptide mass corresponded to the cyclic, oxidized form, in agreement with the synthetic [G1,G33]L5_Abl06ALEG results as shown in Figure 3.3. Cyclization of the same peptide precursor from a synthetic or recombinant source here

68 3.6. RESULTS AND DISCUSSION demonstrates that AEP-mediated cyclization via the L5 cyclization position is flexible and employable in different scenarios depending on the peptide bio/synthesis method of choice.

3.6.6 One-pot cyclization and folding of grafted L5 cyclotide.

To determine whether the oxidised [G1,G33]L5_Abl06ALEG cyclic product in Figure 3.3 was the correctly folded isomer, a synthetic, chemically-cyclized version was made for comparison. The chemically cyclized [G1,G33]L5_Abl06 cyclotide was purified and the secondary structure was characterized using NMR (Figures A.1 and A.2) and then used as a co-elution partner for the AEP- cyclized peptide. As shown in Figure 3.4, both peptides co-eluted in LC-MS resulting in a single peak which was detected using extracted ion chromatography (XIC) for the cyclic, oxidized mass, indicating that both peptides were folded in the same conformation. Thus, AEP-mediated cyclization of [G1,G33]L5_Abl06ALEG not only produced the cyclic product but also allowed correct oxidative folding to occur in a simple one-pot reaction to produce a single isomer. Furthermore, when the

[G1,G33]L5_Abl06ALEG peptide was maintained in reduced form by incorporation of 1 mM TCEP into the reaction buffer OaAEP1b produced reduced, cyclic product, indicating OaAEP1b could cyclize both oxidized and reduced substrates (Figure A.3). This finding is consistent with previous reports that disulfide bond formation is not necessary for cyclization of other cyclotides (Harris et al., 2015).

In addition, the cyclic [G1,G33]L5_Abl06ALEG peptide maintained Abl kinase inhibitory activity compared to its predecessor MTAbl06 (Huang et al., 2015) (43.7% vs 36.8%) as shown in Figure A.6. This result suggests that the additional Gly residues in loop 5 of the grafted peptide did not affect the in vitro activity of this peptide.

Figure 3.4: Co-elution of chemically and enzymatically cyclised peptides. Extracted ion chromatographs (XIC, 1024.64 to 1024.68 Da, left panels) and corresponding +TOF MS (right panels), showing single species for cyclic, oxidized [G1,G33]L5_Abl06 from either: synthesis followed by chemical cyclization (top), synthesis followed by OaAEP1b-mediated cyclization over 24 hours (middle) and both top and middle peptides co-eluted (bottom).

69 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION 3.7 Conclusion

The MCoTI-II cyclotide scaffold is the most frequently used template for epitope grafting, but bioactive epitopes often replace the native cyclization region in loop 6. Here, an AEP-compatible grafting scaffold was designed that retains loop 6 as a preferential insertion site, but opens up loop 5 for AEP-mediated backbone cyclization. We show that this new parent scaffold, [G1,G33]L5ALEG, together with a grafted therapeutic candidate for chronic myeloid leukaemia treatment, were both compatible with efficient AEP-mediated cyclization in vitro. In conclusion, [G1,G33]L5ALEG is as a new scaffold for loop-6 grafted cyclotides that enables coupling of recombinant production and enzyme-mediated peptide backbone cyclization. Having first determined that recombinant MCoTI-II can be detected in planta in Chapter 2, and then confirming AEP-cyclotide compatibility in vitro in the current chapter, the next logical step was to co-express AEPs and cyclotides in planta. As discussed earlier, the transient expression system in N. benthamiana is an ideal system for screening cyclotide-AEP compatibility in planta, but it needs optimising for MCoTI-II and related cyclotides. The next chapter aims to optimise the transient expression system in N. benthamiana for this purpose.

70 REFERENCES References

Bi, T., Li, Y., Shekhtman, A., and Camarero, J. A. (2018). In-cell production of a genetically-encoded library based on the θ-defensin RTD-1 using a bacterial expression system. Bioorganic & Medicinal Chemistry 26, pp. 1212–1219. Camarero, J. A. and Campbell, M. J. (2019). The potential of the cyclotide scaffold for drug develop- ment. Biomedicines 7, p. 31. Camarero, J. A., Kimura, R. H., Woo, Y.-H., Shekhtman, A., and Cantor, J. (2007). Biosynthesis of a fully functional cyclotide inside living bacterial cells. ChemBioChem 8, pp. 1363–1366. Cheneval, O., Schroeder, C. I., Durek, T., Walsh, P., Huang, Y.-H., Liras, S., Price, D. A., and Craik, D. J. (2014). Fmoc-based synthesis of disulfide-rich cyclic peptides. The Journal of Organic Chemistry 79, pp. 5538–5544. Clark, R. J. and Craik, D. J. (2010). Native chemical ligation applied to the synthesis and bioengineering of circular peptides and proteins. eng. Biopolymers: Peptide Science 94, pp. 414–422. Colgrave, M. L. and Craik, D. J. (2004). Thermal, chemical, and enzymatic stability of the cyclotide kalata B1: the importance of the cyclic cystine knot. Biochemistry 43, pp. 5965–5975. Conlan, B. F., Gillon, A. D., Barbeta, B. L., and Anderson, M. A. (2011). Subcellular targeting and biosynthesis of cyclotides in plant cells. American Journal of Botany 98, pp. 2018–2026. Craik, D. J., Daly, N. L., Bond, T., and Waine, C. (1999). Plant cyclotides: A unique family of cyclic and knotted proteins that defines the cyclic cystine knot structural motif. Journal of Molecular Biology 294, pp. 1327–1336. Craik, D. J. and Du, J. (2017). Cyclotides as drug design scaffolds. Current Opinion in Chemical Biology 38, pp. 8–16. Gillon, A. D., Saska, I., Jennings, C. V., Guarino, R. F., Craik, D. J., and Anderson, M. A. (2008). Biosynthesis of circular proteins in plants. Plant Journal 53, pp. 505–515. Göransson, U., Burman, R., Gunasekera, S., Strömstedt, A. A., and Rosengren, K. J. (2012). Circular proteins from plants and fungi. Journal of Biological Chemistry 287, pp. 27001–27006. Harris, K. S., Durek, T., Kaas, Q., Poth, A. G., Gilding, E. K., Conlan, B. F., Saska, I., Daly, N. L., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2015). Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nature Communications 6, p. 10199. Harris, K. S., Guarino, R. F., Dissanayake, R. S., Quimbar, P., McCorkelle, O. C., Poon, S., Kaas, Q., Durek, T., Gilding, E. K., Jackson, M. A., Craik, D. J., Weerden, N. L. van der, Anders, R. F., and Anderson, M. A. (2019). A suite of kinetically superior AEP ligases can cyclise an intrinsically disordered protein. Scientific Reports 9, p. 10820. Hemu, X., El Sahili, A., Hu, S., Wong, K., Chen, Y., Wong, Y. H., Zhang, X., Serra, A., Goh, B. C., Darwis, D. A., Chen, M. W., Sze, S. K., Liu, C.-F., Lescar, J., and Tam, J. P. (2019). Structural determinants for peptide-bond formation by asparaginyl ligases. Proceedings of the National Academy of Sciences of the United States of America 116, pp. 11737–11746.

71 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION

Huang, Y.-H., Du, Q., and Craik, D. J. (2019). Cyclotides: disulfide-rich peptide toxins in plants. Toxicon 172, pp. 33–44. Huang, Y.-H., Henriques, S. T., Wang, C. K., Thorstholm, L., Daly, N. L., Kaas, Q., and Craik, D. J. (2015). Design of substrate-based BCR-ABL kinase inhibitors using the cyclotide scaffold. Scientific Reports 5, p. 12974. Isidro-Llobet, A., Kenworthy, M. N., Mukherjee, S., Kopach, M. E., Wegner, K., Gallou, F., Smith, A. G., and Roschangar, F. (2019). Sustainability challenges in peptide synthesis and purification: From R&D to production. The Journal of Organic Chemistry 84, pp. 4615–4628. Jackson, M. A., Gilding, E. K., Shafee, T., Harris, K. S., Kaas, Q., Poon, S., Yap, K., Jia, H., Guarino, R., Chan, L. Y., Durek, T., Anderson, M. A., and Craik, D. J. (2018). Molecular basis for the production of cyclic peptides by plant asparaginyl endopeptidases. Nature Communications 9, p. 2411. Jackson, M. A., Yap, K., Poth, A. G., Gilding, E. K., Swedberg, J. E., Poon, S., Qu, H., Durek, T., Harris, K., Anderson, M. A., and Craik, D. J. (2019). Rapid and scalable plant-based production of a potent plasmin inhibitor peptide. Frontiers in Plant Science 10, p. 602. Jagadish, K., Gould, A., Borra, R., Majumder, S., Mushtaq, Z., Shekhtman, A., and Camarero, J. A. (2015). Recombinant expression and phenotypic screening of a bioactive cyclotide against α- synuclein-induced cytotoxicity in baker’s yeast. Angewandte Chemie-International Edition 54, pp. 8390–8394. Jagadish, K. and Camarero, J. A. (2017). Recombinant Expression of Cyclotides Using Split Inteins. Methods in molecular biology (Clifton, N.J.) Ed. by H. D. Mootz. Vol. 1114. Methods in Molecular Biology. New York, NY: Springer New York, pp. 41–55. Kimura, R. H., Tran, A.-T., and Camarero, J. A. (2006). Biosynthesis of the cyclotide kalata B1 by using protein splicing. Angewandte Chemie International Edition 118, pp. 987–990. Li, Y., Aboye, T., Breindel, L., Shekhtman, A., and Camarero, J. A. (2016). Efficient recombinant expression of SFTI-1 in bacterial cells using intein-mediated protein trans-splicing. Peptide Science 106, pp. 818–824. Mylne, J. S., Chan, L. Y., Chanson, A. H., Daly, N. L., Schaefer, H., Bailey, T. L., Nguyencong, P., Cascales, L., and Craik, D. J. (2012). Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase–mediated biosynthesis. The Plant Cell 24, pp. 2765–2778. Nguyen, G. K. T., Wang, S., Qiu, Y., Hemu, X., Lian, Y., and Tam, J. P. (2014). Butelase 1 is an Asx-specific ligase enabling peptide macrocyclization and synthesis. Nature Chemical Biology 10, pp. 732–738. Nuijens, T., Toplak, A., Schmidt, M., Ricci, A., and Cabri, W. (2019). Natural cccurring and engineered enzymes for peptide ligation and cyclization. Frontiers in Chemistry 7, p. 829. Ojeda, P. G., Cardoso, M. H., and Franco, O. L. (2019). Pharmaceutical applications of cyclotides. Drug Discovery Today 24, pp. 2152–2161. Pillet, S., Couillard, J., Trépanier, S., Poulin, J.-F., Yassine-Diab, B., Guy, B., Ward, B. J., and Landry, N. (2019). Immunogenicity and safety of a quadrivalent plant-derived virus like particle influenza

72 REFERENCES

vaccine candidate—two randomized phase ii clinical trials in 18 to 49 and ≥50 years old adults. PLOS ONE 14, pp. 1–26. Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641. Rehm, F. B. H., Jackson, M. A., De Geyter, E., Yap, K., Gilding, E. K., Durek, T., and Craik, D. J. (2019). Papain-like cysteine proteases prepare plant cyclic peptide precursors for cyclization. Proceedings of the National Academy of Sciences, pp. 7831–7936. Saska, I., Gillon, A. D., Hatsugai, N., Dietzgen, R. G., Hara-Nishimura, I., Anderson, M. A., and Craik, D. J. (2007). An asparaginyl endopeptidase mediates in vivo protein backbone cyclization. Journal of Biological Chemistry 282, pp. 29721–29728. Schiermeyer, A. (2020). Optimizing product quality in molecular farming. Current Opinion in Biotech- nology 61, pp. 15–20. Schmidt, M., Huang, Y.-H., Texeira de Oliveira, E. F., Toplak, A., Wijma, H. J., Janssen, D. B., Maarseveen, J. H. van, Craik, D. J., and Nuijens, T. (2019). Efficient enzymatic cyclization of disulfide-rich peptides by using peptide ligases. ChemBioChem 20, pp. 1524–1529. Schmidt, M., Toplak, A., Quaedflieg, P. J. L. M., Ippel, H., Richelle, G. J. J., Hackeng, T. M., Maarseveen, J. H. van, and Nuijens, T. (2017). Omniligase-1: A powerful tool for peptide head-to- tail cyclization. Advanced Synthesis & Catalysis 359, pp. 2050–2055. Shaaltiel, Y., Bartfeld, D., Hashmueli, S., Baum, G., Brill-Almon, E., Galili, G., Dym, O., Boldin- Adamsky, S. A., Silman, I., Sussman, J. L., Futerman, A. H., and Aviezer, D. (2007). Production of glucocerebrosidase with terminal mannose glycans for enzyme replacement therapy of Gaucher’s disease using a plant cell system. Plant Biotechnology Journal 5, pp. 579–590. Slazak, B., Kapusta, M., Strömstedt, A. A., Słomka, A., Krychowiak, M., Shariatgorji, M., Andrén, P. E., Bohdanowicz, J., Kuta, E., and Göransson, U. (2018). How does the sweet violet (Viola odorata L.) fight pathogens and pests – Cyclotides as a comprehensive plant host defense system. Frontiers in Plant Science 9, p. 1296. Thompson, R. E. and Muir, T. W. (2019). Chemoenzymatic semisynthesis of proteins. Chemical reviews. Thongyoo, P., Jaulent, A. M., Tate, E. W., and Leatherbarrow, R. J. (2007). Immobilized protease- assisted synthesis of engineered cysteine-knot microproteins. ChemBioChem 8, pp. 1107–1109. Thongyoo, P., Roqué-Rosell, N., Leatherbarrow, R. J., and Tate, E. W. (2008). Chemical and biomimetic total syntheses of natural and engineered MCoTI cyclotides. Organic & Biomolecular Chemistry 6, p. 1462. Tschofen, M., Knopp, D., Hood, E., and Stöger, E. (2016). Plant molecular farming: Much more than medicines. Annual Review of Analytical Chemistry 9, pp. 271–294. Tsukiji, S. and Nagamune, T. (2009). Sortase-Mediated Ligation: A Gift from Gram-Positive Bacteria to Protein Engineering. ChemBioChem 10, pp. 787–798.

73 CHAPTER 3. PERMUTATION OF THE CYCLISATION POSITION

Veer, S. J. de, Kan, M.-W., and Craik, D. J. (2019). Cyclotides: From structure to function. Chemical Reviews 119, pp. 12375–12421. Vitale, A. and Pedrazzini, E. (2005). Recombinant pharmaceuticals from plants: The plant endomem- brane system as bioreactor. Molecular Interventions 5, p. 216. Wang, C. K. and Craik, D. J. (2018). Designing macrocyclic disulfide-rich peptides for biotechnological applications. Nature Chemical Biology 14, pp. 417–427. Yang, R., Wong, Y. H., Nguyen, G. K., Tam, J. P., Lescar, J., and Wu, B. (2017). Engineering a catalytically efficient recombinant protein ligase. Journal of the American Chemical Society 139, pp. 5351–5358. Zauner, F. B., Elsasser, B., Dall, E., Cabrele, C., and Brandstetter, H. (2018). Structural analyses of Arabidopsis thaliana legumain γ reveal the differential recognition and processing of proteolysis and ligation substrates. Journal of Biological Chemistry 8, pp. 8934–8946.

74 Chapter 4

Manipulation of the plant cell environment to enhance recombinant expression of cyclotides

75 CHAPTER 4. PLANT CELL ENVIRONMENT 4.1 Overview

With the development of an alternative cyclisation position for MCoTI-II in the previous chapter, and the publication of new grafted cyclotides every year, it is becoming increasingly important to develop a fast screening method to assess their compatibility with plant based production. In Chapter 2, MCoTI-II and analogues were recombinantly produced in Arabidopsis thaliana seeds. Accumulation of recombinant proteins and peptides in seeds is convenient for storage and processing (Boothe et al., 2010) but stable transformation of seeds is time consuming and tedious for screening multiple constructs such as cyclotide-AEP combinations. Therefore, the use of rapid transient expression in a leaf-based expression system is worth investigating. Expression of MCoTI-II and analogues in leaf tissue has proven difficult, with only weak MALDI-MS signals for cyclotides detected (Chapter 2). Earlier in this thesis, attempts at optimising leaf-based expression were focused on modifying construct components including the precursor gene, the CTPP, and individual residues in the mature cyclotide. This was followed by testing cyclotide-AEP compatibility in vitro in Chapter 3. Here, the focus shifts to changing the cellular and subcellular environment of the leaf to explore whether expression of MCoTI-II and other cyclotides can be improved. Transient expression of recombinant products in Nicotiana benthamiana has become an attractive system because the plant is susceptible to infection with viral vectors for transgene expression, is easy to cultivate and production is separate from the food chain (Bally et al., 2018). N. benthamiana has been used to produce therapeutics and other valuable products such as a wide range of antibodies, vitronectin, human growth factors and cancer treatments (Bally et al., 2018; Powell, 2015) and has recently been employed for expression of recombinant cyclotides (Poon et al., 2018). Because of its exceptional ability to act as a host for recombinant protein production, N. benthamiana has also been used to reproduce complex multi-step biosynthesis pathways via co-expression of a suite of enzymes required for step-by-step synthesis of DHA (docosahexaenoic acid) (Petrie et al., 2010). In this same study, a single transcription factor, LEC2, was co-expressed to induce a seed-like environment in N. benthamiana leaves that facilitated seed-specific promoter activity to drive transgene expression. LEC2 is a B3 domain transcription factor that induces seed maturation via activation of maturation genes in the auxin pathway (Stone et al., 2008). It functions as an early inducer of a seed-like environment and its expression is detected earlier than other seed-maturation genes such as LEC1 and FUS3 when expressed ectopically in A. thaliana (Stone et al., 2008) making it an ideal candidate for induction of a seed-like environment by a single gene. As well as enabling seed-specific promoter activity for DHA production (Petrie et al., 2010), LEC2 expression has also been used to enhance transgene expression for biofuel production in N. tabacum (Andrianov et al., 2010), and to increase endogenous fatty acids in N. benthamiana stems (Nookaraju et al., 2014) and A. thaliana leaves (Santos Mendoza et al., 2005; Slocombe et al., 2009; Kim et al., 2015). Given its ability to induce a seed-like environment that is suitable for protein and fatty acid accumulation, it was hypothesised that co-expression of LEC2 could be used to promote accumulation of recombinant cyclotides in N. benthamiana leaves. To test this, the cyclotide kalata B1 was chosen

76 4.2. METHODS for co-expression with LEC2 as an example of a cyclotide that is readily expressed in N. benthamiana. Given that MCoTI-II was detected in seeds, but not leaves (Chapter 2) its expression might depend on a seed like environment, and so it was also chosen to test in conjunction with LEC2 in this chapter. The promoters Oleosin and FP1 were used to drive cyclotide expression to assess whether seed-specific promoter activity could be enhanced in the LEC2-induced environment. It was unclear exactly how LEC2 might affect cyclotide production, so the multi-factorial design included co-infiltration with and without LEC2, the construct of interest, and the cyclase, OaAEP1b. This study is the first to examine the effect of LEC2 on cyclotide production in planta.

4.2 Methods

4.2.1 Generation of LEC2 gene constructs

The A. thaliana LEC2 gene sequence (GenBank: AF400124.1) was codon optimised for N. benthami- ana using the IDT Codon Optimisation Tool (Integrated DNA Technologies, Inc.) and synthesised as a gBlock® (IDT). The fragment was amplified using primers to add attB sites in preparation for Gateway® cloning. The PCR products were gel purified and recombined with the pDONR221 entry vector, followed by cloning in TOP10 E. coli chemically competent cells. Individual colonies were selected, miniprepped and DNA sequenced at the Australian Genome Research Facility (AGRF), Brisbane, Australia. Confirmed LEC2 DNA was recombined into destination vector pEAQ-HT-Dest1 (Sainsbury et al., 2009) for transfection into A. tumefaciens LBA4404 strain for infiltration into N. benthamiana leaves.

4.2.2 FP1 promoter construct

A truncated version of the FP1 napin promoter from Brassica napus (GenBank: J02798.1) was designed based on the -309 fragment from Stålberg and colleagues (1993) spanning bases -309 to +44 bp relative to the CAP binding site. FP1 (-309+44) was synthesised as a gBlock® (IDT) containing an AscI restriction enzyme cleavage site flanking the truncated promoter sequence. The gBlock fragment was A-tailed and cloned into the vector pGEM®-T Easy, transformed into TOP10 E. coli, miniprepped and sequence verified via sanger sequencing (AGRF, Brisbane). Restriction enzyme cloning using the AscI enzyme (NEB) was used to insert FP1 into the vector pOH123 in place of the Oleosin promoter inserted previously (Figure 2.1). This created the new vector, pFH123, detailed in Figure 4.1. The new vector was transformed into TOP10 E. coli. Colony PCR used to determine successful transformants as well as correct directional insertion of the promoter. Sequences were verified via Sanger sequencing (AGRF, Brisbane). The vector pFH123 was used as a destination vector for Gateway® cloning of the GOIs used in this study.

77 CHAPTER 4. PLANT CELL ENVIRONMENT

Truncated FP1 promoter GATAGCTTGGCGCGCCTCTCATCCCCTTTTAAACCAACTTAGTAAACGTT 50 TTTTTTTTTAATTTTATGAAGTTAAGTTTTTACCTTGTTTTTAAAAAGAA 100 TCGTTCATAAGATGCCATGCCAGAACATTAGCTACACGTTACACATAGCA 150 TGCAGCCGCGGAGAATTGTTTTTCTTCGCCACTTGTCACTCCCTTCAAAC 200 ACCTAAGAGCTTCTCTCTCACAGCACACACATACAATCACATGCGTGCAT 250 GCATTATTACACGTGATCGCCATGCAAATCTCCTTTATAGCCTATAAATT 300 AACTCATCCGCTTCACTCTTTACTCAAACCAAAACTCATCAATACAAACA 350 AGATTAAAAACATACACGAGGCGCGCCTCGAGGGGG 386

Figure 4.1: Design of pFH123. The truncated napin promoter, FP1, designed and annotated based on Stålberg et al., 1993. The promoter contains bases -309 to +44 relative to the CAP binding site (black box indicated on promoter map). pFH123 is a pMDC123-derived vector (Tair accession = Vector:1009003750) containing the FP1 promoter and Hsp7 terminator (pink) to control expression of the gene of interest. Cloning is carried out using the Gateway® method via the att sites indicated in yellow. Kanamycin and glufosinate ammonium (bar) resistant genes for selection are indicated in blue. Sequence maps were created using SnapGene®.

78 4.2. METHODS

4.2.3 Constructs containing the genes of interest

To test the effect of seed-specific promoters, Gateway® cloning was used to insert each gene of interest (GOI) into the destination vectors pOH123 (developed previously in-house) and pFH123 under the regulation of the Oleosin and FP1 promoters respectively, as well as the pEAQ-HT-Dest vector under the control of the CaMV 35S promoter as the standard vector used in our N. benthamiana assays. The sequences of all constructs were verified (AGRF, Brisbane) and transformed into electrocompetent A. tumefaciens LBA4404 cells for agroinfiltration into N. benthamiana. For a list of GOIs used in this study, see Table 4.1.

Table 4.1: Genes of interest and vectors used in Chapter 4 Monoisotopic mass [M+H]+ GOI expression constructs Description Cyclotide product Cyclic Linear OakOpt+UTR Optimised Oak1 gene that encodes kalata kalata B1 2891.21 2909.21 B1. Readily expressed transiently in N. benthamiana when co-infiltrated with the AEP, OaAEP1b. PPI_kB1 Coding sequence for kalata B1 preceded by kalata B1 2891.21 2909.21 potato protease inhibitor (PPI) delta vacuole targeting sequence NaD1_kB1 NaD1 gene from Nicotiana alata with the kalata B1 2891.21 2909.21 coding sequence for the NaD1 defensin replaced with kalata B1. Oak_MCoTI-II Oak1 gene with the coding sequence for kalata MCoTI-II 3451.56 3469.56 B1 replaced with MCoTI-II, and with the native MCoTI-II CTPP ‘ALEG’. Oak_L5[G1, G33]_Abl06 Oak1 gene with the coding sequence for kalata L5[G1, G33]_Abl06 4095.94 4113.94 B1 replaced with L5[G1, G33]_Abl06 from Chapter 2. Paws1_L5[G1, G33]_Abl06 Paws1 gene from Helianthus annuus with the L5[G1, G33]_Abl06 4095.94 4113.94 coding sequence for SFTI-1 replaced with L5[G1, G33]_Abl06 from Chapter 2.

Vectors Description Promoter type Reference pEAQ pEAQ-HT-Dest1 containing the constitutive Constitutive Sainsbury et al., 2009 35S promoter, the CPMV hypertrans expression system, and the p19 suppressor of gene silencing pOH123 pMDC123-derived vector containing the A. Seed-specific See Figure 2.1 thaliana Oleosin promoter and Hsp7 terminator pFH123 pMDC123-derived vector containing the Seed-specific See Figure 4.1 truncated A. thaliana napin promoter FP1, and Hsp7 terminator

4.2.4 Sequencing to confirm LEC2 binding site in the Oleosin promoter

The vector pOH123 was previously assembled in our laboratory by inserting an Oleosin promoter and Hsp7 terminator into the PMDC100 vector (TAIR accession: 1009003749). pOH123 was sent for sequencing to AGRF Brisbane, using a reverse primer for a gene of interest previously cloned into the plasmid. The sequence was aligned with the A. thaliana gene for Oleosin (Genbank: X62353.1) using the LALIGN tool (https://embnet.vital-it.ch/software/LALIGN_form.html) using default alignment settings.

79 CHAPTER 4. PLANT CELL ENVIRONMENT

4.2.5 Transient expression in N. benthamiana

Individual colonies of A. tumefaciens carrying the expression vectors were grown as 5 mL starter cultures in LB broth supplemented with 50 µg/mL rifampicin and 50 µg/mL kanamycin and 1 mM

MgSO4 overnight at 28°C with shaking. Cultures were then scaled up to a desired quantity in LB containing 1 mM MgSO4 and 10 µM acetosyringone with no antibiotics. Larger cultures were grown to stationary phase before centrifugation and resuspension in MES infiltration buffer (10 mM MES salts, 10 mM MgCl2, 100 mM acetosyringone) where they were left at room temperature for three hours before infiltration. Cultures were adjusted to an OD600 of 1 and mixed 1:1:1 regarding GOI:AEP:LEC2 ratio. Empty A. tumefaciens LBA4404 cells were used at the same OD600 where no AEP or no LEC2 treatments were required to maintain final OD600 of 0.33 for each expression vector. Nicotiana benthamiana seedlings were grown in Jiffy pellets (42mm Jiffy-7® Pellet) for 4–5 weeks in controlled conditions: 26°C and 16:8 photoperiod. Leaves were infiltrated with the A. tumefaciens co-infiltration mixes by hand using a needle-less syringe and infiltrated areas outlined in black marker. Infiltrated plants were harvested 6 days post infiltration whereby infiltrated regions of the leaves were excised and snap-frozen in liquid nitrogen and lyophilised overnight before peptide extraction.

4.2.6 Peptide extraction from N. benthamiana leaves

Dry tissue was weighed and then ground using a Geno/Grinder® (SPEX®) at 1000 rpm for 1 minute before adding extraction solvent (50% acetonitrile, 1% formic acid) at a ratio of 20 µL per mg dry leaf weight. Extractions were vortexed for two hours before centrifugation to remove sediment leaf material.

4.2.7 Cyclotide content analysis using MALDI-MS

Leaf extracts were diluted 1:10 in 75% acetonitrile containing peptide standard for relative quantifi- cation. Diluted extract was mixed 1:1 with matrix (α-Cyano-4-hydroxycinnamic acid) and analysed using MALDI-MS (AB SCIEX 5800 MALDI TOF/TOF™ system). Cluster peak areas for detected masses were used for relative quantification to the internal peptide standard.

4.2.8 Transient expression in CRISPR-Cas9 AEP-knockout lines of N. ben- thamiana

An AEP-null line of N. benthamiana was developed in house by Dr. Edward Gilding using CRISPR- Cas9 to disrupt four endogenous AEPs in the plant. CRISPR-Cas9 edited plants were grown as described for N. benthamiana above and infiltrated accordingly at six weeks. Leaf tissue was harvested and analysed as above.

80 4.3. RESULTS 4.3 Results

To assess the effect of LEC2 expression on recombinant cyclotide production in N. benthamiana, cyclotide-encoding genes were expressed alongside LEC2 and OaAEP1b in various combinations. The expression constructs were made by combining cyclotides and precursor genes (Table 4.1, Figure 4.2) and were inserted into three expression vectors. The pEAQ-HT-Dest1 vector (Sainsbury et al., 2009) was included as an established vector for use in N. benthamiana and as a positive control for the expression of OakOpt+UTR. Two vectors containing seed-specific promoters were also used to drive cyclotide expression. The Oleosin promoter in the vector pOH123 was previously prepared in house, and the FP1 promoter was replicated from prior studies into the expression of LEC2 in N. benthamiana (Petrie et al., 2010) based on the most active truncated version of the B. napus napin promoter (Stålberg et al., 1993).

Figure 4.2: Experimental design to assess the effect of LEC2 expression on cyclotide accumulation in Nicotiana benthamiana. Kalata B1 was cloned into three precursor genes; OakOpt+UTR, PPI_kB1 and NaD1_kB1. Each of these precursors were cloned into expression vectors pEAQ-HT-Dest1, pOH123 and pFH123. Co-expression in N. benthamiana included a 1:1:1 mixture of A. tumefaciens containing LEC2 (green), cyclotide (blue) and OaAEP1b (orange).

81 CHAPTER 4. PLANT CELL ENVIRONMENT

4.3.1 The effect of LEC2 on kalata B1 expression

Co-expression with LEC2 and OaAEP1b increases cyclic kalata B1

The cyclotide kalata-B1 was transiently produced in N. benthamiana from three different precursor proteins and in each of the three vectors; pEAQ-HT-Des1, pOH123 and pFH123. These constructs were co-infiltrated into N. benthamiana leaves with both LEC2 and OaAEP1b, LEC2 only, OaAEP1b only, or neither LEC2 nor OaAEP1b (control). The accumulation of cyclotides was detected using MALDI-MS of plant extracts. It was predicted that little linear product would be detected because of the presence of the cyclising AEP, and that LEC2 would enhance cyclotide accumulation. To see whether cyclic product could be expressed in N. benthamiana leaves, the two treatments including OaAEP1b co-expression (LEC2+AEP and AEP-only) were analysed first. As shown in Figure 4.3, co-expression of LEC2 with the three kalata B1 precursor genes resulted in a relative increase in cyclic product in all cases, except when the precursors were expressed from the pEAQ-HT-Dest1 vector. The difference between kalata B1 detection with and without LEC2 co-expression was statistically significant when OakOpt+UTR and NaD1-kB1 were expressed from the seed-specific promoters, whereas for PPI_kB1 this trend was only statistically significant in the case of the FP1 promoter. The increase of cyclic kalata B1 was not accompanied by an increase in linear kalata B1. For the alternative precursor frameworks of PPI and NaD1, kalata B1 yields were best when expression was driven by the seed specific promoters, but only when co-infiltrated with LEC2.

82 4.3. RESULTS

Figure 4.3: The effect of LEC2 on the expression of kalata B1 when driven by three expression cassettes. Each bar graph shows relative quantification of kalata B1 masses identified using MALDI-MS to analyse N. benthamiana extracts harvested 6 days post infiltration with agrobacteria carrying the plasmid and gene of interest as indicated. Relative quantification was calculated against an internal standard of known molecular weight and concentration added to the samples post-extraction. AEP = OaAEP1b. Error bars represent SEM. n=4 biological replicates. Hollow circles represent individual replicate values. 2-way ANOVA analysis was carried out between each treatment on each graph for cyclic vs cyclic, and linear vs linear and indicated where statistically significant. P = ≤0.05 (*), P = ≤0.01 (**).

Linear kalata B1 accumulates in a LEC2-induced environment when OaAEP1b is absent

To assess whether LEC2 expression would also affect the accumulation of linear cyclotide peptides, treatments were carried out without co-expression of OaAEP1b. Without a cyclising AEP in the system, predominantly linear cyclotides are produced (Poon et al., 2018) but small linear peptides are prone to degradation. It was therefore hypothesised that a LEC2- induced environment that is less proteolytic might result in an increase in linear peptide accumulation.

83 CHAPTER 4. PLANT CELL ENVIRONMENT

A higher amount of linear cyclotide was observed in the presence of LEC2, but once again the exception was for the precursors expressed from the pEAQ-HT-Dest1 vector which did not respond to LEC2 co-expression (Figure 4.4). For constructs expressed from the seed-specific promoters, the trend was not statistically significant in most cases, possibly because the MALDI-MS signals were relatively low, making estimation of peak cluster area less precise. However, the general increase in kalata B1 precursor in the presence of LEC2 is consistent for each vector and might suggest an underlying mechanism for increased cyclic kalata B1 seen in the corresponding treatments with OaAEP1b outlined above (Figure 4.3).

Figure 4.4: The effect of LEC2 on expression of kalata B1 in the absence of OaAEP1b. N. benthamiana leaves were infiltrated with agrobacteria carrying the plasmid and gene of interest as indicated and were harvested 6 days post infiltration. Each bar graph shows relative quantification of kalata B1 masses from leaf extracts identified using MALDI-MS. Relative quantification was calculated against an internal standard of known molecular weight and concentration added to the samples post-extraction. AEP = OaAEP1b. Error bars represent SEM. n=4 biological replicates. Hollow circles represent individual replicate values. 2-way ANOVA analysis was carried out between each treatment on each graph for cyclic versus cyclic, and linear versus linear and indicated where statistically significant. P = ≤0.05 (*), P = ≤0.01 (**).

84 4.3. RESULTS

In summary, the accumulation of cyclic or linear kalata B1 in N. benthamiana leaf tissue increased upon co-expression with LEC2 in all cases where a seed-specific promoter was used to drive cyclotide expression (pOH123, pFH123) but not when cyclotide expression was driven by the pEAQ-HT-Dest1 vector. LEC2 enhanced the accumulation of kalata B1 in either cyclic or linear form, depending on the presence or absence of AEP respectively.

4.3.2 Comparing the three vectors; pEAQ-HT-Dest1, pOH123 and pFH123

To best interpret the effects of the vectors on yield, the data from Figures 4.3 and 4.4 were assembled together in Figure 4.5 with normalised scales. The relative expression of the OakOpt+UTR gene from the pEAQ-HT-Dest1 vector had a greater effect on kalata B1 expression than LEC2, but the vectors containing seed-specific promoters are more beneficial for the PPI-kB1 and NaD1-kB1 precursor genes in a LEC2-induced environment. For the OakOpt+UTR precursor, cyclic product was detected up to eight times more strongly from the pEAQ-HT-Dest1 vector than from the pOH123 vector, and four times more than the pFH123 vector (4.5, top row panels). Looking at the other two precursor genes, it is less clear if there is an influence from the expression vector used, but it is clear that for pOH123 and pFH123, the combination of LEC2 and OaAEP1b co-expression together could increase cyclic kalata B1 detection levels, and that the pEAQ-HT-Dest1 vector could not respond in this way.

85 CHAPTER 4. PLANT CELL ENVIRONMENT

Figure 4.5: General trends in the use of three different vectors and three different precursors for kalata-B1 expression in the presence and absence of LEC2. Each bar graph shows relative quantification of kalata B1 cyclic and linear masses identified using MALDI-MS to analyse N. benthamiana extracts harvested 6 days post infiltration with A. tumefaciens carrying the plasmid and gene of interest as indicated. Relative quantification was calculated against an internal standard of known molecular weight and concentration added to the samples post-extraction. AEP = OaAEP1b. Error bars represent SEM. n=4 biological replicates. Y axis is equally scaled for each precursor gene (each row of panels).

86 4.3. RESULTS

4.3.3 Co-expression of MCoTI-II and LEC2

For expression of MCoTI-II in N. benthamiana leaves the Oak1 gene was altered to replace the encoded kalata B1 domain with that of MCoTI-II. Using this strategy, the new construct Oak_MCoTI-II should be directed to a cellular compartment compatible with AEP-mediated processing. The CTPP, ALEG, was retained from the native MCoTI-II precursor as previous studies demonstrated that it is compatible with OaAEP1b (Chapter 3). The Oak_MCoTI-II construct was expressed from the pEAQ-HT-Dest1, pOH123 and pFH123 vectors with and without LEC2 and OaAEP1b (Figure 4.6).

Figure 4.6: The experimental design for LEC2 and Oak_MCoTI-II co-expression. MCoTI-II was cloned into the Oak1 precursor gene. Oak_MCoTI-II was cloned into expression vectors pEAQ-HT-Dest1, pOH123 and pFH123. Co-expression in N. benthamiana included a 1:1:1 mixture of A. tumefaciens containing LEC2 (green), cyclotide (red) and OaAEP1b (orange).

Unfortunately, MALDI-MS signals were relatively low irrespective of the vector used for Oak_MCoTI-II expression. Like for the Oak1 genes, the introduction of LEC2 expression did not have any effect on expression from the pEAQ-DEST1 vector, but did marginally influence the levels of MCoTI-II produced from Oak_MCoTI-II using the seed specific promoters as shown in Figure 4.7. For the vectors containing a seed-specific promoter driving cyclotide expression (Figure 4.7, middle and right panels), the combination of LEC2 and OaAEP1b co-expression was the only treatment where detection of cyclic or linear MCoTI-II was successful, with the exception of two individual cases of linear MCoTI-II being detected in either AEP-only or LEC2-only treatments for pOH123. This suggests a positive effect of the LEC2-induced environment for expression of MCoTI-II, but only enabled determination of presence or absence at levels near the practical detection limit of the material and equipment used.

87 CHAPTER 4. PLANT CELL ENVIRONMENT

Figure 4.7: The effect of LEC2 on the expression of MCoTI-II driven by three expression cassettes. Each bar graph shows relative quantification of MCoTI-II cyclic and linear masses identified using MALDI-MS to analyse N. benthamiana extracts harvested 6 days post infiltration with agrobacteria carrying the plasmid and gene of interest as indicated. Relative quantification was calculated against an internal standard of known molecular weight and concentration added to the samples post-extraction. AEP = OaAEP1b. Error bars represent SEM. n = 4 biological replicates. The Y axes are scaled equally between each graph until 0.01 relative peak area.

4.3.4 Background Oleosin promoter activity is probably not due to mobile auxin activity induced elsewhere on the plant

The Oleosin promoter was used in conjunction with LEC2 co-expression with the hypothesis that it would only be active in a LEC2-induced environment in leaves because it is a seed-specific promoter. However, some cyclotide expression was detected in infiltrations without LEC2 expression. This was the case for both kalata B1 (Figures 4.3 and 4.4) and Oak_MCoTI-II (Figure 4.7). The Oleosin gene is highly seed-specific (TRAVA, 2020; Klepikova et al., 2016) and no cases of leaky promoter were identified in literature searches. One explanation for this erroneous promoter activity was the possibility of auxin induction and transport from other leaves on the same plant that were infiltrated with LEC2, as initial experiments were conducted with multiple treatments per plant. As auxin is a mobile plant hormone, it was hypothesised that LEC2 expression in one leaf spot could possibly lead to auxin biosynthesis and subsequent transport around the plant, inducing embryogenesis elsewhere, and thus enabling some Oleosin promoter activity. To test this hypothesis, LEC2-negative treatments were infiltrated on separate plants to LEC2 treatments. Although plants were kept in separate trays and infiltrations were carried out on separate plants, the Oleosin-driven, LEC2-negative treatments still produced detectable peptides (Figure 4.8) suggesting that LEC2 expression elsewhere in the plant probably does not affect Oleosin promoter activity. The general pattern of expression was similar to the equivalent treatments (right panel, Figure 4.8). Note the difference in scale of the Y axis that corresponds to the use of a different peptide standard for relative quantification, but the ratios of cyclic to linear cyclotide follow the same trend.

88 4.3. RESULTS

Figure 4.8: The expression of LEC2 and non-LEC2 treatments on separate plants. The pOH123:OakOpt+UTR plasmid was co-infiltrated with both LEC2 and OaAEP1b, OaAEP1b only, LEC2 only or on its own, with each treatment infiltrated on to separate plants (left) or randomly distributed (right). N. benthamiana leaf tissue was harvested 6 days post infiltration and extracts were assessed using MALDI-MS. Bars represent relative peak area compared to the internal standard peptide included in the samples post-extraction. Note that different internal standard peptides were used for each panel resulting in varied Y axis ranges. Error bars represent SEM, hollow circles represent individual biological replicates. n=6 (left panel) and 4 (right panel). AEP = OaAEP1b

Figure 4.9: The effect of younger or older infiltrated leaves on kalata B1 expression. The pOH123:OakOpt+UTR plasmid was co-infiltrated with both LEC2 and OaAEP1b, OaAEP1b only, LEC2 only or on its own, with each treatment infiltrated on to separate plants, and with three replicates of either younger (youngest mature leaf) or older (second-youngest mature leaf) leaves infiltrated. Bars represent mean relative peak area for peptide masses detected in leaf extracts from infiltrated leaves 6 days post infiltration. Error bars = SEM, n=3 biological replicates. Hollow circles represent each replicate value.

In the same experiment, each treatment was infiltrated into two leaves on the same plant. This was replicated three times (three different plants) to look at the effect of older and younger leaves on kalata B1 expression. The same data from Figure 4.8 was separated according to younger or older leaves, and the data analysed in Figure 4.9. No significant difference was detected between the relative amount of product detected in the same treatment counterparts between younger or older leaves according to

89 CHAPTER 4. PLANT CELL ENVIRONMENT

2-way ANOVA. There does appear to be a higher relative amount of linear product accumulating in the LEC2-only treatment in younger leaves, but it is probably due to an outlier in the data which might not represent a typical response (see younger leaf: LEC2 treatment in Figure 4.9). The wider distribution of the individual replicate values for the younger leaves might be a reflection of the growing leaf cells over the six day transient expression period.

4.3.5 The Oleosin promoter used in this study contains the RY elements neces- sary for LEC2 transcription factor binding

The pOH123 vector was adopted for use in this study as a convenient in-house resource containing a seed-specific promoter. Coincidentally, the LEC2 transcription factor is known to be a direct activator of the Oleosin promoter in A. thaliana, (Che et al., 2009) introducing the possibility of a direct interaction in addition to the general promotion of a seed-like environment as was the primary aim of the study. As the vector was constructed some years prior to the current study it was sent for sequencing to confirm the origin and length of the Oleosin promoter used. The pOH123 vector sequence containing the Oleosin promoter aligned perfectly with the -200 base pair region of the A. thaliana Oleosin gene (Figure 4.10), indicating that the promoter was of A. thaliana origin and that the sequence was not altered before incorporation into the plasmid pOH123. RY elements identified by Che and colleages 2009 were identified in the native sequence and the pOH123 sequence (Figure 4.10) indicating that the Oleosin promoter used here contains the binding motif necessary for the LEC2 transcription factor.

Figure 4.10: Analysis of the -200 bp region of the Oleosin promoter and pOH123. The A. thaliana Oleosin promoter region (Genbank: X62353.1) was aligned to the pOH123 sequence used in this study using the LALIGN tool (Pearson, 2019). RY elements necessary for LEC2 transcription factor binding (Che et al., 2009) are shown in boxes. Mismatched nucleotides are indicated by an asterisk.

90 4.3. RESULTS

4.3.6 Expression of cyclotides in AEP-knockout N. benthamiana

The absence of endogenous AEPs in N. benthamiana could be beneficial for cyclotide production by reducing the risk of interfering hydrolase-type AEPs that might degrade cyclotides or their precursors. Early pre-validated N. benthamiana knockout lines were used as a first-look into the effect of AEP knockouts, and to try to identify whether endogenous AEP-interference could be an explanation for the lack of MCoTI-II expression in the leaves of this plant. The OakOpt+UTR construct was also included as a positive control for general expression in N. benthamiana. Mature, cyclic, kalata B1 was detected in the CRISPR knockout lines and the wild-type lines of N. benthamiana, although the signal was relatively weak (Figure 4.11) compared to the usually relatively intense peaks for kalata B1 seen in WT N. benthamiana. This was probably due to the mature stage of the plants when infiltrated (6 weeks post germination) indicated by the relatively low expression in both gene edited and wild-type plants at the same stage (Figure 4.12). A trend towards increased cyclic product accumulation in AEP-knockout lines was observed but should be interpreted with caution due to the low signal intensity compared to the usually abundant kalata B1 signals in younger plants.

MCoTI-II and analogues were not detected in N. benthamiana knockout line with or without LEC2 co-expression

MCoTI-II and L5[G1, G33]_Abl06 were co-infiltrated into AEP-knockout and WT plants either with or without OaAEP1b and LEC2. No peptides or precursor peptides were detected in either wild-type or AEP-knockout plants. The peptides might have been degraded before contact with OaAEP1b or the plants were simply too old for efficient recombinant protein production, as indicated by the low levels of kalata B1 (Figures 4.11, 4.12).

91 CHAPTER 4. PLANT CELL ENVIRONMENT

Figure 4.11: The detection of kalata B1 in AEP knockout lines of N. benthamiana. MALDI-MS traces are displayed showing the masses corresponding to cyclic and linear kalata B1 from N. benthamiana extracts, harvested and extracted 6 days post infiltration with pEAQ-HT-Dest1 plasmid harbouring the OakOpt+UTR gene for kalata B1. Cyclic peptide signals highlighted by the navy boxes. Each trace represents one biological replicate. A signal at 2954 m/z represents the internal peptide standard spiked in to the extraction for relative quantification.

Figure 4.12: Relative quantification of kalata B1 from OakOpt+UTR precursor in AEP knockout N. benthami- ana lines. N. benthamiana leaf tissue infiltrated with OakOpt+UTR gene for kalata B1 was harvested 6 days post infiltration and extracted in 50%ACN, 1% FA. After removal of solid material via centrifugation, a 2954 Da peptide was spiked in to the extracts for relative quantification. Bars represent average relative peak area compared to the 2954 Da peptide standard. Error bars are standard error of the mean, based on three biological replicates.

92 4.3. RESULTS

4.3.7 Summary of cyclotides detected and not detected in N. benthamiana

In this chapter, each GOI was co-infiltrated with both LEC2 and OaAEP1b, OaAEP1b only, LEC2 only, or neither LEC2 nor OaAEP1b, and cyclotides were detected from four of the seven GOI constructs in some form in N. benthamiana leaf assays (Table 4.2). The grafted L5[G1, G33]_Abl06 cyclotide was not detected when expressed from either the Oak1 precursor (Oak_L5[G1, G33]_Abl06) or the Paws1 precursor (Paws1_L5[G1, G33]_Abl06) in any conditions tested. Production of MCoTI-II analogues that were demonstrated to be compatible with OaAEP1b in vitro in Chapter 3 of this thesis were not detected suggesting that AEP-compatibility might not be the primary issue for accumulation of cyclic product in N. benthamiana.

Table 4.2: Positive detection of cyclotides and linear products of GOIs from different vectors used in this study. AEP+LEC2 AEP LEC2 Control GOI Vector Cyclic Linear Cyclic Linear Cyclic Linear Cyclic Linear OakOpt+UTR pEAQ  OH123  FH123  NaDI-kB1 pEAQ  OH123  FH123  kB1-PPI pEAQ  OH123  FH123  Oak_MCoTI-II pEAQ  OH123  FH123  Oak_L5[G1, G33]_Abl06 pEAQ  OH123  FH123  Paws1_L5[G1, G33]_Abl06 pEAQ  OH123  FH123  pEAQ refers to the pEAQ-HT-Dest 1 vector containing the constitutive 35S promoter, pOH123 contains the Oleosin promoter and pFH123 contains the truncated napin promoter, FP1, from B. napus. A checkmark indicates that the expected peptide mass was detected in one or more biological replicates infiltrated with agrobacteria harbouring the plasmid and GOI. A cross mark indicates that no peptide of expected mass was detected in any biological replicates carried out throughout the experiments in this chapter.

93 CHAPTER 4. PLANT CELL ENVIRONMENT 4.4 Discussion

Expression of recombinant LEC2 in N. benthamiana leaves has been used as a tool for metabolic engineering via the activation of seed-specific promoters (Petrie et al., 2010). It was hypothesised that this technique could be applied not only to the activation of seed-specific promoters, but also to the accumulation of recombinant cyclotides if a protein storage environment were induced in leaves. Here, LEC2 expression affected cyclotide production in leaf tissue when seed-specific promoters were driving kalata B1 expression. LEC2 also enabled the detection of cyclic MCoTI-II for the first time in leaf tissue, albeit at very low signal intensity. An increase in cyclic kalata B1 was observed when Oak1 was co-expresed with LEC2 and OaAEP1b, and an increase in linear kalata B1 was observed when OaAEP1b was absent. This increase could be a result of stabilised linear and/or cyclic product facilitated by LEC2 expression. LEC2 can influence macromolecule composition of target tissues, which might be a key factor in promoting linear cyclotide accumulation via an increased capacity of the leaf to store proteins. Transgenic N. benthamiana plants with constitutive LEC2 expression had oil bodies present in roots and ovules that were not present in wild-type plants (Stone et al., 2008), and a transgenic LEC2 fusion protein has been shown to accumulate very long chain fatty acids in A. thaliana leaves (Santos Mendoza et al., 2005). More specifically, it appears as though LEC2 causes a shift in carbon partitioning away from sugars and carbohydrates and towards oils and protein. Angeles-Núñez and colleagues showed a 50% decrease in seed fatty acids and 38% lower protein content in seeds in lec2 mutants of A. thaliana (Angeles-Núñez et al., 2011). In the context of the research presented here, a preference for protein storage caused by LEC2 could result in the persistence of linear cyclotide precursors in the cell which in turn are cyclised by the co-expressed AEP. Another possibility is that cyclic products are stored more efficiently after AEP-mediated cyclisation, reducing any loss of the final product to endogenous protease activity or re-processing by OaAEP1b. Overall, the accumulation of linear cyclotide precursor in LEC2-induced, OaAEP1b-deficient treatments indicates that a protein-storage preference, or reduction in protease activity, has been induced in N. benthamiana leaves to result in higher accumulation of recombinant cyclotide precursor. To evaluate the effects of LEC2 expression, a histological approach could be taken to stain leaf sections for lipids, proteins and starch, to build on experiments carried out by Stone and colleagues (2008). In addition, protease profiling could be assessed considering the hypothesis that they interfere with cyclotide maturation or accumulation. There is currently great interest in improving the protease profile for expression of recombinant proteins in N. benthamiana in general, but there have been at least 1243 putative proteases identified in the plant (Jutras et al., 2020). Identifying and silencing unwanted proteases is a formidable task so it would be interesting to find out whether expression of LEC2 does reduce the number of proteases in N. benthamiana leaves overall. This could complement other approaches for reducing unwanted proteolysis during recombinant expression in plants such as co-expression of protease inhibitors or subcellular localisation (Mandal et al., 2016). An interesting result of this study is the lack of response to LEC2 by the CaMV 35S-driven

94 4.4. DISCUSSION constructs expressed from the pEAQ-HT-Dest1 vector, which was used here as a positive control for expression of OakOpt+UTR in N. benthamiana. When looking at the OakOpt+UTR expression from the three vectors used in this study, it appeared that the pEAQ-HT-Dest1 vector was simply far more efficient than the seed-specific vectors, as seen by the much higher relative amount of both cyclic and linear precursors produced, which might have indicated that cyclotide expression was already at maximum levels that could not be enhanced by LEC2. However, when looking at the production of kalata B1 from other precursor proteins, PPI_kB1 and NaD1_kB1, the pEAQ-HT-Dest1 vector did not enable more efficient expression of kalata-B1, with the overall trend appearing to favour the vectors containing seed-specific promoters. Moreover, the pEAQ-HT-Dest1 vectors containing PPI_kB1 and NaD1_kB1 did not respond to co-expression of LEC2, with and without OaAEP1b co-expression. The enhancement of kalata B1 expression by a LEC2-induced environment therefore appears to depend on expression of the cyclotide from a seed-specific promoter, or by the absence of an element in the pEAQ-HT-Dest1 vector. The expression of kalata B1 from a native precursor, albeit optimised, outweighed the advantage of using LEC2, but expression from other precursors could benefit from LEC2 co-expression. The case-by-case advantage of either LEC2 co-expression, precursor gene or vector of choice demonstrates the usefulness of rapid transient expression in N. benthamiana for identifying the best combinations of constructs to optimise cyclotide production, and reiterates the need for transient expression assays as opposed to lengthy stable transformation in systems such as A. thaliana seeds for screening purposes. It is difficult to identify a single cause for the different response of vectors pEAQ-HT-Dest1 and the two seed specific-promoter vectors, pOH123 and pFH123, partly because the vectors are of different origin. pOH123 and pFH123 are based on the vector pMDC100 (TAIR accession: 1009003749) and are identical except for the promoter sequences, but pEAQ-HT-Dest1 is unrelated and contains elements such as the p19 suppressor of gene silencing and the cowpea mosaic virus hypertranslational (CPMV-HT) expression cassette (Sainsbury et al., 2009). Because of these differences, it is not possible to directly compare expression driven by the constitutive 35S promoter of pEAQ-HT-Dest1 with the seed-specific promoters, and this was not the primary aim of these initial LEC2 experiments. However, it would be interesting to dissect whether it is the constitutive promoter that nullifies the effect of LEC2, because another possibility for LEC2 influence on promoter activity is a direct interaction with seed-specific promoters. As it turns out, LEC2 can directly bind to and activate the Oleosin promoter via two neighbouring RY elements (Che et al., 2009) that are included in the Oleosin promoter used in this study (Figure 4.10). This raises the possibility that LEC2 is directly increasing cyclotide production via promoter activation, rather than or in addition to creating an environment that favours protein accumulation. This could explain the absence of a LEC2 effect on the 35S-driven cyclotide production. LEC2 also had an effect when cyclotides were expressed from the FP1 napin promoter. This was expected based on work by Petrie and colleagues (Petrie et al., 2010) who used the FP1 promoter to drive transgene expression in leaves in the presence of LEC2. Although no direct interaction between LEC2 and FP1 has been reported, a closely related transcription factor, ABI3, is implicated in activation

95 CHAPTER 4. PLANT CELL ENVIRONMENT of the napin promoter via elements present in the truncated version, FP1 (Ezcurra et al., 2000). ABI3 is implicated to be a downstream product of LEC2 expression (Stone et al., 2008) suggesting a possible indirect mechanism for LEC2 induction of FP1-driven expression. The current model for LEC2 activity involves activation of the auxin pathway and seed-specific factors to induce an environment similar to a maturing seed. In A. thaliana, LEC2 RNA is present in the seeds and siliques and absent in all other tissues (Klepikova et al., 2016; TRAVA, 2019), so expression in other tissues using transgenic techniques is considered ectopic. In A. thaliana, lec2 null mutants cause serious embryo defects including inappropriate vivipary with cotyledons partially transformed into leaves (Meinke et al., 1994). Ectopic expression of LEC2 activates genes encoding auxin biosynthesis (Stone et al., 2008; Wójcikowska et al., 2013) and auxin is known to induce embryogenesis in somatic tissues (Gaj, 2004), leading to the hypothesis that LEC2 expression induces embryogenesis via upregulation of, or increased sensitivity to, auxin. Future experiments to validate a role for auxin in the LEC2-induced increase in cyclotide accumulation could involve comparing LEC2 expression with direct auxin application. LEC2 is implicated in induction of indolic compounds including indole-3-acetic acid (IAA), and can compensate for 2,4-D in somatic embryogenesis in explants (Wójcikowska et al., 2013). Either of these auxins would be a suitable starting point for comparison to LEC2 expression in inducing a protein storage environment that favours cyclotide accumulation. Some cyclotide masses were detected from seed-specific promoters without LEC2 expression, which could indicate some leaky promoter activity or unexpected promoter activation. The influence of LEC2 expression on other leaves of the same plant was ruled out as a mechanism for unwanted promoter activation, possible via mobile auxin in the plant (Figure 4.8). Although other studies use the Oleosin promoter as a negative control in leaf tissue, detection of peptides here was carried out using highly sensitive MALDI-MS which could detect very small amounts of peptide from very weak promoter activity. The detection of low levels of MCoTI-II peptide mass in this chapter suggests that transcription is initiated, but that something else is preventing peptide accumulation. Although mRNA levels were not assessed in this study, there is some suggestion that recombinant protein accumulation in a more general sense is more dependant on the leaf environment than transcript levels, as revealed by studies using dsRed accumulation in N. tabacum (Jansing et al., 2019). This supports the approach of manipulating the leaf cell environment to enhance cyclotide accumulation. It is clear that LEC2 expression alone is not sufficient to induce satisfactory levels of MCoTI-II accumulation, but it is a starting point for future work in promoting seed-specific protein accumulation in leaves for production or screening purposes. As mentioned in Chapter 2, understanding the protein trafficking pathways for MCoTI-II in A. thaliana seeds could hold the key for understanding why their expression in leaf tissue has so far proven difficult. Expression of MCoTI-II in leaf tissue might be enhanced if seed-like prevacuolar sorting pathways could be mimicked, and this could be coupled with LEC2 expression for maximum effect. To simplify studying the interaction between cyclotides and cyclising AEPs in N. benthamiana

96 4.5. CONCLUSIONS tissue, removal of endogenous AEPs would be beneficial in reducing the chances of unwanted proteolysis (Benchabane et al., 2008). Here, unused AEP-knockdown N. benthamiana plants were infiltrated with cyclotide constructs to assess cyclotide accumulation, in particular MCoTI-II. If AEP knockdown had resulted in higher MCoTI-II levels than in native plants, this would have suggested a role for endogenous AEPs in MCoTI-II degradation and a means to explain the lack of MCoTI-II in N. benthamiana plants until now. However, MCoTI-II was not detected. This could suggest that endogenous AEPs are not involved in premature cyclotide proteolysis, and that a different factor is preventing MCoTI-II accumulation. The main caveat here is that interpretation of the results is wanting due to the poor quality of the plants used. The CRISPR-edited plants were 5–6 weeks old when infiltrated which almost certainly impacted their ability to produce recombinant peptides, as seen by the low detection of kalata B1. Repeating the study with younger plants would be ideal. Even if MCoTI-II accumulation is not enhanced by the knockdown of endogenous AEPs, this study would still provide valuable information by eliminating AEP interference as a possible reason for difficulty in expressing the cyclotide. In this case, other interfering factors such as protein folding, trafficking in pre-vacuolar compartments, charged residues or proteolysis from other enzymes could be considered.

4.5 Conclusions

Manipulation of the N. benthamiana leaf cell environment was investigated here as a means to increase cyclotide accumulation. This was achieved by co-expression of the transcription factor LEC2, a master regulator of seed maturation and a modulator of seed-like properties in N. benthamiana leaves when expressed recombinantly (Petrie et al., 2010). LEC2 co-expression was a useful tool for expression of kalata B1 from non-native precursor proteins when driven by seed-specific promoters. Expression from seed-specific promoters can be just as efficient or better than pEAQ-HT-Dest1 when LEC2 and OaAEP1b are co-expressed with the precursor. It was possible to detect MCoTI-II when co-expressed with LEC2, which had been notoriously difficult in previous attempts at leaf-based transient expression. For future expression of cyclotides in N. benthamiana leaves it is recommended that LEC2 expression be incorporated into the system to enhance cyclotide detection. In addition, LEC2 could be used to compare expression in seed-like leaf tissue with that of A. thaliana to better understand cyclotide trafficking through the endomembrane system in seeds and in leaves. This could be applied to target cyclotides to specific pathways and enhance production. So far the thesis has described various aspects of compatibility between cyclotides and plant-based production systems, including genetic constructs, cyclising enzymes, and the plant cell environment. All of these studies rely on the current knowledge of cyclotide biosynthesis in planta, which is expanding with our discovery of new cyclotide-containing plant species and identification of new ligase-type plant enzymes such as AEPs. The next chapter of the thesis will describe and document the discovery efforts undertaken in the Kimberley region of Western Australia in May 2018 to observe and collect plant specimens of interest to cyclotide research.

97 CHAPTER 4. PLANT CELL ENVIRONMENT References

Andrianov, V., Borisjuk, N., Pogrebnyak, N., Brinker, A., Dixon, J., Spitsin, S., Flynn, J., Matyszczuk, P., Andryszak, K., Laurelli, M., Golovkin, M., and Koprowski, H. (2010). Tobacco as a pro- duction platform for biofuel: Overexpression of Arabidopsis DGAT and LEC2 genes increases accumulation and shifts the composition of lipids in green biomass. Plant Biotechnology Journal 8, pp. 277–287. Angeles-Núñez, J. G. and Tiessen, A. (2011). Mutation of the transcription factor LEAFY COTYLE- DON 2 alters the chemical composition of Arabidopsis seeds, decreasing oil and protein content, while maintaining high levels of starch and sucrose in mature seeds. Journal of Plant Physiology 168, pp. 1891–1900. Bally, J., Jung, H., Mortimer, C., Naim, F., Philips, J. G., Hellens, R., Bombarely, A., Goodin, M. M., and Waterhouse, P. M. (2018). The rise and rise of Nicotiana benthamiana : A plant for all reasons. Annual Review of Phytopathology 56, pp. 405–426. Benchabane, M., Goulet, C., Rivard, D., Faye, L., Gomord, V., and Michaud, D. (2008). Preventing unintended proteolysis in plant protein biofactories. Plant Biotechnology Journal 6, pp. 633–648. Boothe, J., Nykiforuk, C., Shen, Y., Zaplachinski, S., Szarka, S., Kuhlman, P., Murray, E., Morck, D., and Moloney, M. M. (2010). Seed-based expression systems for plant molecular farming. Plant Biotechnology Journal 8, pp. 588–606. Che, N., Yang, Y., Li, Y., Wang, L., Huang, P., Gao, Y., and An, C. (2009). Efficient LEC2 activation of OLEOSIN expression requires two neighboring RY elements on its promoter. Science in China Series C: Life Sciences 52, pp. 854–863. Ezcurra, I., Wycliffe, P., Nehlin, L., Ellerström, M., and Rask, L. (2000). Transactivation of the Brassica napus napin promoter by ABI3 requires interaction of the conserved B2 and B3 domains of ABI3 with different cis-elements: B2 mediates activation through an ABRE, whereas B3 interacts with an RY/G-box. The Plant Journal 24, pp. 57–66. Gaj, M. D. (2004). Factors influencing somatic embryogenesis induction and plant regeneration with particular reference to Arabidopsis thaliana (L.) Heynh. Plant Growth Regulation 43, pp. 27–47. Jansing, J. and Buyel, J. F. (2019). The correlation between DsRed mRNA levels and transient DsRed protein expression in plants depends on leaf age and the 5’ untranslated region. Biotechnology Journal 14, p. 1800075. Jutras, P. V., Dodds, I., and Hoorn, R. A. van der (2020). Proteases of Nicotiana benthamiana: An emerging battle for molecular farming. Current Opinion in Biotechnology 61, pp. 60–65. Kim, H. U., Lee, K.-R., Jung, S.-J., Shin, H. A., Go, Y. S., Suh, M.-C., and Kim, J. B. (2015). Senescence-inducible LEC2 enhances triacylglycerol accumulation in leaves without negatively affecting plant growth. Plant Biotechnology Journal 13, pp. 1346–1359. Klepikova, A. V., Kasianov, A. S., Gerasimov, E. S., Logacheva, M. D., and Penin, A. A. (2016). A high resolution map of the Arabidopsis thaliana developmental transcriptome based on RNA-seq profiling. The Plant Journal 88, pp. 1058–1070.

98 REFERENCES

Mandal, M. K., Ahvari, H., Schillberg, S., and Schiermeyer, A. (2016). Tackling unwanted proteolysis in plant production hosts used for molecular farming. Frontiers in Plant Science 7, p. 267. Meinke, D. W., Franzmann, L. H., Nickle, T. C., and Yeung, E. C. (1994). Leafy cotyledon mutants of Arabidopsis. The Plant Cell 6, pp. 1049–1064. Nookaraju, A., Pandey, S. K., Fujino, T., Kim, J. Y., Suh, M. C., and Joshi, C. P. (2014). Enhanced accumulation of fatty acids and triacylglycerols in transgenic tobacco stems for enhanced bioenergy production. Plant Cell Reports 33, pp. 1041–1052. Pearson, W. (2019). LALIGN tool. URL: https://embnet.vital-it.ch/software/LALIGN_ form.html (visited on 11/13/2019). Petrie, J. R., Shrestha, P., Liu, Q., Mansour, M. P., Wood, C. C., Zhou, X.-R., Nichols, P. D., Green, A. G., and Singh, S. P. (2010). Rapid expression of transgenes driven by seed-specific constructs in leaf tissue: DHA production. Plant Methods 6, p. 8. Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641. Powell, J. D. (2015). From pandemic preparedness to biofuel production: Tobacco finds its biotechnol- ogy niche in North America. Agriculture 5, pp. 901–917. Sainsbury, F., Thuenemann, E. C., and Lomonossoff, G. P. (2009). pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnology Journal 7, pp. 682–693. Santos Mendoza, M., Dubreucq, B., Miquel, M., Caboche, M., and Lepiniec, L. (2005). LEAFY COTYLEDON 2 activation is sufficient to trigger the accumulation of oil and seed specific mRNAs in Arabidopsis leaves. FEBS Letters 579, pp. 4666–4670. Slocombe, S. P., Cornah, J., Pinfield-Wells, H., Soady, K., Zhang, Q., Gilday, A., Dyer, J. M., and Graham, I. A. (2009). Oil accumulation in leaves directed by modification of fatty acid breakdown and lipid synthesis pathways. Plant Biotechnology Journal 7, pp. 694–703. Stålberg, K., Ellerström, M., Josefsson, L.-G., and Rask, L. (1993). Deletion analysis of a 2S seed storage protein promoter of Brassica napus in transgenic tobacco. Plant Molecular Biology 23, pp. 671–683. Stone, S. L., Braybrook, S. A., Paula, S. L., Kwong, L. W., Meuser, J., Pelletier, J., Hsieh, T.-F., Fischer, R. L., Goldberg, R. B., and Harada, J. J. (2008). Arabidopsis LEAFY COTYLEDON2 induces maturation traits and auxin activity: Implications for somatic embryogenesis. Proceedings of the National Academy of Sciences 105, pp. 3151–3156. TRAVA(2019). TRAVA:TRAnscriptome Variation Analysis. Relative read counts. Gene ID - AT1G28300. URL: http://travadb.org/browse/DeSeq/AT1G28300/0-100/AvNorm/Color=RCount/ (visited on 11/05/2019).

99 CHAPTER 4. PLANT CELL ENVIRONMENT

TRAVA(2020). TRAVA:TRAnscriptome Variation Analysis. Relative read counts. Gene ID - AT1G48990. URL: http://travadb.org/browse/DeSeq/AT1G48990/0-100/AvNorm/Color=RCount/ (visited on 01/15/2020). Wójcikowska, B., Jaskóła, K., G ˛asiorek,P., Meus, M., Nowak, K., and Gaj, M. D. (2013). LEAFY COTYLEDON2 (LEC2) promotes embryogenic induction in somatic tissues of Arabidopsis, via YUCCA-mediated auxin biosynthesis. Planta 238, pp. 425–440.

100 Chapter 5

Cyclotide discovery efforts in the Kimberley region of Western Australia

101 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS 5.1 Overview

A plant collection trip to the Kimberley region in Western Australia was undertaken in May 2018. The objectives of this trip were to observe and collect plant specimens that might contain cyclic peptides and peptide ligases such as AEPs. So far, multi-day plant collection trips in Australia with a focus on cyclotide-containing species have covered Far North Queensland, the Simpson Desert, and the Victorian Alps (Craik, 2013), as well as numerous day trips and individual specimen collections. The Kimberley region is a rich natural heritage area with known collection records from all five cyclotide-containing plant families (ALA, 2018i; ALA, 2018h; ALA, 2018d; ALA, 2018f; ALA, 2018c) as well as other families that contain small cyclic peptides called orbitides (ALA, 2018b) and potentially interesting proteases (ALA, 2018e; ALA, 2018a). Discovery is an important part of cyclotide research given that around 300 of the predicted 50000 naturally occurring cyclotides have been sequenced so far (Gruber et al., 2008). There are still many unanswered questions about cyclotides, including whether they exist in other plant families, why some plants express many different cyclotides, and if they have functions other than plant host defence (Weidmann et al., 2016). Identifying new cyclotides could help answer these questions, as well as provide templates for drug and agrochemical design. This is probably why cyclotide discovery is the largest category of cyclotide research (Kan et al., 2018). Apart from our focus on cyclotide discovery, other cyclic peptides of interest include orbitides, small cyclic peptides with between five to twelve amino acids, no disulfide bonds, that are found in at least nine plant families (Annonaceae, Caryophyllaceae, Euphorbiaceae, Lamiaceae, Linaceae, Phyto- laccaceae, Rutaceae, Schizandraceae, and Verbenaceae)(Arnison et al., 2013). Orbitides themselves have bioactive properties, and the plants that make these peptides also contain peptide ligases that are involved in orbitide maturation, such as the PCY1 enzyme from Saponaria vaccaria (Barber et al., 2013). For this reason, orbitide-containing plants are particularly relevant to our interests in finding novel enzymes capable of peptide ligation and were included in our discovery efforts. Throughout this thesis, the use of AEPs for peptide ligation has been highlighted. In Chapter 3 the cyclotide MCoTI-II was re-engineered so that it would be compatible with AEP-mediated cyclisation in vitro, and in Chapter 4, co-expression of AEPs alongside LEC2 in planta was used for biosynthesis of mature cyclotides. With recent advances in our understanding of what makes some AEPs prefer ligation over hydrolysis (Jackson et al., 2018) it is becoming easier to predict which of these enzymes, that are present in all plants, might be of use for bioengineering purposes. Native AEPs from plants have been a valuable resource of peptide ligases and have also been the basis for engineering new efficient ligases (Yang et al., 2017; Jackson et al., 2018). Cyclisation of peptides by AEPs is not limited to the AEPs that co-occur in their native plant producer, particularly for in vitro applications. For example, SFTI from H. annuus was efficiently cyclised in vitro by an AEP from jackbean (Canavalia ensiformis) but not from its native AEP partner, HaAEP, even though the latter is probably responsible for SFTI cyclisation in planta (Bernath-Levin et al., 2015). SFTI is also cyclised by AEPs from A. thaliana, which is not a cyclic peptide producing plant (Bernath-Levin et al., 2015; Zauner et al., 2018).

102 5.2. METHODS

The OaAEPs from O. affinis are active on a number of substrates (Harris et al., 2015; Harris et al., 2019; Jackson et al., 2018; Poon et al., 2018). Because AEP ligase activity is not limited to the cyclic peptides from its native parent plant, identifying uncharacterised AEPs or other ligases from new plant species could increase the toolbox for enzyme-mediated ligation and cyclisation, so this was a key objective for the Kimberley collection trip. Because of the remote location of the Kimberley region, substantial planning was required to organise logistics and identify target taxa before our departure. This chapter aims to serve as an official record of the trip, a guide to future cyclotide discover endeavours, and to collate and detail collection data. The field trip was conducted by Professor David Craik, Dr Edward Gilding, Dr Mark Jackson, Dr Quentin Kaas and myself between the 13th and the 21st May 2018.

5.2 Methods

5.2.1 Itinerary

The Kimberley region falls within the Australian monsoon tropics as it receives over 85% of its rainfall between the austral summer months of November to April (Bowman et al., 2010). May was selected as an appropriate month to collect plant specimens in the Kimberley region because it is around this time of year that the Gibb River Rd becomes accessible after the rainy season. It was important to travel as soon as possible after the rainy season before the lack of rainfall would reduce the chance of finding target plant species. The itinerary outlines the key dates, locations and activities of the trip (Table 5.1). The Gibb River Road runs through the Kimberley region from near the township of Derby through to Kununurra near the Northern Territory border (stops B–F in Figure 5.1).

Table 5.1: Kimberley trip: Itinerary Date Location Key activities Sunday 13-May-18 Brisbane – Perth – Broome Use transit time to visit King Park Botanic Gardens and observe target taxa Arrive Broome. Last good shower. Monday 14-May-18 Broome – Windjana Gorge Pick up 4WD and aquire provisions Plant collection Tuesday 15-May-18 Windjana Gorge – Silent Grove camp- Plant collection site Wednesday 16-May-18 Silent Grove – Mount Elizabeth Plant collection Thursday 17-May-18 Mt Elizabeth – Home Valley Station Plant collection Friday 18-May-18 Home Valley Station – Kununurra Plant collection Saturday 19-May-18 Kununurra – Wolfe Creek DC, QK depart for Brisbane. EG, MJ and BS on to Wolfe Creek Plant collection Sunday 20-May-18 Wolfe Creek – Willare Plant collection Monday 21-May-18 Willare – Broome – Darwin Clean and return 4WD. Pack up sam- ples. Tuesday 22-May-18 Darwin – Brisbane Store samples at Institute for Molecular Bioscience, UQ

103 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.1: The driving route taken for the Kimberley Field Trip. Blue letters indicate camp sites as listed underneath the map. This map was generated using the My Maps feature of Google Maps and is accessible at https://bit.ly/35pe2Bo to view higher resolution and closer details.

5.2.2 Retrieving historical plant collection data

A list of target taxa was generated by Dr Gilding based on species that are known to contain cyclotides, orbitides or that could have interesting protease profiles, and is provided in the appendix to this thesis (Table A.1). Historical plant collection data was obtained to predict possible collection sites and habitats for target taxa. This information was gathered from the Australasian Virtual Herbarium (AVH) (AVH, 2019) and Atlas of Living Australia (ALA) (ALA, 2019). Field guides were generated by the ALA by searching for the target genus within the Kimberley area, selecting the download option, and choosing the Species Field Guide. All guides contain embedded links to the original data search. The field guides were sorted according to genus, printed and placed in plastic folders to take to the field. Hand written notes were added where appropriate to aid in plant and habitat identification. An example is shown in Figure 5.2.

104 5.2. METHODS

Figure 5.2: An example of Field Guides generated by the Atlas of Living Australia for Clitoria ternatea and Nicotiana benthamiana. Hard copies of field guides were collated for all target taxa to bring on the collection trip.

5.2.3 Plant collection

Seeds, leaves and shoots were collected by removing a small amount of the tissue of interest and storing it in either paper envelopes, zip lock bags or microcentrifuge tubes. A collection kit was assembled by Dr Edward Gilding for this purpose including scissors, pens, tubes, zip lock bags, envelopes and paper clips as shown in Figure 5.3. Vegetative tissue was stored in the camper refrigerator.

5.2.4 Record keeping

Visual records

Photographs were taken at each collection site by E. Gilding and B. Smithies using mobile phones with location tagging switched on. Exchangeable image file format (EXIF) data was extracted to obtain GPS coordinates for these photos and matched to the collection records to create a comprehensive collection record of each site sampled. Other members of the party also provided photographs for observational records of collection sites and specimens. Particular thanks go to all members of the party for providing photographs to assist with documenting this trip.

105 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.3: Plant collection kit.

Written records

A handwritten record of collections was kept by E. Gilding (see Appendix Figure A.8) including unique specimen ID, type of tissue collected, and additional notes. A journal was kept by B. Smithies to document the trip in a more general sense and was used here to confirm details such as camp sites, weather and notes on the general surroundings.

5.2.5 Other planning: Permits, equipment, provisions

Plant collection permits

Dr Edward Gilding organised permits for collection of plant material in the Kimberley region. This included applying to the relevant authorities, in this case the Department of Parks and Wildlife of the Government of Western Australia. The permit obtained was a Regulation 4 Authority for the collection of flora and fauna in the Western Kimberley region under the Conservation and Land Management Regulations (CALM), 2002. As required, copies of the relevant permits were carried at all times during the trip.

Equipment and provisions

A four-wheel drive vehicle and camper trailer were hired from Crikey Camper Hire Broome. The hire car also came with essential equipment including water storage tank, first aid kit, satellite phone, UHF radio and spare tyres. Those wanting some privacy brought their own tents. Cold nights were expected and so individuals were encouraged to bring sleeping bags and warm clothes. Other items that were useful for the trip included torches, insect repellent, sunscreen, hats, sleeping mat, mattress pump, good walking shoes and hand sanitiser wipes. To ensure adequate sustenance, a meal plan and shopping list was prepared for the trip. This meant that shopping for provisions in Broome was efficient and that enough food was purchased. This was especially important for the Gibb River Road portion of the trip which only has a few small stores

106 5.2. METHODS along the route. For future reference, a copy of the meal plan and shopping list is included in Appendix Figure A.9.

5.2.6 Safety

First aid

Due to the remote location of the Kimberley region and the lack of mobile phone reception or internet along the Gibb River Rd portion of the journey, it was important to identify potential risks to safety and be familiar with relevant first aid procedures. A folder containing current first aid advice from St John Ambulance Australia regarding snake bite and cardio-pulmonary resuscitation (CPR) was taken on the trip, and attendees were encouraged to download the St John First Aid mobile phone app (St John Ambulance Australia Incorporated, 2018) (Figure 5.4). A first aid kit and satellite phone were provided with the four-wheel drive hire.

Figure 5.4: First aid resources for the Kimberley Field Trip. (Left) Screenshot from First Aid mobile phone app (St John Ambulance Australia Incorporated, 2018) showing list of illness and injury treatment information available. (Right) Pictures of hard copy first aid sheets printed and inserted into yellow first aid folder that was kept easily accessible in the vehicle during trip.

Crocodile awareness

Two species of crocodile are found in the Kimberley region; the freshwater crocodile (Crocodylus john- stoni) and the Australian Estuarine crocodile or saltwater crocodile (Crocodylus porosus). Freshwater crocodiles are generally regarded as harmless unless provoked, but saltwater crocodiles are territorial and aggressive. Despite the name, saltwater crocodiles can be found up to 200 km inland in fresh water. Before departure, the Pentacost River was identified as the only crocodile-risk zone in our itinerary.

107 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Care was taken here to keep well away from the water. Particular attention was also paid to signage along the Gibb River Rd (Figure 5.5).

Figure 5.5: Crocodile safety. (Top) Crocodile information sign near the Drysdale River/Kalumburu turn-off, Gibb River Road, Western Australia. (Bottom) The Pentacost River was identified as a crocodile risk zone along our route. Photographs were taken from a safe distance before closing windows and crossing the river by car.

108 5.3. RESULTS 5.3 Results

5.3.1 Specimens collected

During the nine-day collection trip in the Kimberley region in May 2018, a total of 53 plant specimens were collected from ten plant families. These included both vegetative tissue and seeds where possible. A written record of specimens was kept by Dr Gilding whereby individual samples were given a unique number (DC00XX). This written record was transcribed to an electronic version here and GPS coordinates for each collection site were added (Table 5.2). For scans of the original records see Appendix Figure A.8. Specimens from the cyclotide-containing Rubiaceae and Fabaceae families were among the highest sampled. Observations were made for other cyclotide-containing families, the Cucurbitaceae and Solanaceae, but no Violaceae species were observed or collected. Cyclotide-containing plant families were collected at 13 out of 15 sites sampled, although not all of these are anticipated to contain cyclotides. All specimens from the Rubiaceae family were passed on to a colleague for assessment but no cyclotides have been identified. Clitoria ternatea contains 74 recorded cyclotide sequences (Oguis et al., 2019) and we were lucky to come across a grove of this species growing by the side of the road at Same Creek (17°10’30"S, 125°17’35"E). The plants were in various stages of flowering and setting seed, and nine seed accessions were collected along with many photographs of the stunning blue flowers (Figure 5.9). These seeds were brought back to our laboratory in Brisbane and added to our collection of C. ternatea seeds from around the world. Three Caryophyllaceae specimens were collected during the trip, including seed, and analysis of these samples for orbitides and peptide ligases is yet to be carried out, pending growth of more plant material from the seeds collected. The three specimens collected appeared to have very different growth habits. The first, collected at Windjana Gorge savannah (Figure 5.8, bottom left) had a short sprawling growth and had already dried out considerably before our collection. The second, collected at Mt Elizabeth (Figure 5.13, left, middle row) was around knee-height and growing upright amongst the grasses of similar height. We suspect this species to be Polycarpaea longiflora. The third specimen was very small with a bunched growth habit collected from a roadside stop past Kununurra (Figure 5.15, bottom left). The three samples collected are probably different species considering their varied appearance and have not been assessed for orbitide content to our knowledge. Species of interest for their protease profile included the carnivorous plants from the Byblidaceae and Droseraceae families. These plants exude sticky substances to trap insects (Fukushima et al., 2008). Because of their carnivorous nature, it is possible that they contain interesting proteases that are worthwhile investigating as part of our peptide protease and ligase discovery efforts. Two specimens from each of the Byblidaceae and Droseraceae families were collected during the trip and analysis of their protease content is yet to be carried out.

109 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS pan berley Tiny red peas Plant Billy goat springs 16°53’45"S 125°47’26"E Table 5.2: Specimens collected in the Kimberley region of Western Australia DroseraceaeRubiaceaeRubiaceae PlantRubiaceaeCaryophyllaceaea PlantFabaceae Plant 60km Plant from and Broome seed Plant and Windjana seed Savannah 60km from Broome Windjana Savannah 17°45’51"S Plant Windjana and 122°53’28"E Savannah 17°24’43"S seed 17°45’51"S Petiolaris 124°56’39"E Same complex 17°24’43"S Creek 122°53’28"E 124°56’39"E 17°24’43"S 124°56’39"E Orbitides? 17°10’30"S 125°17’35"E Small single-seeded legumes. Suspect Euphorbiaceae PlantFabaceaeLentibulariacaeaCaryophyllaceae Billy goat springs Plant Plant Plant and and seed seed 16°53’45"S Mt Mt Elizabeth Elizabeth road 125°47’26"E Silent grove ck near spring edge in sand 16°25’25"S 126°6’15"E 16°25’25"S 17°4’7"S Orbitides? 126°6’15"E Common across 125°14’49"E Kimberley in creek near campsite Byblidaceae Plant and seed 60km from Broome 17°45’51"S 122°53’28"E Bright purple/blue flower. Sandy FabaceaeFabaceaeFabaceaeFabaceaeFabaceae SeedFabaceae SeedFabaceae SeedFabaceae SeedFabaceae Same Creek SeedRubiaceae Same Creek SeedRubiaceae Same Creek SeedRubiaceae Same Creek SeedPossible Rubiaceae Same Creek Seed 17°10’30"SRubiaceae Same Plant Creek and seed 125°17’35"E 17°10’30"SRubiaceae Plant and Same Plant seed Creek and Bell A seed 125°17’35"E Gorge 17°10’30"SSuspect Euphorbiaceae Same Silent Plant Creek grove and Bell B seed 125°17’35"E Gorge 17°10’30"SFabaceae Plant Same Creek Bell C 125°17’35"E Gorge 17°10’30"S Plant and D seed 125°17’35"E 17°10’30"S Plant and Silent E seed 125°17’35"E grove 17°10’30"S Silent 16°59’40"S Silent grove F 125°17’35"E grove 17°10’30"S 17°4’7"S 125°12’15"E 16°59’40"S G Plant 125°17’35"E 17°10’30"S and seed In 125°12’15"E 125°14’49"E 16°59’40"S wet H cracks 125°17’35"E at Billy river goat springs In Along 125°12’15"E rock creek I faces at silent grove 17°4’7"S camp In rock faces 17°4’7"S 17°4’7"S 125°14’49"E 16°53’45"S 125°14’49"E Along 125°14’49"E creek 125°47’26"E at silent grove camp along At creek camp at common 25 camp across weed waysides in Kim- Drosera sp. Spermacoce sp. Spermacoce sp. Spermacoce sp. n.d. n.d. n.d. Chickweed n.d. Utricularia sp. Polycarpaea longiflora Byblis liniflora Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Clitoria ternatea Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. n.d. Spermacoce sp. aff. Hedyotis n.d. n.d. 14.05.18 DC0031 14.05.18 DC0032 15.05.18 DC003315.05.18 Astragalus-like DC0034 15.05.18 DC0035 15.05.18 DC0036 15.05.18 DC0037 Fabaceae Plant and seed Windjana Savannah 17°24’43"S 124°56’39"E 16.05.18 DC0055 16.05.18 DC0056 16.05.18 DC0057 16.05.18 DC0058 17.05.18 DC0059 Date14.05.18 ID DC0030 Species Family Sample type Site name Latitude Longitude Notes by Dr Edward Gilding 15.05.18 DC0038 15.05.18 DC0039 15.05.18 DC0040 15.05.18 DC0041 15.05.18 DC0042 15.05.18 DC0043 15.05.18 DC0044 15.05.18 DC0045 15.05.18 DC0046 15.05.18 DC0047 15.05.18 DC0048 15.05.18 DC0049 16.05.18 DC0050 16.05.18 DC0051 16.05.18 DC0052 16.05.18 DC0053 16.05.18 DC0054

110 5.3. RESULTS creek creek gibb river rd gibb river rd gibb river rd road Magenta flowers resembling Lewisia DroseraceaeRubiaceaeRubiaceae PlantRubiaceaeRubiaceaeRubiaceae PlantRubiaceae Mt Plant Elizabeth and track seedRubiaceae Plant and Mistake seed CreekRubiaceae Plant and Hann seed River Mistake Crossing Creek 16°25’55"S Plant and Hann seed RiverRubiaceae Crossing 126°6’53"E Plant Hann River 16°30’52"S Crossing Plant Around 16°27’15"S and seed 4WD 126°21’22"ERubiaceae 16°30’52"S track Plant 126°24’15"E to and Mt seed In 16°27’15"S Elizabeth 126°21’22"E gorge 16°30’52"S sand, is at plentiful In sand Gibb 126°24’15"E In River 126°21’22"E HannRubiaceae sand, Creek River is Crossing plentiful Plant and Bluish seed In bracts,Portulacaceae sand, roadside is plentiful Gibb River crossing 16°30’52"S Plant and seed 16°24’42"S 16°25’25"S 126°21’22"E Gibb 126°28’32"E RiverRubiaceae 126°6’15"E crossing Roadside at 16°24’42"S creek Plant Plant and and Along seed seedCaryophyllaceae intersection 126°28’32"E Mt at Elizabeth creeks 4WD Lake track Lake and Nabberu up NabberuApocynaceae 1km creek 1km Along 16°24’42"S intersection at creeksRubiaceae 126°28’32"E and Seed headsRubiaceae Along intersection Plant 16°2’54"S 16°2’54"S and at seed creeks and Plant Unnamed King 126°42’32"E and 126°42’32"E roadside river seed 1km 1km from Wolfe Creek Lake Nabbaru from at Plant road and seed Lake 16°15’5"S Plant Nabbaru and Willare seed roadhouse at 128°18’38"E 63km from Broome Short tiny n/a plants 19°10’22"S 17°43’36"S 127°47’30"E 17°47’15"S 123°39’16"E xerophyte. 122°49’28"E inflorescense looks not odd seen n/a roadside Road to Kununurra RubiaceaeEriocaulacaeaeFabaceaeByblidaceaeRubiaceae Plant Plant and seed Mt Elizabeth road, creek Plant Plant and and seed 16°25’55"S seed Mt Elizabeth road, Mt 126°6’53"E creek Mt Elizabeth Elizabeth track Plant track and seed 16°25’55"S In sand Mt Elizabeth 126°6’53"E track 16°25’55"S 16°25’55"S In shallow water 126°6’53"E 126°6’53"E 16°25’55"S Around Around 4WD 4WD track 126°6’53"E track to to gorge gorge Around 4WD track to gorge at Drosera burmannii Spermacoce sp. Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. n.d. Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. Oldenlandia sp. n.d. Oldenlandia sp. Polycarpaea Marsdenia angustata Oldenlandia aff corymbosa Oldenlandia sp. Oldenlandia sp. Eriocaulon sp. Crotalaria sp. Byblis liniflora Oldenlandia sp. 17.05.18 DC0065 17.05.18 DC0066 17.05.18 DC0067 17.05.18 DC0068 17.05.18 DC0069 17.05.18 DC0070 17.05.18 DC0071 17.05.18 DC0072 17.05.18 DC0073 17.05.18 DC0074 17.05.18 DC0075 17.05.18 DC0076 17.05.18 DC0077 19.05.18 DC0078 19.05.18 DC0079 20.05.18 DC0080 21.05.18 DC0081 21.05.18 DC0082 17.05.18 DC0060 17.05.18 DC0061 17.05.18 DC0062 17.05.18 DC0063 17.05.18 DC0064

111 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

5.3.2 Collection sites

To collect and observe plant samples in the Kimberley region we aimed to stop at roadside locations or short walking tracks with water nearby that would support plant growth. This meant looking out for creeks along the roadside. Sometimes creeks were too dry to yield any observations of target species, but a number of creeks still contained some water and plants could be found nearby. In total, collections were made at 15 sites along the Gibb River Road, Great Northern Highway and Wolfe Creek as shown in Figure 5.6. One collection site was omitted as no photographs or GPS coordinates were recorded. This was the site named King River in the collection record in Table 5.2. Each collection site in the figure below is given the name as recorded in the collection sheet. A summary of each collection site, including imagery of the surroundings and selected observed species is provided in Figures 5.7 through 5.18. Not all observed specimens were collected, but photographs at each site were collated as a visual record of observed species. These were distributed to each member of the party and deposited in a network drive accessible to all members of the research group for future reference. Selected images are presented here along with species, genus and/or family name where determined. No pictures were taken at Lake Nabberu but GPS coordinated were estimated based on notes in the record sheet. The following figures aim to serve as a pictorial record of the collection sites as of May 2018 for future reference.

112 5.3. RESULTS

Figure 5.6: Map of plant collection sites. Each collection site is represented by an orange leaf symbol and numbered in order of site visited. The blue line represents the driving route, and blue letters represent camping stops. An interactive map is available at https://bit.ly/35pe2Bo including GPS coordinates for each site. The map was made using Google Maps/My Maps (Google Maps, 2019) using GPS coordinates recorded in the EXIF data from a representative photo taken at each site. No photos were taken at site 11 ‘Lake Nabberu + 1km’ so an estimation of longitude and latitude was made by measuring 1km from Lake Nabberu measure tool on My Maps.

113 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.7: Collection site 1: Roadside 60 km from Broome. The environment here consisted of grassy plains with sparse eucalyptus trees and sandy soil. Several Byblis and Drosera species were observed and collected at this site within several metres of the roadside.

114 5.3. RESULTS

Figure 5.8: Collection site 2: Windjana Savannah. Spermacoce and Caryophyllaceae species were found within approximately ten minutes walking distance from the Windjana Gorge campsite towards the gorge itself. The habitat consisted of dry open grassland with small shrubs and trees scattered throughout.

115 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.9: Collection site 3: Same Creek. Dense groves of Clitoria ternatea were observed growing along the tree-line on both sides of the road at this location as shown in the bottom right-hand picture. Creeks ran on either side of the road.

116 5.3. RESULTS

Figure 5.10: Collection site 4: Bell Gorge. Oldenlandia species were observed and collected at this rocky oasis. Collections were made near the water pools at the top of the gorge and from a rocky surface near the bottom of the waterfall.

117 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.11: Collection site 5: Silent Grove. A short walk along a track leading from the campsite to a nearby creek yielded observations of Rubiaceae species, bladderworts and spectacular water lillies. Dense vegetation and rocky ground dominated this area.

Figure 5.12: Collection site 6: Billy Goat Crossing. This creek yielded a collection of Fabaceae species. Eucalyptus and grevillea trees were common here.

118 5.3. RESULTS

Figure 5.13: Collection site 7: Mt Elizabeth track. This four-wheel drive track began as dry grassland nearby a dry creek bed. Polycarpea, Crotalaria and Cucurbitaceae species were observed amongst the grasses here. Further along the track some small creeks were located with some shallow water remaining. This yielded observations and collections of Byblis, Drosera and Ericaulon species.

119 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.14: Collection sites 8–10: Hann River Crossing, Mistake Creek, and Gibb River. These three stops by creek beds along the roadside yielded collections mostly of Rubiaceae specimens and were within 20 km of each other. Photo records at these sites were limited. The Hann River (pictured) contained water, but water was not found at Mistake Creek or near Gibb River.

120 5.3. RESULTS

Figure 5.15: Collection site 12: Unnamed Roadside. This rest stop past Kununurra was surrounded by grassland and sandy soils. A small Caryophyllaceae species was found growing in an exposed region of rocky ground.

121 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.16: Collection site 13: Wolfe Creek. A detour to Wolfe Creek crater provided an opportunity to observe many Gossypium specimens in various stages of flowering or with fibre covered seed pods (pictured bottom right). Cucurbitaceae plants with small fruit were also common around the campsite. A single specimen of Apocynaceae was collected and later identified as Marsdenia angustata.

122 5.3. RESULTS

Figure 5.17: Collection site 14: Willare. One specimen of Oldenlandia was observed growing from a crack in the brickwork near the Willare Roadhouse. Some seed was collected from this plant.

123 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Figure 5.18: Collection site 15: Roadside 63 km from Broome. The final stop near some water on the side of the road yielded some observations of Byblis and Oldenlandia specimens.

124 5.4. DISCUSSION 5.4 Discussion

Over the course of nine days, we collected 53 plant specimens from 15 different locations and these were carefully catalogued and brought back to our laboratory in Brisbane. Our collections and observations covered nine of the twelve target families we aimed to collect and have provided an abundance of samples for ongoing analysis. Given the number of samples collected and the high proportion of target families represented in our observations, this collection trip should be considered successful according to our aims of collecting species that could contain cyclic peptides and peptide ligases. In particular, the collection of nine C. ternatea seed accessions and 25 Rubiaceae specimens adds valuable resources to our current collections of these specimens. The collection of many samples of interest for their protease profiles, including Byblis and Drosera species, will enable us to look into the potential peptide ligases from these families for the first time.

5.4.1 Elusive species and future collections

Of the families that were not observed or collected, the most notable was the Violaceae and, especially, the Hybanthus genus. All members of the team were on particular lookout for this family as all members of the Violaceae family screened to date contain cyclotides (Göransson et al., 2015), including those tested in the Hybanthus genus (Trabi et al., 2009). Although H. enneaspermus and H. aurantiacus occur in the region (ALA, 2018i), we were unable to locate them. This was probably be due to the dry environment, which had already dried considerably since the end of the rainy season approximately one month earlier. Although not observed on this trip, one accession of H. enneaspermus has been collected since from a location near Gladstone in Queensland by Dr Gilding, and is currently being analysed for cyclotide content and bioactivity in our laboratory. This site was by the edge of a stream, supporting our hypothesis that the Kimberley region was too dry to enable us to happen across any specimens, or that it is occasional and localised in the Kimberley as it seems to be elsewhere. Future trips to the Kimberley have the potential to yield observations of other species of interest that were not located on this trip. The passionfruit family, Passifloraceae, is of interest as a closely related family to the cyclotide-abundant Violaceae in the order Malphigiales. Records indicate that specimens of this family are found in higher abundance to the North of the Gibb River Road (ALA, 2018g), an area largely unexplored on this trip but reasonably accessible by four-wheel drive. The Rubiaceae specimens collected on this trip were predominantly Oldenlandia and Spermacoce species, but species in other genera such as Gardenia and Psychotria appear to be more prevalent to the North of the Gibb River Road (ALA, 2018h). The rare violet, Hybanthus bennettiae, also occurs in the region (ALA, 2020). Further North of the Gibb River Road towards the coast might be more abundant in many target species or those not sampled on this trip, although venturing into this area would probably require more time on the road and therefore more planning in terms of bringing supplies and being ready for remote conditions. The prevalence of saltwater crocodiles could also become a safety risk closer to the coast. However, some of this area is relatively accessible, for example towards the Drysdale river and Kalumburu communities via the Kalumburu Road. The continuing discovery

125 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS of new plant species in the Kimberley region, including in genera of interest to our pursuits (Lowrie et al., 2008; Jobson et al., 2015; Stewart et al., 2015), suggests that this area will be exciting for plant collection in relation to cyclic peptides and peptide ligases in future pursuits.

5.4.2 Working in a remote area

Working in a remote area for a multi-day collection trip brings challenges and rewarding results, but one must be prepared for the unexpected and gather as much information as possible whilst on the trip, as it could be difficult to return. This means being highly organised and prepared for both the scientific objectives, i.e. collection of plant material, and the logistical and safety components. In terms of plant collection, some of the planning that was most useful was collection of GPS data with photos, having someone with extensive plant collection experience on the trip, and the preparation of customised field guides for our target taxa. For GPS data collection, members of the team made sure that the location tagging feature on their mobile phone devices was enabled, and that their GPS was switched on. This meant that GPS data could be extracted from all photos that were taken with mobile phones on the trip. Although mobile phone network reception was not available along the Gibb River Road, GPS satellite data could still be collected, providing a valuable tool for data collection and proving the usefulness of modern mobile phones in the field even when they cannot be used for communication. In terms of plant identification, having an experienced botanical enthusiast on the trip, Dr Edward Gilding, was a great advantage and the collection would not have been successful without their expertise. Although all members of the party were sufficient in the plant identification skills, Dr Gilding demonstrated exception intuition in identifying plants and habitats that contained our target taxa. This meant stopping numerous times on the side of the road in areas that would have been looked over by others, for example, the fruitful collection site at Same Creek where C. ternatea was abundant. Dr Gilding’s extensive knowledge and intuition was possibly the most valuable resource available on the trip in terms of plant collection. Aside from Dr Gilding’s exception skills, Professor Craik has prior experience leading collection trips for cyclotide-containing plants and is highly skilled at identifying species from the cyclotide-containing plant families. Dr Jackson has experience with plant collection trips and has excellent photography skills that were employed to capture high quality photographs of selected species on the trip. This was the first plant collection trip for Dr Kaas but his exceptional attention for detail and keen eye meant that he identified many target taxa in the field and his skills matched those with prior experience. He also provided particularly useful records of the human side of the trip by taking candid photographs at collection stops and of collection equipment. Having an experienced, professional and enthusiastic team was key to the success of the trip. Field guides were generated to assist with plant identification on the trip (Figure 5.2). These were the only physical resource for plant identification in the field and so were particularly valuable. The Australasian Virtual Herbarium and Atlas of Living Australia websites were particularly helpful in generating field guides for each of our target taxa (ALA, 2019; AVH, 2019). Knowledge of the habitat

126 5.4. DISCUSSION and morphological features of the target species was the most useful feature of these field guides. The GPS coordinates for previous sample collections were used as a general guide but were not followed for locating individual specimens, as older GPS data might not reflect current plant locations, particularly given the ephemeral growth patterns of some of our target species. Still, we collected GPS data for our own collections and hope that this might be useful for the general location of species in or around the areas recorded. For the purpose of planning field trips, GPS coordinates and detailed collection information can generally be found from online plant collection databases or local herbaria. Some helpful resources include the AVH used for this trip (https://avh.chah.org.au/), JSTOR global plants database (https://plants.jstor.org/), Kew Gardens data and digital resources (Kew Gardens, 2019) and the Global Biodiversity Information Facility (https://www.gbif.org/). Local, state and national herbaria are an excellent resource for local plant collection trips. Aside from the preparation required for the plant collection, extensive planning was required for the logistical side of the trip. This included applying for plant collection permits, booking four-wheel drive hire, checking the status of the Gibb River Road, preparing meal plans and shopping lists and being aware of safety considerations. Overall the field trip ran very smoothly and this was probably due to careful planning and organisation before departing Brisbane. The methods section in this chapter should serve as a guide to those seeking advice on how to navigate these logistical aspects of a field trip. Although we were afforded the luxury of a modern vehicle with camper, camping kitchen and plenty of storage space for water, food and other supplies, a reasonable level of tolerance was required for general necessities of being in the outback. This included being out of contact with friends and family, dealing with heat, dust and cold nights, accepting exposure to insects and other animals and accepting the possibility that there would be few bathroom facilities. All members of the party were comfortable with these arrangements and therefore the trip was an enjoyable one. Practical skills were certainly beneficial to the success of the trip. This included off-road driving, record-keeping, vehicle maintenance, camping experience, and first aid, although the latter was thankfully not required. Much of the Gibb River Rd is easily-driveable dirt road or gravel, but some river crossings and rougher terrain were encountered, especially during detours off the main road. Being comfortable driving a four-wheel drive vehicle as well as towing a camper helped the trip run smoothly. In one instance, a sudden burst tyre meant that the car needed to be brought to a stop and the tyre changed for the spare. As the driver, I was able to bring the car to as smooth a stop as possible and the others changed the completely blown out tyre, although this required some digging into the gravel road to gain clearance for the new tyre to be installed. Happily, everyone on the trip was confident in changing a tyre and more than enough hands were free to provide assistance. Camping experience was beneficial to those on the trip as three of the party slept in their own tents and two slept in the camper. This meant general confidence with setting up a camp site was important, as was having basic cooking, camp fire skills. In Australia, particular care needs to be taken to avoid leaving food scraps around that might attract unwanted animal visitors, and wild animals should never be fed to reduce the risk of their becoming a nuisance and relying on food from humans.

127 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

It is also important not to leave any rubbish behind when camping, and so all rubbish was collected and disposed of thoughtfully after leaving the national parks.

5.4.3 Record keeping and suggestions

Record keeping was important for documenting plant collections and for providing an official record of the trip. All members of the party provided photographs for this purpose, and those from Dr Kaas were particularly helpful in documenting daily activities including collecting, driving, cooking and socialising. A diary record was kept by myself for the purpose of documenting day-to-day highlights and notes of interest. These records are important for any field trip as they provide a record for those who attended to look back on, and for those who might attend similar trips in the future to find out what to expect. Much of this information was used to put together this official record of the trip. While putting together this record, it was possible to reflect on some improvements that could be made for future plant collection trips. One oversight was that although photos with GPS coordinated were taken of the plants in the field, it would have been beneficial to include which were actually samples versus observed only. For example, some sites had many Oldenlandia species, and photos might not have represented those that were actually sampled. Although photos probably corresponded to those samples in most cases, it is impossible to have an official record of this. Placing an identifier next to plants in photographs would aid with this and could have the dual purpose of providing a scale for the size of specimens, which was also neglected here. For more detailed recording of the environment from which samples were taken a standardised description could be included in the future. For example, the Kimberley region consists of over 100 smaller regions that have been carefully surveyed and classified (Fletcher et al., 2019) which could be cross-referenced to collections in the future. Here, our aim was primarily to find specimens for peptide and protein analysis rather than extensive surveying of local flora, but records of the habitat and environmental aspects might be a useful resource for future expeditions. Here, a rudimentary record of the habitats sampled is represented in some notes and photographs taken at each site, but some standardised record of the environment is encouraged for future trips. On a lighter note, some other useful lessons were learned during the trip including, but not limited to; one should check for roosting fowl before setting up tents underneath mango trees, out-of-date marshmallows are not worth the outback price tag, and, snakes are just as scared of Prof. Craik as he is of them (thankfully).

128 5.5. CONCLUSION 5.5 Conclusion

The collection trip to the Kimberley region aimed to identify and observe plant species relevant to our research regarding cyclic peptides and ligase enzymes, and from a collection perspective we were successful in this endeavour. Over 50 samples were collected and have provided our laboratory with extensive research opportunities that will be ongoing into years to come. Successful aspects of the planning included the generation of field guides for target taxa and switching the location tagging on to mobile phones. Improvements could be made in the record keeping to match individual specimens to GPS coordinates and to provide greater detail for environmental conditions and habitat. However, detailed planning and careful record keeping enabled the documentation of this trip for future reference, in particular for those who wish to visit and collect from the same sites and for general preparation for field trips in remote areas.

129 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS References

ALA (2019). Atlas of Living Australia. URL: http://www.ala.org.au (visited on 11/25/2019). ALA (2018a). Atlas of living Australia field guide: Byblidaceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-01&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F2894220&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.6196823120117+- 13.478667241088319, 121. 6196823120117+- 18.826583195480186, 129.13433074951172+- 18.826583195480186, 129.13433074951172+-13.478667241088319,121.6196823120117+-13.478667241088319) ))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018b). Atlas of living Australia field guide: Caryophyllaceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-01&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F8446964&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((120.87570190429688+-13.325484885597936,120. 87570190429688+-19.808054128088575,129.75265502929688+-19.808054128088575, 129.75265502929688+-13.325484885597936,120.87570190429688+-13.325484885597936) ))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018c). Atlas of living Australia field guide: Clitoria. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&sourceTypeId=2& fileType=csv&file=records-2018-05-01&email=e.gilding%40imb.uq.edu.au&q= lsid%3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F2890598&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.05285644531249+-13.496472765758952,121. 05285644531249+-18.81271785640776,129.1387939453125+-18.81271785640776,129. 1387939453125+-13.496472765758952,121.05285644531249+-13.496472765758952))) &&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018d). Atlas of Living Australia field guide: Cucurbitaceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-01&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F8749521&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.55479431152344+-13.368243250897287,121. 55479431152344+ - 18 . 60460138845525 , 129 . 37705993652344+ - 18 . 60460138845525 , 129.37705993652344+-13.368243250897287,121.55479431152344+-13.368243250897287) ))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018e). Atlas of living Australia field guide: Droseraceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-01&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F7173788&qc=data_

130 REFERENCES

hub_uid:dh9&wkt=MULTIPOLYGON(((120.87570190429688+-13.581920900545844,120. 87570190429688+-19.476950206488414,129.40109252929688+-19.476950206488414, 129.40109252929688+-13.581920900545844,120.87570190429688+-13.581920900545844) ))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2020). Atlas of Living Australia field guide: Hybanthus bennettiae. URL: https://avh. ala . org . au / occurrences / search ? file = records - 2020 - 02 - 06 & reasonTypeId = 3 & dwcHeaders = true & fileType = csv & qa = none & sourceTypeId = 2 & email = bjsmithies % 40gmail.com&q=lsid%3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fname%2Fapni% 2F4919849&qc=data_hub_uid%3Adh9&&pageSize=0&&facet=true& (visited on 02/06/2020). ALA (2018f). Atlas of living Australia field guide: Nicotiana benthamiana. URL: http://avh. ala . org . au / occurrences / search ? reasonTypeId = 4 & dwcHeaders = true & qa = none & email = bjsmithies % 40gmail . com & sourceTypeId = 2 & fileType = csv & file = records - 2018- 05- 01&q=lsid%3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni% 2F2912121&qc=data_hub_uid:dh9&wkt=MULTIPOLYGON(((121.05148315429686+- 12. 726084296948184,121.05148315429686+-19.72534224805787,129.57687377929688+- 19.72534224805787,129.57687377929688+-12.726084296948184,121.05148315429686+- 12.726084296948184)))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018g). Atlas of living Australia field guide: Passifloraceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-04&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F8954559&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.13937377929688+-13.068776734357694,121. 13937377929688+-18.729501999072138,129.57687377929688+-18.729501999072138, 129.57687377929688+-13.068776734357694,121.13937377929688+-13.068776734357694) ))&&pageSize=0&&facet=true& (visited on 05/04/2018). ALA (2018h). Atlas of living Australia field guide: Rubiaceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&sourceTypeId=2& fileType=csv&file=records-2018-05-01&email=e.gilding%40imb.uq.edu.au&q= lsid%3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F8807273&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.70997619628908+-13.560561745081422,121. 70997619628908+-18.198043686762652,128.96095275878906+-18.198043686762652, 128.96095275878906+-13.560561745081422,121.70997619628908+-13.560561745081422) ))&&pageSize=0&&facet=true& (visited on 05/01/2018). ALA (2018i). Atlas of living Australia field guide: Violaceae. URL: http://avh.ala.org.au/ occurrences/search?reasonTypeId=4&dwcHeaders=true&qa=none&email=bjsmithies% 40gmail.com&sourceTypeId=2&fileType=csv&file=records-2018-05-01&q=lsid% 3Ahttp%3A%2F%2Fid.biodiversity.org.au%2Fnode%2Fapni%2F8012097&qc=data_ hub_uid:dh9&wkt=MULTIPOLYGON(((121.31515502929686+-13.325484885597936,121. 31515502929686+-19.476950206488414,128.96163940429688+-19.476950206488414,

131 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

128.96163940429688+-13.325484885597936,121.31515502929686+-13.325484885597936) ))&&pageSize=0&&facet=true& (visited on 05/04/2018). Arnison, P. G., Bibb, M. J., Bierbaum, G., Bowers, A. A., Bugni, T. S., Bulaj, G., Camarero, J. A., Campopiano, D. J., Challis, G. L., Clardy, J., Cotter, P. D., Craik, D. J., Dawson, M., Dittmann, E., Donadio, S., Dorrestein, P. C., Entian, K.-D., Fischbach, M. A., Garavelli, J. S., Göransson, U., Gruber, C. W., Haft, D. H., Hemscheidt, T. K., Hertweck, C., Hill, C., Horswill, A. R., Jaspars, M., Kelly, W. L., Klinman, J. P., Kuipers, O. P., Link, A. J., Liu, W., Marahiel, M. A., Mitchell, D. A., Moll, G. N., Moore, B. S., Müller, R., Nair, S. K., Nes, I. F., Norris, G. E., Olivera, B. M., Onaka, H., Patchett, M. L., Piel, J., Reaney, M. J. T., Rebuffat, S., Ross, R. P., Sahl, H.-G., Schmidt, E. W., Selsted, M. E., Severinov, K., Shen, B., Sivonen, K., Smith, L., Stein, T., Süssmuth, R. D., Tagg, J. R., Tang, G.-L., Truman, A. W., Vederas, J. C., Walsh, C. T., Walton, J. D., Wenzel, S. C., Willey, J. M., and Donk, W. A. van der (2013). Ribosomally synthesized and post-translationally modified peptide natural products: Overview and recommendations for a universal nomenclature. Natural Product Reports 30, pp. 108–160. AVH (2019). The Australasian Virtual Herbarium. URL: https://avh.chah.org.au/ (visited on 11/25/2019). Barber, C. J., Pujara, P. T., Reed, D. W., Chiwocha, S., Zhang, H., and Covello, P. S. (2013). The two-step biosynthesis of cyclic peptides from linear precursors in a member of the plant family Caryophyllaceae involves cyclization by a serine protease-like enzyme. Journal of Biological Chemistry 288, pp. 12500–12510. Bernath-Levin, K., Nelson, C., Elliott, A. G., Jayasena, A. S., Millar, A. H., Craik, D. J., and Mylne, J. S. (2015). Peptide macrocyclization by a bifunctional endoprotease. Chemistry and Biology 22, pp. 571–582. Bowman, D. M. J. S., Brown, G. K., Braby, M. F., Brown, J. R., Cook, L. G., Crisp, M. D., Ford, F., Haberle, S., Hughes, J., Isagi, Y., Joseph, L., McBride, J., Nelson, G., and Ladiges, P. Y. (2010). Biogeography of the Australian monsoon tropics. Journal of Biogeography 37, pp. 201–216. Craik, D. J. (2013). Joseph Rudinger memorial lecture: Discovery and applications of cyclotides. Journal of Peptide Science 19, pp. 393–407. Fletcher, M. and Ryan, K. (2019). Land systems of the Kimberley region, Western Australia. URL: https : / / www . agric . wa . gov . au / rangelands / land - systems - kimberley - region - western-australia (visited on 11/27/2019). Fukushima, K., Nagano, K., and Hoshi, Y. (2008). Somatic chromosome differentiation in three species of the Byblis liniflora complex (Byblidaceae). Chromosome Botany 3, pp. 95–99. Google Maps (2019). My Maps. URL: https://www.google.com/maps/about/mymaps/ (visited on 11/24/2019). Göransson, U., Malik, S., and Slazak, B. (2015). Chapter Two - Cyclotides in the Violaceae. Advances in Botanical Research. Ed. by D. J. Craik. Vol. 76. Academic Press, pp. 15–49.

132 REFERENCES

Gruber, C. W., Elliott, A. G., Ireland, D. C., Delprete, P. G., Dessein, S., Göransson, U., Trabi, M., Wang, C. K., Kinghorn, A. B., Robbrecht, E., and Craik, D. J. (2008). Distribution and evolution of circular miniproteins in flowering plants. Plant Cell 20, pp. 2471–2483. Harris, K. S., Durek, T., Kaas, Q., Poth, A. G., Gilding, E. K., Conlan, B. F., Saska, I., Daly, N. L., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2015). Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nature Communications 6, p. 10199. Harris, K. S., Guarino, R. F., Dissanayake, R. S., Quimbar, P., McCorkelle, O. C., Poon, S., Kaas, Q., Durek, T., Gilding, E. K., Jackson, M. A., Craik, D. J., Weerden, N. L. van der, Anders, R. F., and Anderson, M. A. (2019). A suite of kinetically superior AEP ligases can cyclise an intrinsically disordered protein. Scientific Reports 9, p. 10820. Jackson, M. A., Gilding, E. K., Shafee, T., Harris, K. S., Kaas, Q., Poon, S., Yap, K., Jia, H., Guarino, R., Chan, L. Y., Durek, T., Anderson, M. A., and Craik, D. J. (2018). Molecular basis for the production of cyclic peptides by plant asparaginyl endopeptidases. Nature Communications 9, p. 2411. Jobson, R. W. and Baleerio, P. C. (2015). Two new species of Utricularia (Lentibulariaceae) from the north west region of western australia. Telopea 18, pp. 201–208. Kan, M.-W. and Craik, D. J. (2018). Chapter 14. Trends in cyclotide research. Cyclic Peptides: From Bioorganic Synthesis to Applications. The Royal Society of Chemistry, pp. 302–339. Kew Gardens (2019). Data and digital resources. URL: https : / / www . kew . org / science / collections-and-resources/data-and-digital (visited on 11/28/2019). Lowrie, A. and Conran, J. G. (2008). Byblis guehoi (Byblidaceae), a new species from the Kimberley, Western Australia. Telopea 12, pp. 23–29. Oguis, G. K., Gilding, E. K., Jackson, M. A., and Craik, D. J. (2019). Butterfly pea (Clitoria ternatea), a cyclotide-bearing plant with applications in agriculture and medicine. Frontiers in Plant Science 10, p. 645. Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641. St John Ambulance Australia Incorporated (2018). First Aid App. Version 1.1. Stewart, J. M., Craven, L. A., Brubaker, C., and Wendel, J. F. (2015). Gossypium anapoides (Mal- vaceae), a new species from Western Australia. Novon: A Journal for Botanical Nomenclature 23, pp. 447–451. Trabi, M., Mylne, J. S., Sando, L., and Craik, D. J. (2009). Circular proteins from Melicytus (Violaceae) refine the conserved protein and gene architecture of cyclotides. Organic and Biomolecular Chemistry 7, pp. 2378–2388. Weidmann, J. and Craik, D. J. (2016). Discovery, structure, function, and applications of cyclotides: Circular proteins from plants. Journal of Experimental Botany 67, pp. 4801–4812.

133 CHAPTER 5. CYCLOTIDE DISCOVERY EFFORTS

Yang, R., Wong, Y. H., Nguyen, G. K., Tam, J. P., Lescar, J., and Wu, B. (2017). Engineering a catalytically efficient recombinant protein ligase. Journal of the American Chemical Society 139, pp. 5351–5358. Zauner, F. B., Elsasser, B., Dall, E., Cabrele, C., and Brandstetter, H. (2018). Structural analyses of Arabidopsis thaliana legumain γ reveal the differential recognition and processing of proteolysis and ligation substrates. Journal of Biological Chemistry 8, pp. 8934–8946.

134 Chapter 6

Outlook and future directions

135 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS 6.1 Overview

As is clear from the earlier chapters in this thesis, MCoTI-II is a well-studied cyclotide from Momordica cochinchinensis seeds and has therapeutic potential as a scaffold for drug design. This thesis aimed to investigate plant-based production of this promising cyclotide and its derivatives as an alternative to chemical synthesis for research and production. The ultimate goal of this work was to facilitate research and development of MCoTI-II and grafted cyclotides in the broader scientific community by means of cheaper, greener production without the need for chemical synthesis expertise. The background to the thesis was given in Chapter 1, and then the compatibility of MCoTI-II with plant-based production was assessed in Chapters 2–4 by looking at the components of the genetic expression construct, compatibility with AEP-mediated cyclisation, and strategies to optimise leaf-based expression. The ongoing discovery of novel cyclotides and cyclising AEPs of relevance to this research was discussed in the context of a plant collection trip to the Kimberley region of Western Australia in Chapter 5. Overall, the findings here highlight that MCoTI-II can be expressed in plants given the correct cellular environment and that semi-enzymatic production could be an alternative to either complete chemical synthesis or complete biosynthesis in planta.

6.2 MCoTI-II expression is suited to seeds

The recombinant expression experiments in Chapter 2 showed that MCoTI-II and derivatives could be expressed in Arabidopsis thaliana seeds. This included a single amino acid mutant, [K6V]MCoTI-II, the grafted cyclotide, MCoAbl09, and two of a series of loop 5 cyclisation variants. The expression of modified MCoTI-II analogues expands on the expression of native MCoTI-II in A. thaliana seeds by Mylne and colleagues (2012). The work presented in Chapters 2 and 4 also builds on that 2012 study by introducing modified precursor genes instead of the native precursor, TIPTOP2, to enhance cyclotide expression. Using different precursor proteins for cyclotide expression is already common practise in Nicotiana benthamiana leaves (Poon et al., 2018; Jackson et al., 2018; Jackson et al., 2019) and this was applied in Chapter 4, where the PPI and NaD1 precursors produced kalata B1 in a LEC2 dependant manner. In addition to leaf-based expression, the results in Chapter 2 suggest that the alternative precursor approach can be incorporated into A. thaliana seeds by using Paws1 and TI2Vac precursors for MCoTI-II expression. The Paws1 gene expressed in A. thaliana for SFTI-1 production (Mylne et al., 2012) was modified to carry the MCoTI-II and [K6V]MCoTI-II sequences and proved more efficient than the truncated TI2Vac precursor in terms of MALDI-MS signal intensity and frequency of detection in almost all cases. The findings presented suggest that A. thaliana seeds are suitable for MCoTI-II expression and that this can be enhanced by using precursors from other cyclic peptides. The success of different genetic precursors could be attributed to the differences in physiochemical properties of the precursor, prevacuolar sorting pathways, or their compatibility with cyclising AEPs. These could come in to play at different stages during the journey of the propeptide through the cell.

136 6.2. MCOTI-II EXPRESSION IS SUITED TO SEEDS

The physiochemical properties of the precursor could include the overall charge of the propeptide as it moves through progressively more acidic compartments, or its hydrophobicity, and whether this affects protein stability in oily seeds versus proteolytic leaves. Native MCoTI-II has a net charge of +3 at pH 7 but most cyclotides are more neutral in charge. For example, the prototypical cyclotide kalata B1 which is readily expressed in planta (Poon et al., 2018; Jackson et al., 2018) has a net charge of 0. Previous investigations into small, cationic, disulfide rich peptides called defensins has suggested that their acidic precursor domains may have a role in protecting the mature peptide from targeted degradation during biosynthesis (Shafee et al., 2017). This might suggest a similar role for cyclotide precursors. Using the Henn2 precursor for expression of MCoTI-II in Chapter 2 was an attempt to see whether a negatively charged precursor could balance out a positively charged cyclotide to improve expression, but this was not the case. Simply balancing the charge of the cyclotide and precursor is probably not sufficient to improve expression based on the evidence here, but a larger study could be carried out to test multiple precursors in different plant tissues. Hydrophobicity is another property of the precursor that might affect cyclotide accumulation. A hydrophobic peptide might accumulate better in an oily seed compared to a leaf, for example. MCoTI-II is water soluble, whereas its grafted analogues can be more hydrophobic. In Chapter 3, most of the loop 5 analogues of MCoTI-II were oxidised in simple ammonium bicarbonate buffer with glutathione, but [G1,G33]L5_Abl06 required the addition of 20% isopropanol to this buffer to achieve successful folding of the peptide. This is a simplistic demonstration of how the hydrophobicity of the peptide can affect peptide maturation. The correlation between expression success and hydrophobicity of cyclotides and their precursors would be an interesting property for further research. Another feature of the precursor peptide is the C-terminal propeptide, or CTPP, that determines compatibility with AEP-mediated cyclisation. In Chapter 2, cyclotides were expressed in the wild-type A. thaliana Colombia ecotype without introduction of known cyclase AEPs. This means that the cyclic products detected were probably cyclised by endogenous AEPs inefficiently. A. thaliana contains four AEPs, which are also referred to as vacuolar processing enzymes (VPEs). One of these, β-VPE, is found in the seeds. Cyclic MCoTI-I, MCoTI-II and MCoTI-IV have been detected in A. thaliana seeds expressing the TIPTOP2 gene, and these cyclotides were not detected when the same gene was expressed in an aep null A. thaliana line (Mylne et al., 2012). This suggests that endogenous AEPs can mature trypsin inhibitor cyclotides in the seed, in agreement with the results presented here. In N. benthamiana, co-expression of cyclising AEPs can increase cyclic peptide content compared to no AEP co-expression (Poon et al., 2018; Jackson et al., 2018). Therefore, incorporating foreign AEPs into A. thaliana might also improve cyclic peptide accumulation compared to endogenous AEP activity. AEP-expressing lines of A. thaliana could be developed for agroinfiltration with cyclotide constructs, or homozygous lines could be generated for both AEP expression and cyclotide expression and cross pollinated. Both of these approaches are straightforward but would require extensive time and resources, especially given the number of cyclotides and AEPs that could be tested and the case-by-case nature of their compatibility. Therefore, starting with a known cyclotide-AEP match would be a good

137 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS starting point. For example, co-expressing the loop 5 MCoTI-II variant [G1,G33]L5ALEG alongside OaAEP1b would be interesting given that they are compatible, as shown in vitro in Chapter 3. It would also be interesting to test this combination in seeds given that co-expression of the pair did not result in cyclotide detection in N. benthamiana leaves in Chapter 4. Expressing the same cyclotide-AEP pair successfully in seeds and unsuccessfully in leaves could provide a basis for comparing the biosynthetic pathway in the two tissues. Finally, other seed based platforms could be explored for MCoTI-II expression to move forward from the A. thaliana model. Brassica napus seed would be an ideal candidate given that it is an oilseed crop developed for commercial production and has been used for transgene expression (Parmenter et al., 1995). It is amenable to the floral dip method of agrobacteria-mediated transformation which negates the need for sterile selection and regeneration from callus (Li et al., 2010). Other plants that have been used for seed-based production of recombinant proteins and peptides include tobacco, rice, barley, maize and safflower (Shen et al., 2018).

6.3 The loop 5 cyclisation point opens up the possibility for in planta biosynthesis of grafted cyclotides

One of the major problems for plant-based expression of MCoTI-II was predicted to be accommodating grafted sequences at the same time as an AEP-compatible cyclisation point. This was addressed in Chapters 2 and 3 by relocating the cyclisation point to loop 5 of MCoTI-II away from the native cyclisation point in loop 6 that is often replaced by grafted sequences. This was achieved by using the N in loop 5 as the C-terminal residue of the cyclotide followed by the C-terminal propeptide (CTPP) ‘’ALEG’, and cyclising the scaffold in vitro using recombinant OaAEP1b. Both synthetic and grafted

[G1,G33]L5_Abl06ALEG were cyclised in vitro using OaAEP1b, and the next step will be to replicate this in planta. The ability to produce loop-5 cyclised MCoTI-II in A. thaliana seeds was demonstrated in Chapter 2 but without co-expression of AEPs. Attempts were made to express loop 5 variants in N. benthamiana leaf in Chapter 4, but the peptides were not detected. Thus, expression of the [G1,G33]L5ALEG and grafted [G1,G33]L5_Abl06ALEG with OaAEP1b should be explored in A. thaliana seeds to establish whether in vitro compatibility between the two correlates with in planta expression. More broadly, correlation between in vitro and in planta cyclotide-AEP combinations would indicate that in vitro studies could be used to assess pairings before expression in seeds. This could be used instead of transient expression in N. benthamiana for rapidly assessing AEP compatibility. Ideally in vitro assays such as those conducted in Chapter 3 could be used to establish peptide-AEP compatibility and this knowledge could be transferred to plant-based production. It is important to have a rapid screening system in place to keep up with expression of new cyclotides and to test new cyclising AEPs as they are discovered and characterised. Not all loop-6 grafted cyclotides will be compatible with in planta production and the work here

138 6.4. TRANSIENT EXPRESSION IN NICOTIANA BENTHAMIANA CAN BE OPTIMISED provides a semi-enzymatic approach for these situations that still takes advantage of plant AEPs. For example, some grafted MCoTI-II analogues contain D-amino acids (Chan et al., 2016; Chan et al., 2015; Aboye et al., 2012) and other non-proteinogenic amino acids (Aboye et al., 2016) and will require synthetic production. Other cyclotides might not accumulate in plant tissues, and some laboratories do not have plant-production facilities. The ability to cyclise these peptides in loop 5 using AEPs will accommodate their production in vitro in these situations. The in vitro strategy can also be useful for characterising new AEPs with regard to loop 5 cyclisation variants. Using in vitro cyclisation assays to characterise AEPs has already proven useful for identifying efficient AEP ligases for cyclotide maturation (Harris et al., 2015; Harris et al., 2019). The new loop 5 cyclisation point engineered here means that loop 6 grafted cyclotides can now be screened for AEP compatibility in the same way. This is particularly relevant given the recent advances made in characterising cyclase type AEPs (Jackson et al., 2018; Harris et al., 2019; Hemu et al., 2019) and the ongoing discovery efforts for new cyclases.

6.4 The transient expression system in Nicotiana benthamiana can be optimised for cyclotide expression

As detailed in Chapter 4, expression of kalata B1 in N. benthamiana leaves was enhanced by co- expression of the transcription factor LEC2 in certain cases and MCoTI-II was detected for the first time in this system. This showed that cyclotide expression can be improved when the constructs are expressed from seed-specific promoters in leaves in the presence of LEC2 and OaAEP1b, and that linear precursors can accumulate in the presence of LEC2 when OaAEP1b is absent. These results suggest that leaf-based expression can be improved beyond the standard expression conditions commonly used for cyclotides in N. benthamiana. MCoTI-II is notoriously difficult to express in N. benthamiana leaves but its expression in seeds in Chapter 2 and the development of loop 5 cyclisation variants in Chapter 3 are promising and would be complemented by a rapid in planta screening system for cyclotide-AEP pairings. Co-expression of LEC2 did result in detection of cyclic MCoTI-II, confirming that the peptide can be expressed, but the relative signal intensities were very low. Studying the differences between N. benthamiana leaf expression with and without LEC2, and comparing this to seed expression in A. thaliana would provide valuable insight into the differences between the leaf and seed cellular environment. This knowledge could be applied to leaf systems to adapt them to MCoTI-II expression and facilitate more rapid assays like those attempted in N. benthamiana in Chapters 2 and 4. In terms of studying the biosynthesis of cyclotides, most research has focused on identifying the destination for cyclotides in the cell rather than the journey to get there. The targeting of kalata B1 to the vacuole was first visualised using an Oak1-GFP fusion protein (Conlan et al., 2011). Improvements in fluorescent imaging, especially in N. benthamiana (Bally et al., 2018), mean that live imagining of subcellular compartmentalisation is now more flexible than ever. Photoconvertible subcellular probes

139 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS designed for prevacuolar and vacuolar imaging that are also compatible with GFP would be particularly relevant for live imaging of cyclotides directed to these compartments (Mathur et al., 2010). Labelling of prevacuolar compartments using photoconvertible proteins such as mEosFP (Mathur et al., 2010) in the presence and absence of LEC2 expression could be used a tool to compare wild-type leaves and LEC2-induced leaves and explain the mechanism of action for LEC2 enhanced cyclotide expression seen in Chapter 4. In addition, cyclotides such as MCoTI-II could be retained in the ER using KDEL sequences to confirm whether linear precursors can accumulate there or whether they are already targeted for degradation. A theme throughout this thesis has been the need for a better understanding of the intricacies of cyclotide biosynthesis. This would help answer questions as to where MCoTI-II is hindered during biosynthesis in N. benthamiana leaves, how seed expression differs to leaf expression and whether this is due to differences in pre-vacuolar sorting. It would also be beneficial to confirm that MCoTI-II is indeed targeted to the vacuole, an assumption based on its compatibility with AEPs that are located there. It is possible that MCoTI-II might meet AEPs in pre-vacuolar compartments in seeds, for example. The role of such regulatory elements in cyclotide expression is largely unexplored and could assist in this regard. Regulatory elements from other plant species such as Phaseolus vulgaris have been used in a seed-expression context (De Jaeger et al., 2002; Hernández-Velázquez et al., 2015) and could be used to build on findings from chapters 2 and 4 and uncover more details regarding cyclotide trafficking. Understanding the biosynthesis of MCoTI-II is probably the single most important factor in being able to adapt these cyclotides and their precursors for efficient in planta production.

6.5 Ongoing discovery efforts for cyclotides and cyclising enzymes are encouraged

Discovery of new cyclotides and AEPs could assist in all fields of cyclotide research, including natural bioactivites, drug design, and in planta expression. Chapter 5 of this thesis detailed a field trip to the Kimberleys in May 2018 for this purpose. The specimens collected during the trip covered many of the target taxa, including cyclotide containing families and species of carnivorous plants for screening for new peptide ligases. Accessions were collected from known cyclotide-containing species such as Clitoria ternatea that can be used to further characterise their cyclotides profiles. The discovery of new cyclotides from similar plant collection trips could provide new valuable tools for agriculture and drug design. Importantly for the work presented in this thesis, new cyclotide discovery could uncover more cyclic peptide precursors that could be utilised for transgenic cyclotide expression. The results in Chapter 2 of this thesis suggest that seed-derived precursors, such as Paws1, are best suited to seed expression, and so the discovery of more of these could expand the toolbox for improving seed-based expression. Furthermore, discovery of cyclotides with MCoTI-II like properties such as high pKa values that are also expressed in leaves could provide suitable precursors for MCoTI-II expression in N. benthamiana leaves.

140 6.6. SUGGESTED EXPERIMENTS FOR THE CONTINUATION OF THIS WORK

The discovery of new AEPs will also complement cyclotide expression in planta and semi- enzymatic production in vitro. So far, in planta expression has mainly utilised AEPs from the cyclotide containing plants Oldenlandia affinis, Clitoria ternatea and Petunia hybrida (Poon et al., 2018; Jackson et al., 2019) but recent advances in understanding the genomic predictors of AEP cyclase activity (Jackson et al., 2019) mean that these AEPs can now be identified more readily from sequenced plant material. In addition, cyclase activity is not limited to AEPs from cyclotide-containing plants (Zauner et al., 2018; Bernath-Levin et al., 2015), so the potential for discovering new AEPs relevant for cyclotide maturation purposes is vast. It is hoped that the detailed planning and outcomes described in Chapter 5 of this thesis will assist others in planning field trips for the collection of plant material for discovery of new cyclotides, their precursor proteins and AEPs. Discovery efforts can then be readily processed using well established cyclotide discovery techniques (Göransson et al., 2004; Gruber et al., 2008; Mahatmanto et al., 2014). Further investigation of the samples collected on this trip and future collections should be carried out to expand the genetic and enzymatic toolkit for future recombinant cyclotide production. It is hoped that the recording of this field trip has conveyed that field trips require extensive planning, logistics, resources and a positive attitude, but can be a highly rewarding experience.

6.6 Suggested experiments for the continuation of this work

To realise the potential for expressing MCoTI-II and therapeutic derivatives in planta the following research should be prioritised for either seed-based or leaf based-expression:

6.6.1 Suggestions for seed-based expression

1. Co-express cyclase type AEPs and MCoTI-II in A. thaliana seeds. This will expand on the expression of MCoTI-II and analogues demonstrated in Chapter 2 that was carried out without AEP co-expression with the aim of finding whether seed expression can be enhanced using AEPs. Co-expression of cyclotide-AEP combinations that have been confirmed in vitro, such as

[G1,G33]L5ALEG or [G1,G33]L5_Abl06ALEG with OaAEP1b should be carried out first. Other suggestions include using seed-specific AEPs, such as those from Momordica cochinchinensis.

2. Establish whether in vitro data can be used to predict cyclotide-AEP compatibility in planta. Building on the experiment described in point 1, a larger scale assessment of cyclotide- AEP compatibility should be carried out for MCoTI-II peptides both in vitro with recombinant enzymes and in A. thaliana seeds. If the same cyclotide-AEP pairs are compatible in the two systems, in vitro work could be used in the future to determine which combination to co-express A. thaliana before beginning the relatively lengthy process of establishing transgenic lines.

3. Test McoTI-II expression in productive seed systems. A. thaliana seeds are not the most suitable candidates for larger scale seed-based production so other seed systems should be

141 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS

explored. Brassica napus can be transformed using the straightforward floral dip method and is therefore recommended as a candidate for further investigation into seed-based production.

4. Visualise the subcellular localisation of MCoTI-II during biosynthesis in A. thaliana seed. Understanding the subcellular targeting of MCoTI-II, including when it comes in to contact with AEPs and whether its final accumulation is indeed in the vacuole, will assist in designing expression constructs for these cyclotides. This could be carried out using immunolabeling, electron microscopy and MALDI-MSI techniques (Otegui et al., 2006; Slazak et al., 2018) to visualise AEPs and cyclotides.

6.6.2 Suggestions for leaf-based expression

1. Identify where MCoTI-II is located on a subcellular level with and without co-expression of LEC2. Using a GFP-fusion with MCoTI-II and live imaging in N. benthamiana leaves would provide an insight into how MCoTI-II progresses and where it is hindered during expression in leaf tissue. This should be compared with a positive control such as kalata B1 that is readily expressed in N. benthamiana leaves. This data could be compared with seed experiments outlined above to see where the cyclotide biosynthetic pathways differ in these tissues, particularly during prevacuolar sorting. Signal peptides and retention signals such as KDELs, along with truncated cyclotide sequences, would be ideal for targeting subcellular compartments for comparison with the native pathway.

2. Combine LEC2 with other treatments such as heat shock. Peptide folding is probably a key determinant of MCoTI-II accumulation in N. benthamiana leaves. A simple heat shock treatment of agroinfiltrated plants could possibly stimulate heat shock proteins to facilitate folding of cyclotide precursors. A recent study showed that 2 days post infiltration is the optimal time to heat shock N. benthamiana plants to enhance transgene expression (Norkunas et al., 2018). This could be combined with LEC2 expression to see if cyclotide accumulation can be enhanced further.

3. Target expression of MCoTI-II to other subcellular compartments. Directing MCoTI-II to subcellular compartments where it is exposed to different protease environments could enhance accumulation of the peptide. There is some evidence that cysteine rich peptides can fold in the chloroplast (Bally et al., 2007), and a tag for apoplast targeting and subsequent purification has recently been developed (Sainsbury et al., 2016). Directing MCoTI-II to either of these compartments could be a means to increase peptide yield. If AEPs could not be co-expressed here, cyclisation could be subsequently carried out in vitro after purification of the linear substrate from the plant.

142 6.7. CONCLUDING REMARKS 6.7 Concluding remarks

In conclusion, this thesis has established that MCoTI-II and analogues are compatible with expression in plant tissues. MCoTI-II has been redesigned to be compatible with plant AEPs even when grafted epitopes replace the cyclisation position in loop 6. This work can be used as the basis for optimising MCoTI-II expression in planta for larger scale production of grafted MCoTI-II drug candidates to assist in realising the potential of these promising cyclotides.

143 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS References

Aboye, T. L., Ha, H., Majumder, S., Christ, F., Debyser, Z., Shekhtman, A., Neamati, N., and Ca- marero, J. A. (2012). Design of a novel cyclotide-based CXCR4 antagonist with anti-human immunodeficiency virus (HIV)-1 activity. Journal of Medicinal Chemistry 55, pp. 10729–10734. Aboye, T., Meeks, C., Majumder, S., Shekhtman, A., Rodgers, K., and Camarero, J. (2016). Design of a MCoTI-based cyclotide with angiotensin (1-7)-like activity. Molecules 21, p. 152. Bally, J., Jung, H., Mortimer, C., Naim, F., Philips, J. G., Hellens, R., Bombarely, A., Goodin, M. M., and Waterhouse, P. M. (2018). The rise and rise of Nicotiana benthamiana : A plant for all reasons. Annual Review of Phytopathology 56, pp. 405–426. Bally, J., Paget, E., Droux, M., Job, C., Job, D., and Dubald, M. (2007). Both the stroma and thylakoid lumen of tobacco chloroplasts are competent for the formation of disulphide bonds in recombinant proteins. Plant Biotechnology Journal 6, pp. 46–61. Bernath-Levin, K., Nelson, C., Elliott, A. G., Jayasena, A. S., Millar, A. H., Craik, D. J., and Mylne, J. S. (2015). Peptide macrocyclization by a bifunctional endoprotease. Chemistry and Biology 22, pp. 571–582. Chan, L. Y., Craik, D. J., and Daly, N. L. (2015). Cyclic thrombospondin-1 mimetics: grafting of a thrombospondin sequence into circular disulfide-rich frameworks to inhibit endothelial cell migration. Bioscience Reports 35, e00270. Chan, L. Y., Craik, D. J., and Daly, N. L. (2016). Dual-targeting anti-angiogenic cyclic peptides as potential drug leads for cancer therapy. Scientific Reports 6, p. 35347. Conlan, B. F., Gillon, A. D., Barbeta, B. L., and Anderson, M. A. (2011). Subcellular targeting and biosynthesis of cyclotides in plant cells. American Journal of Botany 98, pp. 2018–2026. De Jaeger, G., Scheffer, S., Jacobs, A., Zambre, M., Zobell, O., Goossens, A., Depicker, A., and Angenon, G. (2002). Boosting heterologous protein production in transgenic dicotyledonous seeds using Phaseolus vulgaris regulatory sequences. Nature Biotechnology 20, pp. 1265–1268. Göransson, U., Svangard, E., Claeson, P., and Bohlin, L. (2004). Novel strategies for isolation and characterization of cyclotides: The discovery of bioactive macrocyclic plant polypeptides in the Violaceae. Current Protein and Peptide Science 5, pp. 317–329. Gruber, C. W., Elliott, A. G., Ireland, D. C., Delprete, P. G., Dessein, S., Göransson, U., Trabi, M., Wang, C. K., Kinghorn, A. B., Robbrecht, E., and Craik, D. J. (2008). Distribution and evolution of circular miniproteins in flowering plants. Plant Cell 20, pp. 2471–2483. Harris, K. S., Durek, T., Kaas, Q., Poth, A. G., Gilding, E. K., Conlan, B. F., Saska, I., Daly, N. L., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2015). Efficient backbone cyclization of linear peptides by a recombinant asparaginyl endopeptidase. Nature Communications 6, p. 10199. Harris, K. S., Guarino, R. F., Dissanayake, R. S., Quimbar, P., McCorkelle, O. C., Poon, S., Kaas, Q., Durek, T., Gilding, E. K., Jackson, M. A., Craik, D. J., Weerden, N. L. van der, Anders, R. F., and Anderson, M. A. (2019). A suite of kinetically superior AEP ligases can cyclise an intrinsically disordered protein. Scientific Reports 9, p. 10820.

144 REFERENCES

Hemu, X., El Sahili, A., Hu, S., Wong, K., Chen, Y., Wong, Y. H., Zhang, X., Serra, A., Goh, B. C., Darwis, D. A., Chen, M. W., Sze, S. K., Liu, C.-F., Lescar, J., and Tam, J. P. (2019). Structural determinants for peptide-bond formation by asparaginyl ligases. Proceedings of the National Academy of Sciences of the United States of America 116, pp. 11737–11746. Hernández-Velázquez, A., López-Quesada, A., Ceballo-Cámara, Y.,Cabrera-Herrera, G., Tiel-González, K., Mirabal-Ortega, L., Pérez-Martínez, M., Pérez-Castillo, R., Rosabal-Ayán, Y., Ramos-González, O., et al. (2015). Tobacco seeds as efficient production platform for a biologically active anti-HBsAg monoclonal antibody. Transgenic Research 24, pp. 897–909. Jackson, M. A., Gilding, E. K., Shafee, T., Harris, K. S., Kaas, Q., Poon, S., Yap, K., Jia, H., Guarino, R., Chan, L. Y., Durek, T., Anderson, M. A., and Craik, D. J. (2018). Molecular basis for the production of cyclic peptides by plant asparaginyl endopeptidases. Nature Communications 9, p. 2411. Jackson, M. A., Yap, K., Poth, A. G., Gilding, E. K., Swedberg, J. E., Poon, S., Qu, H., Durek, T., Harris, K., Anderson, M. A., and Craik, D. J. (2019). Rapid and scalable plant-based production of a potent plasmin inhibitor peptide. Frontiers in Plant Science 10, p. 602. Li, J., Tan, X., Zhu, F., and Guo, J. (2010). A rapid and simple method for Brassica napus floral-dip transformation and selection of transgenic plantlets. International Journal of Biology 2, p. 127. Mahatmanto, T., Poth, A. G., Mylne, J. S., and Craik, D. J. (2014). A comparative study of ex- traction methods reveals preferred solvents for cystine knot peptide isolation from Momordica cochinchinensis seeds. Fitoterapia 95, pp. 22–33. Mathur, J., Radhamony, R., Sinclair, A. M., Donoso, A., Dunn, N., Roach, E., Radford, D., S. Moham- mad Mohaghegh, P., Logan, D. C., Kokolic, K., and Mathur, N. (2010). mEosFP-Based green-to-red photoconvertible subcellular probes for plants. Plant Physiology 154, pp. 1573–1587. Mylne, J. S., Chan, L. Y., Chanson, A. H., Daly, N. L., Schaefer, H., Bailey, T. L., Nguyencong, P., Cascales, L., and Craik, D. J. (2012). Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase–mediated biosynthesis. The Plant Cell 24, pp. 2765–2778. Norkunas, K., Harding, R., Dale, J., and Dugdale, B. (2018). Improving agroinfiltration-based transient gene expression in Nicotiana benthamiana. Plant methods 14, p. 71. Otegui, M. S., Herder, R., Schulze, J., Jung, R., and Staehelin, L. A. (2006). The proteolytic processing of seed storage proteins in Arabidopsis embryo cells starts in the multivesicular bodies. The Plant Cell Online 18, pp. 2567–2581. Parmenter, D., Boothe, J. v., Van Rooijen, G., Yeung, E., and Moloney, M. (1995). Production of biologically active hirudin in plant seeds using oleosin partitioning. Plant Molecular Biology 29, pp. 1167–1180. Poon, S., Harris, K. S., Jackson, M. A., McCorkelle, O. C., Gilding, E. K., Durek, T., Weerden, N. L. van der, Craik, D. J., and Anderson, M. A. (2018). Co-expression of a cyclizing asparaginyl endopeptidase enables efficient production of cyclic peptides in planta. Journal of Experimental Botany 69, pp. 633–641.

145 CHAPTER 6. OUTLOOK AND FUTURE DIRECTIONS

Sainsbury, F., Jutras, P. V., Vorster, J., Goulet, M.-C., and Michaud, D. (2016). A chimeric affinity tag for efficient expression and chromatographic purification of heterologous proteins from plants. Frontiers in Plant Science 7, p. 141. Shafee, T. M., Lay, F. T., Phan, T. K., Anderson, M. A., and Hulett, M. D. (2017). Convergent evolution of defensin sequence, structure and function. Cellular and Molecular Life Sciences 74, pp. 663– 682. Shen, J., Wang, X., and Jiang, L. (2018). Seeds as Bioreactors. Molecular Pharming: Applications, Challenges and Emerging Areas. Ed. by A. R. Kermode. 1st ed. John Wiley & Sons, Inc, pp. 93– 118. Slazak, B., Kapusta, M., Strömstedt, A. A., Słomka, A., Krychowiak, M., Shariatgorji, M., Andrén, P. E., Bohdanowicz, J., Kuta, E., and Göransson, U. (2018). How does the sweet violet (Viola odorata L.) fight pathogens and pests – Cyclotides as a comprehensive plant host defense system. Frontiers in Plant Science 9, p. 1296. Zauner, F. B., Elsasser, B., Dall, E., Cabrele, C., and Brandstetter, H. (2018). Structural analyses of Arabidopsis thaliana legumain γ reveal the differential recognition and processing of proteolysis and ligation substrates. Journal of Biological Chemistry 8, pp. 8934–8946.

146 Appendix A

Supplementary information

147 APPENDIX A. SUPPLEMENTARY INFORMATION

Cyclic Linear Precursor Peptide oxidised reduced oxidised reduced oxidised reduced

L5ALEG 3451.56 3457.56 3469.56 3475.56 3839.74 3845.74 L5GLP 3451.56 3457.56 3469.56 3475.56 3736.72 3742.72 [G1,G33]L5ALEG 3565.6 3571.6 3583.6 3589.6 3953.79 3959.79 [G1]L5ALEG 3508.58 3514.58 3526.58 3532.58 3896.76 3902.76 [G33]L5ALEG 3508.58 3514.58 3526.58 3532.58 3896.76 3902.76 [G1,G33]L5_Abl06ALEG 4095.94 4101.94 4113.94 4119.94 4484.12 4490.12 kB1EII 2891.21 2897.21 2909.21 2915.21 3264.42 3270.42

Table A.1: Calculated masses of peptides used in Chapter 3. Monoisotopic masses are presented in daltons (Da; [M+H]+). Precursor masses represent the cyclotide sequence plus the C-terminal propeptide (CTPP). Linear masses represent removal of the CTPP, but without backbone cyclisation. Cyclic masses represent removal of the CTPP and backbone cyclisation.

Figure A.1: The comparison of Hα chemical shifts of [G1,G33]L5_Abl06 with the native scaffold MCoTI-II. A sequence alignment of the two peptides is shown below the panel, with the symbol used for each peptide present in front of them individually. Yellow stripes mark the positions of six (I to VI) and the segments between the cysteines are numbered loop 1 to 6. The mutated residues of the grafted peptide to MCoTI-II are highlighted in pink. The original cyclisation point in loop 6 of MCoTI-II is indicated by a solid triangle and the new cyclisation point in loop 5 of [G1,G33]L5_Abl06 by an empty triangle.

148 Figure A.2: The TOCSY and NOESY spectra of [G1,G33]L5_Abl06. The spin systems of all residues (except for Pro-2, Pro-15 and Pro-36) are shown in the TOCSY spectrum (top panel) and the sequential connectivities in the NOESY spectrum (bottom panel).

149 APPENDIX A. SUPPLEMENTARY INFORMATION

Figure A.3: Cyclisation of [G1,G33]L5_Abl06 in the presence of reducing agent. MALDI-MS traces showing peptide masses in a standard sodium acetate buffer (left panels) or in the same buffer with 1 mM TCEP (reducing buffer, right panels). Observed peptide m/z are labelled at 24 h without AEP (top panels) and with addition of 200 nM OaAEP1b (bottom panels). Expected monoisotopic masses (Da; [M+H]+) are: 4484.12 (precursor, oxidised), 4490.12 (precursor, reduced), 4113.94 (linear, oxidised), 4119.94 (linear, reduced), 4095.94 (cyclic, oxidised), and 4101.94 (cyclic, reduced).

Figure A.4: LC-MS traces for peptides used in Chapter 3. The LC (top row of panels) and MS (bottom row of panels) are shown for each peptide. LC-MS data were not available for [G33]L5ALEG; readers are referred to Figure S5 for 1D NMR spectra of this peptide as an indicator of peptide quality.

150 Figure A.5: 1D NMR for peptides used in cyclisation assays in Chapter 3. Each peptide shows well-dispersed peaks in the amide region of the spectra.

Figure A.6: 1D The percentage of Abl kinase inhibition of MTAbl06 (36.8%) and [G1,G33]L5_Abl06 (43.7%) upon 30 min incubation. The replicates (n= 2) were done independently and the error bars represent SD.

151 APPENDIX A. SUPPLEMENTARY INFORMATION

Figure A.7: 1D Oxidative folding of cyclic [G1,G33]L5_Abl06. a. HPLC showing the peptide after oxidative folding. b. Extracted mass of [G1,G33]L5_Abl06 after oxidative folding. The peak that corresponds to the correctly folded product (circled) accounts for 5.6% of the total product in the sample.

152 Figure A.8: Kimberley trip May 2018: Original collection sheet 153 APPENDIX A. SUPPLEMENTARY INFORMATION

154 Table A.2: Kimberley trip: Targeted taxa

Family Genus Species Sub-species Plant parts collected Comment Aizoaceae Trianthema spp. N/A Seed, leaves, and shoots Screen for small cyclic peptides Apocynaceae Alstonia actinophylla N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Alstonia linearis N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Alstonia spectabilis N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Alstonia spectabilis ophioxyloides Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Brachystelma glabriflorum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Calotropis gigantea N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Calotropis procera N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Carissa spinarum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cascabela thevetia N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cryptostegia grandiflora N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cryptostegia madagascariensis N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides

155 Apocynaceae Cryptostegia spp. N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum spp. N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum carnosum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum christineae N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum floribundum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum pedunculatum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Cynanchum puberulum N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Gymnanthera oblonga N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Marsdenia geminata N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Marsdenia spp. N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Hoya australis rupicola Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Ichnocarpus frutescens N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Marsdenia angustata N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Tylophora cinerascens N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Marsdenia glandulifera N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Apocynaceae Marsdenia hemiptera N/A Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides APPENDIX A. SUPPLEMENTARY INFORMATION Seed, leaves, and shoots Possible cyclic peptide plants and inhibitor peptides Seed, leaves, and shootsSeed, leaves, Possible and cyclic shoots peptide plants and inhibitor peptides Possible cyclic peptide plants and inhibitor peptides N/AN/AN/A Seed, leaves, and shootsN/A Seed, leaves, Possible and cyclic shoots peptide plantsN/A and Seed, inhibitor leaves, peptides Possible and cyclic shoots peptide plantsN/A and inhibitor peptides Possible cyclic peptide plantsN/A and Seed, inhibitor leaves, peptides and shoots Seed, leaves, Possible and cyclic shoots peptide plants and Seed, inhibitor leaves, peptides Possible and cyclic shoots peptide plantsN/A and Seed, inhibitor leaves, peptides Possible and cyclic shoots peptide plants and inhibitor peptides Possible cyclic peptide plantsN/A and inhibitor peptides N/A Seed, leaves, and shoots Possible cyclic peptide plantsN/A and Seed, inhibitor leaves, peptides and shoots Seed, leaves, Possible and cyclic shoots peptide plantsN/A and inhibitor peptides Possible cyclic peptide plantsN/A and Seed, inhibitor leaves, peptides and shootsN/A Possible cyclic peptide plantsN/A and Seed, inhibitor leaves, peptides and shootsN/A Seed, leaves, Possible and cyclic shoots peptide plantsN/A and Seed, inhibitor leaves, peptides Seek and for shoots proteases N/A Seed, leaves, Seek and for shoots proteases N/A Seed, leaves, Seek and for shoots proteases Seed, leaves, Possibly and contains shoots orbitides N/A Seed, leaves, Possibly and contains shoots orbitides N/A Seed, leaves, Possibly and contains shoots orbitides N/A Possibly contains orbitides Seed, leaves, and shootsN/A Seed, leaves, Screen and for shoots cyclotides and enzyme Seed, sequences leaves, Screen and for shoots cyclotides and enzyme sequences Screen for cyclotides and enzyme Seed, sequences leaves, and shoots Screen for cyclotides and enzyme sequences spp.spp. N/Aspp. N/A Seed, leaves, and shoots Possible cyclic peptide plants N/A and inhibitor peptides Seed, leaves, and shoots Possible cyclic peptide plants and Seed, inhibitor leaves, peptides and shoots Possiblespp. cyclic peptide plants and inhibitor peptides spp. N/A Seed, leaves, and shoots N/A Screen for cyclotides and enzyme sequences Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Parsonsia velutina MarsdeniaMarsdeniaMarsdenia pleiadenia Marsdenia velutina viridiflora Sarcostemma viridifloraSarcostemmaSarcostemma brevipedicellatum Sarcostemma esculentum tropica Sarcostemma viminale Secamone viminaleSecamone viminaleTabernaemontana orientalis elliptica Tylophora australe Tylophora brunonianum Wrightia flexuosa Wrightia WrightiaByblis pubescens ByblisByblis saligna Polycarpaea filifolia Polycarpaea liniflora Polycarpaea corymbosa rorida Polycarpaea holtzei Citrullus longiflora Citrullus violacea CitrullusCocciniaCucumis colocynthis Cucumis lanatus grandis althaeoides Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Apocynaceae Byblidaceae Byblidaceae Byblidaceae Caryophyllaceae Caryophyllaceae Caryophyllaceae Caryophyllaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Cucurbitaceae Apocynaceae Apocynaceae Apocynaceae Cucurbitaceae Cucurbitaceae

156 Cucurbitaceae Cucumis anguria N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Cucumis argenteus N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Cucumis maderaspatanus N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Cucumis melo N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Cucumis picrocarpus N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Cucumis umbellatus N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Diplocyclos palmatus N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Diplocyclos palmatus affinis Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Luffa aegyptiaca N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Luffa graveolens N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Luffa saccata N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Momordica spp. N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Momordica balsamina N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Cucurbitaceae Trichosanthes cucumerina N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Drosera

157 Droseraceae spp. N/A Seed and leaf Seek for proteases Droseraceae Drosera banksii N/A Seed and leaf Seek for proteases Droseraceae Drosera burmanni N/A Seed and leaf Seek for proteases Droseraceae Drosera caduca N/A Seed and leaf Seek for proteases Droseraceae Drosera derbyensis N/A Seed and leaf Seek for proteases Droseraceae Drosera dilatato-petiolaris N/A Seed and leaf Seek for proteases Droseraceae Drosera finlaysoniana N/A Seed and leaf Seek for proteases Droseraceae Drosera hartmeyerorum N/A Seed and leaf Seek for proteases Droseraceae Drosera indica N/A Seed and leaf Seek for proteases Droseraceae Drosera kenneallyi N/A Seed and leaf Seek for proteases Droseraceae Drosera ordensis N/A Seed and leaf Seek for proteases Droseraceae Drosera paradoxa N/A Seed and leaf Seek for proteases Droseraceae Drosera petiolaris N/A Seed and leaf Seek for proteases Droseraceae Drosera serpens N/A Seed and leaf Seek for proteases Droseraceae Drosera subtilis N/A Seed and leaf Seek for proteases Fabaceae Clitoria australis N/A Seed, leaves, and shoots Screen for cyclotides and enzyme sequences Fabaceae Clitoria ternatea N/A Seed, leaves, and shoots Profile genetic diversity in cyclotides APPENDIX A. SUPPLEMENTARY INFORMATION N/AN/AN/A Seed, leaves, and shootsN/A Seed, leaves, Screen and for shoots cyclotides andN/A enzyme sequences Screen for cyclotides andN/A enzyme Seed sequences and shoots (<0.5g)N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases Seed and Seek shoots for (<0.5g) proteases N/A Seed and Seek shoots for (<0.5g) proteases N/A Seed, leaves, Seek and for shoots proteases N/A Screen for cyclotides N/A Seed, leaves, and shootsN/A Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides N/A Screen for cyclotides N/A Seed, leaves, and shootsN/A Seed, leaves, Screen and for shoots cyclotides N/A Seed, leaves, Screen and for shoots cyclotides Seed, leaves, Screen and for shoots cyclotides Seed, leaves, Screen and for shoots cyclotides Seed, leaves, Screen and for shoots cyclotides Screen for cyclotides spp. N/A Seed and shoots (<0.5g) Seek for proteases spp. N/A Seed, leaves, and shoots Screenspp. for cyclotides N/A Seed, leaves, and shoots Screen for cyclotides SwainsoniaCanscoraUtricularia campylantha Utricularia Utricularia diffusa Utricularia antennifera Utricularia caerulea Utricularia chrysantha Utricularia georgei Utricularia kenneallyi Utricularia kimberleyensis Utricularialasiocaulis Utricularia singeriana Utricularia stellaris Aidiatridactyla Dentella uliginosa DentellaDentellaDentella racemosa Dentella asperata Dentella dioeca Dentella minutissima Knoxia misera Naucleapulvinata Oldenlandia repens Oldenlandia stricta Oldenlandia orientalis Oldenlandia argillacea Oldenlandiacorymbosa Oldenlandia crouchiana Oldenlandia delicata galioides kochiae Gentianaceae Lentibulariaceae Lentibulariaceae Fabaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Lentibulariaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae Rubiaceae

158 Rubiaceae Oldenlandia mitrasacmoides N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Oldenlandia pterospora N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Oldenlandia spermacocoides N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce argillacea N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce auriculata N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce brachystema N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce breviflora N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce constricta N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce dolichosperma N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce gibba N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce hillii N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce laevigata N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce leptoloba N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce occidentalis N/A Seed, leaves, and shoots Screen for cyclotides Spermacoce phaeosperma

159 Rubiaceae N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Spermacoce stenophylla N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Synaptantha scleranthoides N/A Seed, leaves, and shoots Screen for cyclotides Rubiaceae Timonius timon N/A Seed, leaves, and shoots Screen for cyclotides Solanaceae Nicotiana benthamiana N/A Seed Observe seed size and collect large seeded varieties Violaceae Hybanthus aurantiacus N/A Seed, leaves, and shoots Profile cyclotides and enzymes Violaceae Hybanthus enneaspermus N/A Seed, leaves, and shoots Profile cyclotides and enzymes APPENDIX A. SUPPLEMENTARY INFORMATION Figure A.9: Kimberley trip: Meal plan and shopping list

160