Deep time-resolved proteomic and phosphoproteomic

profiling of cigarette smoke-induced chronic

obstructive pulmonary disease

David Skerrett-Byrne BSc Biochem & Mol Bio (Hons)(UCD) MSc Biotech (UU)

25th March 2019

Supervisors: Professor Phil Hansbro, Dr. Matt Dun, Laureate Professor Rodney Scott, Professor Peter Wark, Professor Darryl Knight, Laureate Professor Paul Foster

Discipline of Immunology and Microbiology and Priority Research Centre for Healthy Lungs

School of Biomedical Science and Pharmacy

Faculty of Health and Medicine

University of Newcastle and Hunter Medical Research Institute

Newcastle, NSW, Australia

Submitted in fulfilment of the requirements for the award of a Doctor of Philosophy

Declaration

Statement of Originality

I hereby certify that the work embodied in the thesis is my own work, conducted under normal

supervision. The thesis contains no material which has been accepted, or is being examined,

for the award of any other degree or diploma in any university or other tertiary institution and,

to the best of my knowledge and belief, contains no material previously published or written

by another person, except where due reference has been made. I give consent to the final

version of my thesis being made available worldwide when deposited in the University’s

Digital Repository, subject to the provisions of the Copyright Act 1968 and any approved

embargo.

______David Skerrett-Byrne 25/03/2019

Acknowledgment of Authorship

I hereby certify that the work embodied in this thesis contains scholarly work of which I am

a joint author. I have included as part of the thesis a written declaration endorsed in writing by

my supervisor, attesting to my contribution to the joint scholarly work.

By signing below I confirm that David Skerrett-Byrne contributed heavily (90%) to the experimental design, collection and analysis of data, figures generated and the writing of each chapter. Chapter 1 is currently under review with The European Respiratory Journal and

Chapters 2, 3, 4 are all in preparation for submission.

______Professor Philip Hansbro 25/03/2019

i Acknowledgements

“During the course of any proteomics investigation, you will see interesting results, but you will rarely observe significant ones. To generate interesting and significant results, using a robust hypothesis, you must compare fresh apples.” Dun, M.D. 2016

Firstly my sincerest thanks to my supervisor Prof. Phil Hansbro for taking me on a PhD

student and giving me to opportunity to explore and develop my research interests in

Australia. I am forever grateful for that. Thank you for your support and patience with our

fruitful proteomic inventory. My thanks for co-supervisors Prof. Peter Wark and Laureate

Prof. Rodney Scott for their knowledge and experimental support during my studies. Thank

you to all co-supervisors, and all past and present Hansbro and Horvat lab members who have

been supportive of me during my thesis. Thank you to Dr. Nicole Hansbro for always being

there to answer my somewhat frantic questions and ensuring my PhD runs seamlessly.

To Matt I owe more than words can express. You have been an incredible mentor to me and

you really have shaped the researcher I have become. You have always known when to challenge me and when to provide me with that wonderful unwavering enthusiasm of yours. Even more than that you have been my friend and you will always hold a special place in my heart. I’m excited for what is ahead and delighted we will still be working together.

To my friends and colleagues, too many to mention, your friendship has meant the world to me.

You have all made me feel so welcome in Australia, created some of my fondest memories, and

supported me in many different ways. You are all great craic! Thank you to Abdul, Alex, Amanda,

Andrew, Anne Chevalier, Anne Gannon, Brett Hill, Brett Nixon, Cal, Cheryl, David, Diane, Hubert,

Jacinta, Jean-Marie, Jon, Mike, Nathan, Nikita, Nikki, Pedro, Phil Dickson, Sam, Severine, Simon,

Tabitha, Yanfang, Zac and everyone who has been part of this journey. Heather, you have been the

best PhD buddy! Thank you for all our conversations, I have learnt so much for you. You are truly an

amazing friend and one the finest researchers I have ever met. Cal, I miss your around lab buddy. You

ii have always had a way of brightening my day and I leave every interaction with such joy. Richie, thank you for all the laughter, the craic, and being a true friend mate. To Lizzie, I’m taken aback by your level of support, particularly in these final months of my PhD. I really feel privileged to know you and am a big admirer of your research. Thank you for you mentorship and more importantly your friendship.

To Daniel, Steph, Lorcan and Fergie, I can never express how supportive you have all been to me. You are forever my family and I love you all so very much. You welcomed me into your lives without hesitation and allowed me to share in Lorcan growing up, which has been one of the greatest pleasures of my life. I owe so much of this thesis and happiness to you all.

To my friends back in Ireland, Daryl, Maria, Zainab, Dave, Mikey, Maykla, Sweeney, Pete, and

Katie, thank you all endlessly for your friendship, laughs, support and always thinking of me.

To my family, Paddy, Sharon, Roxy and Mam, thank you for supporting me during this PhD and being so far from home. Mam, I’m sure I don’t say enough but I am endlessly appreciative for everything you have done for me in my life, everything I am, have become and achieved is because of you. Thanks for all the Christmas, birthday and St. Patrick’s Day packages, every one of them has brighten my time here. You have been the biggest influence on my life, I love you and I’m incredibly proud to be your son.

Lastly and by no means least, thank you to my supportive and beautiful partner Anna. Du bist mein Ein und Alles! While our time apart has been very difficult, I’m so proud of everything we both have achieved and the strength of our love. I know we can achieve anything together and cannot wait to start our lives together and explore all that it will bring. Ich liebe dich für immer und ewig

This thesis is dedicated to the two most important people in my life, my Mam und meine

Liebste Anna.

This thesis was written and researched on Awabakal Lands. Wherever we walk in Australia, we walk on Aboriginal land.

iii

Table of Contents

Declaration ...... i

Acknowledgements ...... ii

Synopsis ...... v

Publications, conference and awards arising from this thesis...... vii

Abbreviations ...... xi

CHAPTER 1: Literature review...... 1 Use of advanced proteomics and phosphoproteomics to uncover the aetiology of chronic obstructive pulmonary disease

CHAPTER 2: ...... 63 Deep time-resolved proteomic profiling of cigarette smoke-induced chronic obstructive pulmonary disease

CHAPTER 3: ...... 113 Deep time-resolved phosphoproteomic profiling of cigarette smoke-induced chronic obstructive pulmonary disease

CHAPTER 4: ...... 170 Proteomic profiling of OCT-embedded COPD endobronchial biopsies

CHAPTER 5: General Discussion & Future Research Directions ...... 236

APPENDIX: Publications ...... 245

iv

Synopsis

Proteomics has become a mature scientific discipline across many fields and a critical

means to understand the dynamic nature of disease states. Strikingly, there are now more than

83,995 citations featuring ‘proteomics’, many of which have arisen in the past 5-10 years.

However, in stark contrast to most fields of research, in chronic obstructive pulmonary disease

(COPD), proteomics remains an under-utilised research tool with only 202 citations featuring

‘COPD proteomics’ recorded to date (<0.25% of all papers). This is in spite of the severity of the disease with COPD listed as the third leading cause of death worldwide.

In this thesis, we have sought to address this fundamental knowledge gap by undertaking a highly comprehensive, comparative and quantitative time resolved analysis of both the proteome and phosphoproteome of lung tissue using our well-characterised mouse model of cigarette smoke-induced COPD. Moreover, we have developed critical platforms for the analysis of endobronchial biopsies from patients through the analysis of samples from clinical cohorts encompassing healthy controls, healthy smokers, mild COPD and severe COPD patients.

Key highlights from this body of work include the quantification of 7,324 proteins and

27,857 unique phoshopeptides across a 12-week time course of COPD progression in our

mouse model. Strikingly, we have identified a critical window of protein dysregulation that

occurs at the 8-week time point corresponding to the progression phase of the disease.

Importantly, novel proteins implicated in this phase included heterogeneous nuclear

ribonucleoproteins C1/C2 (HNRNPC) and RNA-binding protein Musashi homolog 2 (MSI2),

two key proteins involved in RNA synthesis and binding, and protein S100-A1 (S100A1), a

calcium signalling protein with the propensity to signal through toll like receptor 4 (TLR4) and

v

modulate downstream inflammatory response pathways. In characterising alterations in

phosphorylation associated with the induction and progression stages of COPD, our data have

revealed 139 kinases responsible for the phosphorylation of serine, threonine and tyrosine residues. Moreover, from these data we have identified 14 druggable kinases that may be

targeted to reduce the pathogenesis of COPD.

Finally, through establishing a human COPD proteome we have developed a thorough

inventory of dysregulated proteins that serve to characterise the progression of cigarette-smoke

induced COPD in humans. Within this inventory we have highlighted activating transcription factor 6 (ATF6), x-box binding protein 1 (XBP1) and arachidonate 15-lipoxygenase

(ALOX15) as potential contributors to oxidative stress, the unfolded protein response and advanced ageing-like phenotypic changes that are common to lung tissue following smoke exposure. Taken together, these data provide a large and valuable resource for molecular biologists working in the field of COPD. In this way the data presented in this thesis will aid in biomarker discover, direct the delineation of COPD classifications and inform therapeutic strategies towards better patient care for individuals experiencing the weight of this disease.

vi

Publications, conferences and awards arising from this thesis

Publications arising from this thesis

Chapter 1:

1. Skerrett-Byrne, D. A., Dun, M. D., Hansbro, P.M. (2019). Recent advancements in proteomics & phosphoproteomics are the key to uncovering the aetiology of chronic obstructive pulmonary disease. Resubmitted 25/03/2019 | European Respiratory Journal

Chapter 2:

2. Skerrett-Byrne, D. A., Murray, H.C., Jamaluddin, M.F.B., Nixon, B., Bromfield, E.G., Wark, P.A.B., Scott, R.J., Dun, M. D., Hansbro, P.M. Deep time-resolved proteomic profiling of cigarette smoke-induced chronic obstructive pulmonary disease. In preparation for submission | The Journal of Allergy and Clinical Immunology

Chapter 3:

3. Skerrett-Byrne, D. A., Murray, H.C., Scott, R.J., Bromfield, E.G., Dun, M. D., Hansbro, P.M. Deep time-resolved phosphoproteomic profiling maps major dysregulated signalling pathways in cigarette smoke-induced chronic obstructive pulmonary disease.

In preparation for submission | The Journal of Allergy and Clinical Immunology

Chapter 4:

4. Skerrett-Byrne, D. A., Murray, H.C., Jamaluddin, M.F.B., Nixon, B., Bromfield, E.G., Wark, P.A.B., Scott, R.J., Dun, M. D., Hansbro, P.M. Proteomic profiling of OCT- embedded COPD endobronchial biopsies.

In preparation for submission | The Journal of Allergy and Clinical Immunology

vii

Additional publications

2019:

1. Dun, M.D., Rigby, C.J., Toop, H.D., Butler, S., Sillar, J., Duchatel, R.J., Germon, Z., Faulkner, S., Chi, M., Mannan, A., Skerrett-Byrne, D.A., Murray, H.C., Kahl, R.G.S., Flanagan, H., Almazi, J.G., Nixon, B., De Iuliis, G., de Bock, C.E., Alvaro, F., Morris, J.C., Enjeti, A.K., Verrills, N.M.

Under review | Leukemia

2018:

1. Nixon, B., Johnston, S.D., De Iuliis G.N., Hart, H.M., Mathe, A., Bernstein, I., Anderson, A.L., Stanger, S.J., Skerrett-Byrne, D.A., Jamaluddin, M.F.B., Almazi, J.G., Bromfield, E., Larsen, M.R., & Dun, M.D. Proteomic profiling of mouse epididymosomes reveals their contributions to post-testicular sperm maturation. Molecular & Cellular Proteomics. 2018 Sep 13. doi: 10.1074/mcp.RA118.000946.

2. Nixon, B., Johnston, S.D., Skerrett-Byrne, D.A., Anderson, A.L., Stanger, S.J., Bromfield, E., Martin, J., Hansbro, P.M., & Dun, M.D. Proteomic profiling of Australian saltwater crocodile (Crocodylus porosus) spermatozoa refutes the tenet that post-testicular maturation is restricted to mammals. Molecular & Cellular Proteomics. 2018 Aug 2. doi: 10.1074/mcp.RA118.000904.

3. Jamaluddin, M.F.B., Ko, Y.A., Kumar, M., Brown, Y., Bajwa, P., Nagendra, P.B., Skerrett-Byrne, D. A., Hondermarck, H., Baker, M.A., Dun, M.D., Scott, R.J., Nahar, P. & Tanwar, P.S. Proteomic profiling of human uterine fibroids reveals upregulation of the extracellular matrix protein periostina. Endocrinology. 2018 Feb 1. doi: 10.1210/en.2017- 03018.

2017:

1. Degryse, S., de Bock, C.E., Demeyer, S., Govaerts, I., Bornschein, S., Verbeke, D., Jacobs, K., Binos, S., Skerrett-Byrne, D.A., Murray, H.C., Verrills, N.M., Van Vlierberghe, P., Cools, J. and Dun, M.D. Mutant JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and MEK/BCL2 inhibitors for the treatment of T- acute lymphoblastic leukemia. Leukemia. 2017 Aug 30. doi: 10.1038/leu.2017.276.

viii

National and international conference presentations/contributions:

2017:

1. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M., Hansbro, P.M.. Quantitative phosphoproteomic profiling to identify new drug targets to treat chronic obstructive pulmonary disease. Australian Society for Medical Research (ASMR) National Conference 2017 – Sydney, NSW, Australia.

Oral Presentation

2. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M.D., Hansbro, P.M. .Quantitative phosphoproteomic profiling to identify new drug targets to treat chronic obstructive pulmonary disease. The 16th Human Proteome Organisation World Congress 2017 – Dublin Convention Centre, Dublin 1, Ireland.

Poster Presentation

3. Skerrett-Byrne, D.A., Murray, H.C., Al Mazi, J.G., Li, X., Chen, Y., Hansbro, P.M., Verrills, N.M., Dun, M.D. Multiplex analysis of signalling pathway activity using a novel high-resolution targeted proteomics approach - Phospho Parallel Reaction Monitoring (PPRM). ASMR Newcastle Satellite Scientific Meeting 2017 - Hunter Medical Research Institute, Newcastle, NSW, Australia

Oral Presentation

2016:

1. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M., Hansbro, P.M. Real time assessment of altered signalling pathways in the pathobiology of Chronic Obstructive Pulmonary Disease. ASMR National Conference 2016 – Gold Coast, Queensland, Australia.

Oral Presentation

2. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M.D., Hansbro, P.M. Comparative and Quantitative Proteomics for the Discovery of Novel Drug Targets to Treat Chronic Obstructive Pulmonary Disease. ASMR Newcastle Satellite Scientific Meeting 2016 - Hunter Medical Research Institute, Newcastle, NSW, Australia.

Oral Presentation

3. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M., Hansbro, P.M. Uncovering the Aetiology of Chronic Obstructive Pulmonary Disease; a Comparative Proteomic Approach. Chemical Proteomics Symposium 2015 – Children’s’ Medical Research Institute, Sydney, NSW, Australia.

Poster Presentation

ix

4. Skerrett-Byrne, D.A., Jones, B., Scott, R., Dun, M., Hansbro, P.M. Uncovering the Aetiology of Chronic Obstructive Pulmonary Disease; a Comparative Proteomic Approach. ASMR Newcastle Satellite Scientific Meeting 2015 - Hunter Medical Research Institute, Newcastle, NSW, Australia

Poster Presentation

Invited Seminars:

2017:

• Technische Universität München, Munich, Germany | 2017 Seminar title: ‘Quantitative phosphoproteomic profiling to identify new drug targets to treat chronic obstructive pulmonary disease’

• Syddansk Universitet, Odense, Denmark | 2017 Quantitative phosphoproteomic profiling to identify new drug targets to treat chronic obstructive pulmonary disease

Awards:

1. Jennie Thomas Medical Research Travel Grant (2017) | HMRI

Through this grant I attended the 16th Human Proteome Organisation World Congress in Dublin, Ireland. As part of this trip I visited world renowned proteomics research groups in Europe, including Prof. Martin Larsen (University of Southern Denmark), Prof. Bernhard Küster (Technische Universität München), Prof Mathias Mann (Max Planck Institute für Biochemie), Prof. Ruedi Aebersold, Prof. Bernd Wollscheid & Dr. Ben Collins (ETH Zürich).

2. Hunter Cancer Research Alliance student sponsorship (2017) | HCRA

To attend the 2017 ASMR Newcastle Satellite Scientific Meeting.

3. Australian Society for Medical Research Travel Grant (2016) | ASMR

To attend the 2016 ASMR National Conference in the Gold Coast, Australia.

4. University of Newcastle Postgraduate Research Scholarship (2014-2018) | UON

5. University of Newcastle International Postgraduate Research Scholarship (2014- 2018) | UON

x

Abbreviations

1DE One dimensional gel electrophoresis

2DE Two dimensional gel electrophoresis

BALF: Bronchoalveolar lavage fluid

COPD: Chronic obstructive pulmonary disease

CS: Cigarette smoke

DAMPs: Damage-associated molecular patterns

DDA Data- dependent acquisition

DES Differentially expressed spots

EBC Exhaled breath condensate

ELF Epithelial lining fluid

GOLD: Global Initiative for Chronic Obstructive Lung Disease

H&E: Hematoxylin and eosin

HILIC: Hydrophilic interaction liquid chromatography

HNRNPC: Heterogeneous nuclear ribonucleoproteins C1/C2.

iTRAQ: Isobaric tags for relative and absolute quantification

LC Liquid chromatography

MALDI Matrix assisted laser desorption-ionization

MS: Mass spectrometry

MSI2: RNA-binding protein Musashi homolog 2

nLC-MS/MS: Nano liquid chromatography tandem mass spectrometry

OCT: Optimised cutting temperature compound

PRM: Parallel reaction monitoring

PTM Post translational modification

xi

S100A1: Protein S100-A1

SIMAC Sequentially eluted from an immobilised metal affinity chromatography

TiSH TiO2 – SIMAC - HILIC

WT: Wild type

xii

Chapter 1: Literature Review

Use of advanced proteomics and phosphoproteomics to uncover the aetiology of chronic obstructive pulmonary disease

Under review following resubmission: European Respiratory Journal

Authors: David A. Skerrett-Byrne1,2, Matthew D. Dun1,3# and Philip M. Hansbro1,2,4#

Affiliations: 1 School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, NSW, Australia, 2 Hunter Medical Research Institute, VIVA Program, Newcastle, NSW, Australia, 3 Hunter Medical Research Institute, Cancer Research Program, Newcastle, NSW, Australia, 4 Centre of Inflammation, Centenary Institute, and University of Sydney, Sydney, NSW, Australia; # Authors contributed equally

1

Chapter 1: Overview

This review was constructed to serve as a toolkit or template for future proteomic and phosphoproteomic studies in the field of chronic obstructive pulmonary disease (COPD). To that end we began by providing the reader with the principle understandings of high-resolution quantitative mass spectrometry (MS)-based proteomics. We define important technical terms with descriptions of ionisation methods, mass analysers, tandem mass spectrometry

(MS/MS), coupled with detailed figures. We highlight several advances in sample preparation, with a focus on phosphoproteomics due to the paucity of studies focusing on the pathogenesis of COPD.

Following this we provide an overview of how proteomics has been utilised to date to investigate the aetiology of COPD, helping to contextualise what has been discussed thus far.

We review an array of clinically relevant biospecimens, their potential unique insight, possible challenges to overcome and relevant studies with a strong MS-based approach. This leads to our final section “Strategies to elucidate complex pathogenesis” where we provide two excellent templates to help advance proteomic studies in the field.

Unlike other fields, such as cancer biology, phosphoproteomics has seldom been utilised to understand the pathogenesis of COPD despite diverse roles for phosphorylation in cell differentiation, migration, proliferation, and death. Our template-based approach in this review details a complex study which characterises the dysregulated pathways driven by phosphorylation in a mutant specific cancer with phosphosite-specific quantification.

Additional phosphoproteomic profiles were generated to establish how these signalling pathways are altered upon combinatorial treatment regimens, which gives the incentive to further optimise current treatment regimes. In addition, we discuss the use of these approaches

2

to identify druggable signalling pathways, through extensive network, pathway and kinome profiling analysis.

The uptake of these technologies coupled with time- and compartment-resolved proteomics and phosphoproteomics will reveal important new information, help tackle the complex and heterogeneous nature of COPD, and ultimately help to identify novel cellular targets for the development of new and effective treatment strategies. This chapter is currently under review following resubmission to the European Respiratory Journal as a review paper.

3

1.1 Abstract Chronic obstructive pulmonary disease is currently the third leading cause of death worldwide, with its incidence continuing to rise. At best, current treatments only reduce symptoms, with no effective strategies to reverse the damage or halt disease progression.

Limited understanding of the molecular changes in the lung underpins the lack of development of effective treatments. This is primarily linked to the difficulties in obtaining clinically relevant samples at disease onset for high resolution analysis. Advances in genetic approaches have not yet improved treatment, and there is disjointed concordance between genomic and proteomic research. Consequently, it is timely to invest in recent high resolution advances made in quantitative protein sequencing technologies, in particular mass spectrometry (MS)-based proteomics, to elucidate the intricate signalling networks that drive the pathogenesis of COPD.

In this review, we provide a detailed discussion of the features of MS-based proteomics, and what approaches are best suited to address certain biological questions. We discuss the proteomic studies that have been undertaken on COPD biospecimens, their potential and limitations. Finally, we provide key studies as templates, highlighting how MS-based proteomics and phosphoproteomics can reveal novel insights into the dynamic regulation of changes associated with disease aetiology.

4

1.2 Introduction

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death globally and

an ever-increasing socioeconomic burden worldwide 1, 2. It is an umbrella term for complex

heterogeneous respiratory disorders characterised by chronic pulmonary inflammation in

response to prolonged exposure to noxious stimuli that promotes progressive thickening and

narrowing of the airway with destruction of the lung parenchyma 3. Atrophy or enlargement of

the alveoli lead to rapid declines in lung function and severe breathing difficulties, reducing

quality of life, and causing death due to asphyxia, immune system dysfunction, sepsis and acute

lung infections 2, 4, 5. The severity of COPD is classified according to the Global Initiative for

Chronic Obstructive Lung Disease (GOLD) guidelines; from mild to severe, GOLD stage 1 to

4 2.

The leading cause of COPD is tobacco smoking, which is associated with >80% of all

diagnoses 2, 6. Despite this overwhelming evidence, smoking rates continue to rise in

developing countries 7, 8, and the prevalence of COPD is projected to increase in these areas as

a result of the aging population 7. Furthermore, air pollution, environmental smoke exposure

and genetic factors, are also known to cause COPD in non-smoking patients, which are

increasing in developing countries (e.g. China, India) 7, 8. These issues highlight the need to

develop improved treatments and management strategies.

Currently, treatment for COPD is limited to long-acting muscarinic antagonists and

glucocorticoids, which only provide symptomatic relief and fail to halt the progression of the

disease 9-11. Most emerging pharmaceutical therapies aim to target proinflammatory cytokines

and chemokines that are known to be associated with inflammation and COPD presentation 12.

Early phase II clinical trials using the cytokine receptor CXCR2 antagonist, MK-7123, which

5

aims to reduce neutrophil chemotaxis, showed significant improvements of Forced expiratory

13 volume in one sec (FEV1) in patients with moderate to severe COPD . The oxidative microenvironment of lung tissues is significantly altered during smoking and in COPD, and attempts have been made to inhibit the excess production of reactive oxygen species and oxidative stress, known contributors to airspace epithelial injury 14. Modulation of mucus hypersecretion has also been trialled 15. However, there are no effective treatments that halt the progression of COPD, which is due to the lack of understanding of the molecular drivers of pathogenesis.

Genomic and transcriptomic studies are most commonly used to progress our understanding of

COPD pathogenesis. α1-antitrypsin deficiency (AATD) is firmly established as a genetic risk factor for COPD, however it is only present in 1-2% of all cases 16. Numerous genome-wide association studies have identified loci that may play potential roles in COPD susceptibility, such as hedgehog interacting protein (HHIP), nicotinic cholinergic receptors α-3 and 5, and family with sequence similarity 13, member A (FAM13A) 17, 18. While their mechanisms of involvement are not well understood, studies using heterozygous mice, Hhip (+/-), showed increased sensitivity to the development of age-related emphysema 19, 20, and Fam13a null mice were resistant to smoke-induced emphysema potentially through the inhibition of β–catenin 20.

Transcriptomic studies have been performed with the aim of understanding the role of oxidative stress in COPD, which revealed genes and pathways responsible for changes in oxidant defences in airway epithelial cells 21. Although these sophisticated genetic and transcriptomic approaches have produced important information on the progression of COPD, they have not yet defined molecular targets that inform the development of new therapies or improve management strategies.

6

One final layer of complexity not yet addressed is that the human lung has been reported to consist of over forty different cell types, including and not limited to ciliated, clara cells, oncoctyes, smooth muscle, fibroblasts, plasma cells, stem cells and an array of immune cells

22. Most of this variety has been identified by electron and light microscopy research but recent advancements in single cell technology has enabled the recent identification of a novel cell type known as an ionocyte in the airway epithelium 23. This emphasises the fact that the true complexity of cell types in the lung is potentially still unknown. A recent multi-national collaboration has set about the establishment of The Human Lung Cell Atlas, utilising multi- omic single cell data to create a detailed spatial reference map of each cell type on to the lung tissue architecture in both healthy and diseased conditions 24-28. This atlas will be of unquestionable benefit to the respiratory field to progressively track proteins of interest.

Preliminary datasets are currently available until the full integration of The Human Lung Cell

Atlas has been performed. These include https://lungcellatlas.org 25 and https://theislab.github.io/LungAgingAtlas 29.

New studies focused on the proteins that underpin the dysregulation of lung homeostasis, and the interplay between diseased cells, the supporting epithelium, local microenvironment, variety of cell types and immunity is a promising approach. High-resolution proteomics is a powerful technique that can potentially elucidate the molecular regulators of COPD pathogenesis, and may facilitate the development of precision treatments, as well as improve diagnostic approaches. Here, we describe the basic principles needed to understand mass spectrometry (MS)-based proteomics, discussing the major techniques that underpin high- resolution quantitative proteomic approaches, and the sophisticated methodology involved, to help guide the reader how to best utilise them. We review the available relevant biospecimens, their potential, recent studies and limitations. Finally, we highlight how MS-based proteomics

7

can lead to a revolutionary understanding of the dynamic changes of the pulmonary proteome

and characterisation of COPD-associated signalling pathways through both proteomic and

phosphoproteomic analysis.

1.3 MS-based proteomics

Since the Human Genome Project 30, 31 it has become apparent that to truly understand

the complexity of the genetic code there needs to be a multi-omic approach with a detailed

examination of the dynamic diversity of the proteome, which is the focus of this review. The

exponential complexity of the proteome is emphasised by the fact there are approximately

20,000 protein-coding genes in the human genome (Figure 1.1 A) which are transcribed into

~100,000 transcripts (Figure 1.1 B), and potentially translated into >1,000,000 proteoforms

(Figure 1.1 C). Ultimately the integration of multi-omic datasets is needed as each approach

has limitations. For a comprehensive overview of multi-omic approaches, their integration and

analysis, refer to Hasin et al.32

The proteome (Figure 1.1 C) refers to all the proteins expressed in a cell, tissue or any

biological system at any one time but its complexity grows exponentially due to over 1 million

proteoforms co-existing in the human body 33. Proteoform is an emerging term 34 that

encapsulates all possible protein products from a single gene, including genetic variations,

alternative splicing, but the predominate complexity in these proteoforms arises from >200

known post-translational modifications (the PTMome) (Figure 1.1D) 34, 35. Notably neither the

proteome nor the PTMome are static, but are dynamically regulated by a wide variety of

stimuli. Additional functional complexity arises from protein complexes and interactions

(Figure 1.1 E), and analytical complexity from subcellular localisation (Figure 1.1 F) and

tissue-specific abundance. Increasingly, proteomics is helping to identify causation from

8

association by largely dealing with functional proteins within a cell. This is aided by the application of novel tools such as activity-based protein profiling which is a powerful approach to characterise the active function of enzymes 36, demonstrated by Li et al., in the identification of the kinase Aurora kinase B as a potential novel target in small cell lung cancer 37.

Proteomics has lagged behind genetic studies for decades until a technological revolution in MS occurred over the past 20 years, referred to as next-generation proteomics. This has allowed for automated sequencing and characterisation of the proteome and PTMome to unprecedented depths with femtomole sensitivity in complex biological samples 38-40, with major advances made by two seminal publications attempting to map the human proteome. 41,

42

In this review we will first discuss the key components of bottom-up MS-based proteomic approaches which are composed of sample preparation, separation of , MS, and data analysis 43, 44.

9

Figure 1.1 Proteome complexity. Approximately 20,000 protein-coding genes encompass the human genome (A), which are transcribed to form the transcriptome, made up of >100,000 known transcripts (B). This proteome complexity grows exponentially to >1,000,000 proteoforms (C) encompassing >200 known post-translational modifications (D). This is additionally complicated by protein interactions and complexes (E) and subcellular localisation (F). Figure 1.1 was created using BioRender.

10

1.3.1 Sample preparation; enrichment and fractionation strategies

Unlike genomic and transcriptomic studies, proteomics is restricted to the amount of starting sample available. In bottom-up MS proteomics, a protein mixture is commonly digested to peptides enzymatically using trypsin, which cleavages at arginine and lysine residues (Figure 1.2). If an unfractionated or unenriched sample is introduced to the MS a large amount of information can be lost due to ion suppression or overshadowing by more abundant peptides, with similar elution times 45. Several strategies (Table 1.1) can be implemented to resolve these issues and ensure minimal sample loss, reduce sample complexity and achieve in-depth sequencing (Figure 1.2D & E). A relatively popular fractionation technique is known as gel electrophoresis, encompassing one- and two dimensional gel electrophoresis (1DE and

2DE). 1DE separates a protein lysate based on their molecular weights (Figure 1.2D), whereas

2DE combines this separation with the protein’s isoelectric point (Figure 1.2D), the pH at which a protein has no net charge 46. There have been recent significant strides to demonstrate through the use of time-resolved fluorescence that gel-based proteomics can now detect lowly abundant proteins at the sub attomolar level 47, which can be very advantageous in scarce clinical biospecimens.

The most commonly used separation strategy is liquid chromatography (LC)

(Table 1.1). Taking advantage of the heterogeneity of peptide physiochemical characteristics, such as pH or hydrophobicity, LC enables a dynamic complex peptide mixture to be resolved into simpler populations 38, 39, 48, which greatly improves proteome coverage (Figure 1.2.E).

This can be performed either off-line, prior to MS, or on-line, feeding directly into the MS.

Selective enrichment of subsets of the proteome or the PTMome is another way of reducing sample complexity and ensuring deeper coverage. Phosphorylation has emerged as a

11 major research focus due to the critical role it plays in cell signalling pathways in human tissues

49. Phosphorylation occurs in >50% of all human proteins, with recent work suggesting the true number is closer to 90% 41, 50. Pioneering work by the Larsen group 51-53 enabled the selective enrichment of phosphopeptides from a complex biological mixture using a multidimensional fractionation strategy collectively known as TiSH (Figure 1.2F) (Table 1.1). Here, a complex peptide mixture is subjected to a precise ratio of titanium dioxide (TiO2) beads, which bear affinity to phosphorylated peptides due to its amphotheric ion-exchange properties 54. This pre- enriches phosphorylated peptides, and enables their separation from non-phosphorylated peptides (Figure 1.2F.i). The enriched phosphopeptide mixture is then sequentially eluted from an immobilised metal affinity chromatography (SIMAC) column, which allows the separation of mono- or singly- phosphorylated peptides (which bind weakly to IMAC) and multiply- phosphorylated peptide populations (strongly bound) (Figure 1.2F.ii). Mono-phosphorylated peptides undergo a second round of TiO2 bead enrichment to remove sialylated N-linked glycoproteins (Figure 1.2F.iii) 55. Then to ensure a greater depth of analysis, mono- phosphorylated peptides are further fractionated based on their hydrophilicity using hydrophilic interaction LC (HILIC) (Figure 1.2F.iv). Each population and fraction is analysed by MS. TiSH is widely used needing minimal amounts of input peptide. It was recently employed by Kang et al., to elucidate and characterise the molecular mechanisms involved in glucose-stimulated insulin from pancreatic β-cells by integrated analysis of the proteome, phosphoproteome and sialylated N-linked glycoproteins 56.

An emerging alternative for studying the phosphoproteome is a method termed EasyPhos

57, 58. This high-throughput workflow has reduced sample preparation time to one day, using only low starting amounts of protein (≤ 200 µg), and all performed in a single 96-well plate

(Table 1.1) 58. This is achieved by using a novel all-purpose buffer containing sodium

12

deoxycholate buffer which allows for the efficient lysis of cells with trypsin digestion

59 compatibility and in combination with isopropanol, ensures compatibility with TiO2 phosphopeptide enrichment. EasyPhos has been used to accurately identify ~20,000 phosphopeptides in a time-resolved map of insulin signalling in the mouse liver 57 and 20,132 phosphopeptides in epidermal growth factor-treated glioblastoma cells 58. There are a number of alternatives to phosphopeptide enrichment including immunoprecipitation of phosphorylated serines, threonines or tyrosines 60, 61.

13

Figure 1.2 Upstream methods of mass spectrometry based proteomic workflow. Biospecimens collected from COPD patients; Induced sputum, exhaled breath condensate (EBC), epithelial lining fluid (ELF), bronchoalveolar lavage fluid (BALF), plasma, serum and lung tissue (A). Protein lysates are extracted from biospecimens and digested with site-specific enzyme(s), such as trypsin, to form a tryptic peptide population (B). Tryptic peptides are introduced to the mass spectrometer (MS) via liquid chromatography, generating a spectral readout (C). Optional strategies to reduce the complexity include gel electrophoresis where a protein lysate is separated by molecular weight (1DE) or molecular weight and the proteins isoelectric point (2DE) (D). Liquid chromatography, where a complex peptide population is deconvoluted into small peptide populations for MS analysis (E). TiSH for the selective enrichment of the proteome, mono- and multi-phosphopeptides from a complex biological mixture using a multidimensional fractionation strategy (F). Aspects of Figure 1.2 were created using BioRender. 14

TABLE 1.1 Enrichment/fractionation techniques commonly employed in proteomic workflows Techniques Outcome Description Notes Ref. Fractionation of a peptide population Peptides are introduced to the LC system and bind to a Can be utilised both on- or off-line for 62-64 Liquid based on any number of chemical or column (stationary phase). A solvent (mobile phase) is MS, the former reducing risk of peptide chromatography physical properties; e.g. pH, passed through the column to elute peptides in a loss and the later providing great hydrophobicity sequential manner. coverage. 52, 65, 66 Mono-phosphorylated peptides A quite laborious workflow, however it Multi-phosphorylated peptides greatly reduces sample complexity, and TiSH See next rows for breakdown of full method Non-modified peptides provides unparalleled coverage of four Sialylated N-linked glycopeptides distinct peptide populations.

A complex peptide population is subjected to a precise Phosphorylated peptides ratio of TiO2 beads, which bear affinity to TiO2 Proteome/unphosphorylated phosphorylated peptides. This allows for the isolation of population phosphorylated peptides. Phosphopepetides are sequentially eluted from an SIMAC Multi-phosphorylated peptides immobilised metal affinity chromatography column,

allowing enrichment of the multi-phosphorylated peptides.

A second round of TiO2 separates the mono- nd Mono-phosphorylated peptides 2 TiO2 phosphorylated peptides from the sialylated N-linked Sialylated N-linked glycopeptides glycopeptides. Fractionation of non-modified peptides, sialylated N-linked Fractionation of peptide population of interest based on HILIC glycopeptides and mono- their hydrophilicity phosphorylated peptides 57, 58 Using an all-purpose buffer, samples rapidly and High-throughput rapid quantification of Phosphorylated peptides efficiently undergo lysis, digestion and TiO2 >20,000 distinct phosphopeptides using EasyPhos Proteome phosphopepetide enrichment with only a single minimal starting material (≤200µg) in ~1 desalting step. day.

15

A final aspect to consider before undergoing proteomics studies is whether to undertake a label-free or label-based protein quantification. Label-free approaches are more cost-effective but involve handing each sample individually and there is an ongoing struggle with accuracy and robustness. Quantification algorithms have been developed seeking to deal with this issue

67. In label-based approaches chemical labels can be introduced into living cells using stable isotope labelling of amino acids in cell culture, known as SILAC 68. At the peptide level, chemical tags such as isobaric tags for relative and absolute quantitation (iTRAQ) 69 or tandem mass tags (TMT) 70 can be used. This involves a set a chemical tags each possessing an amine- reactive group, which reacts with N-terminal amine groups or lysine side chains, and a reporter ion of differing masses enabling quantification following peptide fragmentation 71. Once the peptide population has been chemically tagged, all samples are mixed evenly to allow for the parallel analysis of up to 11 distinct samples, reducing bias, human error variation, and the labour intensive nature of proteomics 72.

Following sample preparation, peptides are introduced to the MS. We next discuss the differ types of MS acquisition approaches, the basic principles behind them and what information can be obtained from each.

1.3.2 MS acquisition approaches

MS acquisition approaches generally fall into three distinct categories (Figure 1.3); data- dependent acquisition (DDA), data-independent acquisition (DIA) and targeted proteomics.

Although they differ there are some basic principles that apply to all MS-based approaches.

MS is generally composed of three key stages (Figure 1.3A); ion source, mass analyser and detector. At the ion source stage (Figure 1.3A.i), peptides are given an electrical charge to allow movement within the mass analyser. This is commonly achieved by electrospray ionisation 73,

16 whereby high voltage is applied to peptides directly as they enter the mass analyser, or by matrix assisted laser desorption-ionization (MALDI) which utilises a matrix capable of absorbing laser energy and transferring it to the peptides mixed within the matrix 74. These ionised peptides are introduced into the mass analyser of the MS. There are several mass analyser options available, each with their own unique characteristics affecting mass accuracy and sensitivity through differing approaches to separating ions 75. Mass analyser options include ion traps, Obritraps, quadruoples and time-of-flight 75-78. Liquid chromatography (LC) is commonly coupled to two mass analysers and is known as LC tandem mass spectrometry

(LC-MS/MS). For a basic overview of LC-MS/MS, we focus on triple quadrupole MS. The quadrupole consists of four parallel rods, which generates an electric field when a voltage is applied. A triple quadrupole is composed, in order, of a mass analyser (Figure 1.3A.ii), collision cell (Figure 1.3A.iii), and another mass analyser (Figure 1.3A.iv) 38, 79. Upon entering the mass analyser these ionised peptides, also known as precursor ions, enter the first quadrupole (Q1) which selects ions of particular mass-to-charge (m/z) ratios for fragmentation

(Figure 1.3A.ii). This generates an initial spectra known as the first mass scan (MS1), where precursor ions are graphed based on their relative intensity against time eluted (Figure 1.3B).

Selected ions enter the collision cell (Q2) at high acceleration whereby they collide with neutral molecules generating fragment ions (Figure 1.3A.iii). The final quadrupole separates the fragment ions based on their m/z ratio and feeds into the detector (Figure 1.3A.iv). The detector generates an output in the form of spectra where the fragment ions are defined by their m/z ratio and intensity, known as the second mass scan (MS2). Aebersold and Mann have comprehensively reviewed the principles of MS-based proteomics 38.

Where these three approaches predominantly differ is in the manipulation of the parameters of the mass analyser, a balance between mass accuracy, resolution, and sensitivity

17

38. When designing a proteomic study one must prioritise between the number of proteins identified and high resolution quantification of those identifications (Figure 1.3.F). To the left of that linear scale is DDA, the most predominantly used approach in discovery-based, shotgun, proteomics (Figure 1.3.C). Here the researcher is interested in the most protein identifications possible in complex samples with relative quantification. DDA analysis enables the de novo identification of thousands of proteins in a single LC-MS/MS run. At the MS1 level, DDA selects the most abundant precursor ions, which are individually fragmented in MS2 38, 48, 80.

Notably, DDA is inherently biased towards highly expressed peptides due to stochastically sampling the most abundant co-eluting peptides which leads to reproducibility challenges and loss of lowly abundant peptides. This is further confounded by the heterogeneous nature of peptide physiochemical characteristics, including composition, charge, hydrophobicity/philicity and the presence of PTMs. However, as discussed in the sample preparation section, fractionation and enrichment strategies aim to alleviate some of these challenges. This output is analysed by database search algorithms, such as Mascot 81 or Sequest

82, to determine with high probability the identity of what peptides are present and their corresponding proteins. Most MS vendors provide software to analyse this type of data.

However, a popular free alterative software is MaxQuant 83.

At the other end of this linear scale is targeted proteomics (Figure 1.3.F). An alternative acquisition approach where a researcher has a well-defined set of proteins, pertinent to their biological question such as clinical biomarkers, which dictates the parameters of MS1 and MS2 methods providing precise quantification, and unparalleled sensitivity and robustness 84-88.

Selected reaction monitoring (SRM) 89, multiple reaction monitoring (MRM) 90 and parallel reaction monitoring (PRM) 85 encompass most targeted proteomic approaches. PRM has emerged as the targeted approach of choice (Figure 1.3.E). Peterson et al., established this

18

technique that uses previously identified parameters of specific peptides (m/z ratio and elution times dictated by the LC gradient) which allows the MS to reproducibly identify the precursor ions at the MS1 level and monitor all fragment ions at the MS2 level. This has revolutionised the targeted proteomic field, allowing up to 600 pre-defined peptides to be quantified in a single

LC run 91. Several bioinformatic software packages are freely available to assist in analysing the resulting data; Qualis-SIS 92, QuaSAR 93, and Skyline 94.

An exciting and ever-evolving approach known as DIA, aims to bridge the gap between the identification rate of DDA and the precise quantification of PRM (Figure 1.3.E & F). To achieve this, instead of isolating precursor ions, DIA methods loop through pre-defined mass ranges, sampling all precursors within each range, and like in PRM, continuously monitors the fragment ions for precise quantification 95, 96. The weakness of DIA has been the complex nature of the MS2 spectra generated but there have been some promising technical advances and software packages 97, 98 which provide supervised and unsupervised analysis of these data;

OpenSWATH 99, Skyline 100, Spectronaut 101, and SWATHProphet 102. An emerging method of DIA known as sequential window acquisition of all theoretical mass spectra (SWATH-MS)

103 bridges this gap between DDA and PRM and provides quantitative consistency 104. An important powerful aspect of SWATH-MS to note is that retrospective targeting at both the

MS1 and MS2 level is possible, unlike in DDA (relatively possible at MS1), or targeted proteomics. Due to the quantitative robustness and accuracy of SWATH-MS, it has been utilised in clinical therapeutics studies 105, biomarker identification 106, and basic research 107.

Ludwug et al., published a detailed tutorial for using SWATH-MS 108.

We next discuss the available relevant COPD biospecimens, their use, relevant studies and limitations.

19

1.4 Tissue selection

Inhalational exposure to noxious stimuli to induce COPD results in complex and

heterogeneous responses and disease 109-111. Complexity and heterogeneity are two very

important factors to consider when designing any proteomic study, the former referring to the

numerous clinical characteristics of COPD while the latter highlights that not all patients

possess all characteristics of the disease at any given time 111.

The next question is what biospecimens to work with. There are several biospecimens

available, each providing different but important information. Lung tissue for a given relative

homogeneous population, provide a broad global view of changes, while individual cell types

provide a window into specific changes. Both are important and will provide pieces to

understand COPD aetiology. The advances and principles of MS-based proteomics outlined

possess the potential to unravel the role of these biospecimens in COPD pathoegenesis. We

discuss the major biospecimens relevant to COPD, their potential uses in proteomics, the most

recent or applicable MS-based proteomic studies (Table 3) undertaken to further the

understanding of COPD pathogenesis or identifying potential biomarkers 112, 113, and

challenges and limitations to consider.

Search strategy and selection criteria We searched PubMed using search terms “proteomics” and “COPD” combined with the biospecimen of interest, using both its abbreviation and full name; “induced sputum”, “exhaled breath condensate”, “epithelial lining fluid”, “bronchoalveolar lavage fluid”, “plasma”, “serum”, “lung tissue”, focusing on original data publications in English between January 2000 and November 2018. We searched the reference list of all publications included in this review for additional relevant original publications. We focused on publications which used a MS-based proteomics approach using human samples. We endeavoured to cover different MS acquisition approaches to provide clinically relevant examples pertinent to the MS-based proteomics section.

21

1.4.1 Induced sputum

Inhalation of hypertonic saline induces the secretion of complex sputum originating from

both proximal and distal airways. It is the most studied COPD clinical biospecimen due to the

ease and production of samples using non-invasive techniques, and applicability to studying

inflammatory mediators 114.

A fractionation technique that can be used prior to MS is two-dimensional difference in

gel electrophoresis (2D-DIGE), which separates complex protein populations based on

molecular size and isoelectric points and appear as spots on the gel. Olmeier et al., used a

cysteine-specific 2D-DIGE, which targets cysteine/thiol-containing proteins known to have

roles in lung defence and redox modulation of several lung diseases 115, 116. Comparing the

induced sputum from COPD GOLD stage 2 patients, healthy smokers and non-smokers, 38

differentially expressed spots (DES) were found to be exclusive to COPD, suggesting a unique

role in disease 117. These spots were excised and analysed using MALDI-TOF MS, which

resulted in the identification of 15 differentially expressed proteins. These included the

polymeric immunoglobulin receptor, a protein which regulates immune defence and

inflammation 118. It was highly upregulated in COPD patients and smokers suggesting a smoke-

dependent induction. It was validated in a clinical cohort by ELISA, immunohistochemistry

and immunoblotting using induced sputum, lung tissue and plasma.

More recently, researchers hypothesised a MS-based proteomic approach would enable

the identification of unique alterations related to differing clinical features of COPD. Induced

sputum from COPD patients with varying clinical features, healthy smokers and healthy

controls, were subjected to 2DE coupled to LC-MS/MS. This revealed reduced expression of

the immunoglobulin joining-chain in airway obstruction, increased expression of histones and

22

defensins related to inflammatory emphysema, and elevated mucin 5AC correlating with

mucus hypersecretion 119.

Sophisticated unbiased approaches have also been employed to study the proteome in

induced sputum. A comparative and quantitative LC-MS/MS coupled to TMT labelling was

used to quantitatively analyse 240 subjects in four study groups; mild-moderate COPD (GOLD

stage 1/2), current, former and never smokers 120. Thirteen differentially expressed proteins

were identified with high confidence, between COPD patients and asymptomatic smokers.

These included metalloproteinase inhibitor-1, apolipoprotein A-I, and BPI fold-containing

family B member-1, which are linked to oxidative stress, inflammation, and altered mucus

production, respectively.

These studies demonstrate that induced sputum may enable the identification of non-

invasively obtained biomarkers, but it is important to note that collection of this biospecimen

is complicated by technical and clinical factors such as varying sample dilutions between

patients, bacterial colonisation, and protein or cellular contamination from the upper airways

114, 121.

1.4.2 Exhaled breath condensate

Exhaled breath condensate (EBC) is another appealing non-invasive sample used for the

identification of biomarkers due to the ease of collection and its suitability for longitudinal

studies 121. Only two MS-based proteomic studies have assessed EBC in COPD. One compared

the EBC from patients with the genetic disorder α1-antitrypsin deficiency (A1AT) with

emphysema to healthy controls 122. This study identified several cytokines and cytokeratins

(Table 3) to be upregulated in the A1AT patients. Building on this Fumagalli et al., profiled

23

similar cohorts with the addition of COPD without emphysema and healthy smokers 123. They identified several cytokines including interleukin-1α, -β, tumour necrosis factor, as well as type

I and II cytokeratins, and calgranulin A and B as potential biomarkers to distinguish between groups. Intended as a preliminary qualitative study, it has several limitations most notably the lack of quantitative data and limited validation.

Due to the paucity of MS-based proteomic studies in COPD, we mined data from a study focused on identifying non-invasive biomarkers for the early detection of lung cancer, which included COPD patients, smoking participants at risk of developing both diseases and healthy controls. We identified 17 unique proteins (Table 3) in COPD patients such as fibrinogen-α and -β chain 124. A Gene Ontology classification 125, 126 of these unique proteins returned molecular functions of and unfolded protein binding, enriched for pathways involved in legionellosis 127, splicesome and mitogen-activated protein kinase (MAPK) signalling. While not without limitations, the re-analysis of these data provides potential proteins and pathways for further proteomic characterisation and functional studies in appropriate patient cohorts.

EBC is limited by its very dilute and highly varied collections between patients. EBC collections are sourced from the whole lung tissue providing a relatively low number of proteins, with the additional complication of potential contaminants from the upper airway 121,

128. Future COPD MS-based proteomics studies of EBC and may lead to the development of high-throughput diagnostic tests to predict disease onset or markers of progression, however, improvements to sample collection and processing for MS 129 need to occur first.

24

1.4.3 Epithelial lining fluid

Epithelial lining fluid (ELF) is a thin layer of fluid that is a physical barrier between the

external environment and lung tissue. ELF is sampled using a technique known as

bronchoscopic microprobe, whereby a bronchoscope guides and exposes a protected adsorptive

tip to the airway mucosa for the sampling the ELF via adsorption 130, 131. It is comprised of a

complex matrix of proteins and peptides which covers mucosal airways and alveolar surfaces.

Unlike the previous biospecimens discussed ELF does not suffer from dilution or upper airway

contaminations and its analysis provides an insight into specific compartments 131. It does

however risk possible contamination from blood as well as potential tissue damage 131.

Proteomic profiling of this protective barrier could provide insights into how environmental

factors, in the case of COPD cigarette smoke, can alter the protein composition of the ELF and

the pathogenic effects of gas exchange, removal of impurities, etc 113.

The first proteomic profile of human ELF employed 1DE to separate proteins extracted

from the fluid of COPD patients and healthy controls. The resulting gel was cut into 43 slices

and analysed by LC-MS/MS identifying 269 proteins with varying functions including host

responses to bacterial infection, oxidative stress and inflammation 130. In the same study, a

comparative quantitative analysis of the two cohorts was performed using an iTRAQ labelled

approach coupled to MALDI-TOF/TOF MS. This yielded 299 proteins that differed between

the two cohorts, identifying differentially expressed proteins lactotransferrin, HMGB1,

cofillin-1 and α1-antichymotrypsin, all of which were validated by immunohistochemistry.

Subsequently, the same group attempted to differentiate the proteomic profiles of ELF from

young and old individuals susceptible to COPD compared to similar aged healthy controls 132.

ELF was collected at baseline and 24 hours after susceptible subjects had smoked three

25

cigarettes and data was compared to their baseline and unexposed healthy controls. Using

iTRAQ coupled to 2D-LC MALDI-TOF/TOF MS, peroxiredoxin, serpinB3 and aldehyde

dehydrogenase 3A1 were substantially increased following smoke exposure in COPD patients.

These proteins were linked to anti-inflammatory responses 133, metabolising toxic compounds

134, and protection against oxidative stress 135.

1.4.4 Bronchoalveolar lavage fluid

One of the most invasive biospecimens is bronchoalveolar lavage (BAL), as isolation

involves fibre-optic bronchoscopy introducing a sterile saline solution that is aspirated. One of

the advantages of BALF is its composition of an array of proteins that coat, and are secreted

from, the surface of distal airways and epithelia. Although dilute collections and high salt

concentrations are significant issues resulting in poor recovery, in addition to potential tissue

damage 121, 136, its analysis offers insights into dysregulated proteins derived from lung sources

such as epithelial cells and also external sources due to diffusion from serum across the

alveolar-capillary membrane 137. It is important to note that BAL fluid (BALF) generally refers

to the supernatant remaining after removal of the immune cells from the total BAL sample.

Early investigations collected BAL from lifelong smokers and healthy non-smokers at

baseline with follow-up 6-7 years later when some of the smoking cohort had developed GOLD

stage 2 COPD 138. Proteomic analysis revealed protein signatures associated with the

development of COPD. Using a more advanced mass analyser, an Obritrap, the BALF

proteomes of stable GOLD stage 2 patients versus healthy controls was investigated, yielding

76 significantly dysregulated proteins with diverse biological processes including proteolysis,

glycolysis, gluconeogenesis and alcohol metabolic processes 139. These proteins have

biomarker and therapeutic target potential.

26

A recent study used an iTRAQ-based proteomic approach to elucidate which proteins

and pathways are associated with COPD pathogenesis and how patient gender affects the

underlying mechanisms 140. Assessing subjects from the Karolinska COSMIC cohort, a cross-

sectional multi-omic study where clinical groups are stratified by gender 141, 142, researchers

showed that proteome alterations differed significantly between genders with only three shared

proteins. The female COPD patient proteome displayed a significant enrichment of

phagocytosis and lysosomal pathways, which correlated well with the level of obstruction and

emphysema, respectively. This study highlights the power of proteomics and its ability to

disentangle complex disease, in this case showing the importance of gender stratification and

its potential role in disease pathogenesis.

1.4.5 Plasma and serum

The use of blood or plasma is appealing clinically as they provide non-invasive samples

for analysis, diagnosis and disease monitoring. An additional benefit is that analysis of blood

provides information on multi-organ interactions, and systemic responses. However, proteomic

analysis of blood is challenging due its dynamic range and presence of highly abundant proteins

143. To overcome this a plasma depletion method can be used but this approach suffers from a

lack of reproducibility 144, 145. Alternatively collection in cell‐free DNA blood collection tubes

aids the identification of low-abundance plasma proteins 146. A very recent study demonstrated

the largest non-depleted plasma proteome with a depth of 1,700 proteins, averaging 1,025

proteins across 175 patients 147. This was achieved using an automated sample preparation

platform with short MS run time, 45 minutes, making it suitable for clinical application.

27

Pioneering work used surface-enhanced laser desorption/ionization – time of flight

(SELDI-TOF) MS to identify a novel serum biomarker, serum amyloid A, in acute

exacerbations of COPD 148. A later study again used SELDI-TOF MS to examine 125 sera

across four study groups; COPD smokers, asthma, cystic fibrosis and healthy controls. In total

119 differentially expressed spectral peaks were identified, providing unique signatures for

each group. Only one peak was further investigated and identified as haemoglobin subunit-β,

with increased expression in cystic fibrosis but no validation performed 149. A more recent

study used 1DE to fractionate plasma samples prior to LC-MS/MS and identified low

abundance plasma biomarkers in 10 severe COPD patients compared to 10 healthy controls 150.

Of 31 differentially expressed proteins identified, a glucose regulated protein of 78 kD,

interleukin-1 receptor accessory protein, soluble CD163 and macrophage stimulating factor-9

were dysregulated in COPD patients, with strong correlations with lung function and

emphysema severity. All four proteins were validated in a separate cohort of 80 subjects by

ELISA and immunoblotting. A 2017 study took a global unbiased approach using iTRAQ to

examine COPD GOLD stage 4 smokers and healthy smokers using high-resolution MS.

Amongst 646 proteins identified, 13 and 28 were upregulated and downregulated, respectively,

in COPD 151. Thyroxine-binding globulin was elevated in COPD, and identified as a promising

candidate plasma biomarker. This protein is interesting as it is linked to acute exacerbations

and the severity of impaired lung function, and recent studies suggest a relationship between

dysfunction in the endocrine system and COPD exacerbations 152, 153.

1.4.6 Lung tissue

Airway and lung tissue biopsies can be obtained using bronchoscopy or from lung

transplant surgeries. It is an important sample to use in proteomic investigations as it provides

a direct snapshot of the altered protein expressions and dysregulated pathways in the diseased

28

lung tissue itself. The invasive nature of biopsy collection has several limitations with increased risks of pneumothorax, bleeding and hypoxia 154, 155.

Ishikawa et al., used 2DE MALDI-TOF MS to identify the modulation of haemoglobin-

α and -β monomers and complexes in idiopathic pulmonary fibrosis (IPF) compared to COPD

GOLD stage 4 lungs and healthy controls 156. A larger study of lung tissue from six patient cohorts of mild to moderate COPD (GOLD stage 1/2), severe to very severe (GOLD stage 3/4),

AATD, IPF, smokers and healthy controls using 2D-DIGE separation and MALDI-TOF MS identified 167 DES, yielding 82 proteins 157. Cathepsin D, dihydropyrimidinase-related protein-2, transglutaminase-2 and tripeptidyl-peptidase-1 were exclusively linked to COPD.

Transglutaminase-2 detection correlated with COPD severity with no association with smoking. Future analyses investigating these proteins longitudinally coupled with mechanistic studies may identify promising diagnostic markers or therapeutic targets.

A very recent study used the promising powerful approach of SWATH-MS, to characterise the lung’s extracellular matrix (ECM) in both COPD (GOLD stage 4) and severe

IPF 158. A sequential tissue extraction strategy on distal lung resections produced three fractions named soluble, sodium dodecyl sulfate (SDS), and ECM-enriched, this study assessed deeper pulmonary proteome coverage with a particular focus on the matrisome 159, 160. This approach resulted in the quantification of 3,369 proteins, of which 279 were assigned to the matrisome.

Late stage COPD patients were characterised by significant increases in ECM regulators, which play pivotal roles in tissue homeostasis, namely; matrix metalloproteinase 12 and 28, metalloproteinase inhibitor 3 and serine protease HTRA1. With greater uptake of public consortiums of MS raw data, such as ProteomeXchange 161, this human pulmonary proteome will be a valuable resource.

29

Table 1.2 Summary of key recent proteomic studies in COPD using different clinical biospecimens Sample Proteomic Number of Significant Dysregulated Publication Study Subjects Fractionation Validation Methods Ref. Type Method Identifications Proteins Year Smokers with normal lung iTRAQ 4plex; Female COPD v Smoker: 22 Male function (M:F – 11:14) Mix-mode BALF LTQ Orbitrap 1,265 proteins COPD vs Smoker: 24 ARP3, 2D-DIGE proteomics 2018 140 COPD Gold stage 1/2 chromatography Velos HEXB, ATP5B, ILTA4H (M:F – 10:8)

COPD stage 2 ex-smoker Cathepsin D, galectin 3, ADH1B, 423 protein groups; 76 BALF LTQ - Orbitrap (10), healthy non-smoker nRPLC & nLC transgelin 2, fibrinogen β, SAP, Immunoblot 2014 139 dysregulated (10) ALDH2, ALDH3A1.

COPD stage 2 (7), MALDI- 406 spots; 30 DES; 200 BALF asymptomatic smokers (22), 2DE Secretory IgA, None 2007 138 TOF/TOF proteins non-smokers (18) Study 1) Young susceptible (5) and non-susceptible (5) Young (susceptible vs iTRAQ 4plex; Peroxiredoxin I, Uteroglobin, to develop COPD non-susceptible) 112 ELISA ELF MALDI- SCX & LC SerpinB3, S100A8, S100A9, 2014 132 Study 2) COPD (5), older proteins; older (COPD immunohistochemistry TOF/TOF Aldehyde dehydrogenase 3A1. healthy smoker (5), non- vs HS v NS) 104 proteins smoker (5)

Preliminary) 43 slices; iTRAQ 8plex; Lactotransferrin, HMGB1, Cofilin- Duplicates - COPD (4) vs SDS-PAGE & 269 proteins ELF MALDI- 1, Immunohistochemistry 2013 130 healthy controls (4) ChipLC iTRAQ 1) 138 proteins TOF/TOF α1-antichymotrypsin. iTRAQ 2) 161 proteins

FGA, FGB, HPR, HSPA1A, COPD (46) & lung cancer HSP72, HSPA1, HSX70, COPD: 125 (17 unique to EBC Qq-TOF (48), risk factor smoking LC HSPA1B, HSP72, HSPA1L, None 2017 124 COPD) (49), healthy control (49) HSPA6, HSP70B', IGKV3-20, IGHM, TGM3 Type I and II cytokeratins, COPD no emphysema (15), COPD - 17 proteins, monomeric & dimeric surfactant SELDI-TOF & EBC LTQ Orbitrap A1AT (23), healthy smoker RP-HPLC A1AT - 15 proteins, 2012 123 protein A, α1-antitrypsin, Western Blot (20), healthy control (25) NS&HS - 44 proteins Cytokines, Calgranulin A & B.

30

Table 1.2 Continued Sample Significant Validation Publication Proteomic Method Study Subjects Fractionation Number of Identifications Ref. Type Dysregulated Proteins Methods Year 1DE & 2DE, Interleukin-2 & -15, ESI-Ion Trap & AATD with pulmonary EBC RP-HPLC & 12 peptides gamma-interferon, ELISA 2008 SELDI-TOF emphysema (20) & HC (25) 122 µHPLC Cytokeratins-1,-9, & -10.

OA + emphysema (10), OA - Induced emphysema (15), MH + normal 203 proteins - Focus on 50 Mucin 5AC, Igj Q-TOF LC None 2015 119 Sputum lung function (11), healthy proteins polypeptides smoker (13), healthy control (7)

Differentially abundant KRT19, C6orf58, COPD stage 1/2 (60), current proteins: CS v NS (107), CS V TIMP1, BPIFB1, PPIB, None Induced smokers (60), former smokers TMT-6plex; Orbitrap nLC FS (110), FS V NS (1), COPD TF, AHSG, SERPINC1, (Comparison with 2015 120 Sputum (60), never smokers V NS(186), COPD V FS (155), AFM, ALB, HRG, transcriptome) (60) AND COPD V CS (13) APOA1, CNDP1

ACTG, AMY1, AZGP1, ELISA - PIGR COPD GOLD stage 2 smokers C3, CST1, CST4, GC, (Plasma), IHC - Induced MALDI-TOF (7), healthy smokers (7), 2D - DIGE 38 DES; 15 proteins IGKC, LPLUNC1, PIGR (LT), 2012 117 Sputum non-smokers (7) MSMB, PIGR, Immunoblot - SERPINA1, TF, TTR. PIGR (IS & LT)

Sequential tissue Lung Control (2), severe IPF (6) and 3,369 proteins; 279 proteins HTRA1, MMP-12, Immunohistoche Obritrap extraction & 2018 Tissue COPD stage 4 (5) assigned to the matrisome MMP-28, TIMP3 mistry 158 SDS-PAGE

COPD stage 1/2 (8), 3&4 (8), COPD specific (5) - Lung 1940 spots; 167 DES; 82 ELISA, MALDI-TOF/TOF A1AT (8), smokers (9), IPF (9) 2D-DIGE CAPG, CTSD, DPYSL2, 2016 157 Tissue proteins Immunoblot healthy controls (9) TGM2 & TPP1

Specifically excised two highly Haemoglobin-α & -β Immunohistoche Lung COPD stage 4 (4), IPF (4), MALDI-TOF 2DE abundant spots in IPF but low in monomers and mistry & 2010 Tissue healthy controls (4) 156 controls and COPD complexes Immunoblot

31

Table 1.2 Continued Sample Proteomic Number of Significant Dysregulated Publication Study Subjects Fractionation Validation Methods Ref. Type Method Identifications Proteins Year 646 proteins; 13 iTRAQ 8plex; COPD stage 4 smokers (4), Thyroxine-binding globulin, Plasma LC upregulated and 28 ELISA 2017 151 Orbitrap healthy smokers (4) peroxiredoxin-2, galectin-7 downregulated in COPD

712 proteins; 31 Unique (7), upregulated (6) and High Capacity Ion Severe COPD (10), heathy Plasma 1DE & LC differentially expressed downregulated (7) in COPD. ELISA & Immunoblot 2014 150 Traps controls (10) proteins GRP78, IL1AP, sCD163 and MSPT9 COPD stage 2 non-smokers 1900 spots; 72 DES; 20 MALDI-TOF; α2-macroglobulin, Plasma (5), stable asthma (21), 2D-DIGE unique differentially ELISA & Immunoblot 2011 162 MALDI-TOF/TOF ceruloplasmin, haptoglobin, and heathy controls (17) expressed proteins hemopexin

Significant correlation between ELISA, MS & COPD (20), heathy Focus on surfactant fucosylation levels in serum Immunoblot Serum MALDI-QIT-TOF smokers (15), heathy SDS-PAGE protein D, excised band surfactant protein D and the 2015 163 controls (18) of interest at 43kDa severity of emphysema in

smokers. Mechanism unknown.

Not all peaks were identified or COPD smokers (16), asthma 137 peaks detected; 119 investigated; only haemoglobin SELDI-TOF; Serum (29), cystic fibrosis Chip significantly subunit-β is identified. None 2010 149 MALDI-TOF/TOF (26), healthy controls (54) differentially expressed

Cluster of peaks at 11,548 Severe acute exacerbations Serum amyloid A (11,699) and 11,699 Da Serum SELDI-TOF of COPD with elevated C- Chip ELISA 2008 148 upregulated in the reactive protein (4) exacerbated samples

32

1.5 Strategies to elucidate complex pathogenesis

Here, we have highlighted advances in MS technology and enrichment strategies to reduce the complexity of the proteome to gain greater coverage. We highlight pivotal studies where MS technology has been used with the most recent incorporating more complex methods and cutting- edge proteomic technology. Next we discuss a pivotal study which provides a template of how to combine a novel sample preparation method with cutting-edge MS technology to build a time- and compartment-resolved profile of the pulmonary proteome 164.

This study by Schiller et al., aimed to assess how the extracellular niche was dynamically regulated upon lung injury and repair, and sought to comprehensively characterise how this niche is remodelled in different tissue compartments. Using a rapid mouse model of bleomycin-induced pulmonary fibrosis 165, this study assessed proteomic alterations at time points associated with inflammation (day 3), fibrogenesis (day 14), remodelling (day 28) and resolution (day 54). They first present the compartment-resolution of the pulmonary proteome using lung tissue at day 14 following treatment with either PBS or bleomycin coupled with a novel method of quantitative detergent solubility profiling (QDSP) that subjects the insoluble protein pellet to a series of buffers with increasing strength of detergents. QDSP resolved the pulmonary proteome into four unique fractions noted; lung ECM components, ELF proteome of the airway and alveolar lumen and the interstitial proteome. This compartment-resolved profile identified 8,366 proteins, of which 171 were core matrisome and 264 matrisome-associated proteins. This characterisation provides a unique insight into how the abundance and localisation of proteins changes upon lung injury.

33

Focusing on the four time points the authors quantified 6,236 protein groups with ANOVA analysis revealing that 3,032 proteins were significantly altered in at least one time point, of which

154 were matrisome components. Assessing these proteomic profiles in combination lead to the identification of two unique ECM proteins not reported previously in tissue injury or fibrosis; collagen-XXVIII (Col28a1) and Emilin-2. Tissue sections from day 14 were used to validate both proteins by immunofluorescence. The authors conclude that Col28a1 structurally is a beaded- filament-forming collagen associated with basement membranes 166 with no known function.

Emilin-2, a glycoprotein 167, has been shown to directly activate the extrinsic apoptosis pathway

168. Both proteins warrant further functional investigation of their roles in tissue repair processes.

These unique profiles provide bioinformatic mining possibilities, such as elucidating upstream temporal dynamics of transcriptional networks and downstream biological functions using

Ingenuity Pathway Analysis (IPA) 169 Knowledge base. In this study, based on the alterations in protein expression of 365 known targets of TGF-β signalling, IPA classified this master upstream regulator as activated at fibrogenesis stage of the model and resolved to baseline by the remodelling phase. Additional these protein signatures can be correlated with human lung function data to identify what proteins are uniquely altered at different clinical disease stages, all with the added dimension of spatial resolution. This paper really highlights the depth proteomics can provide with highly complex tissue environments, and the utilisation of higher-level bioinformatical analysis to identify activity of master regulators of disease.

Phosphorylation is the driving force of cell signalling, mediating practically all biological functions 170, yet despite this, there is a paucity of studies that have investigating the role of phosphorlyation in COPD. Cancer research is an excellent example illustrating the informative

34 power of phosphoproteomics and how it can characterise complex, transient signalling pathways that drive many cancers at a molecular level. A recent study employed both discovery and targeted proteomics to characterise downstream signalling pathways regulated by Janus kinase 3 (JAK3) mutations in T-cell acute lymphoblastic leukaemia (T-ALL) 171. Using TMT10plex Degryse et al., performed an unbiased deep phosphoproteomic analysis of triplicate Ba/F3 cells expressing mutant human JAK3 (L857Q) treated with the JAK1/JAK3-selective inhibitors tofacitinib or ruxolitinib or vehicle. Using the multidimensional TiSH strategy 51 coupled to high resolution LC-MS/MS,

2,070 unique phosphoproteins composed of ~5,400 unique phosphorylation sites were quantified, with a false discovery rate of 1%, across all nine samples. Unsupervised hierarchical clustering along with Ingenuity Pathway Analysis identified five independent clusters and the most significant biological functions and signalling pathways. Focusing on dysregulated unique phosphorylation sites which showed two fold change or more following treatment (84 downregulated, 48 upregulated), alterations in pathways regulating apoptosis, cell cycle, epigenetic processes, MAPK and phosphatidylinositol-4,5-bisphosphate 3-kinase/protein kinase-

B signalling, RNA metabolism and translation initiation were identified. Follow-up using PRM, validated 19 novel phosphorylation sites in two JAK3 mutant Ba/F3 cell lines (L857Q & M511I) and two JAK3 mutant human T-ALL patient samples, all with the same treatment regime as the discovery arm. This highlights how powerful proteomics and phosphoproteomics can create a detailed map of dysregulated molecular mechanisms in a complex disease, how they change with treatment, and for targeted studies validating the pathways of interest with precise quantification and detailed molecular characterisation. Murray et al., reviewed in detail how to use these approaches to identify druggable signalling pathways 172.

35

1.6 Conclusions

We provide a detailed toolkit for future proteomic and phosphoproteomics studies of COPD.

We discuss the basic principles of proteomics, contextualising what this exponentially complex omics is and its importance (Figure 1). We outline the key methodological considerations of quantitation (label-free versus label-based), proteome depth (e.g LC), and phosphorylation enrichment (TiSH and EasyPhos) needed to harness the full potential of MS-based proteomics

(Figure 2). We provide detailed descriptions of how MS-based proteomics works and the major

MS acquisition approaches that are used that underpin high-resolution quantitative proteomics

(Figure 3), what they provide and how to analyse them. We review the applicability of clinically relevant COPD biospecimens, discussing their sampling and potential information they provide.

We highlight key MS-based proteomic studies exploring each biospecimen and the challenges faced. We discuss how proteomics and phosphoproteomics is applied in to provide unique insights into COPD in two seminal studies. Schiller et al. demonstrated the utility of a temporal and compartment-resolved proteomic approach to understand disease processes in a highly complex lung tissue environment. The temporal interrogation of a mouse model enabled >8,000 proteins in the lung and BALF to be progressively tracked from the initial inflammation and fibrogenesis phase through to the tissue repair phases of remodelling and resolution. Degryse et al. demonstrated how phosphorylation enrichment strategies can characterised >5,400 site-specific phosphorylation events, and how their association to perturbed signalling pathways provides pivotal information for the development of therapeutic strategies. These studies unravel the dynamic proteomic complexity of COPD, identified upstream master kinase and transcriptional regulators and characterise the dysregulation of signalling pathways driven by phosphorylation.

36

Schiller et al. provide a promising solution to the challenge of COPD complexity and heterogeneity is through the use of clinically relevant animal models 2, 3, 173-183 that recapitulate human hallmark features. This approach allows the analysis of multiple biospecimens in the context of a multi- organ system. Acknowledging this complexity and heterogeneity of COPD is the first step towards stratification of relatively homogenous populations to better understand the mechanisms driving specific cohort characteristics, which could also define novel COPD phenotypes 109, 184. Research from the Karolinska COSMIC cohort has demonstrated with the integration of five to seven omics sourcing from five different anatomical locations, they could identify the exact classification of

COPD diagnosis 185. As outlined earlier in this review, current knowledge suggests the lung is composed of over forty different cell types, this is an important dimension that will be filled by ever evolving single cell proteomic techniques 186. Maturity of single cell proteomic technology will not only influence projects such as The Human Cell Atlas 24 to define the exact number of cell types housed within the lung, but provide spatial information on how cells are organised and their relative protein profiles and abundance.

Ultimately the uptake of these MS-based proteomic approaches will undoubtedly provide the potential to better understand the pathogenesis of COPD, coupled with multi-omics incorporating multiple biospecimens, will help to unravel COPD complexity, understand the true molecular subtypes and in time bring about much-needed personalised medicine approaches.

To pursue a further understanding of the aetiology of COPD, this thesis hypothesises that a time-resolved comparative and quantitative proteomic approach on a clinically relevant mouse

37 model will shed light on the dysregulated proteins and pathways driving the pathogenesis of

COPD. To test this hypothesis, there are three key aims of this thesis:

1. To use discovery mass spectrometry technology to progressively track protein abundance

and localisation throughout the induction and progression phases of COPD in an

established clinically relevant mouse model 2, 3, 173-183

2. To employ the multi-dimensional TiSH phosphoproteomics strategy to establish the first

complete phosphoproteome of COPD in a mouse model with the goal to layer the time-

resolved profile from aim 1 with relevant changes in phosphorylation status.

3. To establish a platform for the analysis of endobronchial biopsies from patients using

both discovery and high-resolution targeted proteomics to allow the translation of

findings from the mouse model

38

References

1. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and

regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a

systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;

380:2095-128.

2. Hsu AC, Starkey MR, Hanish I, Parsons K, Haw TJ, Howland LJ, et al. Targeting PI3K-

p110alpha Suppresses Influenza Virus Infection in Chronic Obstructive Pulmonary

Disease. Am J Respir Crit Care Med 2015; 191:1012-23.

3. Jones B, Donovan C, Liu G, Gomez HM, Chimankar V, Harrison CL, et al. Animal

models of COPD: What do they tell us? Respirology 2017; 22:21-32.

4. Celli BR, MacNee W. Standards for the diagnosis and treatment of patients with COPD:

a summary of the ATS/ERS position paper. Eur Respir J 2004; 23:932-46.

5. Han MK, Agusti A, Calverley PM, Celli BR, Criner G, Curtis JL, et al. Chronic

obstructive pulmonary disease phenotypes: the future of COPD. Am J Respir Crit Care

Med 2010; 182:598-604.

6. Eisner MD, Anthonisen N, Coultas D, Kuenzli N, Perez-Padilla R, Postma D, et al. An

official American Thoracic Society public policy statement: Novel risk factors and the

global burden of chronic obstructive pulmonary disease. Am J Respir Crit Care Med

2010; 182:693-718.

7. Bilano V, Gilmour S, Moffiet T, d'Espaignet ET, Stevens GA, Commar A, et al. Global

trends and projections for tobacco use, 1990-2025: an analysis of smoking indicators

from the WHO Comprehensive Information Systems for Tobacco Control. Lancet 2015;

385:966-76.

39

8. Lopez AD, Shibuya K, Rao C, Mathers CD, Hansell AL, Held LS, et al. Chronic

obstructive pulmonary disease: current burden and future projections. Eur Respir J 2006;

27:397-412.

9. Barnes PJ. Corticosteroid resistance in patients with asthma and chronic obstructive

pulmonary disease. J Allergy Clin Immunol 2013; 131:636-45.

10. Spina D. Pharmacology of novel treatments for COPD: are fixed dose combination

LABA/LAMA synergistic? Eur Clin Respir J 2015; 2.

11. Cazzola M, Page C. Long-acting bronchodilators in COPD: where are we now and where

are we going? Breathe 2014; 10:110-20.

12. Hallstrand TS, Hackett TL, Altemeier WA, Matute-Bello G, Hansbro PM, Knight DA.

Airway epithelial regulation of pulmonary immune homeostasis and inflammation. Clin

Immunol 2014; 151:1-15.

13. Rennard SI, Dale DC, Donohue JF, Kanniess F, Magnussen H, Sutherland ER, et al.

CXCR2 Antagonist MK-7123. A Phase 2 Proof-of-Concept Trial for Chronic Obstructive

Pulmonary Disease. Am J Respir Crit Care Med 2015; 191:1001-11.

14. MacNee W. Oxidants/antioxidants and COPD. Chest 2000; 117:303s-17s.

15. Lakshmi SP. Emerging pharmaceutical therapies for COPD. 2017; 12:2141-56.

16. DeMeo DL, Silverman EK. Alpha1-antitrypsin deficiency. 2: genetic aspects of alpha(1)-

antitrypsin deficiency: phenotypes and genetic modifiers of emphysema risk. Thorax

2004; 59:259-64.

17. Hobbs BD, Hersh CP. Integrative genomics of chronic obstructive pulmonary disease.

Biochem Biophys Res Commun 2014; 452:276-86.

40

18. Pillai SG, Ge D, Zhu G, Kong X, Shianna KV, Need AC, et al. A genome-wide

association study in chronic obstructive pulmonary disease (COPD): identification of two

major susceptibility loci. PLoS Genet 2009; 5:e1000421.

19. Lao T, Jiang Z, Yun J, Qiu W, Guo F, Huang C, et al. Hhip haploinsufficiency sensitizes

mice to age-related emphysema. Proc Natl Acad Sci U S A 2016; 113:E4681-7.

20. Jiang Z, Lao T, Qiu W, Polverino F, Gupta K, Guo F, et al. A Chronic Obstructive

Pulmonary Disease Susceptibility Gene, FAM13A, Regulates Protein Stability of beta-

Catenin. Am J Respir Crit Care Med 2016; 194:185-97.

21. Pierrou S, Broberg P, O'Donnell RA, Pawlowski K, Virtala R, Lindqvist E, et al.

Expression of genes involved in oxidative stress responses in airway epithelial cells of

smokers with chronic obstructive pulmonary disease. Am J Respir Crit Care Med 2007;

175:577-86.

22. Franks TJ, Colby TV, Travis WD, Tuder RM, Reynolds HY, Brody AR, et al. Resident

cellular components of the human lung: current knowledge and goals for research on cell

phenotyping and function. Proceedings of the American Thoracic Society 2008; 5:763-6.

23. Plasschaert LW, Žilionis R, Choo-Wing R, Savova V, Knehr J, Roma G, et al. A single-

cell atlas of the airway epithelium reveals the CFTR-rich pulmonary ionocyte. Nature

2018; 560:377.

24. Schiller HB, Montoro DT, Simon LM, Rawlins EL, Meyer KB, Strunz M, et al. The

Human Lung Cell Atlas: A High-Resolution Reference Map of the Human Lung in

Health and Disease. Am J Respir Cell Mol Biol 2019; 61:31-41.

41

25. Braga FAV, Kar G, Berg M, Carpaij OA, Polanski K, Simon LM, et al. A cellular census

of healthy lung and asthmatic airway wall identifies novel cell states in health and

disease. bioRxiv 2019:527408.

26. Ordovas-Montanes J, Dwyer DF, Nyquist SK, Buchheit KM, Vukovic M, Deb C, et al.

Allergic inflammatory memory in human respiratory epithelial progenitor cells. Nature

2018; 560:649-54.

27. Reyfman PA, Walter JM, Joshi N, Anekalla KR, McQuattie-Pimentel AC, Chiu S, et al.

Single-Cell Transcriptomic Analysis of Human Lung Provides Insights into the

Pathobiology of Pulmonary Fibrosis. Am J Respir Crit Care Med 2019; 199:1517-36.

28. Xu Y, Mizuno T, Sridharan A, Du Y, Guo M, Tang J, et al. Single-cell RNA sequencing

identifies diverse roles of epithelial cells in idiopathic pulmonary fibrosis. JCI Insight

2016; 1:e90558.

29. Angelidis I, Simon LM, Fernandez IE, Strunz M, Mayr CH, Greiffo FR, et al. An atlas of

the aging lung mapped by single cell transcriptomics and deep tissue proteomics. Nature

communications 2019; 10:963.

30. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, et al. The sequence of

the human genome. Science 2001; 291:1304-51.

31. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. Initial

sequencing and analysis of the human genome. Nature 2001; 409:860-921.

32. Hasin Y, Seldin M, Lusis A. Multi-omics approaches to disease. Genome Biol 2017;

18:83.

33. Munoz J, Heck AJ. From the human genome to the human proteome. Angew Chem Int

Ed Engl 2014; 53:10864-6.

42

34. Smith LM, Kelleher NL, The Consortium for Top Down P, Linial M, Goodlett D,

Langridge-Smith P, et al. Proteoform: a single term describing protein complexity.

Nature Methods 2013; 10:186.

35. Walsh CT, Garneau-Tsodikova S, Gatto GJ, Jr. Protein posttranslational modifications:

the chemistry of proteome diversifications. Angew Chem Int Ed Engl 2005; 44:7342-72.

36. Fu J, Wu M, Liu X. Proteomic approaches beyond expression profiling and PTM

analysis. Anal Bioanal Chem 2018; 410:4051-60.

37. Li J, Fang B, Kinose F, Bai Y, Kim JY, Chen YA, et al. Target Identification in Small

Cell Lung Cancer via Integrated Phenotypic Screening and Activity-Based Protein

Profiling. Mol Cancer Ther 2016; 15:334-42.

38. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature 2003; 422:198-207.

39. Issaq HJ, Conrads TP, Janini GM, Veenstra TD. Methods for fractionation, separation

and profiling of proteins and peptides. Electrophoresis 2002; 23:3048-61.

40. Smith LM, Kelleher NL. Proteoform: a single term describing protein complexity. Nat

Methods 2013; 10:186-7.

41. Wilhelm M, Schlegl J, Hahne H, Gholami AM, Lieberenz M, Savitski MM, et al. Mass-

spectrometry-based draft of the human proteome. Nature 2014; 509:582-7.

42. Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, et al. A draft map

of the human proteome. Nature 2014; 509:575-81.

43. Bantscheff M, Lemeer S, Savitski MM, Kuster B. Quantitative mass spectrometry in

proteomics: critical review update from 2007 to the present. Anal Bioanal Chem 2012;

404:939-65.

43

44. Domon B, Aebersold R. Options and considerations when selecting a quantitative

proteomics strategy. Nat Biotechnol 2010; 28:710-21.

45. Annesley TM. Ion suppression in mass spectrometry. Clin Chem 2003; 49:1041-4.

46. Rabilloud T, Lelong C. Two-dimensional gel electrophoresis in proteomics: a tutorial. J

Proteomics 2011; 74:1829-41.

47. Sandberg A, Buschmann V, Kapusta P, Erdmann R, Wheelock ÅM. Use of time-resolved

fluorescence to improve sensitivity and dynamic range of gel-based proteomics.

Analytical chemistry 2016; 88:3067-74.

48. Bateman NW, Goulding SP, Shulman NJ, Gadok AK, Szumlinski KK, MacCoss MJ, et

al. Maximizing peptide identification events in proteomic workflows using data-

dependent acquisition (DDA). Mol Cell Proteomics 2014; 13:329-38.

49. Choudhary C, Mann M. Decoding signalling networks by mass spectrometry-based

proteomics. Nat Rev Mol Cell Biol 2010; 11:427-39.

50. Sharma K, D'Souza RC, Tyanova S, Schaab C, Wisniewski JR, Cox J, et al. Ultradeep

human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based

signaling. Cell Rep 2014; 8:1583-94.

51. Engholm-Keller K, Birck P, Storling J, Pociot F, Mandrup-Poulsen T, Larsen MR. TiSH-

-a robust and sensitive global phosphoproteomics strategy employing a combination of

TiO2, SIMAC, and HILIC. J Proteomics 2012; 75:5749-61.

52. Thingholm TE, Jensen ON, Robinson PJ, Larsen MR. SIMAC (sequential elution from

IMAC), a phosphoproteomics strategy for the rapid separation of monophosphorylated

from multiply phosphorylated peptides. Mol Cell Proteomics 2008; 7:661-71.

44

53. Engholm-Keller K, Larsen MR. Improving the Phosphoproteome Coverage for Limited

Sample Amounts Using TiO2-SIMAC-HILIC (TiSH) Phosphopeptide Enrichment and

Fractionation. Methods Mol Biol 2016; 1355:161-77.

54. Pinkse MW, Uitto PM, Hilhorst MJ, Ooms B, Heck AJ. Selective isolation at the

femtomole level of phosphopeptides from proteolytic digests using 2D-NanoLC-ESI-

MS/MS and titanium oxide precolumns. Anal Chem 2004; 76:3935-43.

55. Palmisano G, Lendal SE, Engholm-Keller K, Leth-Larsen R, Parker BL, Larsen MR.

Selective enrichment of sialic acid-containing glycopeptides using titanium dioxide

chromatography with analysis by HILIC and mass spectrometry. Nat Protoc 2010;

5:1974-82.

56. Kang T, Jensen P, Huang H, Christensen GL, Billestrup N, Larsen MR. Characterization

of the molecular mechanisms underlying Glucose Stimulated Insulin Secretion from

isolated pancreatic beta-cells using PTMomics. Mol Cell Proteomics 2017.

57. Humphrey SJ, Azimifar SB, Mann M. High-throughput phosphoproteomics reveals in

vivo insulin signaling dynamics. Nat Biotechnol 2015; 33:990-5.

58. Humphrey SJ, Karayel O, James DE, Mann M. High-throughput and high-sensitivity

phosphoproteomics with the EasyPhos platform. Nat Protoc 2018; 13:1897-916.

59. Masuda T, Tomita M, Ishihama Y. Phase transfer surfactant-aided trypsin digestion for

membrane proteome analysis. J Proteome Res 2008; 7:731-40.

60. Di Palma S, Zoumaro-Djayoon A, Peng M, Post H, Preisinger C, Munoz J, et al. Finding

the same needles in the haystack? A comparison of phosphotyrosine peptides enriched by

immuno-affinity precipitation and metal-based affinity chromatography. J Proteomics

2013; 91:331-7.

45

61. Gronborg M, Kristiansen TZ, Stensballe A, Andersen JS, Ohara O, Mann M, et al. A

mass spectrometry-based proteomic approach for identification of serine/threonine-

phosphorylated proteins by enrichment with phospho-specific antibodies: identification

of a novel protein, Frigg, as a protein kinase A substrate. Mol Cell Proteomics 2002;

1:517-27.

62. Pitt JJ. Principles and Applications of Liquid Chromatography-Mass Spectrometry in

Clinical Biochemistry. Clin Biochem Rev 2009; 30:19-34.

63. Mitulovic G, Mechtler K. HPLC techniques for proteomics analysis--a short overview of

latest developments. Brief Funct Genomic Proteomic 2006; 5:249-60.

64. Mitulović G. New HPLC Techniques for Proteomics Analysis: A Short Overview of

Latest Developments. Journal of Liquid Chromatography & Related Technologies 2015;

38:390-403.

65. Bodenmiller B, Mueller LN, Mueller M, Domon B, Aebersold R. Reproducible isolation

of distinct, overlapping segments of the phosphoproteome. Nat Methods 2007; 4:231-7.

66. Jensen SS, Larsen MR. Evaluation of the impact of some experimental procedures on

different phosphopeptide enrichment techniques. Rapid Commun Mass Spectrom 2007;

21:3635-45.

67. Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M. Accurate proteome-wide

label-free quantification by delayed normalization and maximal peptide ratio extraction,

termed MaxLFQ. Mol Cell Proteomics 2014; 13:2513-26.

68. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, et al. Stable

isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach

to expression proteomics. Mol Cell Proteomics 2002; 1:376-86.

46

69. Wiese S, Reidegeld KA, Meyer HE, Warscheid B. Protein labeling by iTRAQ: a new tool

for quantitative mass spectrometry in proteome research. Proteomics 2007; 7:340-50.

70. Thompson A, Schafer J, Kuhn K, Kienle S, Schwarz J, Schmidt G, et al. Tandem mass

tags: a novel quantification strategy for comparative analysis of complex protein mixtures

by MS/MS. Anal Chem 2003; 75:1895-904.

71. Li Z, Adams RM, Chourey K, Hurst GB, Hettich RL, Pan C. Systematic comparison of

label-free, metabolic labeling, and isobaric chemical labeling for quantitative proteomics

on LTQ Orbitrap Velos. J Proteome Res 2012; 11:1582-90.

72. Megger DA, Pott LL, Ahrens M, Padden J, Bracht T, Kuhlmann K, et al. Comparison of

label-free and label-based strategies for proteome analysis of hepatoma cell lines.

Biochim Biophys Acta 2014; 1844:967-76.

73. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM. Electrospray ionization for

mass spectrometry of large biomolecules. Science 1989; 246:64-71.

74. Karas M, Hillenkamp F. Laser desorption ionization of proteins with molecular masses

exceeding 10,000 daltons. Anal Chem 1988; 60:2299-301.

75. Yates JR, Ruse CI, Nakorchevsky A. Proteomics by mass spectrometry: approaches,

advances, and applications. Annu Rev Biomed Eng 2009; 11:49-79.

76. Eliuk S, Makarov A. Evolution of Orbitrap Mass Spectrometry Instrumentation. Annu

Rev Anal Chem (Palo Alto Calif) 2015; 8:61-80.

77. Hager JW, Le Blanc JC. High-performance liquid chromatography-tandem mass

spectrometry with a new quadrupole/linear ion trap instrument. J Chromatogr A 2003;

1020:3-9.

47

78. Morris HR, Paxton T, Dell A, Langhorne J, Berg M, Bordoli RS, et al. High sensitivity

collisionally-activated decomposition tandem mass spectrometry on a novel

quadrupole/orthogonal-acceleration time-of-flight mass spectrometer. Rapid Commun

Mass Spectrom 1996; 10:889-96.

79. Aebersold R, Goodlett DR. Mass spectrometry in proteomics. Chem Rev 2001; 101:269-

95.

80. Mann M, Hendrickson RC, Pandey A. Analysis of proteins and proteomes by mass

spectrometry. Annu Rev Biochem 2001; 70:437-73.

81. Perkins DN, Pappin DJ, Creasy DM, Cottrell JS. Probability-based protein identification

by searching sequence databases using mass spectrometry data. Electrophoresis 1999;

20:3551-67.

82. Eng JK, McCormack AL, Yates JR. An approach to correlate tandem mass spectral data

of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom

1994; 5:976-89.

83. Cox J, Mann M. MaxQuant enables high peptide identification rates, individualized

p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature

Biotechnology 2008; 26:1367.

84. Bourmaud A, Gallien S, Domon B. Parallel reaction monitoring using quadrupole-

Orbitrap mass spectrometer: Principle and applications. Proteomics 2016; 16:2146-59.

85. Peterson AC, Russell JD, Bailey DJ, Westphall MS, Coon JJ. Parallel reaction

monitoring for high resolution and high mass accuracy quantitative, targeted proteomics.

Mol Cell Proteomics 2012; 11:1475-88.

48

86. Abbatiello SE, Schilling B, Mani DR, Zimmerman LJ, Hall SC, MacLean B, et al. Large-

Scale Interlaboratory Study to Develop, Analytically Validate and Apply Highly

Multiplexed, Quantitative Peptide Assays to Measure Cancer-Relevant Proteins in

Plasma. Mol Cell Proteomics 2015; 14:2357-74.

87. Addona TA, Abbatiello SE, Schilling B, Skates SJ, Mani DR, Bunk DM, et al. Multi-site

assessment of the precision and reproducibility of multiple reaction monitoring-based

measurements of proteins in plasma. Nat Biotechnol 2009; 27:633-41.

88. Kuhn E, Whiteaker JR, Mani DR, Jackson AM, Zhao L, Pope ME, et al. Interlaboratory

evaluation of automated, multiplexed peptide immunoaffinity enrichment coupled to

multiple reaction monitoring mass spectrometry for quantifying proteins in plasma. Mol

Cell Proteomics 2012; 11:M111.013854.

89. Lange V, Picotti P, Domon B, Aebersold R. Selected reaction monitoring for quantitative

proteomics: a tutorial. Mol Syst Biol 2008; 4:222.

90. Anderson L, Hunter CL. Quantitative mass spectrometric multiple reaction monitoring

assays for major plasma proteins. Mol Cell Proteomics 2006; 5:573-88.

91. Gallien S, Kim SY, Domon B. Large-Scale Targeted Proteomics Using Internal Standard

Triggered-Parallel Reaction Monitoring (IS-PRM). Mol Cell Proteomics 2015; 14:1630-

44.

92. Mohammed Y, Percy AJ, Chambers AG, Borchers CH. Qualis-SIS: automated standard

curve generation and quality assessment for multiplexed targeted quantitative proteomic

experiments with labeled standards. J Proteome Res 2015; 14:1137-46.

49

93. Mani DR, Abbatiello SE, Carr SA. Statistical characterization of multiple-reaction

monitoring mass spectrometry (MRM-MS) assays for quantitative proteomics. BMC

Bioinformatics 2012; 13 Suppl 16:S9.

94. MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, et al.

Skyline: an open source document editor for creating and analyzing targeted proteomics

experiments. Bioinformatics 2010; 26:966-8.

95. Venable JD, Dong MQ, Wohlschlegel J, Dillin A, Yates JR. Automated approach for

quantitative analysis of complex peptide mixtures from tandem mass spectra. Nat

Methods 2004; 1:39-45.

96. Sajic T, Liu Y, Aebersold R. Using data-independent, high-resolution mass spectrometry

in protein biomarker research: perspectives and clinical applications. Proteomics Clin

Appl 2015; 9:307-21.

97. Bilbao A, Varesio E, Luban J, Strambio-De-Castillia C, Hopfgartner G, Muller M, et al.

Processing strategies and software solutions for data-independent acquisition in mass

spectrometry. Proteomics 2015; 15:964-80.

98. Hu A, Noble WS, Wolf-Yadlin A. Technical advances in proteomics: new developments

in data-independent acquisition. F1000Res 2016; 5.

99. Rost HL, Rosenberger G, Navarro P, Gillet L, Miladinovic SM, Schubert OT, et al.

OpenSWATH enables automated, targeted analysis of data-independent acquisition MS

data. Nat Biotechnol 2014; 32:219-23.

100. Egertson JD, MacLean B, Johnson R, Xuan Y, MacCoss MJ. Multiplexed peptide

analysis using data-independent acquisition and Skyline. Nat Protoc 2015; 10:887-903.

50

101. Bruderer R, Bernhardt OM, Gandhi T, Miladinovic SM, Cheng LY, Messner S, et al.

Extending the limits of quantitative proteome profiling with data-independent acquisition

and application to acetaminophen-treated three-dimensional liver microtissues. Mol Cell

Proteomics 2015; 14:1400-10.

102. Keller A, Bader SL, Shteynberg D, Hood L, Moritz RL. Automated Validation of Results

and Removal of Fragment Ion Interferences in Targeted Analysis of Data-independent

Acquisition Mass Spectrometry (MS) using SWATHProphet. Mol Cell Proteomics 2015;

14:1411-8.

103. Gillet LC, Navarro P, Tate S, Rost H, Selevsek N, Reiter L, et al. Targeted data

extraction of the MS/MS spectra generated by data-independent acquisition: a new

concept for consistent and accurate proteome analysis. Mol Cell Proteomics 2012;

11:O111.016717.

104. Collins BC, Hunter CL, Liu Y, Schilling B, Rosenberger G, Bader SL, et al. Multi-

laboratory assessment of reproducibility, qualitative and quantitative performance of

SWATH-mass spectrometry. Nature Communications 2017; 8:291.

105. Litichevskiy L, Peckner R, Abelin JG, Asiedu JK, Creech AL, Davis JF, et al. A Library

of Phosphoproteomic and Chromatin Signatures for Characterizing Cellular Responses to

Drug Perturbations. Cell Syst 2018; 6:424-43.e7.

106. Ortea I, Rodriguez-Ariza A, Chicano-Galvez E, Arenas Vacas MS, Jurado Gamez B.

Discovery of potential protein biomarkers of lung adenocarcinoma in bronchoalveolar

lavage fluid by SWATH MS data-independent acquisition and targeted data extraction. J

Proteomics 2016; 138:106-14.

51

107. Parker BL, Yang G, Humphrey SJ, Chaudhuri R, Ma X, Peterman S, et al. Targeted

phosphoproteomics of insulin signaling using data-independent acquisition mass

spectrometry. Sci Signal 2015; 8:rs6.

108. Ludwig C, Gillet L, Rosenberger G, Amon S, Collins BC, Aebersold R. Data-

independent acquisition-based SWATH-MS for quantitative proteomics: a tutorial. Mol

Syst Biol 2018; 14:e8126.

109. Vestbo J, Hurd SS, Agusti AG, Jones PW, Vogelmeier C, Anzueto A, et al. Global

strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary

disease: GOLD executive summary. Am J Respir Crit Care Med 2013; 187:347-65.

110. Agusti A, Calverley PM, Celli B, Coxson HO, Edwards LD, Lomas DA, et al.

Characterisation of COPD heterogeneity in the ECLIPSE cohort. Respir Res 2010;

11:122.

111. Agusti A. The path to personalised medicine in COPD. Thorax 2014; 69:857-64.

112. Ghosh N, Dutta M, Singh B, Banerjee R, Bhattacharyya P, Chaudhury K.

Transcriptomics, proteomics and metabolomics driven biomarker discovery in COPD: an

update. Expert Rev Mol Diagn 2016; 16:897-913.

113. Terracciano R, Pelaia G, Preiano M, Savino R. Asthma and COPD proteomics: current

approaches and future directions. Proteomics Clin Appl 2015; 9:203-20.

114. Casaburi R, Celli B, Crapo J, Criner G, Croxton T, Gaw A, et al. The COPD Biomarker

Qualification Consortium (CBQC). Copd 2013; 10:367-77.

115. Kinnula VL, Vuorinen K, Ilumets H, Rytila P, Myllarniemi M. Thiol proteins, redox

modulation and parenchymal lung disease. Curr Med Chem 2007; 14:213-22.

52

116. Kinnula VL, Paakko P, Soini Y. Antioxidant enzymes and redox regulating thiol proteins

in malignancies of human lung. FEBS Lett 2004; 569:1-6.

117. Ohlmeier S, Mazur W, Linja-Aho A, Louhelainen N, Ronty M, Toljamo T, et al. Sputum

proteomics identifies elevated PIGR levels in smokers and mild-to-moderate COPD. J

Proteome Res 2012; 11:599-608.

118. Wines BD, Hogarth PM. IgA receptors in health and disease. Tissue Antigens 2006;

68:103-14.

119. Baraniuk JN, Casado B, Pannell LK, McGarvey PB, Boschetto P, Luisetti M, et al.

Protein networks in induced sputum from smokers and COPD patients. Int J Chron

Obstruct Pulmon Dis 2015; 10:1957-75.

120. Titz B, Sewer A, Schneider T, Elamin A, Martin F, Dijon S, et al. Alterations in the

sputum proteome and transcriptome in smokers and early-stage COPD subjects. J

Proteomics 2015; 128:306-20.

121. Wheelock CE, Goss VM, Balgoma D, Nicholas B, Brandsma J, Skipp PJ, et al.

Application of 'omics technologies to biomarker discovery in inflammatory lung diseases.

Eur Respir J 2013; 42:802-25.

122. Fumagalli M, Dolcini L, Sala A, Stolk J, Fregonese L, Ferrari F, et al. Proteomic analysis

of exhaled breath condensate from single patients with pulmonary emphysema associated

to alpha1-antitrypsin deficiency. J Proteomics 2008; 71:211-21.

123. Fumagalli M, Ferrari F, Luisetti M, Stolk J, Hiemstra PS, Capuano D, et al. Profiling the

proteome of exhaled breath condensate in healthy smokers and COPD patients by LC-

MS/MS. Int J Mol Sci 2012; 13:13894-910.

53

124. Lopez-Sanchez LM, Jurado-Gamez B, Feu-Collado N, Valverde A, Canas A, Fernandez-

Rueda JL, et al. Exhaled breath condensate biomarkers for the early diagnosis of lung

cancer using proteomics. Am J Physiol Lung Cell Mol Physiol 2017; 313:L664-l76.

125. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large

gene lists using DAVID bioinformatics resources. Nat Protoc 2009; 4:44-57.

126. Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward

the comprehensive functional analysis of large gene lists. Nucleic Acids Res 2009; 37:1-

13.

127. Diederen BMW, van der Valk PDLPM, Kluytmans JAWJ, Peeters MF, Hendrix R. The

role of atypical respiratory pathogens in exacerbations of chronic obstructive pulmonary

disease. European Respiratory Journal 2007; 30:240-4.

128. Mutlu GM, Garey KW, Robbins RA, Danziger LH, Rubinstein I. Collection and analysis

of exhaled breath condensate in humans. Am J Respir Crit Care Med 2001; 164:731-7.

129. Bloemen K, Hooyberghs J, Desager K, Witters E, Schoeters G. Non-invasive biomarker

sampling and analysis of the exhaled breath proteome. Proteomics Clin Appl 2009;

3:498-504.

130. Franciosi L, Govorukhina N, Fusetti F, Poolman B, Lodewijk ME, Timens W, et al.

Proteomic analysis of human epithelial lining fluid by microfluidics-based nanoLC-

MS/MS: a feasibility study. Electrophoresis 2013; 34:2683-94.

131. Franciosi L, Govorukhina N, Ten Hacken N, Postma D, Bischoff R. Proteomics of

epithelial lining fluid obtained by bronchoscopic microprobe sampling. Methods Mol

Biol 2011; 790:17-28.

54

132. Franciosi L, Postma DS, van den Berge M, Govorukhina N, Horvatovich PL, Fusetti F, et

al. Susceptibility to COPD: differential proteomic profiling after acute smoking. PLoS

One 2014; 9:e102037.

133. Lunardi F, Villano G, Perissinotto E, Agostini C, Rea F, Gnoato M, et al. Overexpression

of SERPIN B3 promotes epithelial proliferation and lung fibrosis in mice. Lab Invest

2011; 91:945-54.

134. van der Toorn M, Smit-de Vries MP, Slebos DJ, de Bruin HG, Abello N, van Oosterhout

AJ, et al. Cigarette smoke irreversibly modifies glutathione in airway epithelial cells. Am

J Physiol Lung Cell Mol Physiol 2007; 293:L1156-62.

135. Kwon HS, Bae YJ, Moon KA, Lee YS, Lee T, Lee KY, et al. Hyperoxidized

peroxiredoxins in peripheral blood mononuclear cells of asthma patients is associated

with asthma severity. Life Sci 2012; 90:502-8.

136. Govender P, Dunn MJ, Donnelly SC. Proteomics and the lung: Analysis of

bronchoalveolar lavage fluid. Proteomics Clin Appl 2009; 3:1044-51.

137. Wattiez R, Falmagne P. Proteomics of bronchoalveolar lavage fluid. J Chromatogr B

Analyt Technol Biomed Life Sci 2005; 815:169-78.

138. Plymoth A, Lofdahl CG, Ekberg-Jansson A, Dahlback M, Broberg P, Foster M, et al.

Protein expression patterns associated with progression of chronic obstructive pulmonary

disease in bronchoalveolar lavage of smokers. Clin Chem 2007; 53:636-44.

139. Tu C, Mammen MJ, Li J, Shen X, Jiang X, Hu Q, et al. Large-scale, ion-current-based

proteomics investigation of bronchoalveolar lavage fluid in chronic obstructive

pulmonary disease patients. J Proteome Res 2014; 13:627-39.

55

140. Yang M, Kohler M, Heyder T, Forsslund H, Garberg HK, Karimi R, et al. Proteomic

profiling of lung immune cells reveals dysregulation of phagocytotic pathways in female-

dominated molecular COPD phenotype. Respir Res 2018; 19:39.

141. Kohler M, Sandberg A, Kjellqvist S, Thomas A, Karimi R, Nyren S, et al. Gender

differences in the bronchoalveolar lavage cell proteome of patients with chronic

obstructive pulmonary disease. J Allergy Clin Immunol 2013; 131:743-51.

142. Balgoma D, Yang M, Sjodin M, Snowden S, Karimi R, Levanen B, et al. Linoleic acid-

derived lipid mediators increase in a female-dominated subphenotype of COPD. Eur

Respir J 2016; 47:1645-56.

143. Anderson NL, Anderson NG. The Human Plasma Proteome. History, Character, and

Diagnostic Prospects 2002; 1:845-67.

144. Tu C, Rudnick PA, Martinez MY, Cheek KL, Stein SE, Slebos RJ, et al. Depletion of

abundant plasma proteins and limitations of plasma proteomics. J Proteome Res 2010;

9:4982-91.

145. Bellei E, Bergamini S, Monari E, Fantoni LI, Cuoghi A, Ozben T, et al. High-abundance

proteins depletion for serum proteomic analysis: concomitant removal of non-targeted

proteins. Amino Acids 2011; 40:145-56.

146. Almazi JG, Pockney P, Gedye C, Smith ND, Hondermarck H, Verrills NM, et al. Cell-

Free DNA Blood Collection Tubes Are Appropriate for Clinical Proteomics: A

Demonstration in Colorectal Cancer. Proteomics Clin Appl 2018; 12:e1700121.

147. Wewer Albrechtsen NJ, Geyer PE, Doll S, Treit PV, Bojsen-Moller KN, Martinussen C,

et al. Plasma Proteome Profiling Reveals Dynamics of Inflammatory and Lipid

56

Homeostasis Markers after Roux-En-Y Gastric Bypass Surgery. Cell Syst 2018; 7:601-

12.e3.

148. Bozinovski S, Hutchinson A, Thompson M, Macgregor L, Black J, Giannakis E, et al.

Serum amyloid a is a biomarker of acute exacerbations of chronic obstructive pulmonary

disease. Am J Respir Crit Care Med 2008; 177:269-78.

149. Gomes-Alves P, Imrie M, Gray RD, Nogueira P, Ciordia S, Pacheco P, et al. SELDI-TOF

biomarker signatures for cystic fibrosis, asthma and chronic obstructive pulmonary

disease. Clin Biochem 2010; 43:168-77.

150. Merali S, Barrero CA, Bowler RP, Chen DE, Criner G, Braverman A, et al. Analysis of

the plasma proteome in COPD: Novel low abundance proteins reflect the severity of lung

remodeling. Copd 2014; 11:177-89.

151. Diao W, Shen N, Du Y, Sun X, Liu B, Xu M, et al. Identification of thyroxine-binding

globulin as a candidate plasma marker of chronic obstructive pulmonary disease. Int J

Chron Obstruct Pulmon Dis 2017; 12:1549-64.

152. Sarinc Ulasli S, Bozbas SS, Ozen ZE, Ozyurek BA, Ulubay G. Effect of thyroid function

on COPD exacerbation frequency: a preliminary study. Multidiscip Respir Med 2013;

8:64.

153. Terzano C, Romani S, Paone G, Conti V, Oriolo F. COPD and thyroid dysfunctions.

Lung 2014; 192:103-9.

154. Pereira W, Jr., Kovnat DM, Snider GL. A prospective cooperative study of complications

following flexible fiberoptic bronchoscopy. Chest 1978; 73:813-6.

57

155. Neuman Y, Koslow M, Matveychuk A, Bar-Sef A, Guber A, Shitrit D. Increased

hypoxemia in patients with COPD and pulmonary hypertension undergoing

bronchoscopy with biopsy. Int J Chron Obstruct Pulmon Dis 2015; 10:2627-32.

156. Ishikawa N, Ohlmeier S, Salmenkivi K, Myllarniemi M, Rahman I, Mazur W, et al.

Hemoglobin alpha and beta are ubiquitous in the human lung, decline in idiopathic

pulmonary fibrosis but not in COPD. Respir Res 2010; 11:123.

157. Ohlmeier S, Nieminen P, Gao J, Kanerva T, Ronty M, Toljamo T, et al. Lung tissue

proteomics identifies elevated transglutaminase 2 levels in stable chronic obstructive

pulmonary disease. Am J Physiol Lung Cell Mol Physiol 2016; 310:L1155-65.

158. Ahrman E, Hallgren O, Malmstrom L, Hedstrom U, Malmstrom A, Bjermer L, et al.

Quantitative proteomic characterization of the lung extracellular matrix in chronic

obstructive pulmonary disease and idiopathic pulmonary fibrosis. J Proteomics 2018;

189:23-33.

159. Naba A, Clauser KR, Ding H, Whittaker CA, Carr SA, Hynes RO. The extracellular

matrix: Tools and insights for the "omics" era. Matrix Biol 2016; 49:10-24.

160. Naba A, Clauser KR, Hoersch S, Liu H, Carr SA, Hynes RO. The matrisome: in silico

definition and in vivo characterization by proteomics of normal and tumor extracellular

matrices. Mol Cell Proteomics 2012; 11:M111.014647.

161. Deutsch EW, Csordas A, Sun Z, Jarnuczak A, Perez-Riverol Y, Ternent T, et al. The

ProteomeXchange consortium in 2017: supporting the cultural change in proteomics

public data deposition. Nucleic Acids Res 2017; 45:D1100-d6.

58

162. Verrills NM, Irwin JA, He XY, Wood LG, Powell H, Simpson JL, et al. Identification of

novel diagnostic biomarkers for asthma and chronic obstructive pulmonary disease. Am J

Respir Crit Care Med 2011; 183:1633-43.

163. Ito E, Oka R, Ishii T, Korekane H, Kurimoto A, Kizuka Y, et al. Fucosylated surfactant

protein-D is a biomarker candidate for the development of chronic obstructive pulmonary

disease. J Proteomics 2015; 127:386-94.

164. Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et

al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung

injury and repair. Mol Syst Biol 2015; 11.

165. Mouratis MA, Aidinis V. Modeling pulmonary fibrosis with bleomycin. Curr Opin Pulm

Med 2011; 17:355-61.

166. Veit G, Kobbe B, Keene DR, Paulsson M, Koch M, Wagener R. Collagen XXVIII, a

novel von Willebrand factor A domain-containing protein with many imperfections in the

collagenous domain. J Biol Chem 2006; 281:3494-504.

167. Colombatti A, Spessotto P, Doliana R, Mongiat M, Bressan GM, Esposito G. The

EMILIN/Multimerin family. Front Immunol 2011; 2:93.

168. Mongiat M, Ligresti G, Marastoni S, Lorenzon E, Doliana R, Colombatti A. Regulation

of the extrinsic apoptotic pathway by the extracellular matrix glycoprotein EMILIN2.

Mol Cell Biol 2007; 27:7176-87.

169. Kramer A, Green J, Pollard J, Jr., Tugendreich S. Causal analysis approaches in

Ingenuity Pathway Analysis. Bioinformatics 2014; 30:523-30.

170. Needham EJ, Parker BL, Burykin T, James DE, Humphrey SJ. Illuminating the dark

phosphoproteome. Sci Signal 2019; 12.

59

171. Degryse S, de Bock CE, Demeyer S, Govaerts I, Bornschein S, Verbeke D, et al. Mutant

JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and

MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic leukemia.

Leukemia 2017.

172. Murray HC, Dun MD, Verrills NM. Harnessing the power of proteomics for

identification of oncogenic, druggable signalling pathways in cancer. Expert Opin Drug

Discov 2017; 12:431-47.

173. Beckett EL, Stevens RL, Jarnicki AG, Kim RY, Hanish I, Hansbro NG, et al. A new

short-term mouse model of chronic obstructive pulmonary disease identifies a role for

mast cell tryptase in pathogenesis. J Allergy Clin Immunol 2013; 131:752-62.

174. Franklin BS, Bossaller L, De Nardo D, Ratter JM, Stutz A, Engels G, et al. The adaptor

ASC has extracellular and 'prionoid' activities that propagate inflammation. Nat Immunol

2014; 15:727-37.

175. Tay HL, Kaiko GE, Plank M, Li J, Maltby S, Essilfie AT, et al. Antagonism of miR-328

increases the antimicrobial function of macrophages and neutrophils and rapid clearance

of non-typeable Haemophilus influenzae (NTHi) from infected lung. PLoS Pathog 2015;

11:e1004549.

176. Hansbro PM, Hamilton MJ, Fricker M, Gellatly SL, Jarnicki AG, Zheng D, et al.

Importance of mast cell Prss31/transmembrane tryptase/tryptase-gamma in lung function

and experimental chronic obstructive pulmonary disease and colitis. J Biol Chem 2014;

289:18214-27.

60

177. Haw TJ, Starkey MR, Nair PM, Pavlidis S, Liu G, Nguyen DH, et al. A pathogenic role

for tumor necrosis factor-related apoptosis-inducing ligand in chronic obstructive

pulmonary disease. Mucosal Immunol 2016; 9:859-72.

178. Liu G, Cooley MA, Jarnicki AG, Hsu AC, Nair PM, Haw TJ, et al. Fibulin-1 regulates

the pathogenesis of tissue remodeling in respiratory diseases. JCI Insight 2016; 1.

179. Jarnicki AG, Schilter H, Liu G, Wheeldon K, Essilfie AT, Foot JS, et al. The inhibitor of

semicarbazide-sensitive amine oxidase, PXS-4728A, ameliorates key features of chronic

obstructive pulmonary disease in a mouse model. Br J Pharmacol 2016; 173:3161-75.

180. Hsu ACY, Dua K, Starkey MR, Haw TJ, Nair PM, Nichol K, et al. MicroRNA-125a and

-b inhibit A20 and MAVS to promote inflammation and impair antiviral response in

COPD. JCI Insight; 2.

181. Fricker M, Goggins BJ, Mateer S, Jones B, Kim RY, Gellatly SL, et al. Chronic cigarette

smoke exposure induces systemic hypoxia that drives intestinal dysfunction. JCI Insight;

3.

182. Fricker M, Deane A, Hansbro PM. Animal models of chronic obstructive pulmonary

disease. Expert Opin Drug Discov 2014; 9:629-45.

183. Starkey MR, Jarnicki AG, Essilfie AT, Gellatly SL, Kim RY, Brown AC, et al. Murine

models of infectious exacerbations of airway inflammation. Curr Opin Pharmacol 2013;

13:337-44.

184. Agusti A, Sobradillo P, Celli B. Addressing the complexity of chronic obstructive

pulmonary disease: from phenotypes and biomarkers to scale-free networks, systems

biology, and P4 medicine. Am J Respir Crit Care Med 2011; 183:1129-37.

61

185. Li CX, Wheelock CE, Skold CM, Wheelock AM. Integration of multi-omics datasets

enables molecular classification of COPD. Eur Respir J 2018; 51.

186. Specht H, Slavov N. Transformative Opportunities for Single-Cell Proteomics. J

Proteome Res 2018; 17:2565-71.

62

Chapter 2

Deep time-resolved proteomic profiling of cigarette smoke-induced chronic obstructive pulmonary disease.

In preparation for submission: The Journal of Allergy and Clinical Immunology

Authors: David A. Skerrett-Byrne,1,2 Heather C. Murray,1,3 M. Fairuz B. Jamaluddin,1,3 Brett Nixon,1,4 Elizabeth G. Bromfield,1,4 Peter A.B. Wark1,3, Rodney J. Scott,1,4 Matthew D. Dun,1,3,5# and Philip M. Hansbro1,2,5#

Affiliations: 1 School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, NSW, Australia, 2 Hunter Medical Research Institute, VIVA Program, Newcastle, NSW, Australia, 3 Hunter Medical Research Institute, Cancer Research Program, Newcastle, NSW, Australia, 4 Priority Research Centre for Reproductive Science, School of Environmental and Life Sciences, University of Newcastle, Callaghan, NSW 2308, Australia, 5 Centre of Inflammation, Centenary Institute, and University of Sydney, Sydney, NSW, Australia; # Authors contributed equally

63

Chapter 2: Overview

While chapter one has outlined what is known so far regarding changes to the proteome induced by COPD, there remains a considerable gap in knowledge surrounding the pathogenesis of the disease and the proteins that contribute to this. Here, using a well-established mouse model of cigarette smoke-induced COPD we have performed comparative and quantitative proteomics using several techniques outlined in chapter one. Importantly, we have progressively tracked alterations in protein expression in both the induction and the progression phases of the disease by analysing lung tissue at four distinct time points; 4, 6, 8 and 12 weeks.

The data presented in this chapter highlights the wealth of protein changes that occur throughout the development of COPD and reveals the 8 week time point, in our model, as a key stage of protein dysregulation. This corresponds to an important window of rapid decline in lung tissue and the onset of the irreversible symptoms of COPD. Through the identification of >7,200 proteins in lung tissue we have now provided the inventory required to gain a molecular understanding of what occurs at this stage of the disease. Through this study, we hope to contribute to the design of novel therapeutics and the identification of temporal biomarkers to unravel the complexity of COPD.

64

2.1 Abstract:

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide.

Treatment of COPD is restricted to symptom control, with no therapeutic intervention currently available to reverse lung tissue damage induced by COPD, leading to the fatal course of the disease. To explore the molecular changes important in the pathogenesis of COPD, we have employed comparative and quantitative label-based proteomic approach to map the progressive changes in the lung tissue proteome, using a clinically relevant mouse model of cigarette smoke- induced COPD. Quantification of 7,324 proteins was achieved across the four key time points of induction and progression of the model and compared to age-matched controls. Hierarchical clustering revealed modulated protein expression linked to cell death and survival, cellular assembly, organisation function and maintenance. Progressive alterations in protein expression patterns were observed in mice at the induction phase, with 18 and 16 protein changes at 4 and 6 week time points, respectively. Strikingly, 270 proteins showed altered expression at the 8 week progressive stage of COPD, which resolved to 27 changes at 12 weeks of cigarette smoke, highlighting the 8 week time point in our model as an important stage of disease progression.

Proteins potentially implicated in disease progression included Heterogeneous nuclear ribonucleoproteins C1/C2 (HNRNPC) and RNA-binding protein Musashi homolog 2 (MSI2), critical for RNA biosynthesis, and protein S100-A1 (S100A1) a novel calcium signalling protein with the propensity to signal through TLR4 and modulate downstream inflammatory response pathways. These proteins were validated in relevant human patient endobronchial biopsies. Herein we have provided a detailed inventory of proteins and potential biomarkers associated with the progressive decline of lung tissue induced by cigarette smoke. Moreover we propose several candidates that may be involved in the pathogenesis of COPD.

65

Graphical abstract

66

2.2 Introduction

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death globally.1,

2 Classified as a complex heterogeneous respiratory disorder, it is characterised by chronic pulmonary inflammation, progressive thickening and narrowing of the airway with destruction of the lung parenchyma leading to a rapid decline in lung function.3 The leading cause of COPD is cigarette smoke (CS), which is associated with >80% of all diagnoses.2, 4 In developing countries, smoking rates continue to rise 5, 6 and thus so, does the prevalence of COPD.5 Additionally, air pollution, environmental smoke exposure and genetic factors such as alpha-1 antitrypsin deficiency,7 are also known to instigate the progressive and destructive pulmonary changes that cause COPD in non-smoking patients.5, 6

Standard treatments for COPD are focused on symptomatic control, with treatment modalities currently unable to halt the progression of the disease.8-10 Despite emerging therapeutics, such as antagonists of cytokine receptors driving neutrophil chemotaxis, aiming to reduce airway inflammation,11 there is a vital need to characterise the molecular drivers of COPD.

This will help inform future diagnoses and aid in the development of new treatments. Analysis of

COPD affected lung tissue has thus far been well characterised at a genomic and transcriptomic level, however at the current time limited information is available for the progressive changes in the lung tissue proteome. 12-15 This is mostly attributed to the large amount of fresh tissue that is required for accurate quantification of protein expression using proteomics based mass spectrometry techniques and the limitation in the accessibility to healthy or normal tissue to act as a direct comparison.

67

Given these challenges, sophisticated animal models have emerged to circumvent issues underpinned by accessibility to early stage disease tissue and healthy controls. These powerful research tools provide us with access to lung tissues exhibiting the hallmark features of COPD that can be investigated in a multi-organ system in a reasonable timeframe.3, 16 Our laboratory has established a novel mouse model of CS-induced COPD, recapitulating human clinical features such as; airway remodelling, chronic pulmonary inflammation, emphysema, impaired lung function, and mucus hypersecretion in eight weeks.17 Herein, we have utilised this unique model coupled with high resolution comparative and quantitative proteomics to characterise the pulmonary proteome at both the induction and progression phases of CS-induced COPD. We have revealed a critical juncture in the instigation of COPD, where alterations in the machinery responsible for RNA biogenesis and damage-associated molecular patterns (DAMPs) may play an important role in the pathogenesis of COPD that requires detailed and focused future investigation.

68

2.3 Methods

2.3.1 Ethics statement

This study was performed in strict accordance with the recommendations in the Australian code of practice for the care and use of animals for scientific purposes issued by the National

Health and Medical Research Council of Australia. All protocols were approved by the Animal

Care and Ethics Committee of The University of Newcastle, Australia.

2.3.2 Experimental COPD

Female, 7-8-week-old, WT BALB/c mice were exposed to normal air or cigarette smoke

(CS) through the nose only for up to twelve weeks as previously described.17 This includes four time points, each with two treatment groups, and six mice within each treatment group. Mice were housed under a 12-hour light/dark cycle and had free access to food (standard chow) and water.

After a period of acclimatization (up to 5 days), mice were randomly placed into experimental groups and exposed to either normal air or nose-only inhalation of CS (Supplementary Figure

E2.1).16, 17 Weights were recorded every two days (Supplementary Figure E2.1.B). In recent years, some studies have shown that COPD prevalence and mortality is higher in females, and in the

USA in 2009 women accounted for 53% of COPD deaths.18 It is for these and logistical reasons that female mice were used. In this experimental setup the decision was made to research the global changes occurring in the lung by focusing on whole lung homogenates, rather than focusing on one specific cell type present in the lung. The rationale being that both whole lung and isolated cells shed light on different but important aspects of COPD pathogenesis, but the whole lung providing a broader global picture of disease progression for proteomic studies. Further work will be able to map single cell omics data to the generated profiles.

69

2.3.3 Pulmonary Inflammation

Airway inflammation was confirmed through differential enumeration of inflammatory cells in bronchoalveolar lavage fluid.2, 19, 20 Lung sections were stained with periodic acid-Schiff and tissue inflammation assessed by enumeration of inflammatory cells (Supplementary Figure

E2.1.D).17, 19, 20 Histopathological score was determined in lung sections stained with hematoxylin and eosin (H&E) based on well-established criteria.19-21

2.3.4 Alveolar Enlargement

Lung sections were stained with H&E. Alveolar septal damage and diameter were assessed by using the destructive index technique 22 and mean linear intercept technique, respectively

(Supplementary Figure E2.1.E). 2, 17, 23, 24

2.3.5 Lung Function

Mice were anaesthetized with ketamine (100mg/kg) and xylazine (10mg/kg, Troy

Laboratories, Smithfield, Australia) prior to tracheostomy. Tracheas were then cannulated and attached to Buxco® Forced Maneuvers systems apparatus (DSI, St. Paul, Minnesota, USA) to assess total lung capacity (Supplementary Figure E2.1.C).17, 19 Mice were then attached to a

FlexiVent apparatus (FX1 System; SCIREQ, Montreal, Canada) to assess transpulmonary resistance (tidal volume of 8mL/kg at a respiratory rate of 450 breaths/mins).17 All assessments were performed at least three times and the average was calculated for each mouse.

70

2.3.6 Mouse lung tissue sample preparation for proteomic analysis

For each time point (4, 6, 8, 12 week)17 the lungs of four mice for each experiment group, normal air or CS, were perfused with tris-buffered saline supplemented with protease (Sigma) and phosphatase inhibitors (Roche, Complete EDTA free) and extracted for proteomic analysis (Figure

2.1.A). Lung tissues were homogenised in 100µL of ice-cold 0.1M Na2CO3 containing protease and phosphatase inhibitors, using the FastPrep-24TM 5G (MP Biomedical, Santa Ana, CA, USA) with the Cool Prep Adaptor at a speed of 6.5m/s for 2 min. Samples were then sonicated for 3 x

10 s and incubated for 1hr at 4°C. The homogenates were fractionated into membrane-enriched and soluble proteins by ultra-centrifugation (Figure 2.1.B) at 100,000 x g for 90 min at 4°C.25 Both the membrane-enriched pellets and soluble proteins were dissolved to a final concentration of 6M urea, 2M thiourea, reduced using 10mM DTT (30 min, room temperature), alkylated using 20 mM iodoacetamide (30 min, 55°C, in the dark), and subsequently digested with a 1:30 Lys-C/Trypsin

Mix (Promega) where the solution is diluted below 1M urea concentration using 50mM triethylammonium bicarbonate (pH 7.8) after 3h and left overnight at 37°C.26 Lipids were precipitated from the membrane-enriched peptides using formic acid. All peptide solutions were desalted and cleaned up using commercial desalting columns (Oasis, Waters). Quantitative fluorescent peptide quantification (Qubit protein assay kit, Thermo Fisher Scientific, Carlsbad,

CA, USA) was employed and 200 µg of peptide was labelled using chemical isobaric tag based methods (Figure 2.1.C) 27, according to manufacturer’s specifications (SCIEX, iTRAQ). Digestion and isobaric tag labelling efficiency was determined by nano liquid chromatography tandem mass spectrometry (nLC-MS/MS) (described below). Samples were then mixed in a 1:1 ratio and the proteome was enriched using a multi-dimensional strategy.26, 28-30 Proteome populations were

71 desalted using a modified StageTip microcolumn 31 and subjected to offline hydrophilic interaction liquid chromatography (HILIC) prior to high resolution nLC-MS/MS (Figure 2.1.D).

72

73

2.3.7 Human subjects and sampling

All experiments were conducted in accordance with Hunter New England Area Health

Service Ethics Committee (05/08/10/3.09) and University of Newcastle Safety Committee

approvals. All subjects underwent a fibreoptic bronchoscopy in accordance with standard

guidelines at John Hunter Hospital. The bronchoscope was inserted into the 3-4th generation

airway of the subject and bronchial biopsies were then obtained using biopsy forceps applied under

direct vision. Endobronchial biopsies were embed in optimised cutting temperature compound

(ProSci Tech IA018), a cryo-preservative, and stored at -80ºC until later proteomic work.

The study cohort was comprised of 24 subjects, split into four cohorts; healthy control (n=6),

healthy smokers (n=6), mild-moderate COPD (n=6), and severe-very severe COPD (n=6). COPD

patients were stratified into mild-moderate and severe-very severe based on the Global Initiative

for Chronic Obstructive Lung Disease (GOLD) criteria 2, number of frequent acute exacerbations,

and COPD assessment test (CAT) score. The clinical characteristics of the study subjects are

summarised in Table 2.1. and additional clinical features can be found in supplementary table 2.2.

TABLE 2.1 Clinical characteristics of the study subjects. Proteomic Analysis Mild-moderate Severe-very Healthy Healthy COPD severe COPD Controls (n=6) Smokers (n=6) (n=6) (n=6) Age (years), mean (SD) 55.5 (9.6) 64.2 (10.1) 69.0 (8.4) 76.7 (8.5) Male (n) : Female (n) 2 : 4 4 : 2 4 : 2 5 : 1

FEV1 (%), mean (SD) 97.3 (11.6) 98.2 (13.7) 66.7 (9.0) 40.5 (11.1) FAE, N/A N/A < 2 ≥ 2 CAT Score, mean (SD) N/A N/A 9.6 (4.3) 16.5 (5.6) GOLD Stage N/A N/A 1-2 3-4

74

2.3.8 Human biopsies sample preparation for proteomic analysis

Once thawed, endobronchial biopsies were subjected to the same sample preparation of homogenisation, sonication, reduction, alkylation, digestion, lipid precipitation and quantification as described earlier in the mouse lung tissue sample preparation. However, this protocol was performed without ultracentrifugation, but with the addition of a trichloroacetic acid precipitation step.32 Following quantification, 30µg of peptide was taken from each sample and a heavy-labelled peptide standard (Supplementary Table E2.1) was evenly spiked-in across samples.

2.3.9 LC-MS/MS Analysis

Discovery

Reverse phase nLC-MS/MS was performed on 9-11 HILIC enriched fractions

(Supplementary Figure E2.2) for each 8plex, using a Q-Exactive Plus hybrid quadrupole-Orbitrap

MS system (Thermo Fisher Scientific) coupled to a Dionex Ultimate 3000RSLC nanoflow HPLC system (Thermo Fisher Scientific). Samples were loaded onto an Acclaim PepMap100 C18 75

μm× 20 mm trap column (Thermo Fisher Scientific) for pre-concentration and online desalting.

Separation was then achieved using an EASY-Spray PepMap C18 75 μm× 500 mm column

(Thermo Fisher Scientific), employing a linear gradient from 2 to 35% acetonitrile at 300 nl/min over 125 min. Q-Exactive Plus MS System was operated in full MS/data dependent acquisition

MS/MS mode (data-dependent acquisition). The Orbitrap mass analyzer was used at a resolution of 70 000, to acquire full MS with an m/z range of 370–1750, incorporating a target automatic gain control value of 3e6 and maximum fill times of 100 ms. The 20 most intense multiply charged precursors were selected for higher-energy collision dissociation fragmentation with a normalized

75 collisional energy of 32. MS/MS fragments were measured at an Orbitrap resolution of 30,000 using an automatic gain control target of 5e5 and maximum fill times of 120 ms.

For endobronchial biopsies an EASY-Spray PepMapTM C18 75 μm× 25 mm column

(Thermo Fisher Scientific), employing a stepped gradient at 300 nl/min of 2 to 32% acetonitrile over 95 min, to 45% over 15 min. Q-Exactive Plus MS System was operated in full

MS/datadependent acquisition MS/MS mode (data-dependent acquisition). The Orbitrap mass analyzer was used at a resolution of 70,000, to acquire full MS with an m/z range of 350–2000, incorporating a target automatic gain control value of 1e6 and maximum fill times of 50 ms. The

20 most intense multiply charged precursors were selected for higher-energy collision dissociation fragmentation with a normalized collisional energy of 26 and 30. MS/MS fragments were measured at an Orbitrap resolution of 17,500 using an automatic gain control target of 5e5 and maximum fill times of 120 ms.

2.3.10 Targeted proteomics by parallel reaction monitoring

Peptides were loaded onto an Acclaim PepMapTM100 C18 75 μm× 25 mm trap column (Thermo

Fisher Scientific) for pre-concentration and online desalting. Separation was then achieved using an EASY-Spray PepMap C18 75 μm× 25 mm column (Thermo Fisher Scientific), employing a linear gradient over 80 min, 5 to 40% acetonitrile at 300 nl/min over 64 min. Q-Exactive Plus MS

System was operated in parallel reaction monitoring (PRM) mode with optimised methods for each peptide target. This was performed using a resolution of 35,000, automatic gain control of

2e5, maximum fill times of 100ms and 1.6 m/z isolation window.26

76

2.3.11 Data analysis

Analysis of COPD mouse model proteome

Database searching of all raw files was performed using Proteome Discoverer 2.1 (Thermo

Fisher Scientific). Mascot 2.2.3 and SEQUEST HT were used to search against the Swiss_Prot,

Uniprot_mouse database (25,041 sequences, downloaded 11th July 2017) and Uniprot_human database (71,523 sequences, downloaded 5th March 2018). Database searching parameters included up to two missed cleavages, to allow for full tryptic digestion, a precursor mass tolerance set to 10 p.p.m. and fragment mass tolerance 0.02 Da. Cysteine carbamidomethylation was set as a fixed modification while dynamic modifications included acetylation (K), oxidation (M), phospho (S/T), phospho (Y) and iTRAQ 8plex. Interrogation of the corresponding reversed database was also performed to evaluate the false discovery rate of peptide identification using

Percolator on the basis of q-values, which were estimated from the target-decoy search approach.

To filter out target peptide spectrum matches over the decoy-peptide spectrum matches, a fixed false discovery rate of 1% was set at the peptide level. Each iTRAQ 8plex was analysed individually.26, 29, 30 Normalisation was carried out in the Peptide and Protein Quantifier node which normalised based on total peptide amount whereby Proteome Discoverer 2.1 “Sums the peptide group abundances for each sample and determines the maximum sum for all files. The normalization factor is the factor of the sum of the sample and the maximum sum in all files”.

Following normalisation, Proteome Discoverer 2.1 scales the normalised abundances based on the channel average whereby it scales the average of all channels to 100.

77

Data analysis for targeted proteomics

Peptides were compared to their corresponding human counterpart for sequence conservation. As described previously,26 PRM raw files were processed using Proteome

Discoverer 2.1 to create spectral libraries and imported to Skyline (MacCoss Lab, University of

Washington).33 The spectral libraries guided the identification of peptide chromatogram peaks.

Quantification of peptides was performed using the area under the curve of the MS2 extracted ion chromatogram within Skyline. All peptides were normalised to the spiked-in heavy-labelled peptide mix, to control variation in peptide amount, and fold changes were generated (patient cohort divided by the average of the healthy control patients), and log transformed.

2.3.12 Bioinformatic analysis and statistics

Protein lists were exported from Proteome Discoverer 2.1 in the form of Excel documents.

Using the scaled and normalised abundances, iTRAQ quantification ratios were generated by dividing individual smoke exposed mice by the average of the control mice (Smoke/Control) in their respective time points. Protein atlas across the time course was generated for membrane- enriched and soluble iTRAQ experiments by mapping Uniprot accession numbers in Perseus,34 version 1.6.2.3. Ratios were transformed (log2) and histograms were generated (Supplementary

Figure E2.3). All references to ratio/fold change will be discussed in the log2 format. An ANOVA

(FDR<0.05) was carried to identify proteins that were significantly changed in at least one time point. Before reducing the matrices to only ANOVA significant proteins, the missing values imputation algorithm in Perseus was used. These values are at the lower limit of the intensity scale.

The resulting ANOVA significant matrices in Perseus were used for principal component analysis and unsupervised hierarchical clustering to visualise temporal trends and consistency. Volcano

78 plots and graphs were plotted using Prism version 8.0, and Venn diagrams made using Venny.35

Basic data handling, if not otherwise stated, was carried out using Microsoft Excel® (Version

16.0.4739, Microsoft Corporation, Redmond, WA).

2.3.13 Ingenuity pathway analysis

ANOVA significant lists containing Uniprot accession numbers and transformed ratios were analysed using the Ingenuity® Pathway Analysis software (IPA®, Qiagen) as previously described.26 Canonical pathway analysis, upstream analysis, and disease and function analysis were assessed using; P-value is an enrichment measurement of the overlapping proteins from the dataset with a particular pathway, function or regulator; Z-score is a prediction scoring system of activation or inhibition based upon statistically significant patterns in the dataset and prior biological knowledge manually curated in the Ingenuity Knowledge Base.36 To elucidate the most significant changes in our analyses, we applied a stringency criteria of -log10 P-value of > 2 in each time point, and a Z-score of (inhibition) -2 ≤ Z ≥ 2 (activated) in at least one time point. For disease and function analysis we restricted our analysis to ‘molecular and cellular functions’ and

‘physiological system development and function’. For analysis of the temporal clusters, the top five most significantly enriched overlapping canonical pathways and molecular and cellular functions were reported.

2.3.14 Immunohistochemistry

Immunohistochemistry (IHC) was performed as previously described37 with the following modifications; lung tissue sections underwent deparaffinization and rehydration steps before a heat induced epitope retrieval was carried out in a low pH, citrate based antigen unmasking solution

79

(Vector Laboratories, California, USA, catalogue number H-3300) using a decloaking chamber

(Biocare, West Midlands, United Kingdom) at 105oC for 5min (SP1), followed by 90oC for 30sec

(SP2). The remaining steps were carried out using an ImmPRESS IgG (Peroxidase) Polymer

Detection Kit (Vector Laboratories, California, USA) as per the manufacturer’s recommendations.

In brief, endogenous peroxidases were quenched using 0.3% H2O2 and 2.5% normal horse serum was applied to the section for blocking. Primary antibodies, listed below, were applied to the sections and stained using DAB Peroxidase (HRP) Substrate Kit (Vector Laboratories, California,

USA, catalogue number SK-4100). Finally, tissue sections were counterstained with hematoxylin

(Gill’s formulation, Vector Laboratories, California, USA), dehydrated and cleared in xylene before mounting in Ultramount #4 mounting media (Thermo Fisher Scientific, Victoria, Australia).

Imaging was performed on an Olympus BX50 microscope, DP72 digital camera under Cellsyns software.

The following primary antibodies were used: anti-MSI2 at 1/250 dilution (#AB76148,

Abcam), anti-S100 alpha at 1/5000 dilution (#AB11428, Abcam). Negative controls included no primary antibodies and isotype control antibodies (Mouse (G3A1) mAb IgG1 Isotype Control,

Cell Signaling Technology, Massachusetts, USA, catalogue number #5415) were also performed.

All slides were quantified using an open source software, QuPath v0.1.2,38 designed for image analysis and digital pathology.

80

2.4 Results

2.4.1 Proteomic profiling reveals protein expression patterns linked with the progressive decline in lung tissue in a mouse model of CS-induced COPD

Proteomic profiling of mouse lung tissue exposed to air or chronic smoke, during induction

(4 and 6 weeks) and progression (8 and 12 weeks) phases of COPD (Figure 2.1),17 identified 7,324 proteins (FDR ≤ 0.01) across all samples, with an average of 6,034 proteins identified in each time point. Filtering out proteins which had no quant values or less than two unique peptides assigned, a total of 6,220 unique protein groups remained across the time course, with an average number of

5,108 unique protein groups per time point. Lung tissue was fractionated into membrane-enriched and soluble protein populations to assist in gaining insights into spatial and membrane dynamics during different phases of the disease.39-41

Principal component analysis revealed a high level of grouping for each lung proteome from each time point (i.e. four mice per treatment, smoked and air, membrane and soluble enrichment for 4 time points, 4, 6, 8 and 12 weeks) (Figure 2.2.A and Supplementary Figure E2.4). The distribution of proteins that were identified in each time point were assessed by Venn diagrams

(Figure 2.2.B). Notably, approximately 50% of protein identifications were shared across all time points, with 16-20% of proteins identified at a time point. Volcano plots highlight the balance of upregulated and downregulated proteins in each time point and its respective fractionation (Figure

2.2.C). These data revealed the most significant change in the lung proteome induced by CS, occurred at the 8 week time point, where 270 proteins showed significantly altered expression compared to controls (Table 2.2). This time point in our mouse model is where the clinical features of COPD are established and irreversible.17

81

TABLE 2.2. Proteomic Identification Summary #Proteins Total IDs ≥ 0.585 Fold change ≤ -0.585 Fold change Unique protein Unique protein Time point Unique peptides ≥ 2 ttest p ≤0.05 ttest p ≤0.05 IDs overlap IDs time course 4wk Membrane 4,628 2 3 5,503 4wk Soluble 4,715 7 6 6wk Membrane 4,391 8 1 4,781 6wk Soluble 3,387 6 1 6.220 8wk Membrane 4,877 162 2 5,235 8wk Soluble 3,600 103 3 12wk Membrane 3,379 3 11 4,913 12wk Soluble 4,543 11 2

82

83

In assessing the impact chronic exposure of CS has on the proteome of the lung across the time course, 2,828 proteins (membrane-enriched) and 1,176 proteins (soluble) showed significantly altered expression compared to aged matched controls time point(ANOVA; FDR

<0.05). Unsupervised hierarchical clustering of significantly altered expression profiles identified unique temporal clusters, which were assigned biological functions and signalling pathways using ingenuity pathway analysis (Figure 2.3.A and C). Here, membrane-enriched cluster 4 revealed 675 overlapping proteins showing increased expression in the progressive 8 week and 12 week stage of COPD. This cluster included eukaryotic initiation factor 2 (EIF2) (p-value: 1.30E-08) and caveolar-mediated endocytosis (p-value: 1.49E-08). In addition we assigned biological processes and upstream regulators to the consecutive phases of cigarette smoke exposure (Figure 2.3.B and

D). In the initial 4 week induction phase of the model, based on the altered expression in 434 and

64 proteins, apoptosis and necrosis were predicted to be activated in the membrane-enriched protein profile, respectively (Figure 2.3.B). This time point was also associated with an inhibition of cell movement of connective tissue cells (61 proteins) and endothelial cells (32 proteins), formation of the cytoskeleton (72 proteins), and microtubule dynamics (209 proteins). The expression of these proteins demonstrated a shift as the model transitioned to the 6 week time point. This time point was characterised by an activation of immune related functions such as cellular infiltration, cell movement of leukocytes and immune response to cells. Interestingly the induction phase of the model also suggested the presence of a high level of cellular stress associated with cell death, inflammation and arrested cell movement dynamics. However, overall during the progression of the disease across these time points there was a predicted increase in

‘activated functions’ including apoptosis of endothelial, fibroblast and lung cells, generation of reactive oxygen species, dysregulated growth and movement of cells.

84

Figure 2.3: Heatmap representation of protein pathways and molecular functions influenced by cigarette smoke. Hierarchical clustering analysis of ANOVA significant proteins (Smoke/Control; log2) of the membrane enriched (A) and soluble populations (C). Clustering identifies 85 unique clusters, listing top five significant canonical pathways and molecular functions. Hierarchical clustering of activity scores of downstream biological functions and upstream regulators (B, D).

Following on from this analysis we explored upstream regulators and networks, to explain the drivers of this remodelling process. These analyses demonstrated lethal-7 (let-7) to be activated at 4 and 6 weeks and then return to baseline at the progression phase (Figure 2.3.D). Let-7 is a family of microRNAs, the overexpression of which has been linked to delayed tissue repair and decreased movement and proliferation of lung fibroblasts.42, 43 Interestingly microRNA profiling of emphysemic lung tissue found downregulation of let-7 family members correlated with emphysema severity.44 This might explain the loss of activation at the progression phases as emphysema severity increases.

Heterogeneous nuclear ribonucleoproteins C1/C2 (HNRNPC) and RNA-binding protein

Musashi homolg 2 (MSI2) were amongst the most overexpressed proteins identified at the 8 week time point (Fold change of 3.6 and 1.8 respectively). Additionally, damage-associated molecular patterns (DAMPs) protein, S100-A1 (S100A1) also showed overexpression (Fold change of 2.5).

Given these unique changes, these candidates were selected for further validation.

86

Investigation of HNRNPC, MSI2, S100A1 as markers of progressive COPD in clinically relevant cohorts of patients.

To observe the expression of dysregulated proteins in the lung, tissue section analysis of

MSI2 and S100A1 was performed using lung sections from the same cohort of mice after 8 weeks of chronic smoke exposure and compared to air controls (Figure 2.4.A and B). Each protein demonstrated significantly increased expression in smoked mice; S100A1 fold change 1.71; MSI2 fold change 0.546. MSI2 and S100A1 were expressed ubiquitously in lung parenchyma but displayed the highest expression at the bronchiole. To further explore the proteins which showed significantly altered expression in our mouse model at the 8 week time point, we performed targeted proteomics by parallel reaction monitoring26 in clinically relevant patient cohorts.

Endobronchial biopsies were collected from 24 patients; 6 x healthy control (HC), 6 x healthy smokers (HS), 6 x mild-moderate (MC), and 6 x severe-very severe COPD (SC) patients

(Table 2.1). HNRNPC showed a 1.49 (ttest 0.04) and 1.61 (ttest 0.04) upregulation fold change in

MC and SC patients, respectively, compared to HC (Figure 2.4.C). Expression in human COPD patients was higher than that revealed at the 8 week time point in our CS-induce mouse model of

COPD (Figure 2.4.C). Interestingly, HNRNPC expression in HS showed no change (Fold change

-0.13; ttest 0.89) suggesting its expression is not necessarily driven by cigarette exposure alone.

Across six patients in both HC and HS, MSI2 was undetectable. However, in MC and SC patients

MSI2 was indeed detected suggesting MSI2 is a potential biomarker of both early and late stage human COPD. S100A1 was detected in all human samples, with a moderate increase in expression in MC (Fold change 1.22) and SC (Fold change 1.74), compared to HC.

87

A lgG Control 8wk Control 8wk S moke B M Sl 2 S l O O A l

3 3 M Sl 2

u 2

1 1

S l O O A l

iTRAQMEM iTRAQSOL IHC iTRAQME M iTRAQSOL IHC

C

Heterogeneous nuclear ribonucleoproteins C1/ C2 0. 0 0 - 0. 1 3

R N A- bi n di n g pr ot ei n M u s s hi h o m ol o g 2 n/ d n/ d

Protein S100- A1 0. 0 0 0. 2 5

2.5 Discussion

COPD and its progression is highly complex and curative treatments or prevention strategies remain elusive. While progress is being made towards therapeutic goals, a lack of thorough proteomic studies in this field has left a hole in our understanding of the disease and a lack of relevant biomarkers. Here, we have employed a multi-dimensional enrichment strategy to achieve unprecedented depth of the pulmonary proteome, characterising 7,324 unique proteins in lung tissue. Moreover, this study featured several novel aspects including the time-resolved nature of the study, where we examined defined time points of induction and progression phases of CS- induced COPD. The second being our focus on the progression phases of COPD which have not been comprehensively studied in previous analyses. Finally, using an ultracentrifugation fractionation technique we performed the first compartment-dependent analysis of lung tissue enduring COPD which allowed for distinctions to be made between the contributions of proteins in the membrane-enriched versus the soluble to the onset of COPD. Together these novel aspects shed light on the dynamic interplay of protein abundance and localisation, providing mechanistic insight in to the proteins potentially driving the induction and progression of CS-induced COPD.

Two major risk factors driving the onset and development of COPD are chronic exposure to

CS and age. Chronic exposure to CS induces cell death, such as apoptosis and necrosis, 46 oxidative stress through the generation of reactive oxygen species,47 all accumulating in persistent inflammation in the lung. The role of ageing is a growing field of interest into understanding COPD pathogenesis, with strong evidence suggesting an accelerated aging in the lung.48 A recent review by Meiners et al. outlined the altered cellular, metabolic and transcriptional hallmarks of ageing in the lung and lung disease including, but not limited to, cellular senescence, epigenetic alterations,

89 genomic instability, and telomere attrition.49, 50 These two risk factors are also explicitly linked, as comparing CS-exposed young and aged mice John-Schuster et al. have demonstrated an age increased susceptibility to CS-induced COPD.48, 51 Strikingly after eight weeks of chronic CS exposure in our mouse model, the proteomic profiles revealed 270 unique dysregulated proteins, of which 265 were significantly upregulated. This is particularly interesting as this time point recapitulates the well-established features of human COPD, such as airway remodelling, chronic pulmonary inflammation, emphysema, impaired lung function, and mucus hypersecretion.

Importantly, these are irreversible hallmarks of COPD.17 Additional this time point is representative of the early GOLD stages I/II of COPD,52 allowing our changes to be validated in clinically relevant samples. For these reasons we have focused on the progression phase of the model where amongst the 265 dysregulated proteins characterised at the 8 week time point, we identified a number of candidate proteins potentially playing key roles in COPD pathogenesis;

HNRNPC, MSI2, S100A1.

Proteomic profiling of progressive COPD in our mouse model showed increased expression of proteins involved in RNA biogenesis at 8 weeks. HNRNPC belongs to a family of twenty heterogeneous nuclear ribonucleoproteins with known roles in numerous aspects of RNA biogenesis; the regulation of mRNA metabolism, nucelo-cytoplasmic transport, transcription, translation, and splicing.53 Interestingly, HNRNPC has been shown to be involved in telomerase biogenesis and with overexpression inducing telomere shortening 54, 55. Telomeres are protective

DNA caps which sit at the end of chromosomes, limiting DNA degradation and maintaining genomic stability.50, 56 Recent longitudinal studies have correlated telomere shortening in lung resident alveolar, endothelial and smooth muscle cells as well as circulating lymphocytes, directly

90 with disease severity, lung function and mortality in COPD patients.48, 50, 56-58 This lung functional correlation was further supported by a meta-analysis of fourteen studies.59 HNRNPC has been demonstrated in lung epithelial cells to regulate the mRNA stability of urokinase plasminogen activator receptor (uPAR),60 an emerging biomarker of immune activation and inflammation in

COPD.61, 62 uPAR has been strongly linked with COPD pathogenesis, shown to be upregulated in alveolar wall cells, pulmonary macrophages and small airway epithelia in COPD patients in comparison with healthy smokers and control. Its upregulation was significantly correlated with forced expiratory volume in 1 second.63 Also implicated in the destruction of alveolar cells and small airways through regulation of extracellular matrix degrading enzymes matrix metalloproteinases 62, 63. These observations suggest that the marked upregulation of HNRNPC may assist in the accelerated ageing of the lung tissue, and driving the expression of uPAR which contributions to inflammation and degradation of small airways, making it an interesting future target for novel COPD therapeutics.

MSI2 is one of two members of the Musashi family of RNA binding proteins, containing two RNA recognition motifs, with regulatory roles in cellular proliferation, determination of cell fate, and mRNA translation.64 There is strong evidence supporting the role of increased cellular senescence in COPD lung tissue.48 Cellular senescence is characterised by cells undergoing an irreversible replicative arrest,65 and as the lung ages an increased senescence is observed.48 Healthy tissue homeostasis can be perturbed by the accumulation of senescent cells, which secrete pro- inflammatory mediators including DAMPs, increasing the generation of reactive oxygen species, and arresting tissue repair.48, 65 Using intricate computational algorithms, transcriptional networks were built from publically available gene microarray datasets of diverse human lung epithelial

91 cells.66 These gene expression profiles revealed a marked suppression of anti-senescence T-Box

(TBX) transcription factors coupled with an increase of TBX-regulated markers of cellular senescence, suggesting an important role in cellular senescence in the context of COPD. These findings were validated in lung tissue samples from COPD GOLD stages 2-3 and healthy smoking subjects by qPCR and immunoblotting. Amongst the TBX transcriptional factors identified was

TBX-1, which has been shown to be translational repressed by overexpression of MSI2 at both genomic and proteomic levels.67 We observed an upregulation fold change of 1.8 in the soluble profile of MSI2.

MSI2 was recently shown to be upregulated in non-small cell lung cancer, supporting metastasis by modulation of the epithelial–mesenchymal transition (EMT) via transforming growth factor beta (TGF-β) signalling and tight junction proteins.68 Activation of EMT has been suggested to be a driver of progressive fibrosis and remodelling in the airway epithelium in smokers, but further research is required to elucidate this relationship.69 A transcriptome-wide study identified 7,378 distinct genes as targets regulated by MSI2 post-transcriptionally, with roles in cell death, cell survival, DNA repair and replication.64 Small interfering RNA knockdown of

MSI demonstrated its regulation of several signalling pathways including epidermal growth factor,

EIF2, hepatocyte growth factor and interleukin-6 . Both EGF and HGF are known regulators of

ERK/MAPK, JAK/STAT, and P13K/AKT pathways, all orchestrators of inflammation. 70-72 We propose that the marked upregulation of MSI2 at 8 weeks is potentially indicative of its role in the establishment of the clinical features of COPD through its repression of TBX- 1 to drive cellular senescence, the modulation of EMT via TGF-β signalling and the regulation of inflammatory pathways via EGF and HGF signalling. Further mechanistic investigations focused on MSI2 are warranted to elucidate its role in COPD pathogenesis.

92

In response to injury in resident lung cells DAMPs can be released, mediating an immune response to damaged tissue.48, 73, 74 DAMPs are the subject of emerging interest in understanding

COPD pathogenesis, a potential driving force of innate and adaptive immune responses,75 but also strongly implicated in the hallmark features of ageing.76 A number of S100 family have been extensively characterised as DAMPs, with regulatory roles in cell cycle and proinflammatory activity.76 S100A8, S100A9, and S100A12 have been shown to activate pattern recognition receptors such as toll-like receptor 4 (TLR4) and receptor for advanced glycation end-products

(RAGE), both driving the activation of nuclear factor-kappaB (NF-ĸB) signalling 75, 77 which are correlated with airflow limitation.78 Upregulation of these members have been reported in the airways of severe asthma patients79-81, shown to induce mucin 5AC production82, and in bronchoalveolar fluid, serum, and sputum of COPD patients.83-86 S100A1 has recently been identified as a DAMPs molecule released from necrotic cardiomyocytes upon myocardial infarction.87 Extracellular S100A1, is internalised, via caveolin‐ and clathrin‐independent fluid endocytosis, where it can bind to intracellular TLR4, initiating the formation of a signalling complex of S100A1-TLR4 with the TLR adaptor myeloid differentiation primary response gene

88 (MyD88). This in turn leads to the activation of MAPK and NF‐κB pathways, known drivers of inflammation. Importantly, previous research from our lab has revealed a phenotype of attenuated airway fibrosis and demonstrated a marked reduction in apoptosis, emphysema-like alveolar enlargement, and impaired lung function in Tlr4-/- knockout mice.88 Moreover, in the present study we have shown 2.5 and 0.84 fold increased expression of S100A1 in the soluble and membrane-enriched lung profiles, respectively, at the progressive 8 week phase of COPD (Figure

2.4.B). This is coupled with significant enrichment in the IPA analysis of the 8 week ANOVA

93 significant proteins of canonical pathways caveolar- and clathrin-mediated endocytosis. Toll-like receptor signalling is also listed amongst the canonical pathways enrichment, and MyD88 was quantified in our membrane-enriched profile. Further, activation of NF-ĸB, by virus signalling, was indicated to be active with abundance changes in 35 known targets (z-score: 1.86) at the 8 week time point. Additionally, activation of ERK/MAPK signalling was identified with 61 proteins linked to this growth and proliferative signalling pathway. More generally, IPA analysis revealed a significant enrichment for ‘necrosis’ at every time point (Figure 2.3.B and D). S100A1 is also known to interact with RAGE leading to the activation of NF-ĸB and MAPK signalling pathways.89 Interestingly a RAGE knockout mouse model displayed a protective effect from airway inflammation.90 Given these data, we propose S100A1 to be an important DAMP in our model of CS-induced COPD, leading to the activation of NF-ĸb and MAPK pathways, which in turn contributes to pulmonary inflammation. We have additionally shown S100A1 to be upregulated in early and late stages of COPD compared to healthy controls (Figure 2.4.C). These data point to S100A1 as a potential therapeutic target upstream of TLR-4 and RAGE, which may be investigated to effectively inhibit the progressive clinical features of COPD.

The correlation and validation of HRNRPC, MSI2 and S100A1 in clinically relevant human cohorts not only highlights their potential role in the pathogenesis of COPD, but adds to the validity of the proteomic profiles of the induction phase of this mouse model. Importantly our laboratory has consistently demonstrated that the exposure of mice to 8 weeks of CS recapitulates the hallmark features of human COPD.2, 17, 19, 23, 88, 91 To our knowledge this study represents the most comprehensive CS-induced COPD pulmonary proteome to date. It will serve as an important resource which can be further mined to explore the progressive alterations in the proteome of

94 whole lung tissue. Our mouse model provides unique opportunities to therapeutically validate these targets and further investigate these candidates mechanistically for the improved understanding of the pathogenesis of COPD. Future functional studies investigating HRNRPC,

MSI2 and S100A1, are now necessary to elucidate the functional roles each plays in COPD.

95

2.6 References

1. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and

regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a

systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;

380:2095-128.

2. Haw TJ, Starkey MR, Nair PM, Pavlidis S, Liu G, Nguyen DH, et al. A pathogenic role

for tumor necrosis factor-related apoptosis-inducing ligand in chronic obstructive

pulmonary disease. Mucosal Immunol 2016; 9:859-72.

3. Jones B, Donovan C, Liu G, Gomez HM, Chimankar V, Harrison CL, et al. Animal

models of COPD: What do they tell us? Respirology 2017; 22:21-32.

4. Eisner MD, Anthonisen N, Coultas D, Kuenzli N, Perez-Padilla R, Postma D, et al. An

official American Thoracic Society public policy statement: Novel risk factors and the

global burden of chronic obstructive pulmonary disease. Am J Respir Crit Care Med

2010; 182:693-718.

5. Bilano V, Gilmour S, Moffiet T, d'Espaignet ET, Stevens GA, Commar A, et al. Global

trends and projections for tobacco use, 1990-2025: an analysis of smoking indicators

from the WHO Comprehensive Information Systems for Tobacco Control. Lancet 2015;

385:966-76.

6. Lopez AD, Shibuya K, Rao C, Mathers CD, Hansell AL, Held LS, et al. Chronic

obstructive pulmonary disease: current burden and future projections. Eur Respir J 2006;

27:397-412.

96

7. Hazari YM, Bashir A, Habib M, Bashir S, Habib H, Qasim MA, et al. Alpha-1-

antitrypsin deficiency: Genetic variations, clinical manifestations and therapeutic

interventions. Mutat Res 2017; 773:14-25.

8. Barnes PJ. Corticosteroid resistance in patients with asthma and chronic obstructive

pulmonary disease. J Allergy Clin Immunol 2013; 131:636-45.

9. Cazzola M, Page C. Long-acting bronchodilators in COPD: where are we now and where

are we going? Breathe 2014; 10:110-20.

10. Spina D. Pharmacology of novel treatments for COPD: are fixed dose combination

LABA/LAMA synergistic? Eur Clin Respir J 2015; 2.

11. Rennard SI, Dale DC, Donohue JF, Kanniess F, Magnussen H, Sutherland ER, et al.

CXCR2 Antagonist MK-7123. A Phase 2 Proof-of-Concept Trial for Chronic Obstructive

Pulmonary Disease. Am J Respir Crit Care Med 2015; 191:1001-11.

12. Obeidat M, Hao K, Bosse Y, Nickle DC, Nie Y, Postma DS, et al. Molecular mechanisms

underlying variations in lung function: a systems genetics analysis. Lancet Respir Med

2015; 3:782-95.

13. Pillai SG, Ge D, Zhu G, Kong X, Shianna KV, Need AC, et al. A genome-wide

association study in chronic obstructive pulmonary disease (COPD): identification of two

major susceptibility loci. PLoS Genet 2009; 5:e1000421.

14. Sauler M, Lamontagne M, Finnemore E, Herazo-Maya JD, Tedrow J, Zhang X, et al. The

DNA repair transcriptome in severe COPD. Eur Respir J 2018; 52.

15. Hardin M, Silverman EK. Chronic Obstructive Pulmonary Disease Genetics: A Review

of the Past and a Look Into the Future. Chronic Obstr Pulm Dis 2014; 1:33-46.

97

16. Fricker M, Deane A, Hansbro PM. Animal models of chronic obstructive pulmonary

disease. Expert Opin Drug Discov 2014; 9:629-45.

17. Beckett EL, Stevens RL, Jarnicki AG, Kim RY, Hanish I, Hansbro NG, et al. A new

short-term mouse model of chronic obstructive pulmonary disease identifies a role for

mast cell tryptase in pathogenesis. J Allergy Clin Immunol 2013; 131:752-62.

18. Pinkerton KE, Harbaugh M, Han MK, Jourdan Le Saux C, Van Winkle LS, Martin WJ,

2nd, et al. Women and Lung Disease. Sex Differences and Global Health Disparities. Am

J Respir Crit Care Med 2015; 192:11-6.

19. Hansbro PM, Hamilton MJ, Fricker M, Gellatly SL, Jarnicki AG, Zheng D, et al.

Importance of mast cell Prss31/transmembrane tryptase/tryptase-gamma in lung function

and experimental chronic obstructive pulmonary disease and colitis. J Biol Chem 2014;

289:18214-27.

20. Nair PM, Starkey MR, Haw TJ, Liu G, Horvat JC, Morris JC, et al. Targeting PP2A and

proteasome activity ameliorates features of allergic airway disease in mice. Allergy 2017;

72:1891-903.

21. Horvat JC, Beagley KW, Wade MA, Preston JA, Hansbro NG, Hickey DK, et al.

Neonatal chlamydial infection induces mixed T-cell responses that drive allergic airway

disease. Am J Respir Crit Care Med 2007; 176:556-64.

22. Eidelman DH, Ghezzo H, Kim WD, Cosio MG. The destructive index and early lung

destruction in smokers. Am Rev Respir Dis 1991; 144:156-9.

23. Hsu AC, Starkey MR, Hanish I, Parsons K, Haw TJ, Howland LJ, et al. Targeting PI3K-

p110alpha Suppresses Influenza Virus Infection in Chronic Obstructive Pulmonary

Disease. Am J Respir Crit Care Med 2015; 191:1012-23.

98

24. Liu G, Cooley MA, Jarnicki AG, Hsu AC, Nair PM, Haw TJ, et al. Fibulin-1 regulates

the pathogenesis of tissue remodeling in respiratory diseases. JCI Insight 2016; 1.

25. Fujiki Y, Hubbard AL, Fowler S, Lazarow PB. Isolation of intracellular membranes by

means of sodium carbonate treatment: application to . J Cell Biol

1982; 93:97-102.

26. Degryse S, de Bock CE, Demeyer S, Govaerts I, Bornschein S, Verbeke D, et al. Mutant

JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and

MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic leukemia.

Leukemia 2018; 32:788-800.

27. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, et al. Multiplexed

protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging

reagents. Mol Cell Proteomics 2004; 3:1154-69.

28. Engholm-Keller K, Birck P, Storling J, Pociot F, Mandrup-Poulsen T, Larsen MR. TiSH-

-a robust and sensitive global phosphoproteomics strategy employing a combination of

TiO2, SIMAC, and HILIC. J Proteomics 2012; 75:5749-61.

29. Nixon B, De Iuliis GN, Hart HM, Zhou W, Mathe A, Bernstein I, et al. Proteomic

profiling of mouse epididymosomes reveals their contributions to post-testicular sperm

maturation. Mol Cell Proteomics 2018.

30. Nixon B, Johnston SD, Skerrett-Byrne DA, Anderson AL, Stanger SJ, Bromfield EG, et

al. Proteomic profiling of crocodile spermatozoa refutes the tenet that post-testicular

maturation is restricted to mammals. Mol Cell Proteomics 2018.

31. Larsen MR, Cordwell SJ, Roepstorff P. Graphite powder as an alternative or supplement

to reversed-phase material for desalting and concentration of peptide mixtures prior to

99

matrix-assisted laser desorption/ionization-mass spectrometry. Proteomics 2002; 2:1277-

87.

32. Zhao X, Huffman KE, Fujimoto J, Canales JR, Girard L, Nie G, et al. Quantitative

Proteomic Analysis of Optimal Cutting Temperature (OCT) Embedded Core-Needle

Biopsy of Lung Cancer. J Am Soc Mass Spectrom 2017; 28:2078-89.

33. MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B, et al.

Skyline: an open source document editor for creating and analyzing targeted proteomics

experiments. Bioinformatics 2010; 26:966-8.

34. Tyanova S, Temu T, Sinitcyn P, Carlson A, Hein MY, Geiger T, et al. The Perseus

computational platform for comprehensive analysis of (prote)omics data. Nat Methods

2016; 13:731-40.

35. J.C. O. An interactive tool for comparing lists with Venn’s diagrams.

http://bioinfogp.cnb.csic.es/tools/venny/index.html 2007-2015.

36. Kramer A, Green J, Pollard J, Jr., Tugendreich S. Causal analysis approaches in

Ingenuity Pathway Analysis. Bioinformatics 2014; 30:523-30.

37. Faulkner S, Roselli S, Demont Y, Pundavela J, Choquet G, Leissner P, et al. ProNGF is a

potential diagnostic biomarker for thyroid cancer. Oncotarget 2016; 7:28488-97.

38. Bankhead P, Loughrey MB, Fernandez JA, Dombrowski Y, McArt DG, Dunne PD, et al.

QuPath: Open source software for digital pathology image analysis. Sci Rep 2017;

7:16878.

39. Ahrman E, Hallgren O, Malmstrom L, Hedstrom U, Malmstrom A, Bjermer L, et al.

Quantitative proteomic characterization of the lung extracellular matrix in chronic

100

obstructive pulmonary disease and idiopathic pulmonary fibrosis. J Proteomics 2018;

189:23-33.

40. Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et

al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung

injury and repair. Mol Syst Biol 2015; 11.

41. Schiller HB, Mayr CH, Leuschner G, Strunz M, Staab-Weijnitz C, Preisendorfer S, et al.

Deep Proteome Profiling Reveals Common Prevalence of MZB1-Positive Plasma B Cells

in Human Lung and Skin Fibrosis. Am J Respir Crit Care Med 2017; 196:1298-310.

42. Osei ET, Florez-Sampedro L, Timens W, Postma DS, Heijink IH, Brandsma C-A.

Unravelling the complexity of COPD by microRNAs: it's a small world after all.

European Respiratory Journal 2015; 46:807.

43. Huleihel L, Ben-Yehudah A, Milosevic J, Yu G, Pandit K, Sakamoto K, et al. Let-7d

microRNA affects mesenchymal phenotypic properties of lung fibroblasts. Am J Physiol

Lung Cell Mol Physiol 2014; 306:L534-42.

44. Christenson SA, Brandsma C-A, Campbell JD, Knight DA, Pechkovsky DV, Hogg JC, et

al. miR-638 regulates gene expression networks associated with emphysematous lung

destruction. Genome Medicine 2013; 5:114.

45. Bourmaud A, Gallien S, Domon B. Parallel reaction monitoring using quadrupole-

Orbitrap mass spectrometer: Principle and applications. Proteomics 2016; 16:2146-59.

46. Messner B, Frotschnig S, Steinacher-Nigisch A, Winter B, Eichmair E, Gebetsberger J, et

al. Apoptosis and necrosis: two different outcomes of cigarette smoke condensate-

induced endothelial cell death. Cell Death Dis 2012; 3:e424.

47. Kirkham PA, Barnes PJ. Oxidative stress in COPD. Chest 2013; 144:266-73.

101

48. Brandsma C-A, de Vries M, Costa R, Woldhuis RR, Königshoff M, Timens W. Lung

ageing and COPD: is there a role for ageing in abnormal tissue repair? European

Respiratory Review 2017; 26:170073.

49. López-Otín C, Blasco MA, Partridge L, Serrano M, Kroemer G. The hallmarks of aging.

Cell 2013; 153:1194-217.

50. Meiners S, Eickelberg O, Königshoff M. Hallmarks of the ageing lung. European

Respiratory Journal 2015; 45:807.

51. John-Schuster G, Gunter S, Hager K, Conlon TM, Eickelberg O, Yildirim AO.

Inflammaging increases susceptibility to cigarette smoke-induced COPD. Oncotarget

2016; 7:30068-83.

52. Liebler DC, Zimmerman LJ. Targeted Quantitation of Proteins by Mass Spectrometry.

Biochemistry 2013; 52:3797-806.

53. Geuens T, Bouhy D, Timmerman V. The hnRNP family: insights into their role in health

and disease. Hum Genet 2016; 135:851-67.

54. Fu D, Collins K. Purification of human telomerase complexes identifies factors involved

in telomerase biogenesis and telomere length regulation. Mol Cell 2007; 28:773-85.

55. Naro C, Bielli P, Pagliarini V, Sette C. The interplay between DNA damage response and

RNA processing: the unexpected role of splicing factors as gatekeepers of genome

stability. Front Genet 2015; 6:142.

56. Córdoba-Lanús E, Cazorla-Rivero S, Espinoza-Jiménez A, de-Torres JP, Pajares MJ,

Aguirre-Jaime A, et al. Telomere shortening and accelerated aging in COPD: findings

from the BODE cohort. 2017; 18:59.

102

57. Rutten EP, Gopal P, Wouters EF, Franssen FM, Hageman GJ, Vanfleteren LE, et al.

Various Mechanistic Pathways Representing the Aging Process Are Altered in COPD.

Chest 2016; 149:53-61.

58. Lee J, Sandford AJ, Connett JE, Yan J, Mui T, Li Y, et al. The Relationship between

Telomere Length and Mortality in Chronic Obstructive Pulmonary Disease (COPD).

PLOS ONE 2012; 7:e35567.

59. Albrecht E, Sillanpää E, Karrasch S, Alves AC, Codd V, Hovatta I, et al. Telomere length

in circulating leukocytes is associated with lung function and disease. European

Respiratory Journal 2014; 43:983.

60. Shetty S. Regulation of urokinase receptor mRNA stability by hnRNP C in lung epithelial

cells. Mol Cell Biochem 2005; 272:107-18.

61. Godtfredsen NS, Jørgensen DV, Marsaa K, Ulrik CS, Andersen O, Eugen-Olsen J, et al.

Soluble urokinase plasminogen activator receptor predicts mortality in exacerbated

COPD. 2018; 19:97.

62. Desmedt V, Delanghe JR, Speeckaert R, Speeckaert MM. The intriguing role of soluble

urokinase receptor in inflammatory diseases AU - Desmedt, S. Critical Reviews in

Clinical Laboratory Sciences 2017; 54:117-33.

63. Zhang Y, Xiao W, Jiang Y, Wang H, Xu X, Ma D, et al. Levels of components of the

urokinase-type plasminogen activator system are related to chronic obstructive

pulmonary disease parenchymal destruction and airway remodelling. J Int Med Res 2012;

40:976-85.

64. Duggimpudi S, Kloetgen A, Maney SK, Munch PC, Hezaveh K, Shaykhalishahi H, et al.

Transcriptome-wide analysis uncovers the targets of the RNA-binding protein MSI2 and

103

effects of MSI2's RNA-binding activity on IL-6 signaling. J Biol Chem 2018; 293:15359-

69.

65. Kirkland JL, Tchkonia T. Cellular Senescence: A Translational Perspective.

EBioMedicine 2017; 21:21-8.

66. Acquaah-Mensah GK, Malhotra D, Vulimiri M, McDermott JE, Biswal S. Suppressed

expression of T-box transcription factors is involved in senescence in chronic obstructive

pulmonary disease. PLoS Comput Biol 2012; 8:e1002597.

67. Sutherland JM, Sobinoff AP, Fraser BA, Redgrove KA, Siddall NA, Koopman P, et al.

RNA binding protein Musashi-2 regulates PIWIL1 and TBX1 in mouse spermatogenesis.

J Cell Physiol 2018; 233:3262-73.

68. Kudinov AE, Deneka A, Nikonova AS, Beck TN, Ahn Y-H, Liu X, et al. Musashi-2

(MSI2) supports TGF-β signaling and inhibits claudins to promote non-small cell lung

cancer (NSCLC) metastasis. 2016; 113:6955-60.

69. Nowrin K, Sohal SS, Peterson G, Patel R, Walters EH. Epithelial-mesenchymal transition

as a fundamental underlying pathogenic process in COPD airways: fibrosis, remodeling

and cancer. Expert Rev Respir Med 2014; 8:547-59.

70. Singh D. P38 inhibition in COPD; cautious optimism. Thorax 2013; 68:705.

71. Yew-Booth L, Birrell MA, Lau MS, Baker K, Jones V, Kilty I, et al. JAK–STAT

pathway activation in COPD. European Respiratory Journal 2015; 46:843.

72. Bozinovski S, Vlahos R, Hansen M, Liu K, Anderson GP. Akt in the pathogenesis of

COPD. Int J Chron Obstruct Pulmon Dis 2006; 1:31-8.

73. Kaczmarek A, Vandenabeele P, Krysko DV. Necroptosis: the release of damage-

associated molecular patterns and its physiological relevance. Immunity 2013; 38:209-23.

104

74. Tang D, Kang R, Coyne CB, Zeh HJ, Lotze MT. PAMPs and DAMPs: signal 0s that spur

autophagy and immunity. Immunol Rev 2012; 249:158-75.

75. Pouwels SD, Heijink IH, ten Hacken NH, Vandenabeele P, Krysko DV, Nawijn MC, et

al. DAMPs activating innate and adaptive immune responses in COPD. Mucosal

Immunology 2013; 7:215.

76. Huang J, Xie Y, Sun X, Zeh HJ, 3rd, Kang R, Lotze MT, et al. DAMPs, ageing, and

cancer: The 'DAMP Hypothesis'. Ageing Res Rev 2015; 24:3-16.

77. Srikrishna G. S100A8 and S100A9: new insights into their roles in malignancy. J Innate

Immun 2012; 4:31-40.

78. Di Stefano A, Caramori G, Oates T, Capelli A, Lusuardi M, Gnemmi I, et al. Increased

expression of nuclear factor-kappaB in bronchial biopsies from smokers and patients with

COPD. Eur Respir J 2002; 20:556-63.

79. Lee TH, Chang HS, Bae DJ, Song HJ, Kim MS, Park JS, et al. Role of S100A9 in the

development of neutrophilic inflammation in asthmatics and in a murine model. Clin

Immunol 2017; 183:158-66.

80. Lee TH, Jang AS, Park JS, Kim TH, Choi YS, Shin HR, et al. Elevation of S100 calcium

binding protein A9 in sputum of neutrophilic inflammation in severe uncontrolled

asthma. Ann Allergy Asthma Immunol 2013; 111:268-75.e1.

81. Kim K, Kim HJ, Binas B, Kang JH, Chung IY. Inflammatory mediators ATP and

S100A12 activate the NLRP3 inflammasome to induce MUC5AC production in airway

epithelial cells. Biochem Biophys Res Commun 2018; 503:657-64.

105

82. Kang JH, Hwang SM, Chung IY. S100A8, S100A9 and S100A12 activate airway

epithelial cells to produce MUC5AC via extracellular signal-regulated kinase and nuclear

factor-kappaB pathways. Immunology 2015; 144:79-90.

83. Merkel D, Rist W, Seither P, Weith A, Lenter MC. Proteomic study of human

bronchoalveolar lavage fluids from smokers with chronic obstructive pulmonary disease

by combining surface-enhanced laser desorption/ionization-mass spectrometry profiling

with mass spectrometric protein identification. Proteomics 2005; 5:2972-80.

84. Pouwels SD, Nawijn MC, Bathoorn E, Riezebos-Brilman A, van Oosterhout AJM,

Kerstjens HAM, et al. Increased serum levels of LL37, HMGB1 and S100A9 during

exacerbation in COPD patients. European Respiratory Journal 2015; 45:1482.

85. Cockayne DA, Cheng DT, Waschki B, Sridhar S, Ravindran P, Hilton H, et al. Systemic

biomarkers of neutrophilic inflammation, tissue injury and repair in COPD patients with

differing levels of disease severity. PLoS One 2012; 7:e38629.

86. Lorenz E, Muhlebach MS, Tessier PA, Alexis NE, Duncan Hite R, Seeds MC, et al.

Different expression ratio of S100A8/A9 and S100A12 in acute and chronic lung

diseases. Respir Med 2008; 102:567-73.

87. Rohde D, Schon C, Boerries M, Didrihsone I, Ritterhoff J, Kubatzky KF, et al. S100A1 is

released from ischemic cardiomyocytes and signals myocardial damage via Toll-like

receptor 4. EMBO Mol Med 2014; 6:778-94.

88. Haw TJ, Starkey MR, Pavlidis S, Fricker M, Arthurs AL, Nair PM, et al. Toll-like

receptor 2 and 4 have opposing roles in the pathogenesis of cigarette smoke-induced

chronic obstructive pulmonary disease. Am J Physiol Lung Cell Mol Physiol 2018;

314:L298-l317.

106

89. Khan MI, Su YK, Zou J, Yang LW, Chou RH, Yu C. S100B as an antagonist to block the

interaction between S100A1 and the RAGE V domain. PLoS One 2018; 13:e0190545.

90. Chen M, Wang T, Shen Y, Xu D, Li X, An J, et al. Knockout of RAGE ameliorates

mainstream cigarette smoke-induced airway inflammation in mice. Int

Immunopharmacol 2017; 50:230-5.

91. Tay HL, Kaiko GE, Plank M, Li J, Maltby S, Essilfie AT, et al. Antagonism of miR-328

increases the antimicrobial function of macrophages and neutrophils and rapid clearance

of non-typeable Haemophilus influenzae (NTHi) from infected lung. PLoS Pathog 2015;

11:e1004549.

107

2.7 Supplementary Figures

Figure E2.1: Nose-only exposure of the lungs of BALB/c mice to cigarette smoke induces hallmark features of human COPD. WT BALB/c mice were exposed to cigarette smoke or normal air over a time course of 4–12 weeks. Relative to their age matched control mice, smoke-exposed mice had reduced weight gain relative to initial weight (A); decreased total lung capacity (B), chronic increase in the number of macrophages (M), neutrophils (N) and lymphocytes (L) in BALF (C); alveolar enlargement (D) (scale bar on micrographs = 100 μm). Data are means ±SEM of 4-6 mice/group, # P<0.05, ## P<0.01, ### P<0.001 compared to mice that breathed normal air, * P<0.05, ** P<0.01 compared to other groups indicated. 108

Figure E2.2: HILIC fractionation chromatograms of proteome populations with fractions denoted. The x-axis and y-axis depict time (min) and intensity (mAU) respectively. 4 week membrane (A) & soluble (B), 6 week membrane (C) & soluble (D), 8 week membrane (E) & soluble (F), 12 week membrane (G) & soluble (H). Each fraction was subjected to a MS gradient of 160 min.

109

Figure E2.3: Histograms distribution of protein iTRAQ quantification ratios. Ratios were generated by dividing individual smoke exposed mice by the average of the control mice. The x-axis and y-axis depict ratio (log2) and number of proteins in each bin respectively. 4weeks membrane (A) & soluble (B), 6weeks membrane (C) & soluble (D), 8weeks membrane (E) & soluble (F), 12weeks membrane (G) & soluble (H).

110

A B

C D

Figure E2.4: Principal component analysis of membrane-enriched and soluble profiles. Principal component analysis revealed a high level of grouping within each fractionation at 4 weeks (A), 6 weeks (B), 8 weeks (C) and 12 weeks (D).

111

2.8 Supplementary Tables

Supplementary Table 2.1 Heavy-labelled Spiketide Mix Uniprot Position in Charge Mass-to- Peptide Sequence Species Modifications accession protein State charge ratio R.DIATIVADK.C Q9Y3A5 Homo sapiens [109, 117] Label 13C(6) 15N(2) [C-term K] ++ 477.2733 K.RPYTVILIER.A Q9Y3A5 Homo sapiens [125, 134] Label 13C(6) 15N(4) [C-term R] ++ 635.3813 R.GSLEVLSLK.D P70122 Mus musculus [231, 239] Label 13C(6) 15N(2) [C-term K] ++ 477.2915

Supplementary Table 2.2 Additional clinical characteristics of the study subjects. Mild-moderate COPD Severe - very severe COPD Patient Number cohort (n=6) cohort (n=6) Patient 1 Emphysema Patient 2 Asthma Small airway disease (no Patient 3 Cough bronchiectasis) Patient 4 Patient 5 Patient 6 Bronchiectasis Emphysema

112

Chapter 3:

Deep time-resolved phosphoproteomic profiling of cigarette smoke-induced chronic obstructive pulmonary disease.

Currently in preparation for submission: The Journal of Allergy and Clinical Immunology

Authors: David A. Skerrett-Byrne,1,2 Heather C. Murray,1,3 Elizabeth G. Bromfield,1,4 Rodney J. Scott,1,3 Matthew D. Dun,1,3# and Philip M. Hansbro1,2,5#

Affiliations: 1 School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, NSW, Australia, 2 Hunter Medical Research Institute, VIVA Program, Newcastle, NSW, Australia, 3 Hunter Medical Research Institute, Cancer Research Program, Newcastle, NSW, Australia, 4 Priority Research Centre for Reproductive Science, School of Environmental and Life Sciences, University of Newcastle, Callaghan, NSW 2308, Australia, 5 Centre of Inflammation, Centenary Institute, and University of Sydney, Sydney, NSW, Australia

# Authors contributed equally

113

Chapter 3: Overview

Despite the publication of several studies evaluating kinases as therapeutic targets to prevent

COPD, there remains a paucity of proteomic studies focused on the role of aberrant phosphorylation in the disease. To address this, we have undertaken an in depth study of the phosphoproteome using a multi-dimensional enrichment strategy coupled to high resolution mass spectrometry.

Building on the pulmonary proteome characterised in Chapter 2, we have identified >27,000 unique phosphorylation sites across the induction and progression phases of our cigarette smoke- induced COPD model. Utilising the latest bioinformatic platforms we have identified potential downstream functions mediated by this post-translational modification and mapped upstream kinases regulating such phosphorylation events. Refining these kinase identifications by clinically approved drugs we have identified 13 druggable kinases mapping to individual time points across the time course of COPD.

This study provides a valuable platform for the informed development of kinase-based therapeutic strategies to alleviate the burden of COPD and has improved our understanding of the intricate role of phosphorylation in the development of lung pathologies.

114

3.1 Introduction

Chronic obstructive pulmonary disease (COPD) is one the leading causes of morbidity and mortality in the world.1 COPD is a complex and heterogeneous respiratory disease, covering a variety of clinical characteristics, including chronic bronchitis and emphysema, and importantly not all characteristics are exhibited in patients within any given timeframe.2-4 Despite overwhelming evidence of the detrimental health effects of cigarette smoke (CS), it remains the leading cause of COPD, associated with over 80% of all diagnoses,5, 6 coupled to continued rising smoking rates in developing countries.7, 8

Symptomatic control has been the standard treatment of patients suffering from COPD, with no effective treatments currently that can halt the progression of the disease.9-11 This paucity of effective treatments stems from a lack of understanding of the molecular drivers of pathogenesis of a very complex, heterogeneous disease. Characterisation of these drivers has predominately focused on changes in the transcriptome, however at the current time limited information is available for the progressive changes in the lung tissue proteome,12-15 and in particular the role posttranslational modifications (PTMs)play in driving COPD.

Currently there are more than 200 known PTMs, all with unique regulatory roles and diverse biological and molecular outcomes.16 Transient protein phosphorylation has emerged as one of the major PTMs of research interest. Phosphorylation is the driving force of cell signalling, mediating numerous biological functions.17 Phosphorylation involves the covalent addition, a kinase catalysed reaction, of a charged gamma-phosphate from adenosine 5′-triphosphate (ATP) to the side chain of amino acids residues that can be reversed by phosphatases.18 The major

115 phosphorylated residues are serine, threonine and tyrosine which are generally distributed

86%:12%:2%.19 The addition of a phosphate group to these amino acids affects protein conformation, due to the transformation of the hydrophilic side chain into a substantially larger, negatively charged side chain.16 This change in conformation can have substantial effects on the activity and subcellular localisation of the protein, its interaction with other proteins, and its stability.20, 21

Revolutionary advancements made in mass spectrometry (MS)-based phosphoproteomics have been instrumental in understanding the true complexity of this PTM, allowing for high- resolution identification of multiple site-specific phosphorylation events.22, 23 A recent study showed 75% of the expressed proteome can be phosphorylated and predicts the true number to be in excess of 90%.24 MS-based phosphoproteomics has accelerated the molecular understanding of a variety of diseases including cardiovascular disease,25 diabetes,26 infertility,27-29 liver disease,30 neurological diseases,31, 32 and has been strongly implemented in cancer biology,33-36 driving improved treatment regimens and aiding in the development of novel therapies.37, 38

Phosphoproteomic profiling sheds light on both the downstream biological functions mediated by aberrant phosphorylation but also on the upstream kinases, a major drug target,39 which govern these transient phosphorylation profiles.

To date phosphoproteomics has seldom been applied to understand the pathogenesis of

COPD, despite kinase inhibitors emerging as a potential therapeutic strategy.40-42 To address this, we have employed our CS-induced COPD mouse model43 and a multi-dimensional phosphoenrichment strategy,44 in tandem with an unbiased comparative and quantitative

116 phosphoproteomic approach. Through this approach we have characterised the alterations in phosphorylation associated with the induction and progression stages of COPD with the goal to identify pathways aberrantly regulated by phosphorylation and upstream kinase controlling these events as potential therapeutic targets.

117

3.2 Methods

3.3.1 Ethics statement

This study was performed in strict accordance with the recommendations in the Australian code of practice for the care and use of animals for scientific purposes issued by the National

Health and Medical Research Council of Australia. All protocols were approved by the Animal

Ethics Committee of The University of Newcastle, Australia.

3.3.2 Experimental COPD

Female, 7-8-week-old, WT BALB/c mice were exposed to normal air or cigarette smoke through the nose only for eight weeks as previously described.43 Mice were housed under a 12-hr light/dark cycle and had free access to food (standard chow) and water. After period of acclimatization (up to 5 days), mice were randomly placed into experimental groups and exposed to either normal air or nose-only inhalation of CS for up to twelve weeks.43, 45 In recent years, some studies have shown that COPD prevalence and mortality is higher in females, and in the USA in

2009 women accounted for 53% of COPD deaths.46 It is for these and logistical reasons that female mice are used. See Chapter 2, Supplementary Figure E2.1 for CS-induced mouse model clinical features.

3.3.3 Pulmonary Inflammation

Airway inflammation was assessed by differential enumeration of inflammatory cells in bronchoalveolar lavage fluid.5, 47, 48 Lung sections were stained with periodic acid-Schiff and tissue inflammation assessed by enumeration of inflammatory cells.43, 47, 48 Histopathological score was

118 determined in lung sections stained with hematoxylin and eosin (H&E) based on established custom designed criteria.47-49

3.3.4 Alveolar Enlargement

Lung sections were stained with H&E. Alveolar septal damage and diameter were assessed by using the destructive index technique 50 and mean linear intercept technique respectively. 5, 43,

51, 52

3.3.5 Lung Function

Mice were anaesthetized with ketamine (100mg/kg) and xylazine (10mg/kg, Troy

Laboratories, Smithfield, Australia) prior to tracheostomy. Tracheas were then cannulated and attached to Buxco® Forced Maneuvers systems apparatus (DSI, St. Paul, Minnesota, USA) to assess total lung capacity (TLC).43, 47 Mice were then attached to a FlexiVent apparatus (FX1

System; SCIREQ, Montreal, Canada) to assess transpulmonary resistance (tidal volume of 8mL/kg at a respiratory rate of 450 breaths/mins).43 All assessments were performed at least three times and the average was calculated for each mouse.

3.3.6 Mouse lung tissue sample preparation for phosphoproteomic analysis

For each time point from the COPD experimental model (4, 6, 8, 12wk) 43 the lungs of four mice for each experiment group, normal air or CS, were perfused with tris-buffered saline supplemented with protease (Sigma) and phosphatase inhibitors (Roche, Complete EDTA free) and extracted for proteomic analysis (Figure 3. 1). Lung tissues were homogenised in 100µL of ice-cold 0.1M Na2CO3 containing protease and phosphatase inhibitors, using the FastPrep-24TM

119

5G (MP Biomedical, Santa Ana, CA, USA ) with the Cool Prep Adaptor at a speed of 6.5m/s for

2 min. Samples were then sonicated for 3 x 10 s and incubated for 1hr at 4°C. The homogenates were fractionated into membrane and soluble proteins (Figure 3. 1.B) by ultra-centrifugation

(100,000 x g for 90 min at 4°C).53 Both the membrane rich pellets and soluble proteins were dissolved to a final concentration of 6M urea, 2M Thiourea, reduced using 10mM DTT (30 min, room temperature), alkylated using 20 mM iodoacetamide (30 min, 55°C, in the dark), and subsequently digested with a 1:30 Lys-C/Trypsin Mix (Promega) where the solution was diluted below a 1M Urea concentration using 50mM triethylammonium bicarbonate (pH 7.8) after 3h and left overnight at 37°C. Lipids were precipitated from membrane peptides using formic acid. All peptide solutions were desalted and cleaned up using commercial desalting columns (Oasis,

Waters). Fluorescent peptide quantification (Qubit protein assay kit, Thermo Fisher Scientific,

Carlsbad, CA, USA) was employed and 200 µg of peptide was labelled using chemical isobaric tag based methods,54 (Figure 3. 1.C) according to manufacturer’s specifications (SCIEX, iTRAQ).

Digestion and isobaric tag labelling efficiency was determined by nano liquid chromatography tandem mass spectrometry (nLC-MS/MS) (described below). Samples were then mixed in 1:1 ratio and phosphopeptides were enriched using the multi-dimensional strategy TiSH (Figure 3.

1.D).44 In brief, enrichment was achieved using a titanium dioxide pre-enrichment step followed by a sequential elution from immobilized metal affinity chromatograph to separate mono- and multi-phosphorylated peptides. Enriched phosphorylated peptide populations were desalted using a modified StageTip microcolumn55 and mono-phosphorylated peptides were subjected to offline hydrophilic interaction liquid chromatography (HILIC) prior to high resolution nLC-MS/MS

(Figure 3. 1.E).

120

A Induction Phase Progression Phase

S moke expos ur e S m o k e n = 6 C o ntr ol n = 6 * = S a crifi c e

w k s 4 w k s * 6 w k s * I S w k s * 1 2 w k s *

B Air exposed S moke exposed l u n g s l u n g s n = 4 n = 4

C L a b elli n g

Co mbine 1:1

D

sosphopeptide � ;�;ich ment

E F

MS 1 Selected Peak / Hi er ar c hi c al Proteo me Discoverer 2.1 Co mparison I , Clust ering I

T m e (i n) M S 2 Selecte d Peak l n g e n ui y P at h w a y A n al y si s 3.3.7 LC-MS/MS Analysis

nLC-MS/MS was performed on 6-11 HILIC enriched fractions (Supplementary figure E3.1) for each 8plex, using a Q-Exactive Plus hybrid quadrupole-Orbitrap MS system (Thermo Fisher

Scientific) coupled to a Dionex Ultimate 3000RSLC nanoflow HPLC system (Thermo Fisher

Scientific). Samples were loaded onto an Acclaim PepMap100 C18 75 μm× 20 mm trap column

(Thermo Fisher Scientific) for pre-concentration and online desalting. Separation was then achieved using an EASY-Spray PepMap C18 75 μm× 500 mm column (Thermo Fisher Scientific), employing a stepped gradient at 5 to 20% acetonitrile at 300 nl/min over 101 min, to 40% over 17 min, to 90% over 15min. Q-Exactive Plus MS System was operated in full MS/datadependent acquisition MS/MS mode (data-dependent acquisition). The Orbitrap mass analyzer was used at a resolution of 70 000, to acquire full MS with an m/z range of 370–1750, incorporating a target automatic gain control value of 3e6 and maximum fill times of 100 ms. The 20 most intense multiply charged precursors were selected for higher-energy collision dissociation fragmentation with a normalized stepped collisional energies of 28, 30 and 32. MS/MS fragments were measured at an Orbitrap resolution of 35 000 using an automatic gain control target of 5e5 and maximum fill times of 120 ms.

3.3.8 Computational LC-MS/MS data analysis

Database searching of all raw files was performed using Proteome Discoverer 2.1 (Thermo

Fisher Scientific). Mascot 2.2.3 and SEQUEST HT were used to search against the Swiss_Prot,

Uniprot_mouse database (25,041 sequences, downloaded 11th July 2017). Database searching parameters included up to two missed cleavages, to allow for full tryptic digestion, a precursor mass tolerance set to 10 p.p.m. and fragment mass tolerance 0.02 Da. Cysteine

122 carbamidomethylation was set as a fixed modification while dynamic modifications included acetylation (K), oxidation (M), phospho (S/T), phospho (Y) and iTRAQ 8plex. Interrogation of the corresponding reversed database was also performed to evaluate the false discovery rate of peptide identification using Percolator on the basis of q-values, which were estimated from the target-decoy search approach. To filter out target peptide spectrum matches over the decoy-peptide spectrum matches, a fixed false discovery rate of 1% was set at the peptide level. Each iTRAQ

8plex was analysed individually. Normalisation was carried out in the Peptide and Protein

Quantifier node which normalised based on total peptide amount whereby Proteome Discoverer

2.1 “Sums the peptide group abundances for each sample and determines the maximum sum for all files. The normalization factor is the factor of the sum of the sample and the maximum sum in all files”. Following normalisation, Proteome Discoverer 2.1 scales the normalised abundances based on the channel average whereby it scales the average of all channel to 100.

3.3.9 Bioinformatic analysis and statistics

Protein lists were exported from Proteome Discoverer 2.1 in the form of Excel documents.

The phosphopeptide enrichment efficiency was determined for each time point by dividing the number of high confident and unique phosphopeptides by the total high confident peptides detected

(Supplementary Table E3.1). Focusing on phosphopeptides, the scaled and normalised iTRAQ abundances were used to generate quantitation ratios by dividing individual smoke exposed mice by the average of the control mice (Smoke/Control) in their respective time points. Membrane- enriched and soluble iTRAQ experiments were imported to Perseus 56 (version 1.6.2.3) and ratios were transformed (log2) and histograms were generated (Supplementary Figures E3.2-3).

Principal component analysis (PCA) was carried out for each enriched fraction, whereby time

123 points were combined and filtered for more than or equal to three quantitation values in at least one time point. Perseus missing values imputation algorithm was used for the purpose of PCA only, these values were at the lower limit of the intensity. Within each time point and their respective enriched fractions, a standard t test with an FDR of 0.05 and log2 fold change cut off

±0.585 (1.5 fold change) was applied to identify unique dysregulated phosphopeptides. Volcano plots and graphs were plotted using Prism version 8.0, and Venn diagrams made using Venny.57

Basic data handling, if not otherwise stated, was carried out using Microsoft Excel® (Version

16.0.4739, Microsoft Corporation, Redmond, WA). (Figure 3. 1.F)

3.3.10 Ingenuity pathway analyses

Complete inventory of unique phosphopeptides, with t test (0.05) applied to any phosphopeptide above the fold change cut off (±0.585), containing Uniprot accession numbers and transformed ratios were analysed using the Ingenuity® Pathway Analysis software (IPA®,

Qiagen) as previously described.33 Importantly Ingenuity® Pathway Analysis presented are based on the protein level and the assumption that changes in phosphorylation result in either activation or repression of pathways, regulators and downstream functions. Canonical pathway analysis, upstream analysis, and disease and function analysis were assessed using; p-value as an enrichment measurement of the overlapping proteins from the dataset with a particular pathway, function or regulator; Z-score as a prediction scoring system of activation or inhibition based upon statistically significant patterns in the dataset and prior biological knowledge manually curated in the Ingenuity

Knowledge Base.58 To elucidate the most significant changes in our analyses, we applied a stringency criteria of -log10 p-value of > 2 in each time point, and a Z-score of (inhibition) -2 ≤ Z

≥ 2 (activation) in at least one time point. For disease and function analysis we stricted our analysis

124 to ‘molecular and cellular functions’ and ‘physiological system development and function’ The top five most significantly enriched canonical pathways, upstream regulators and molecular functions, as well as the most annotated proteins were reported.

3.3.11 Kinase enrichment

Lists of t-test significant unique dysregulated phosphopeptides with a fold change cut off

(±0.585) were analysed for mapping kinases using KinMap, a web-based tool for the kinase visualisation based upon known linking biochemical, structural, and disease association data to the human kinome phylogenetic tree.59 The size of the red hexagon is a representation of the number of unique phosphopeptides identified for each protein mapped by a kinase.

125

3.3 Results

3.4.1 Time-resolved quantitative phosphosite identifications of CS-induced COPD

We sought to characterise the transient site specific phosphorylation changes regulated by

chronic smoke exposure. To achieve this we used our CS-induced mouse model of COPD (Figure

3.1),43 focusing on the induction (4 and 6 weeks) and progression (8 and 12 weeks) phases of the

disease. Using a multidimensional phosphopeptide enrichment strategy44 coupled with isobaric-

tag based labelling, we quantified a total of 27,857 unique phosphopeptides (FDR ≤ 0.01)

corresponding to 4,392 proteins across the time course (Supplementary Table E3.2). The

phosphopeptide enrichment efficiency was determined for each time point and a median

phosphopeptide enrichment of 80% was achieved (Table 3.1).

TABLE 3.1 Phosphoproteomic Identification Summary #Unique phosphopeptides Unique sites Phosphoenrichment Total sites Total IDs Time point FDR ≤ 0.01 Median 80% time point overlap time course overlap 4wk Membrane 6,885 75% 13,358 4wk Soluble 10,506 79% 6wk Membrane 11,545 80% 15,446 6wk Soluble 6,615 78% 27,857 8wk Membrane 4,233 81% 9,247 8wk Soluble 6,283 83% 12wk Membrane 9,418 80% 11,908 12wk Soluble 5,227 71%

126

As outlined in the methods, lung homogenates were initially fractionated into membrane- enriched and soluble protein populations to help gain insights of the spatial dynamics of protein abundance and localisation during smoke exposure.60-62 Quantification ratios generated for each time point, and its respective compartment fractionation, were log transformed and histograms generated to help visualise the distribution of changes and the effect of normalisation

(Supplementary Figures E3.2-3). We next characterised the distribution of phosphorylated residues in each analysis. We noted a significant portion accounting for approximately 73% were phosphorylated serine residues (Figure 3.2). Phosphorylated threonine accounted for 4-8%, while relatively small numbers of phosphorylated tyrosine residues were characterised (Figure 3.2). This was in line with normal distribution of these residues.19

Principal component analyses revealed strong separation of the time points and their respective compartment fractionation point (i.e. four mice per treatment, smoked and air, membrane and soluble enrichment for 4 time points, 4, 6, 8 and 12 weeks) (Figure 3.3.A). Next we explored the distribution of phosphosites that were identified at each time point, these were assessed by Venn Diagrams (Figure 3.3.B). Notably, approximately 11% of phosphosite identifications were shared across all time points, with 55% exclusively identified at a time point.

To explore these distributions further, volcano plots were generated highlighting the balance of upregulated and downregulated proteins in each time point and its respective fractionation (Figure

3.3.C). These data identified the significant dysregulation of 705, 554, 1095 and 693 unique phosphopeptides, with a threshold of ±1.5 fold change, at 4, 6, 8 and12 week time points, respectively (Table 3.2). Interestingly the 8 week time point commands the largest number of changes, mapping to 376 uniquely dysregulated proteins (Supplementary Figure E3.4). Chief

127

amongst this time point were the upregulation of leucine-rich repeat serine/threonine-protein

kinase 2 (Serine 895; 898), coiled-coil domain-containing protein 8 homolog (Serine 142;146)

cacuolar protein sorting-associated protein 13C (Serine 871;873) and downregulation of latent-

transforming growth factor beta-binding protein 1 (Serine 313), transmembrane protein 45A

(Serine 269) and myotubularin-related protein 2 (Serine 58). Additional unique phosphopeptides

were observed to be exclusively abundant in either smoke exposed mice (n ≥ 3) or in the age match

controls (n ≥ 3), giving a final total of dysregulated phosphopeptides to be 730, 580, 1332, and

728 in 4, 6, 8 and 12 week time points, respectively (Table 3.2). Further breakdown of these data

revealed a relative balance between increased and decreased phosphorylation at the 4 week time

point (46% vs 54%), but a shift in favour of reduced phosphorylation along the induction phase (6

weeks; 67% vs 33%). We observed a return to the mean at 8 week (49% vs 51%) and finally at 12

weeks a reversal of the 6 weeks profile (39% vs 61%). An investigation of the proteins that these

phosphopeptides map to revealed that the majority (63.6%) of proteins were unique to a particular

time point (Supplementary Figure E3.4). Due to this diversity, we performed a focused analysis

within each time point to explore the heterogeneity of biological functions that may be

underpinning disease progression.

TABLE 3.2 Dysregulated unique phosphopeptides summary #Phosphosites Control Smoke Downregulation Upregulation FC ≤ 0.667 FC ≥ 1.5 P-value ≤ 0.05 P-value ≤ 0.05 Exclusive Exclusive total total Time point n ≥ 3 n ≥ 3 (peptide/protein) (peptide/protein) 4wk Membrane 58 74 1 10 337 / 279 393 / 298 4wk Soluble 274 299 4 10 6wk Membrane 294 155 12 3 391 / 300 189 / 160 6wk Soluble 75 30 10 1 8wk Membrane 100 191 20 39 659 / 492 673 / 388 8wk Soluble 417 387 122 56 12wk Membrane 217 353 12 15 286 / 233 442 / 315 12wk Soluble 53 70 4 4

128

A B

Figure 3.2: Distribution of phosphorylated residues. Analysis of the complete inventory of unique phosphopeptides broken down into the total number (A) and proportion (B) phosphorylation modifications to amino acids serine (S), threonine (T), and tyrosine (Y), as well as ambiguous

phosphorylation of either a serine or threonine residue (S/T), a serine or tyrosine residue (S/Y), a threonine or tyrosine (T/Y), or a serine, threonine, or tyrosine residue (S/T/Y).

citated crocodile spermatozoa. pS/T, pS/Y, pS/T/Y =

129

8 " ·k

■ ■

s T

Co mponent 1 {60.1 %) Co mponent 1 {60 %)

4

4 3 5 � � 4 � � 4 ( 2 d d g o £ 3

0 0 0 0 - 7 - 6 . 5 - 3 · 2 - 1 0 1 2 3 4 5 6 7 -3 - 2 - 1 - 6 - 5 -3 - 2 - 1 0 1 2 3 4 5 6 - 3 - 2 - 1 4

L o g 2 Fold Change L o g 2 Fold Change L o g 2 Fold Change L o g 2 Fold Change

4

4 3 5 � 4 � ! 4 d 2 d d £ 3

0 0 0 0 - 7 - 6 - 5 - 3 - 2 -1 0 1 2 3 4 5 6 7 -3 - 2 - 1 4 - 6 - 5 - 3 - 2 - 1 0 1 2 3 4 5 6 -3 - 2 - 1 4

L o g 2 Fold Change L o g 2 Fold Change L o g 2 Fold Change Log� Fold Change

3.4.2 Phosphoproteome profiles identify perturbed upstream transcriptional regulators and downstream molecular and cellular functions characterising induction and progression stages of CS-induced COPD

The complete significant inventories of mapped phosphoproteins were analysed using ingenuity pathway analysis and assigned upstream regulators and downstream molecular and cellular functions. Observation of the top molecular functions identified on the basis of significance and number of annotated proteins pointed to a shared level of cellular stress associated with cell death, homeostasis, and the dynamics of cellular movement and structure (Supplementary

Tables E3.3-6). To elucidate the potential activation or inhibition of these downstream molecular functions,63 ingenuity pathway analysis activation z-scores were implemented. Hierarchical clustering of activity scores was used to observe the dysregulated functions across the time course and in each enriched fraction (Figure 3.4). Here, membrane-enriched profiles at the initial 4 week time point found abundance changes of regulated phosphopeptides were identified in proteins involved in the predicted activation of apoptosis (444 proteins), fibrogenesis (435 proteins) and necrosis (94 proteins). Also observed was an activation of splicing and processing of mRNA (21;

26 proteins), coupled with inhibition of cell cycle progression (147 proteins) and growth of microtubules (5 proteins). In the soluble profile we observed similar activity of cell death related functions, as well as, arrested cellular movements (Figure 3.4.A). Next along the induction phase there was a marked predicted increase in the activity of functions related to immune cell chemotaxis, transport of carbohydrates, cytoskeleton dynamics, and as observed at the 4 week time point, RNA processes and fibrogenesis (Figure 3.4.B). Interestingly a predicted inhibition of cell death of epithelial and connective tissue cells was also revealed. Soluble profiling maintained the activation of cell death related functions (Figure 3.4.A).

131

Entering the progression phase of the disease, less predicted dysregulated activities were observed in the membrane-enriched profile but overall functions were inhibited. The 8 week time point was marked by inhibition of a mitogen-activated protein kinase (MAPKKK), mitosis and of the fibrogenesis function observed earlier in the time course (Figure 3.4.B). Lastly at 12 weeks a cluster of inhibitions were observed related to cell movement and migration of connective tissue and fibroblast cells, with inhibited apoptosis of the latter cell type (Figure 3.4.B). Soluble profiling of the progression phase continued to show a dysregulation of RNA processes, in addition to apoptosis of hematopoietic and leukocyte cells at the 8 week time point.

132

A B

Figure 3.4: Predictive downstream cellular and molecular function of time-resolved CS-induced COPD mediated by phosphorylation. Hierarchical clustering of activation z- scores of downstream molecular and cellular functions of soluble (A) and membrane-enriched (B) phosphorylation profiles. Grey indicates no significant data detected. 133

In the membrane enriched profiles, application of our stringency criteria revealed only two upstream regulators with a significant z-score in the progression phase of the disease. Inhibition of myocardial zonula adherens protein at 8 weeks and activation of MAP kinase interacting serine/threonine kinase 1 (MKNK1) at 12 weeks. Profiling of the soluble fraction identified 17 dysregulated upstream regulators (Figure 3.5). MicroRNAs 27a-3p, 122-5p and 139-5p were exclusively predicted to be activated in the induction phase of the disease, along with the inhibition of MKNK1 and RAC-alpha serine/threonine-protein kinase (AKT1). Across the time course there is a trend towards activation of sphingosine kinase 2 (SPHK2) realised in the transition to the progression phase of the disease (8 weeks). Furthermore, an activation of protein kinase C and tumour protein 53 was identified. Lastly an inhibition of two members of tumour necrosis factor receptor associated factor (TRAF) protein family were observed at 12 weeks, important in the regulation of apoptosis and inflammation.64 See Supplementary Tables E3.5-10 for the top five upstream regulators based on significance and the number of annotated proteins, along with their molecule type, significant scores and number of annotated proteins.

134

Figure 3.5: Predictive upstream regulators of time-resolved CS-induced COPD mediating phosphorylation profiles. Hierarchical clustering of activation z- scores of upstream regulators of soluble phosphorylation profiles. Grey indicates no significant data detected.

135

3.4.3 Pathway analysis of phosphoproteome profiles reveals profiles of induction and progression stages of CS-induced COPD

The complete significant inventories of mapped phosphoproteins were analysed using ingenuity pathway analysis and assigned pathways (Figure 3.6). Hierarchical clustering of the activity scores (see methods) of these assignments sheds light on the potential activation or inhibition of pathways driving the progression of COPD (Figure 3.6). In both fractions 4 weeks is marked by a inhibition in a number of pathways including actin cytoskeleton signalling, Janus kinase/signal transducers of transcription (JAK/STAT) signalling and cytokines interleukins 1 (IL-

1), IL-6, IL-22 (Figure 3.6). Only 4 week membrane-enriched fraction identified activated pathways, namely proliferator-activated receptors (PPAR) and phosphatase and tensin homolog deleted on chromosome 10 (PTEN) (Figure 3.6.B). The 6 week time point sees a shift towards activated pathways with membrane-enriched fractions marked by roles in oxidative stress by nuclear factor erythroid 2-related factor 2 (NRF2), inflammation regulator chemokine C-C motif receptor 5 (CCR5) signalling in macrophages, platelet glycoprotein VI (GP6) and production of nitric oxide and reactive oxygen species in macrophages (Figure 3.6.B). The soluble profile only identified one activated pathway, fatty acid metabolism transcriptional mediator

PPAR/retinoid X receptors (PPARα/RXRα) (Figure 3.6.A). The progressive phase of the model was characterised by the predicted inhibition of 57 pathways and only 7 activated. Amongst the top inhibited pathways were growth factor signalling of epidermal (EGF), fibroblasts (FGF) and nerve (NGF), cytokines IL-1, IL-2, IL-17A, and 17F (Figure 3.6).

Filtering for the top five canonical pathway based on significance we observed an enrichment of renin-angiotensin signalling (6.31E-20), leukocyte extravasation signalling (3.98E-

136

16), nitric oxide signalling in the cardiovascular system (3.98E-18) in the membrane-enriched fractions. Soluble profiles enriched for insulin receptor signalling (3.16E-24), protein kinase A signalling (2.51E-22), and actin cytoskeleton signalling (2.00E-16). Stratification by number of annotated proteins, revealed a consistent enrichment between the two fractions, enriching for phospholipase C signalling (91 proteins), glucocorticoid receptor signalling (103 proteins), and molecular mechanism of cancer (129 proteins). See Supplementary Tables E3.11-14 for the top five canonical pathways based on significance and the number of annotated protein for each time point and fraction, along with their significant scores and number of annotated proteins.

To elucidate these differences further we focused on the biological process uniquely assigned to the deregulated phosphopeptides and the kinases that act upstream of these phosphorylation events.

137

Figure 3.6: Predictive pathway analysis of time-resolved CS-induced COPD phosphoproteomic profiles. Hierarchical clustering of activation z- scores of canonical pathways of soluble (A) and membrane-enriched (B) phosphorylation profiles. Grey indicates no significant data detected.

138

3.4.4 Dysregulated phosphorylation sites identify upstream master kinases as potential therapeutic targets

Focusing on the significantly dysregulated unique phosphopeptides, KinMap59 was used to identify the upstream master kinases potentially regulating the changes associated with the induction and progression phases of CS-induced COPD. KinMap categorises 540 kinases are classified into 13 atypical families and 8 typical groups,65 namely AGC, calcium and calmodulin- regulated kinases (CAMK), cell kinase 1 (CK1), CMGC, STE, tyrosine kinase (TK), tyrosine kinase-like (TKL) and others. KinMap identified 32 and 49 kinases for membrane-enriched and soluble fractions, respectively, at 4 weeks; 48 and 14 kinase for membrane-enriched and soluble fractions at 6 weeks; 36 and 81 kinases for membrane-enriched and soluble fractions at 8 weeks;

57 and 21 kinases for membrane-enriched and soluble fractions at 12 weeks (Figure 3.7).

To build the CS-induced druggable phosphoproteome, bioinformatical mining of Needham et al. allowed these kinases identifications to be mapped to their families and their associated clinically approved drugs (Table 3.3).66 This lead to the identification of 14 clinically approved druggable kinases, of which 11 belong to the tyrosine kinase family, 2 to the tyrosine kinase-like family and 1 to the CMGC family. These 14 druggable kinases map to 27 clinically approved drugs predominately used in the treatment of cancers such as non-small cell lung cancer, acute myeloid leukaemia, and metastatic breast cancer but also used for the treatment of idiopathic pulmonary fibrosis (Nintedanib). All 27 are listed in Supplementary Table 3.15 with their associated conditions listed andsourced using DrugBank.67

139

Figure 3.7: Mapped kinases regulating perturbed phosphorylation profiles of CS-induced COPD. Dysregulated unique phosphopeptides were assessed by KinMap to identify upstream kinases. Mapped kinases were visualised as the red hexagon on the human kinome phylogenetic tree. The size of the red hexagon is a representation of the number of unique phosphopeptides identified for each protein mapped by a kinase. Illustrations were produced Courtesy of Cell Signaling Technology, created using KinMap.61 140

Two tyrosine kinase family members, receptor tyrosine-protein kinase erbB-4 (ErbB4) and ephrin type-A receptor 6 (EphA6), were identified as exclusive druggable kinases to the 4 week and 6 week induction phases of the disease respectively (Table 3.3). The 8 week time point of the progression phase of the disease offered 5 exclusive druggable kinases, of which 4 belong to the tyrosine kinase family, namely B-lymphoid tyrosine kinase (BLK), discoidin domain receptor tyrosine kinase 1 (DDR1), ephrin type-A receptor 5 (EphA5), and Janus kinase 3 (JAK3) (Table

3.3). The remaining kinase, cell division protein kinase 6 (CDK6), belongs to the CMGC kinase family. The most severe phase of the disease (12 weeks) mapped to two exclusive druggable tyrosine kinases, epidermal growth factor receptor (EGFR) and receptor tyrosine-protein kinase erbB-4 (ErbB4) (Table 3.3). The remaining druggable kinases mapped to all time points with the exception of fibroblast growth factor receptor 3 (FGFR3) which mapped to 4, 6, and 12 week time points (Table 3.3).

141

TABLE 3.3 Druggable mapped kinases Time point #Clinically Kinase Family Clinical drug names mapped approved drugs Abelson murine leukemia viral oncogene Imatinib;Dasatinib;Nilotinib; Tyrosine kinase All 6 homolog 1 (ABL1) Bosutinib;Ponatinib; Regorafenib B-lymphoid tyrosine kinase (BLK) Tyrosine kinase 8wk 2 Vandetanib;Ponatinib Serine/threonine-protein kinase B-raf (BRAF) Tyrosine kinase- Sorafenib;Vemurafenib;Regorafenib; All 4 like Dabrafenib Cell division protein kinase 6 (CDK6) CMGC 8wk 3 Palbociclib;Abemaciclib;Ribociclib Discoidin domain receptor tyrosine kinase 1 Tyrosine kinase 8wk 1 Nilotinib (DDR1) Epidermal growth factor receptor (EGFR) Erlotinib;Lapatinib;Vandetanib;Afatinib; Tyrosine kinase 12wk 7 Osimertinib;Gefitinib;Brigatinib Ephrin type-A receptor 5 (EphA5) Tyrosine kinase 8wk 2 Vandetanib;Ponatinib Ephrin type-A receptor 6 (EphA6) Tyrosine kinase 6wk 2 Vandetanib;Ponatinib Receptor tyrosine-protein kinase erbB-2 Tyrosine kinase 4wk 3 Lapatinib;Afatinib;Neratinib (ErbB2) Receptor tyrosine-protein kinase erbB-4 Tyrosine kinase 12wk 1 Afatinib (ErbB4) Fibroblast growth factor receptor 3 (FGFR3) Tyrosine kinase 4, 6, 12wk 3 Pazopanib;Ponatinib;Nintedanib Janus kinase 3 (JAK3) Tyrosine kinase 8wk 1 Tofacitinib Imatinib;Sorafenib;Dasatinib; Sunitinib; Platelet-derived growth factor receptor beta Tyrosine kinase All 11 Nilotinib;Axitinib;Ponatinib;Regorafenib (PDGFRb) Nintedanib;Lenvatinib;Midostaurin RAF proto-oncogene serine/threonine-protein Tyrosine kinase- All 2 Sorafenib;Vemurafenib kinase (RAF1) like

142

3.4 Discussion

One of the biggest challenges facing COPD is delineating its complexity and heterogeneity to characterise the dynamic interplay of the molecular drivers of disease progression. We have sought to address this problem by utilising our established CS-induced COPD mouse model,43 allowing for a focus on a relative homogeneous system, and employing advanced phosphoproteomics to map the progressive changes in phosphorylation in a time-resolved manner. Using a multi-dimensional enrichment and label-based strategy,33 we achieved an unparalleled depth of coverage of the pulmonary phosphoproteome, characterising 27,857 unique phosphosites in the lung tissues across the defined time points of induction (4 weeks and 6 weeks) and progression (8 weeks and 12 weeks) phases of CS- induced COPD. Additionally we employed an ultracentrifugation fractionation technique to allow distinctions to be made between the alterations to phosphorylation status of proteins in the membrane versus the cytosol throughout the time course. This sophisticated strategy has shed light on the dynamic regulatory nature of phosphorylation, its influence on downstream biological functions and has allowed us to identify kinases that may be potential master regulators driving the induction and progression of CS-induced COPD.

In the proteomic profiling of the progression phase of the model (Chapter 2) at the 8 week time point, where the clinical features of COPD are established and irreversible,43 we observed the highest number of significant changes (270 proteins). Our phosphoproteomic profiling of the same cohort of mice at 8 weeks also displays the highest number of significant changes (1,332 unique phosphopeptides), indicating dysregulated phosphorylation may be mediating a rapid deterioration in the lung tissue at this pivotal time point of CS-induced COPD. Due to the large number of dysregulated phosphopeptides, we utilised bioinformatical platforms to elucidate the most likely downstream functions mediated by these alterations in phosphorylation and the upstream

143 regulators at play. By mapping the dysregulated phosphorylation events to upstream kinase regulators,59 this offered the unique opportunity to identify potential druggable targets,66 with the view to potentially repurpose clinically approved drugs for the potential improvement of treatment strategies for COPD patients. This approached identified 14 druggable kinases and 27 clinically approved drugs (Table 3.3; Supplementary Table 3.15). At the 8 week time point we identified 5 exclusive druggable kinases (BLK, DDR1, EphA5, JAK3 and CDK6) targetable by 7 drugs

(abemaciclib, nilotinib, palbociclib, ponatinib, ribociclib, tofacitinib and vandetanib). Of these kinases, three of them have been implicated in lung pathologies and/or COPD. BLK belongs to the Src family of non-receptor tyrosine kinases, a family heavily researched in cancer biology.68

Inhibition of fellow family member c-SRC has been shown to occur following cigarette smoke exposure in mice, inducing lung tissue destruction and airway inflammation.69 Moreover, the inhibition of c-SRC alleviated smoke-mediated destruction of alveolar enlargement and inflammatory changes. Knockout of the DDR1 collagen receptor tyrosine kinase in mice, has been demonstrated to have a protective effect against bleomycin-induced lung fibrosis, in addition to pulmonary inflammation.70 DDR1 has also been implicated in the production of matrix metalloproteinase-2,71 whose expression has been linked to emphysema development in COPD.72

The JAK3/STAT3/NF-κB pathway has been successfully inhibited using ergosterol, ameliorating pathological injury and pro-inflammatory effects in a mouse model of CS-induced COPD, showing promising potential for halting COPD progression.73 Previous publications from our group have also demonstrated the utility of ruxolitinib and tofacitinib for the inhibition of JAK1 and JAK3.33

Cyclin-dependent kinases are critical machinery for the regulation of cell cycle progression as well as modulating alternative splicing and transcription.74 CDK6 is essential for regulating cell cycle entry and progression through the G1-phase.74 A recent study demonstrated that hypoxia induced

144 an overexpression of CDK6 causing a dysregulated cell cycle driving proliferation of pulmonary smooth muscle cells in pulmonary hypertension patients with COPD.75 Targeting these kinases warrants further mechanistic investigations to elucidate whether they can reduce the pathogenesis of COPD.

While our coverage of the phosphoproteome of the mouse lung has provided important leads as to the kinases involved in lung degeneration, it is important to note that of the currently identified human phosphorylation sites (>230,000) less than 5% have been assigned a biological function or kinase.66 Moreover, approximately 90% of these phosphorylation events are regulated by only 20% of known kinases.66 Continuous updates in the annotation of these and new phosphorylation events driven by improvements in MS-based phosphoproteomics and knowledge databases such as PhosphoSitePlus,76 Phosphopedia,77 and KinMap59 are required to help

‘illuminate the dark phosphoproteome’.66 This is an extremely important aspect to the CS-induced phosphoproteome we have produced, as the generated raw data files (spectral data generated by the MS) can be reprocessed retrospectively, making the phosphoproteome a “living” dataset. As more and more of these living datasets are made publicly available, a growing trend in proteomics,78 it opens up the possibility for future approaches to spectra analysis, improved annotated databases and reprocessing by researchers with new questions.78, 79 Reprocessing of publicly available datasets has been well demonstrated by Matic et al. who sought to better understand adenosine diphosphate (ADP)-ribosylation modifications, controller of important cellular processes, reprocessed a mouse tissue-specific phosphoproteomic atlas covering 9 tissue types.80, 81 They identified 88 ADP-ribosylation modified sites, shedding new insights into the

145 tissue specificity of this modification and shifting conventional thought on the subcellular localisation of modified arginines.81, 82

To our knowledge this study represents the most comprehensive CS-induced COPD pulmonary phosphoproteome to date. These data will serve as an important resource that can be further mined to explore new and updated phosphorylation events mediating downstream functions, and mediated by upstream kinases, driving the induction and progression phases of CS- induced COPD. Our mouse model provides unique opportunities to therapeutically validate these targets and further investigate these new candidates mechanistically toward the improved understanding of the pathogenesis of COPD.

146

References

1. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and

regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a

systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;

380:2095-128.

2. Agusti A. The path to personalised medicine in COPD. Thorax 2014; 69:857-64.

3. Vestbo J, Hurd SS, Agusti AG, Jones PW, Vogelmeier C, Anzueto A, et al. Global

strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary

disease: GOLD executive summary. Am J Respir Crit Care Med 2013; 187:347-65.

4. Agusti A, Calverley PM, Celli B, Coxson HO, Edwards LD, Lomas DA, et al.

Characterisation of COPD heterogeneity in the ECLIPSE cohort. Respir Res 2010;

11:122.

5. Haw TJ, Starkey MR, Nair PM, Pavlidis S, Liu G, Nguyen DH, et al. A pathogenic role

for tumor necrosis factor-related apoptosis-inducing ligand in chronic obstructive

pulmonary disease. Mucosal Immunol 2016; 9:859-72.

6. Eisner MD, Anthonisen N, Coultas D, Kuenzli N, Perez-Padilla R, Postma D, et al. An

official American Thoracic Society public policy statement: Novel risk factors and the

global burden of chronic obstructive pulmonary disease. Am J Respir Crit Care Med

2010; 182:693-718.

7. Bilano V, Gilmour S, Moffiet T, d'Espaignet ET, Stevens GA, Commar A, et al. Global

trends and projections for tobacco use, 1990-2025: an analysis of smoking indicators

from the WHO Comprehensive Information Systems for Tobacco Control. Lancet 2015;

385:966-76.

147

8. Lopez AD, Shibuya K, Rao C, Mathers CD, Hansell AL, Held LS, et al. Chronic

obstructive pulmonary disease: current burden and future projections. Eur Respir J 2006;

27:397-412.

9. Barnes PJ. Corticosteroid resistance in patients with asthma and chronic obstructive

pulmonary disease. J Allergy Clin Immunol 2013; 131:636-45.

10. Cazzola M, Page C. Long-acting bronchodilators in COPD: where are we now and where

are we going? Breathe 2014; 10:110-20.

11. Spina D. Pharmacology of novel treatments for COPD: are fixed dose combination

LABA/LAMA synergistic? Eur Clin Respir J 2015; 2.

12. Obeidat M, Hao K, Bosse Y, Nickle DC, Nie Y, Postma DS, et al. Molecular mechanisms

underlying variations in lung function: a systems genetics analysis. Lancet Respir Med

2015; 3:782-95.

13. Pillai SG, Ge D, Zhu G, Kong X, Shianna KV, Need AC, et al. A genome-wide

association study in chronic obstructive pulmonary disease (COPD): identification of two

major susceptibility loci. PLoS Genet 2009; 5:e1000421.

14. Sauler M, Lamontagne M, Finnemore E, Herazo-Maya JD, Tedrow J, Zhang X, et al. The

DNA repair transcriptome in severe COPD. Eur Respir J 2018; 52.

15. Hardin M, Silverman EK. Chronic Obstructive Pulmonary Disease Genetics: A Review

of the Past and a Look Into the Future. Chronic Obstr Pulm Dis 2014; 1:33-46.

16. Walsh CT, Garneau-Tsodikova S, Gatto GJ, Jr. Protein posttranslational modifications:

the chemistry of proteome diversifications. Angew Chem Int Ed Engl 2005; 44:7342-72.

17. Hunter T. Protein kinases and phosphatases: the yin and yang of protein phosphorylation

and signaling. Cell 1995; 80:225-36.

148

18. Cohen P. The origins of protein phosphorylation. Nat Cell Biol 2002; 4:E127-30.

19. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, et al. Global, in vivo,

and site-specific phosphorylation dynamics in signaling networks. Cell 2006; 127:635-

48.

20. Humphrey SJ, James DE, Mann M. Protein Phosphorylation: A Major Switch

Mechanism for Metabolic Regulation. Trends Endocrinol Metab 2015; 26:676-87.

21. Ardito F, Giuliani M, Perrone D, Troiano G, Lo Muzio L. The crucial role of protein

phosphorylation in cell signaling and its use as targeted therapy (Review). Int J Mol Med

2017; 40:271-80.

22. Doll S, Burlingame AL. Mass spectrometry-based detection and assignment of protein

posttranslational modifications. ACS Chem Biol 2015; 10:63-71.

23. Riley NM, Coon JJ. Phosphoproteomics in the Age of Rapid and Deep Proteome

Profiling. Anal Chem 2016; 88:74-94.

24. Sharma K, D'Souza RC, Tyanova S, Schaab C, Wisniewski JR, Cox J, et al. Ultradeep

human phosphoproteome reveals a distinct regulatory nature of Tyr and Ser/Thr-based

signaling. Cell Rep 2014; 8:1583-94.

25. van den Biggelaar M, Hernandez-Fernaud JR, van den Eshof BL, Neilson LJ, Meijer AB,

Mertens K, et al. Quantitative phosphoproteomics unveils temporal dynamics of thrombin

signaling in human endothelial cells. Blood 2014; 123:e22-36.

26. Li J, Li Q, Tang J, Xia F, Wu J, Zeng R. Quantitative Phosphoproteomics Revealed

Glucose-Stimulated Responses of Islet Associated with Insulin Secretion. J Proteome Res

2015; 14:4635-46.

149

27. Nixon B, Johnston SD, Skerrett-Byrne DA, Anderson AL, Stanger SJ, Bromfield EG, et

al. Proteomic profiling of crocodile spermatozoa refutes the tenet that post-testicular

maturation is restricted to mammals. Mol Cell Proteomics 2018.

28. Urizar-Arenaza I, Osinalde N, Akimov V, Puglia M, Candenas L, Pinto FM, et al.

Phosphoproteomic and functional analyses reveal sperm-specific protein changes

downstream of kappa opioid receptor in human spermatozoa. Molecular & Cellular

Proteomics 2019:mcp.RA118.001133.

29. Wei Y, Gao Q, Niu P, Xu K, Qiu Y, Hu Y, et al. Integrative Proteomic and

Phosphoproteomic Profiling of Testis from Wip1 Phosphatase-Knockout Mice: Insights

into Mechanisms of Reduced Fertility. Mol Cell Proteomics 2019; 18:216-30.

30. Krahmer N, Najafi B, Schueder F, Quagliarini F, Steger M, Seitz S, et al. Organellar

Proteomics and Phospho-Proteomics Reveal Subcellular Reorganization in Diet-Induced

Hepatic Steatosis. Dev Cell 2018; 47:205-21.e7.

31. Jensen P, Myhre CL, Lassen PS, Metaxas A, Khan AM, Lambertsen KL, et al. TNFalpha

affects CREB-mediated neuroprotective signaling pathways of synaptic plasticity in

neurons as revealed by proteomics and phospho-proteomics. Oncotarget 2017; 8:60223-

42.

32. Liu JJ, Sharma K, Zangrandi L, Chen C, Humphrey SJ, Chiu YT, et al. In vivo brain

GPCR signaling elucidated by phosphoproteomics. Science 2018; 360.

33. Degryse S, de Bock CE, Demeyer S, Govaerts I, Bornschein S, Verbeke D, et al. Mutant

JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and

MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic leukemia.

Leukemia 2018; 32:788-800.

150

34. Murray HC, Dun MD, Verrills NM. Harnessing the power of proteomics for

identification of oncogenic, druggable signalling pathways in cancer. Expert Opin Drug

Discov 2017; 12:431-47.

35. Koch H, Wilhelm M, Ruprecht B, Beck S, Frejno M, Klaeger S, et al. Phosphoproteome

Profiling Reveals Molecular Mechanisms of Growth-Factor-Mediated Kinase Inhibitor

Resistance in EGFR-Overexpressing Cancer Cells. J Proteome Res 2016; 15:4490-504.

36. Francavilla C, Lupia M, Tsafou K, Villa A, Kowalczyk K, Rakownikow Jersie-

Christensen R, et al. Phosphoproteomics of Primary Cells Reveals Druggable Kinase

Signatures in Ovarian Cancer. Cell Rep 2017; 18:3242-56.

37. Wu X, Xing X, Dowlut D, Zeng Y, Liu J, Liu X. Integrating phosphoproteomics into

kinase-targeted cancer therapies in precision medicine. J Proteomics 2019; 191:68-79.

38. Wu X, Zahari MS, Renuse S, Nirujogi RS, Kim MS, Manda SS, et al. Phosphoproteomic

Analysis Identifies Focal Adhesion Kinase 2 (FAK2) as a Potential Therapeutic Target

for Tamoxifen Resistance in Breast Cancer. Mol Cell Proteomics 2015; 14:2887-900.

39. Ferguson FM, Gray NS. Kinase inhibitors: the road ahead. Nature Reviews Drug

Discovery 2018; 17:353.

40. Overgaard CE, Schlingmann B, White SD, Ward C, Fan X, Swarnakar S, et al. The

relative balance of GM-CSF and TGF-β1 regulates lung epithelial barrier function.

American Journal of Physiology-Lung Cellular and Molecular Physiology 2015;

308:L1212-L23.

41. Schamberger AC, Mise N, Jia J, Genoyer E, Yildirim AO, Meiners S, et al. Cigarette

smoke-induced disruption of bronchial epithelial tight junctions is prevented by

transforming growth factor-beta. Am J Respir Cell Mol Biol 2014; 50:1040-52.

151

42. Knobloch J, Jungck D, Charron C, Stoelben E, Ito K, Koch A. Superior anti-

inflammatory effects of narrow-spectrum kinase inhibitors in airway smooth muscle cells

from subjects with chronic obstructive pulmonary disease. J Allergy Clin Immunol 2018;

141:1122-4.e11.

43. Beckett EL, Stevens RL, Jarnicki AG, Kim RY, Hanish I, Hansbro NG, et al. A new

short-term mouse model of chronic obstructive pulmonary disease identifies a role for

mast cell tryptase in pathogenesis. J Allergy Clin Immunol 2013; 131:752-62.

44. Engholm-Keller K, Birck P, Storling J, Pociot F, Mandrup-Poulsen T, Larsen MR. TiSH-

-a robust and sensitive global phosphoproteomics strategy employing a combination of

TiO2, SIMAC, and HILIC. J Proteomics 2012; 75:5749-61.

45. Fricker M, Deane A, Hansbro PM. Animal models of chronic obstructive pulmonary

disease. Expert Opin Drug Discov 2014; 9:629-45.

46. Pinkerton KE, Harbaugh M, Han MK, Jourdan Le Saux C, Van Winkle LS, Martin WJ,

2nd, et al. Women and Lung Disease. Sex Differences and Global Health Disparities. Am

J Respir Crit Care Med 2015; 192:11-6.

47. Hansbro PM, Hamilton MJ, Fricker M, Gellatly SL, Jarnicki AG, Zheng D, et al.

Importance of mast cell Prss31/transmembrane tryptase/tryptase-gamma in lung function

and experimental chronic obstructive pulmonary disease and colitis. J Biol Chem 2014;

289:18214-27.

48. Nair PM, Starkey MR, Haw TJ, Liu G, Horvat JC, Morris JC, et al. Targeting PP2A and

proteasome activity ameliorates features of allergic airway disease in mice. Allergy 2017;

72:1891-903.

152

49. Horvat JC, Beagley KW, Wade MA, Preston JA, Hansbro NG, Hickey DK, et al.

Neonatal chlamydial infection induces mixed T-cell responses that drive allergic airway

disease. Am J Respir Crit Care Med 2007; 176:556-64.

50. Eidelman DH, Ghezzo H, Kim WD, Cosio MG. The destructive index and early lung

destruction in smokers. Am Rev Respir Dis 1991; 144:156-9.

51. Hsu AC, Starkey MR, Hanish I, Parsons K, Haw TJ, Howland LJ, et al. Targeting PI3K-

p110alpha Suppresses Influenza Virus Infection in Chronic Obstructive Pulmonary

Disease. Am J Respir Crit Care Med 2015; 191:1012-23.

52. Liu G, Cooley MA, Jarnicki AG, Hsu AC, Nair PM, Haw TJ, et al. Fibulin-1 regulates

the pathogenesis of tissue remodeling in respiratory diseases. JCI Insight 2016; 1.

53. Fujiki Y, Hubbard AL, Fowler S, Lazarow PB. Isolation of intracellular membranes by

means of sodium carbonate treatment: application to endoplasmic reticulum. J Cell Biol

1982; 93:97-102.

54. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, et al. Multiplexed

protein quantitation in Saccharomyces cerevisiae using amine-reactive isobaric tagging

reagents. Mol Cell Proteomics 2004; 3:1154-69.

55. Larsen MR, Cordwell SJ, Roepstorff P. Graphite powder as an alternative or supplement

to reversed-phase material for desalting and concentration of peptide mixtures prior to

matrix-assisted laser desorption/ionization-mass spectrometry. Proteomics 2002; 2:1277-

87.

56. Tyanova S, Temu T, Sinitcyn P, Carlson A, Hein MY, Geiger T, et al. The Perseus

computational platform for comprehensive analysis of (prote)omics data. Nat Methods

2016; 13:731-40.

153

57. J.C. O. An interactive tool for comparing lists with Venn’s diagrams.

http://bioinfogp.cnb.csic.es/tools/venny/index.html 2007-2015.

58. Kramer A, Green J, Pollard J, Jr., Tugendreich S. Causal analysis approaches in

Ingenuity Pathway Analysis. Bioinformatics 2014; 30:523-30.

59. Eid S, Turk S, Volkamer A, Rippmann F, Fulle S. KinMap: a web-based tool for

interactive navigation through human kinome data. BMC Bioinformatics 2017; 18:16.

60. Ahrman E, Hallgren O, Malmstrom L, Hedstrom U, Malmstrom A, Bjermer L, et al.

Quantitative proteomic characterization of the lung extracellular matrix in chronic

obstructive pulmonary disease and idiopathic pulmonary fibrosis. J Proteomics 2018;

189:23-33.

61. Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et

al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung

injury and repair. Mol Syst Biol 2015; 11.

62. Schiller HB, Mayr CH, Leuschner G, Strunz M, Staab-Weijnitz C, Preisendorfer S, et al.

Deep Proteome Profiling Reveals Common Prevalence of MZB1-Positive Plasma B Cells

in Human Lung and Skin Fibrosis. Am J Respir Crit Care Med 2017; 196:1298-310.

63. Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et

al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung

injury and repair. Mol Syst Biol 2015; 11:819.

64. Xie P. TRAF molecules in cell signaling and in human diseases. J Mol Signal 2013; 8:7.

65. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase

complement of the human genome. Science 2002; 298:1912-34.

154

66. Needham EJ, Parker BL, Burykin T, James DE, Humphrey SJ. Illuminating the dark

phosphoproteome. Sci Signal 2019; 12.

67. Wishart DS, Feunang YD, Guo AC, Lo EJ, Marcu A, Grant JR, et al. DrugBank 5.0: a

major update to the DrugBank database for 2018. Nucleic Acids Res 2018; 46:D1074-

d82.

68. Zhang S, Yu D. Targeting Src family kinases in anti-cancer therapies: turning promise

into triumph. Trends Pharmacol Sci 2012; 33:122-8.

69. Geraghty P, Hardigan A, Foronjy RF. Cigarette smoke activates the proto-oncogene c-src

to promote airway inflammation and lung tissue destruction. Am J Respir Cell Mol Biol

2014; 50:559-70.

70. Avivi-Green C, Singal M, Vogel WF. Discoidin domain receptor 1-deficient mice are

resistant to bleomycin-induced lung fibrosis. Am J Respir Crit Care Med 2006; 174:420-

7.

71. Hou G, Vogel WF, Bendeck MP. Tyrosine kinase activity of discoidin domain receptor 1

is necessary for smooth muscle cell migration and matrix metalloproteinase expression.

Circ Res 2002; 90:1147-9.

72. Churg A, Zhou S, Wright JL. Matrix metalloproteinases in COPD. European Respiratory

Journal 2012; 39:197.

73. Huan W, Tianzhu Z, Yu L, Shumin W. Effects of Ergosterol on COPD in Mice via

JAK3/STAT3/NF-kappaB Pathway. Inflammation 2017; 40:884-93.

74. Diaz-Moralli S, Tarrado-Castellarnau M, Miranda A, Cascante M. Targeting cell cycle

regulation in cancer therapy. Pharmacol Ther 2013; 138:255-71.

155

75. Sang HY, Jin YL, Zhang WQ, Chen LB. Downregulation of microRNA-637 Increases

Risk of Hypoxia-Induced Pulmonary Hypertension by Modulating Expression of Cyclin

Dependent Kinase 6 (CDK6) in Pulmonary Smooth Muscle Cells. Med Sci Monit 2016;

22:4066-72.

76. Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E.

PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res 2015;

43:D512-20.

77. Lawrence RT, Searle BC, Llovet A, Villen J. Plug-and-play analysis of the human

phosphoproteome by targeted high-resolution mass spectrometry. Nat Methods 2016;

13:431-4.

78. Martens L, Vizcaíno JA. A Golden Age for Working with Public Proteomics Data.

Trends Biochem Sci 2017; 42:333-41.

79. A home for raw proteomics data. Nature Methods 2012; 9:419.

80. Huttlin EL, Jedrychowski MP, Elias JE, Goswami T, Rad R, Beausoleil SA, et al. A

tissue-specific atlas of mouse protein phosphorylation and expression. Cell 2010;

143:1174-89.

81. Matic I, Ahel I, Hay RT. Reanalysis of phosphoproteomics data uncovers ADP-

ribosylation sites. Nat Methods 2012; 9:771-2.

82. Laing S, Unger M, Koch-Nolte F, Haag F. ADP-ribosylation of arginine. Amino Acids

2011; 41:257-69.

156

Supplementary Figures

Figure E3.1: HILIC chromatograms of mono-phosphorylated peptide populations with fractions denoted. The x-axis and y-axis depict time (min) and intensity (mAU) respectively. 4wk membrane (A) & soluble (B), 6wk membrane (C) & soluble (D), 8wk membrane (E) & soluble (F), 12wk membrane (G) & soluble (H). Each fraction was subjected to a MS gradient of 160 min.

157

Figure E3.2: Histograms displaying distribution of membrane-enriched unique phosphopeptides iTRAQ quantification ratios. Ratios were generated by dividing individual smoke exposed mice and control mice by the average of the control mice. The x-axis and y-axis depict ratio (log2) and number of proteins in each bin respectively. 4 weeks (A), 6 weeks (B), 8 weeks (C) and 12 weeks (D). MEM = membrane

158

Figure E3.3: Histograms displaying distribution of soluble unique phosphopeptides iTRAQ quantification ratios. Ratios were generated by dividing individual smoke exposed mice and control mice by the average of the control mice. The x-axis and y-axis depict ratio (log2) and number of proteins in each bin respectively. 4 weeks (A), 6 weeks (B), 8 weeks (C) and 12 weeks (D). SOL = soluble 159

Figure E3.4: Venn Diagram of mapped proteins for membrane-enriched and soluble dysregulated unique phosphopeptides. Total dysregulated mapped proteins for each time point (A), only decreased phosphorylation (B) and increased phosphorylation (C).

160

Supplementary Tables

TABLE E3.1 Phosphopeptide enrichment table #Phosphopeptides Non-phosphorylated Phosphorylated Phosphopeptide Timepoint Unique + FDR ≤0.01 Unique + FDR ≤0.01 enrichment 4wk Membrane 2,716 8,122 75% 4wk Soluble 3,217 12,264 79% 6wk Membrane 3,291 13,379 80% 6wk Soluble 2,098 7,658 78% 8wk Membrane 1,103 4,732 81% 8wk Soluble 1,530 7,279 83% 12wk Membrane 2,836 11,192 80% 12wk Soluble 2,406 5,956 71% Median 80%

TABLE E3.2 Mapped phosphoprotein identifications summary #Phosphoproteins Unique Protein Time point Timecourse Time point Accession Overlap Overlap 4wk Membrane 2,589 3,557 4wk Soluble 3,215 6wk Membrane 3,358 3,596 6wk Soluble 2,230 4,392 8wk Membrane 1,734 2,500 8wk Soluble 2,077 12wk Membrane 2,915 3,156 12wk Soluble 1,943

161

TABLE E3.3 IPA downstream molecular functions analysis of TABLE E3.4 IPA downstream molecular functions analysis membrane-enriched profiles of membrane-enriched profiles Highest Significance Highest #Annotated Proteins -log #Annotated -log #Annotated Biological Function (p-value) Proteins Biological Function (p-value) Proteins Organization of cytoskeleton 32.76 312 Apoptosis 21.07 444

Organization of cytoplasm 32.56 329 Necrosis 20.60 435 Microtubule dynamics 24.61 263 Expression of RNA 11.97 354

4 weeks Cell movement 23.72 404 4 weeks Transcription of RNA 12.47 285 Migration of cells 23.26 357 Cellular homeostasis 9.75 280

Organization of cytoplasm 35.41 410 Apoptosis 22.50 563

Organization of cytoskeleton 34.51 385 Necrosis 20.23 544 Cell movement 27.93 519 Expression of RNA 15.77 467

6 weeks Migration of cells 26.04 453 6 weeks Transcription of RNA 17.54 380 Microtubule dynamics 25.67 325 Cellular homeostasis 12.72 369

Organization of cytoskeleton 17.42 189 Apoptosis 8.36 257

Organization of cytoplasm 16.52 197 Necrosis 8.22 252 Cell movement 13.67 250 Expression of RNA 9.63 232

8 weeks Migration of cells 12.65 218 8 weeks Transcription of RNA 9.03 184 Microtubule dynamics 11.80 155 Cellular homeostasis 5.70 173

Organization of cytoplasm 31.36 355 Apoptosis 17.57 474 Organization of cytoskeleton 30.76 334 Necrosis 15.36 456 Cell movement 23.80 444 Expression of RNA 12.68 395 Microtubule dynamics 22.04 279 Transcription of RNA 15.74 328 12 weeks 12 weeks Migration of cells 20.64 382 Cellular homeostasis 12.55 323

162

TABLE E3.5 IPA downstream molecular functions analysis of soluble TABLE E3.6 IPA downstream molecular functions profiles analysis of soluble profiles Highest Significance Highest #Annotated Proteins -log #Annotated -log #Annotated Biological Function (p-value) Proteins Biological Function (p-value) Proteins Organization of cytoskeleton 30.80 358 Necrosis 23.43 530

Organization of cytoplasm 30.12 377 Expression of RNA 18.86 457 Apoptosis 28.68 559 Transcription 18.58 435

4 weeks Cell movement of connective tissue cells 26.30 88 4 weeks Transcription of RNA 22.25 379 Cell movement of fibroblasts 24.49 77 Cellular homeostasis 12.43 349

Organization of cytoplasm 32.66 289 Apoptosis 17.25 368

Organization of cytoskeleton 32.63 274 Necrosis 15.53 355 Cell movement 27.18 361 Expression of RNA 17.68 327 6 weeks Microtubule dynamics 24.78 231 6 weeks Migration of cells 23.89 312 Transcription of RNA 20.18 273 Transcription 15.42 304

Organization of cytoskeleton 29.04 245 Apoptosis 12.27 315

Organization of cytoplasm 28.92 258 Cell movement 19.81 308 Microtubule dynamics 20.77 203 Necrosis 11.87 308 8 weeks Organization of actin cytoskeleton 20.59 62 8 weeks Expression of RNA 15.53 291 Cell movement of fibroblasts 18.29 53 Transcription 13.17 269

Organization of cytoskeleton 30.63 253 Cell movement 22.84 324 Organization of cytoplasm 29.95 265 Apoptosis 12.20 321 Formation of cytoskeleton 23.02 77 Necrosis 11.19 311 Development of cytoplasm 22.85 90 12 weeks Migration of cells 20.42 281 12 weeks Fibrogenesis 20.99 85 Expression of RNA 11.10 278

163

TABLE E3.7 IPA upstream regulators analysis of membrane-enriched TABLE E3.8 IPA upstream regulators analysis of membrane-enriched profiles profiles Highest Significance Highest #Annotated Proteins Upstream -log #Annotated Upstream -log #Annotated regulators Molecule Type (p-value) Protein Targets regulators Molecule Type (p-value) Protein Targets TP53 transcription regulator 8.53 199 IL4 cytokine 2.11 69

FOS transcription regulator 6.58 61 TCF7L2 transcription regulator 3.27 58 ALKBH5 enzyme 6.23 29 EP300 transcription regulator 2.54 45

4 weeks MKNK1 kinase 5.72 30 4 weeks DMD other 2.56 43 MIR17HG other 4.74 31 PKD1 ion channel 3.18 41

TP53 transcription regulator 6.46 239 TCF7L2 transcription regulator 3.16 72

MKNK1 kinase 5.43 35 FOS transcription regulator 4.35 67 ALKBH5 enzyme 5.14 32 EP300 transcription regulator 2.46 56

6 weeks LIMS2 other 4.55 6 6 weeks DMD other 2.84 55 CD44 other 4.46 39 PKD1 ion channel 2.27 47

LIMS2 other 4.69 5 IL4 cytokine 2.22 47

LIMS1 other 4.18 5 PKD1 ion channel 2.53 27 TP53 transcription regulator 4.14 118 MKNK1 kinase 2.82 17

8 weeks ALKBH5 enzyme 3.83 18 8 weeks SP1 transcription regulator 2.42 16 FOS transcription regulator 3.79 37 Bvht other 3.01 15

FOS transcription regulator 6.84 67 IL4 cytokine 2.55 79 MKNK1 kinase 5.61 32 FOS transcription regulator 6.84 67 TP53 transcription regulator 5.27 202 TCF7L2 transcription regulator 3.82 66 ALKBH5 enzyme 5.16 29 EP300 transcription regulator 4.83 58 12 weeks 12 weeks LIMS2 other 4.98 6 CREBBP transcription regulator 2.45 56

164

TABLE E3.9 IPA upstream regulators analysis of soluble profiles TABLE E3.10 IPA upstream regulators analysis of soluble profiles Highest Significance Highest #Annotated Proteins Upstream -log #Annotated Upstream -log #Annotated regulators Molecule Type (p-value) Protein Targets regulators Molecule Type (p-value) Protein Targets TP53 transcription regulator 8.81 239 IL4 cytokine 2.38 85

ALKBH5 enzyme 6.78 34 TCF7L2 transcription regulator 4.48 74 FOS transcription regulator 5.24 67 CREBBP transcription regulator 2.29 60

4 weeks MKNK1 kinase 5.12 33 4 weeks EP300 transcription regulator 3.59 58 EBF1 transcription regulator 4.81 39 MTOR kinase 2.90 42

FOS transcription regulator 5.23 50 TP53 transcription regulator 3.83 148

MKNK1 kinase 4.78 25 IL4 cytokine 2.31 60 ALKBH5 enzyme 4.58 23 TCF7L2 transcription regulator 3.49 51

6 weeks LIMS2 other 4.08 5 6 weeks EP300 transcription regulator 2.83 40 EBF1 transcription regulator 4.01 28 PKD1 ion channel 2.15 32

MKNK1 kinase 5.12 24 TP53 transcription regulator 2.05 122

FOS transcription regulator 4.82 45 IL4 cytokine 2.81 57 LIMS2 other 4.32 5 TCF7L2 transcription regulator 2.77 44

8 weeks ALKBH5 enzyme 3.84 20 8 weeks CREBBP transcription regulator 2.70 41 LIMS1 other 3.82 5 EP300 transcription regulator 3.51 39

FOS transcription regulator 6.80 51 IL4 cytokine 3.49 61 TP53 transcription regulator 5.52 146 TCF7L2 transcription regulator 3.63 48 ALKBH5 enzyme 5.24 23 EP300 transcription regulator 3.63 40 MKNK1 kinase 4.96 24 CREBBP transcription regulator 2.07 39 12 weeks 12 weeks IPMK kinase 4.94 10 DMD other 2.67 35

165

TABLE E3.11 IPA pathway analysis of membrane-enriched profiles TABLE E3.12 IPA IPA pathway analysis of membrane-enriched profiles Highest Significance Highest #Annotated Proteins -log #Annotated -log #Annotated Pathway (p-value) Proteins Pathway (p-value) Proteins Sertoli Cell-Sertoli Cell Junction Signaling 19.8 72 Molecular Mechanisms of Cancer 14.3 106

Protein Kinase A Signaling 17.1 111 Axonal Guidance Signaling 7.11 103 Nitric Oxide Signaling in the Signaling by Rho Family GTPases 14.7 79 15.4 49

Cardiovascular System 4 weeks Glucocorticoid Receptor Signaling 7.57 78 4 weeks Phospholipase C Signaling 15.2 76 Integrin Signaling 13.6 70 Renin-Angiotensin Signaling 15.2 53 Protein Kinase A Signaling 15.9 130

Renin-Angiotensin Signaling 19.2 67 Molecular Mechanisms of Cancer 14.8 129

Sertoli Cell-Sertoli Cell Junction Signaling 18.3 81 Axonal Guidance Signaling 5.73 122

Nitric Oxide Signaling in the 6 weeks Glucocorticoid Receptor Signaling 8.08 97 17.4 59 Cardiovascular System Phospholipase C Signaling 15.9 91 6 weeks Apelin Endothelial Signaling Pathway 17 63 Insulin Receptor Signaling 16.8 69 Molecular Mechanisms of Cancer 7.4 63

Axonal Guidance Signaling 2.93 58 Sertoli Cell-Sertoli Cell Junction Signaling 12.9 47 Glucocorticoid Receptor Signaling 3.6 45

Protein Kinase A Signaling 10.1 69 8 weeks Signaling by Rho Family GTPases 6.46 44 Phospholipase C Signaling 9.37 48 Actin Cytoskeleton Signaling 7.28 43

8 weeks Leukocyte Extravasation Signaling 8.72 44

ILK Signaling 8 40 Molecular Mechanisms of Cancer 13.6 113 Axonal Guidance Signaling 5.85 108

Sertoli Cell-Sertoli Cell Junction Signaling 17.5 73 Glucocorticoid Receptor Signaling 6.92 83 Protein Kinase A Signaling 15.9 117 Signaling by Rho Family GTPases 13.3 82 12 weeks Renin-Angiotensin Signaling 15.6 57 Thrombin Signaling 13.2 73 Leukocyte Extravasation Signaling 15.4 77 12 weeks Phospholipase C Signaling 14.9 81

166

TABLE E3.13 IPA pathway analysis of soluble profiles TABLE E3.14 IPA pathway analysis of soluble profiles Highest Significance Highest #Annotated Proteins -log #Annotated -log #Annotated Pathway (p-value) Proteins Pathway (p-value) Proteins Insulin Receptor Signaling 23.5 76 Molecular Mechanisms of Cancer 17.1 129

Sertoli Cell-Sertoli Cell Junction Signaling 22.8 85 Axonal Guidance Signaling 5.49 115 Protein Kinase A Signaling 21.6 137 Glucocorticoid Receptor Signaling 11.9 103 4 weeks 4 weeks Renin-Angiotensin Signaling 21.5 68 Signaling by Rho Family GTPases 17.8 96 14-3-3-mediated Signaling 19.3 68 Phospholipase C Signaling 17.7 91

Insulin Receptor Signaling 19.6 58 Axonal Guidance Signaling 4.75 81

Protein Kinase A Signaling 17.7 100 Glucocorticoid Receptor Signaling 8.54 71 Phospholipase C Signaling 17.4 72 Signaling by Rho Family GTPases 12.7 67 6 weeks 6 weeks Molecular Mechanisms of Cancer 15.7 97 Actin Cytoskeleton Signaling 12.8 63 Renin-Angiotensin Signaling 15 48 ERK/MAPK Signaling 14.9 62

Insulin Receptor Signaling 16.7 51 Molecular Mechanisms of Cancer 11.4 81

Sertoli Cell-Sertoli Cell Junction Signaling 16.5 57 Axonal Guidance Signaling 5.8 78 Actin Cytoskeleton Signaling 15.7 64 Signaling by Rho Family GTPases 13.5 64 8 weeks 8 weeks Protein Kinase A Signaling 15.6 89 Glucocorticoid Receptor Signaling 6.04 59 Leukocyte Extravasation Signaling 13.7 58 Phospholipase C Signaling 11.9 58

Protein Kinase A Signaling 22.2 103 Molecular Mechanisms of Cancer 13.5 87 Sertoli Cell-Sertoli Cell Junction Signaling 16.7 58 Axonal Guidance Signaling 4.36 74 Phospholipase C Signaling 15.3 65 Glucocorticoid Receptor Signaling 9.01 68 Signaling by Rho Family GTPases 15.2 68 Opioid Signaling Pathway 11.6 60 12 weeks 12 weeks Neuregulin Signaling 14.9 38 Breast Cancer Regulation by Stathmin1 14.1 59

167

TABLE E3.15 Mapped clinically approved drugs Drug name Major disease context Associated conditions Abemaciclib Breast caner Advanced or metastatic breast cancer Afatinib Lung cancer Metastatic Non-Small Cell Lung Cancer; Refractory, metastatic squamous cell Non-small cell lung cancer Axitinib Kidney cell cancer Severe Aplastic Anemia (SAA); Advanced Thyroid cancer; Refractory Aplastic anemia Refractory, accelerated phase Chronic myelogenous leukemia; Refractory, blast phase Chronic myelogenous Bosutinib Leukaemia leukemia; Refractory, chronic phase Chronic myelogenous leukemia Brigatinib Lung cancer Metastatic Non-Small Cell Lung Cancer Dabrafenib Melanoma Metastatic Melanoma; Unresectable Melanoma Dasatinib Leukaemia Acute Lymphoblastic Leukaemias (ALL); Chronic Myeloid Leukemia (CML) Locally Advanced Non-Small Cell Lung Cancer; Locally Advanced Pancreatic Cancer; Lung Cancer Non- Erlotinib Lung cancer Small Cell Cancer (NSCLC); Metastatic Non-Small Cell Lung Cancer; Pancreatic Cancer Metastatic Gefitinib Lung cancer Metastatic Non-Small Cell Lung Cancer Chordomas; Chronic Eosinophilic Leukemia (CEL); Chronic Myeloid Leukemia (CML); Desmoid Tumors; FIP1L1-PDGFRα fusion kinase status unknown Chronic eosinophilic leukemia; FIP1L1-PDGFRα fusion kinase status unknown Hypereosinophilic syndrome; Hypereosinophilic Syndromes; Metastatic Gastrointestinal Stromal Tumor; Metastatic Melanoma; Myelodysplastic Syndromes; Myeloproliferative Disorders; Refractory Acute Lymphoblastic Leukemia; CKit mutational status unknown Aggressive systemic Imatinib Leukaemia mastocytosis; Metastatic Dermatofibrosarcoma protuberans; Newly diagnosed Acute Lymphoblastic Leukaemia; Newly diagnosed, chronic phase Chronic myeloid leukemia; Recurrent Dermatofibrosarcoma protuberans; Refractory, accelerated phase Chronic myeloid leukemia; Refractory, blast crisis Chronic myeloid leukemia; Refractory, chronic phase Chronic myeloid leukemia; Systemic mastocytosis with associated hematological neoplasm; Unresectable Gastrointestinal stromal tumor Lapatinib Breast caner Metastatic Breast Cancer (MBC); Refractory, advanced Breast cancer; Refractory, metastatic Breast cancer Advanced Renal Cell Carcinoma; Locally recurrent radioactive iodine-refractory Thyroid cancer; Metastatic Lenvatinib Thyroid cancer radioactive iodine-refractory Thyroid cancer; Progressive radioactive iodine-refractory Thyroid cancer Leukemia Acute Myeloid Leukemia (AML); Malignant mast cell neoplasm; Systemic Mastocytosis; Systemic Midostaurin Leukaemia mastocytosis with associated hematological neoplasm Neratinib Breast caner Breast Cancer

168

TABLE E3.15 Continued Drug name Major disease context Associated conditions Chronic Phase Chronic Myeloid Leukemia; Refractory Gastrointestinal stromal tumor; Refractory, accelerated Nilotinib Leukaemia phase Chronic myeloid leukemia Nintedanib Idiopathic pulmonary fibrosis Idiopathic pulmonary fibrosis Osimertinib Lung cancer Metastatic Non-Small Cell Lung Cancer Advanced Breast Cancer; Metastatic Breast Cancer (MBC); Refractory, advanced Breast cancer; Refractory, Palbociclib Breast caner metastatic Breast cancer Pazopanib Kidney cancer and sarcoma Advanced Renal Cell Carcinoma; Advanced Soft Tissue Sarcoma; Advanced Thyroid cancer Accelerated phase chronic myologenic leukemia; Acute Lymphoblastic Leukaemias (ALL); Chronic Phase Ponatinib Leukaemia Chronic Myeloid Leukemia; Blast phase Chronic myelocytic leukemia Metastatic Gastrointestinal Stromal Tumor; Locally advanced Gastrointestinal stromal tumor; Refractory, Regorafenib Gastrointestinal cancer metastatic Colorectal cancer; Unresectable Gastrointestinal stromal tumor Ribociclib Breast caner Advanced Breast Cancer; Metastatic Breast Cancer (MBC) Advanced Renal Cell Carcinoma; Gastrointestinal Stromal Tumors; Hemangiosarcoma; Unresectable Sorafenib Liver and kidney cancer Hepatocellular Carcinoma; Locally recurrent refractory to radioactive iodine treatment Thyroid carcinoma; Metastatic refractory to radioactive iodine treatment Thyroid carcinoma Advanced Renal Cell Carcinoma; Soft Tissue Sarcoma (STS); Thyroid Cancers; Metastatic Pancreatic Kidney and gastrointestinal Sunitinib Neuroendocrine Tumors; Refractory Gastrointestinal stromal tumor; Unresectable, locally advanced cancer Pancreatic Neuroendocrine Tumors Tofacitinib Rheumatoid arthritis Moderate Rheumatoid arthritis; Severe Rheumatoid arthritis Vandetanib Thyroid cancer Metastatic Medullary Thyroid Cancer; Locally advanced Medullary thyroid cancer Melanoma and Metastatic Melanoma; Unresectable Melanoma; Refractory Erdheim-Chester disease; Refractory Non-small Vemurafenib Erdheim-Chester disease cell lung cancer

169

Chapter 4

Proteomic profiling of OCT-embedded COPD endobronchial biopsies

Authors: David A. Skerrett-Byrne,1,2 Heather C. Murray,1,3 M. Fairuz B. Jamaluddin,1,3 Brett Nixon,1,4 Elizabeth G. Bromfield,1,4 Peter A.B. Wark1,3, Rodney J. Scott,1,4 Matthew D. Dun,1,3,5# and Philip M. Hansbro1,2,5#

Affiliations: 1 School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, NSW, Australia, 2 Hunter Medical Research Institute, VIVA Program, Newcastle, NSW, Australia, 3 Hunter Medical Research Institute, Cancer Research Program, Newcastle, NSW, Australia, 4 Priority Research Centre for Reproductive Science, School of Environmental and Life Sciences, University of Newcastle, Callaghan, NSW 2308, Australia, 5 Centre of Inflammation, Centenary Institute, and University of Sydney, Sydney, NSW, Australia; # Authors contributed equally

Aspects of this manuscript are in preparation for submission to The Journal of Allergy and Clinical Immunology

170

Chapter 4: Overview

In the final results chapter of this thesis we have used the proteomic tools optimised in our mouse model of COPD and applied them to human endobronchial biopsies from two patient cohorts suffering from mild-moderate and severe-very severe COPD. Additionally, we have compared the proteome of these two cohorts to those of healthy control individuals and smokers that do not have

COPD (‘healthy smokers’).

Using label free quantitative proteomics we have characterised 1901 unique proteins from these four cohorts. Further, we have identified a total of 51, 65, and 51 dysregulated proteins associated with healthy smokers, mild-moderate, and severe-very severe COPD cohorts, respectively, compared to their healthy control counterparts. This has allowed us to observe several protein changes occurring in healthy smokers and in the mild-moderate COPD cohort that may be preemptive of the later development of severe-very severe COPD symptoms. The identification of these dysregulated proteins in human patients provides the impetus to perform mechanistic studies to establish their contribution to COPD pathogenesis.

171

4.1 Introduction

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death globally and an ever-increasing socioeconomic burden worldwide.1, 2 It is an umbrella term for complex heterogeneous respiratory disorders characterised by chronic pulmonary inflammation in response to prolonged exposure to noxious stimuli that promotes progressive thickening and narrowing of the airway with destruction of the lung parenchyma.3 Atrophy or enlargement of the alveoli lead to rapid declines in lung function and severe breathing difficulties, reducing quality of life, and causing death due to asphyxia, immune system dysfunction, sepsis and acute lung infections.2, 4, 5

The severity of COPD is classified according to the Global Initiative for Chronic Obstructive Lung

Disease (GOLD) guidelines; from mild-moderate to severe-very severe, GOLD stage 1 to 4.2

The leading cause of COPD is tobacco smoking, which is associated with >80% of all diagnoses.2, 6 Despite this overwhelming evidence, smoking rates continue to rise in developing countries,7, 8 and the prevalence of COPD is projected to increase in these areas as a result of the aging population.7 Furthermore, air pollution, environmental smoke exposure and genetic factors, are also known to cause COPD in non-smoking patients7, 8 These issues highlight the need to develop improved treatments and management strategies. Currently, treatment for COPD is limited to long-acting muscarinic antagonists and glucocorticoids, which only provide symptomatic relief and fail to halt the progression of the disease.9-11 Most emerging pharmaceutical therapies aim to target proinflammatory cytokines and chemokines that are known to be associated with inflammation and COPD presentation.12 Early phase II clinical trials using the cytokine receptor

CXCR2 antagonist, MK-7123, which aims to reduce neutrophil chemotaxis, showed significant

13 improvements of FEV1 in patients with moderate to severe-very severe COPD. The oxidative

172 microenvironment of lung tissues is significantly altered during smoking and in COPD, and attempts have been made to inhibit the excess production of reactive oxygen species (ROS) as a therapeutic strategy. This is due to the contribution of ROS to airspace epithelial injury.14

Modulation of mucus hypersecretion has also been trialled.15 Despite this progress, there are currently no effective treatments that halt the progression of COPD, which is due to the lack of understanding of the molecular drivers of pathogenesis. In the present study, we seek to address this knowledge gap by optimising methods for the study of COPD-induced changes to lung tissue using comparative and quantitative proteomics.

Advancements made in mass spectrometry (MS)-based proteomics have allowed for high- resolution comparative and quantitative proteomic analysis of complex samples in a relatively short timeframe.16, 17 These advances can be applied to elucidate the dysregulated proteins potentially underpinning the pathogenesis of COPD, and help identify aberrantly regulated pathways, novel biomarkers of disease progression or distinguish subtypes within current classification systems. There is a plethora of clinically relevant biospecimens available for COPD patients, all providing different or important information about how the disease develops and progresses. One of the most interesting biospecimens is lung biopsies, which provide a snapshot into the physical damage and remodelling of the lung tissue. Ideally, proteomics would be performed using fresh tissue samples, typically lysed and homogenised upon collection to extract proteins in relative real time. However, biopsies are more commonly collected with the intention of histological analysis, requiring the tissue structures to be preserved.

173

Formalin and paraffin embedding (FFPE) is the routine choice for this procedure and allows for long term storage at room temperature. However, this approach raises a number of challenges for protein analysis, such as the denaturation of proteins due to heat exposure, crosslinking events induced by formalin fixation and the possibility of post-translational modifications (PTMs) being masked.18 To overcome these issues, we have used the cryo-preservative optimal cutting temperature (OCT) compound, an emerging alterative to FFPE which involves no cross-linking and may allow a better representation of the proteome.19 Amongst the OCT composition are two polymers, polyethylene glycol (PEG) and polyvinyl alcohol (PVA), which also raise a challenge for MS-based proteomics.

A recent and very simple extraction method was developed to overcome these issues, removing these polymers early in the sample preparation.20 Zhao et al. demonstrated with the addition of a trichloroacetic acid (TCA) based protein precipitation step, >5,400 proteins were reproducibility quantified in OCT-embedded core needle biopsies from non-small cell lung cancer patients.20 For these reasons, we sought to incorporate this extraction step into our established proteomic sample preparation protocol,21-24 on OCT-embedded human patient endobronchial biopsies. Thus, this study focused on both the optimisation of protocols for future use in the study of endobronchial biopsies and the elucidation of proteins and pathways aberrantly regulated during the onset and later stages of COPD.

174

4.2 METHODS

4.2.1 Subjects and sampling

All experiments were conducted in accordance with Hunter New England Area Health

Service Ethics Committee (05/08/10/3.09) and University of Newcastle Safety Committee

approvals. All subjects underwent a fibreoptic bronchoscopy in accordance with standard

guidelines at John Hunter Hospital. The bronchoscope was inserted into the 3-4th generation

airway of the subject and bronchial biopsies were then obtained using biopsy forceps applied under

direct vision. Endobronchial biopsies were embedded in optimised cutting temperature (OCT)

compound (ProSci Tech IA018), a cryo-preservative, and stored at -80ºC until later proteomic

work.

The study cohort was comprised of 24 subjects split into four cohorts; healthy control (n=6),

healthy smokers (n=6), mild COPD (n=6), and severe-very severe COPD (n=6). However, healthy

smoker patient six was removed from the cohort as a lack of material was obtained from this patient

leading to a negative protein quantitation and an inability to perform equivalent proteomic

techniques. The specific steps where this sample was removed are clearly outlined in the results

section. COPD patients were stratified into mild-moderate and severe-very severe based on GOLD

criteria,2 the number of frequent acute exacerbations (FAE), and the COPD assessment test (CAT)

score. The clinical characteristics of the study subjects are summarised in Table 4.1

175

TABLE 4.1 Clinical characteristics of the study subjects. Proteomic Analysis Healthy Healthy Mild-moderate Severe-very severe Controls (n=6) Smokers (n=6) COPD (n=6) COPD (n=6) Age (years), mean (SD) 55.5 (9.6) 64.2 (10.1) 69.0 (8.4) 76.7 (8.5) Male (n) : Female (n) 2 : 4 4 : 2 4 : 2 5 : 1 FEV1 (%), mean (SD) 97.3 (11.6) 98.2 (13.7) 66.7 (9.0) 40.5 (11.1) FAE, N/A N/A < 2 ≥ 2 CAT Score, mean (SD) N/A N/A 9.6 (4.3) 16.5 (5.6) GOLD Stage N/A N/A 1-2 3-4

4.2.2 Human biopsy sample preparation for proteomic analysis

Once thawed, OCT embedded endobronchial biopsies were homogenised in 1000µL of ice-

cold 0.1M Na2CO3 supplemented with protease (Sigma) and phosphatase inhibitors (Roche,

Complete EDTA free), using the FastPrep-24TM 5G (MP Biomedical, Santa Ana, CA, USA ) with

the Cool Prep Adaptor at a speed of 6.5m/s for 2 min. Samples were then sonicated for 3 x 10 s

and incubated for 1hr at 4°C. OCT is composed of a number of polymers which greatly affect MS

analysis, thus, a simple trichloroacetic acid (TCA) precipitation step in sample preparation was

necessary to purify the proteome and obtain great coverage.20 In brief, ice cold TCA was added to

each sample to a final concentration of 20% TCA and incubated for 20 min. Following

centrifugation for 20 min at 16,000xg at 4°C, the supernatant was removed. The pellet underwent

four washes with 500µL 10% TCA followed by four washes with pre-chilled acetone. The

resulting pellet was dissolved to a final concentration of 6M urea, 2M Thiourea. Samples were

quantified using a BCA protein assay kit (Pierce™ BCA Protein Assay Kit, Thermo Fisher

Scientific) according to manufacturer’s specifications. Reduction was then performed using 10mM

DTT (30 min, room temperature) and alkylation using 20 mM iodoacetamide (30 min, 55°C, in

176

the dark). Subsequently, samples were digested with a 1:30 ratio of Lys-C/Trypsin Mix (Promega)

where the solution was diluted below 1M Urea concentration using 50mM triethylammonium

bicarbonate (pH 7.8) after 3h and left overnight at 37°C. Lipids were precipitated from membrane

peptides using formic acid. All peptide solutions were desalted and cleaned up using commercial

desalting columns (Oasis, Waters). Quantitative fluorescent peptide quantification (Qubit protein

assay kit, Thermo Fisher Scientific, Carlsbad, CA, USA) was then employed.

4.2.3 LC-MS/MS Analysis

nLC-MS/MS was performed using a Q-Exactive Plus hybrid quadrupole-Orbitrap MS

system (Thermo Fisher Scientific) coupled to a Dionex Ultimate 3000RSLC nanoflow HPLC

system (Thermo Fisher Scientific). Samples were loaded onto an Acclaim PepMap100 C18 75

μm× 20 mm trap column (Thermo Fisher Scientific) for pre-concentration and online desalting.

Separation was then achieved using an EASY-Spray PepMapTM C18 75 μm× 25 mm column

(Thermo Fisher Scientific). Initial optimisation 95 min gradient was composed of a stepped

gradient at 300 nl/min of 5 to 22% acetonitrile over 40 min, to 35% over 20 min and finally to

90% over 5 min. Optimised 135 min gradient involved a stepped gradient at 300 nl/min of 5 to

32% acetonitrile over 95 min, to 45% over 15 min and finally to 90% over 8 min. Q-Exactive Plus

MS System was operated in full MS/data dependent acquisition MS/MS mode (data-dependent

acquisition). The Orbitrap mass analyzer was used at a resolution of 70,000, to acquire full MS

with an m/z range of 350–2000, incorporating a target automatic gain control value of 1e6 and

maximum fill times of 50 ms. The 20 most intense multiply charged precursors were selected for

higher-energy collision dissociation fragmentation with a normalized collisional energy of 26 and

177

30. MS/MS fragments were measured at an Orbitrap resolution of 17,500 using an automatic gain

control target of 5e5 and maximum fill times of 120 ms.

4.2.4 Computational LC-MS/MS data analysis

Database searching of all raw files was performed using Proteome Discoverer 2.2 (Thermo

Fisher Scientific). SEQUEST HT and MS Amanda 2.0 were used to search against the Uniprot

human database (73,653 sequences, downloaded 17th February 2019). Database searching

parameters included up to two missed cleavages, to allow for full tryptic digestion, a precursor

mass tolerance set to 10 p.p.m. and fragment mass tolerance 0.02 Da. Cysteine

carbamidomethylation was set as a fixed modification while dynamic modifications included

acetylation (K), oxidation (M), phosphorylation (S/T/Y). Interrogation of the corresponding

reversed database was also performed to evaluate the false discovery rate of peptide identification

using Percolator on the basis of q-values, which were estimated from the target-decoy search

approach. To filter out target peptide spectrum matches over the decoy-peptide spectrum matches,

a fixed false discovery rate of 1% was set at the peptide level.

4.2.5 Bioinformatic analysis and statistics

In built nodes of Proteome Discoverer 2.2, Minora Feature Detector and Feature Mapper

were employed for label free quantification. Normalisation is carried out by summing the peptide

abundances for each patient and determining the max sum for all files. The software uses

calculations to generate a normalisation factor which is the factor of the sum of the sample and the

max sum in all patients (Figure 4.1). Within the Proteome Discoverer 2.2 analysis, patients for

their respective clinical cohort where classified as biological replicates, this allowed quantification

178 ratios to set up; healthy smokers/healthy controls (HS/HC); mild COPD/healthy controls

(MC/HC); severe-very severe COPD/healthy controls (SC/HC). An ANOVA test was carried out to calculate the p-values and adjusted p-values using the Benjamini-Hochberg method.

Protein lists were exported from Proteome Discoverer 2.2 in the form of Excel documents.

25 Perseus data analysis software, version 1.6.2.3., was utilised to transform data (log2) and generate histograms and multi scatter plots (Figure 4.4; Supplementary Figures E4.6-9). All references to ratio/fold change will be discussed in the log2 format. Volcano plots and graphs were plotted using

Prism version 8.0, and Venn diagrams made using Venny.26 Basic data handling, if not otherwise stated, was carried out using Microsoft Excel® (Version 16.0.4739, Microsoft Corporation,

Redmond, WA).

179

Figure 4.1: Minora normalisation of label-free patient proteomes. Box and whisker plots of calculated abundances prior to normalisation (A) and after (B); x-axis indicates the patients and the y-axis is the log10 abundance.

180

4.2.6 Pathway analyses

Global proteome lists of each clinical cohort, where at least three patient quantification

values were detected, were submitted to Database for Annotation, Visualization and Integrated

Discovery (DAVID) 27, 28 for Gene Ontology (GO) analysis of enriched biological processes,

cellular components and molecular functions. Additional GO enrichment of biological processes

was carried out for the dysregulated proteins identified in each clinical cohort ratio.

Each clinical cohort ratio list, i.e. HS/HC, containing Uniprot accession numbers and

transformed (log2) fold change were analysed using the Ingenuity® Pathway Analysis software

(IPA®, Qiagen). IPA allows for the analysis, interpretation and scoring of large datasets through

its algorithms combined with the Ingenuity Knowledge Base. We focused on three tools within

IPA; canonical pathway analysis, upstream analysis, and disease and function analysis. Two

important outputs given by IPA are P-value and Z-score. P-value is an enrichment measurement

of the overlapping proteins from the dataset with a particular pathway, function or regulator. Z-

score is a prediction scoring system of activation or inhibition based upon statistically significant

patterns in the dataset and prior biological knowledge manually curated in the Ingenuity

Knowledge Base.29 To elucidate the most significant changes in these analyses, we applied a

stringency criteria of -log10 p-value ≥ 2 (p-value ≤0.01) in each clinical cohort ratio and a Z-score

of (inhibition) -2 ≤ Z ≥ 2 (activated) in at least one clinical cohort ratio.30 For disease and function

analysis we restricted our analysis to ‘molecular and cellular functions’ and ‘physiological system

development and function’. The top five most significantly enriched canonical pathways, upstream

regulators and molecular functions, as well as the most annotated proteins were reported.

181

4.3 Results

4.3.1 Optimisation of protein extraction from OCT-embedded biopies

The presence of PEG and PVA in OCT are not ideal as they compete for binding sites during

sample preparation steps and separation on a liquid chromatography column, biasing the

stoichiometry of any analysis.19, 20 Additional these polymers suppress ion formation in MS and

reduce protein identification (Supplementary Figure E4.1.A).19, 31 A TCA precipitation step offers

a promising solution to overcome this challenge (Supplementary Figure E4.1.B).19, 20

As endobronchial biopsies are scarce and critically important, to test the effectiveness of the

TCA precipitation step on OCT-embedded biopsies for MS-based proteomics, it was initially

trialled on one biopsy. One of the major challenges of proteomics is the amount of starting material

is restricted and cannot be replicated. This highlights the importance of effective protein extraction

techniques which begin with homogenisation of the starting material. To optimise this we adopted

a more rigorous homogenisation technique to the traditional radioimmunoprecipitation assay

(RIPA) buffer extraction protocol which had yielded on average 43.71µg per biopsy to date. As

outlined in our methods(Chapter 2,3,4), we utilised a bench top automatic homogeniser, which

through collisions with lysing matrix particles in tridimensional motion results in a consistent and

complete lysis. Following sonication of the sample we carried out a quantitative fluorescent protein

quantification which resulted a 344% increase in yield compared to the historical RIPA based

extraction data (Table 4.2). Liquid chromatography (LC) is an important tool to deconvolute

proteome complexity and improve sequencing. Due to the heterogeneity of peptide

physiochemical characteristics, LC improves proteome coverage by resolving a dynamic complex

peptide mixture into simpler populations.32-34 Initially 0.7µg of tryptic peptides were separated on

182

a 95 min gradient feed directly into the mass spectrometer (Figure 4.2.A). This identified 227

unique peptides corresponding to 69 proteins (≥ 2 unique peptides cut-off). The MS1 spectra and

3D chromatographic map revealed a significant amount of peptides co-eluting at the later end of

the LC gradient. To overcome this we employed two changes, a larger spread of the gradient time

and an increase in the amount of sample injected (3µg). Through the implementation of these

changes, peptides were separated on a 135 min gradient (Figure 4.2.B), identifying 1,312 unique

peptides corresponding to 589 proteins (≥ 2 unique peptides cut-off), marking a 3.5 fold increase

in identifications.

TABLE 4.2 Comparison of RIPA and improved protein extraction methods.

RIPA Improved sample preparation Sample Protein (µg) Sample Protein (µg) Mild-moderate COPD Patient Biopsy 1 83.21 150.50 Patient 1 Patient Biopsy 2 34.13 Patient Biopsy 3 39.52 Patient Biopsy 4 51.89 Patient Biopsy 5 72.58 Patient Biopsy 6 43.68 Patient Biopsy 7 31.20 Patient Biopsy 8 69.53 Patient Biopsy 9 31.20 Patient Biopsy 10 5.65 Patient Biopsy 11 81.28 Patient Biopsy 12 47.52 Patient Biopsy 13 40.21 Patient Biopsy 14 48.49 Patient Biopsy 15 34.98 Patient Biopsy 16 44.19 Patient Biopsy 17 28.27 Patient Biopsy 18 13.51 Patient Biopsy 19 21.05 Patient Biopsy 20 52.12 Average 43.71µg Fold increase 344%

183

Figure 4.2: Optimisation of liquid chromatography for improved protein identification. An initial 95 min gradient was used to separate the peptide population (A). Expansion of three key elution phases, marked in purple, red and green, increased the gradient time to 135 min (B). Retention time, flow and percentages of buffers are listed (C).

184

4.3.2 Pathway analysis for the trial dataset

Using in-built analysis tools in Proteome Discoverer 2.2, the refined dataset of 589 proteins were classified by biological process, cellular component and molecular function. On the basis of the number of annotated proteins, biological process classification returned the broad categories of response to stimulus, regulation of biological process and metabolic process (Figure 4.3.A).

Notably, the enrichment of biological processes with relevance to the mouse model of cigarette smoke induced COPD in Chapters 2 and 3, included cell proliferation, cell death, cellular homeostasis and defence response. Dominating cellular components in the endobronchial biopsy included extracellular, membrane, and (Figure 4.3.B). Top ranked molecular functions were identified as RNA binding, catalytic activity and protein binding mapping 186, 254 and 486 proteins respectively (Figure 4.3.C).

Here, we have demonstrated the effectiveness of the TCA precipitation step to remove any inferring polymers before MS analysis and obtain promising proteome coverage. These analyses justified the experimental techniques for use in a larger and diverse clinical cohorts of patients.

185

A Biological Processes

cell growth development cell division coagulation cell communication cell proliferation cellular homeostasis cell death defense response cellular component movement cell differentiation transport cell organization and biogenesis response to stimulus regulation of biological process metabolic process B 0 Cellular50 100 Component150 200 250 300 350 400 450 proteasome vacuole spliceosomal complex Golgi endosome cell surface chromosome ribosome endoplasmic reticulum cytoskeleton mitochondrion organelle lumen extracellular nucleus cytosol cytoplasm membrane -50 50 150 250 350 450 C Molecular Functions

translation regulator activity receptor activity signal transducer activity antioxidant activity motor activity transporter activity enzyme regulator activity DNA binding structural molecule activity metal ion binding nucleotide binding RNA binding catalytic activity protein binding 0 100 200 300 400 500 600 Number of annotated proteins Figure 4.3: Gene ontology annotation of trial human endobronchial biopsy data. All 589 proteins (FDR ≤ 0.01) were subjected to a gene ontology analysis, classifying the most significant 186 biological processes, (A) cellular compartments, (B) and molecular functions (C).

4.3.3 Proteomic profiling of OCT-embedded COPD endobronchial biopsies

We outlined in “Subjects and sampling” our focus was on four clinical cohorts of patients,

each with a sample size of six. Following the optimised sample preparation methods on the OCT-

embedded endobronchial biopsies, a BCA protein assay was carried out. An average of 91.50µg,

79.19µg, 269.62µg and 220.42µg was recovered from healthy controls, healthy smokers, mild-

moderate and severe-very severe COPD respectively (Table 4.3). In light of the negative

quantitation obtained for healthy smoker patient 6, this sample was removed from further

experiment steps. The variation in amount of protein obtained from differing endobronchial

biopsies could be due to the difficulty in a standard collection size when carried out by a clinician.

TABLE 4.3 BCA protein quantification of human endobronical biopsies Healthy Controls Healthy Smokers Mild-moderate COPD Severe-very severe COPD Protein Protein Protein Protein Subject Subject Subject Subject (µg) (µg) (µg) (µg) HC Patient 1 150.44 HS Patient 1 64.95 MC Patient 1 123.86 SC Patient 1 260.34 HC Patient 2 64.24 HS Patient 2 38.16 MC Patient 2 652.92 SC Patient 2 203.95 HC Patient 3 60.64 HS Patient 3 217.24 MC Patient 3 251.72 SC Patient 3 471.90 HC Patient 4 69.98 HS Patient 4 2.49 MC Patient 4 136.43 SC Patient 4 87.58 HC Patient 5 115.60 HS Patient 5 168.04 MC Patient 5 255.32 SC Patient 5 197.85 HC Patient 6 88.13 HS Patient 6 -15.73 MC Patient 6 197.49 SC Patient 6 100.87 Average 91.50 Average 79.19 Average 269.62 Average 220.42

187

After tryptic digestion of protein samples, the resulting peptide populations were desalted to

remove any impurities and then quantified once more before MS analysis. An average of 100.68µg,

85.86µg, 137.04µg and 125.11µg were quantified for healthy controls, healthy smokers, mild-

moderate and severe-very severe COPD respectively (Table 4.4). Interestingly healthy smoker

patients 2 and 4 produced an improvement in amount quantified, suggesting some interference in

the BCA assay which was alleviated following digestion and the desalting process.

TABLE 4.4 Qubit peptide quantification of human endobronchial biopsies Healthy Controls Healthy Smokers Mild-moderate COPD Severe-very severe COPD Protein Protein Protein Protein Subject Subject Subject Subject (µg) (µg) (µg) (µg) HC Patient 1 102.82 HS Patient 1 62.53 MC Patient 1 93.79 SC Patient 1 177.16 HC Patient 2 142.07 HS Patient 2 54.19 MC Patient 2 172.99 SC Patient 2 110.81 HC Patient 3 75.38 HS Patient 3 141.73 MC Patient 3 152.15 SC Patient 3 169.17 HC Patient 4 101.43 HS Patient 4 64.61 MC Patient 4 109.77 SC Patient 4 91.36 HC Patient 5 102.13 HS Patient 5 131.6 MC Patient 5 177.16 SC Patient 5 114.63 HC Patient 6 80.21 Average 85.86 MC Patient 6 116.37 SC Patient 6 887.54 Average 100.68 Average 137.04 Average 125.11

188

4.3.4 Protein identification

Using our improved proteomic preparation methods we quantified 1,901 proteins (FDR ≤

0.01) across all clinical cohorts. (Table 4.5). Scatter plots and histograms were generated to

evaluate the reproducibility within each cohort (Supplementary Figure E.4.2-5) and the

distribution of abundances respectively (Figure 4.4). We observed a relatively high Pearson

correlation between patients within each cohort, the highest average score being 0.76 in mild-

moderate COPD. Filtering for proteins assigned two unique peptides or more, a total of 1,396

unique protein groups were identified. From this, we focused on proteins which had at minimum

three quant values in at least one clinical cohort, resulting in 1,016, 1,174, 1,135 and 1,121 proteins

in healthy controls, healthy smokers, mild-moderate and severe-very severe COPD, respectively

(Table 4.6).

TABLE 4.5 Proteomic Identification Summary #Proteins Total IDs Total IDs Quant Shared with Clinical cohorts Unique Proteins Unique peptides ≥ 2 n ≥ 3 HC Healthy Control 1,566 1,290 1,016 N/A Healthy Smoker 1,619 1,320 1,029 926 Mild-moderate 1,674 1,346 1,135 971 COPD Severe-very severe 1,686 1,342 1,121 969 COPD

189

HC HS MC SC

iTRAQ abundance HC 1 iTRAQ abundance HS 1 iTRAQ abundance MC 1 iTRAQ abundance SC 1

iTRAQ abundance HC 2 iTRAQ abundance HS 2 iTRAQ abundance MC 2 iTRAQ abundance SC 2

iTRAQ abundance HC 3 iTRAQ abundance HS 3 iTRAQ abundance MC 3 iTRAQ abundance SC 3

iTRAQ abundance HC 4 iTRAQ abundance HS 4 iTRAQ abundance MC 4 iTRAQ abundance SC 4

iTRAQ abundance HC 5 iTRAQ abundance HS 5 iTRAQ abundance MC 5 iTRAQ abundance SC 5

iTRAQ abundance HC 6 iTRAQ abundance MC 6 iTRAQ abundance SC 6 iTRAQ abundance MC 2

Figure 4.4: Histogram of all patient cohorts. Distribution of quantified proteins for healthy control (HC),

healthy smokers (HS), mild (MC) and severe COPD (SC); x-axis is the normalised abundance and y-axis is protein counts.

190

4.3.5 Gene Ontology enrichment of clinical cohorts

An initial GO enrichment analysis using DAVID27, 28 was carried out on the global proteome of each clinical cohort to gain insight into the variability of biological processes and molecular functions at play, and the distribution of cellular components linked to the proteins identified.

There were no overt changes in the biological processes (62.5%) and molecular functions (69.2%) between clinical cohorts (Figure 4.5). However, biological processes were enriched in the hallmark features of aging included changes to mRNA splicing, proteolysis, and extracellular matrix organisation (Figure E4.2, E4.3, E4.4, E4.5, A). Dominating cellular components included cell- cell adherence junction, cytoplasm and extracellular matrix (ECM) subcategories; region, space and exosome (Figure E4.2, E4.3, E4.4, E4.5, B). Top ranked molecular functions were identified as ATP binding, poly(A) RNA binding, and protein binding mapping to 115, 253, and 690 proteins respectively (Figure E4.2, E4.3, E4.4, E4.5, C).

Given that no overt changes were observed through GO analysis of these clinical cohorts, in-built grouping and statistics were carried out using Proteome Discoverer 2.2 (See methods) to allow for a deeper understanding of the datasets. This was performed by investigating the quantification ratios of each clinical cohort in relation to healthy controls; HS/HC, MC/HC,

SC/HC.

191

Figure 4.5: Distribution of GO enrichments in each clinical cohorts. Venn diagrams depicting the overlaps of the GO enrichment results for biological processes (A) and molecular functions (B) for healthy controls, healthy smokers, mild and severe COPD cohorts.

192

4.3.6 Quantitative proteomic differences in clinical cohorts identifies inventory of potential

biomarkers of disease progression.

Generation of quantification ratios (see methods) resulted in the identification of 926, 971,

969 in the HS/HC, MC/HC, and SC/HC lists respectively (Table 4.5). An ANOVA test (p-value ≤

0.05) paired with fold change cut off of ±1 was employed to identify the significantly dysregulated proteins, potentially characterising these clinical cohorts. According to these criteria, the mild- moderate COPD ratios revealed the largest number of changes (33 proteins) and severe-very severe

COPD displayed the lowest number of changes (18 proteins). Ratios for healthy smokers revealed

28 dysregulated proteins. Additionally, several proteins were found to be exclusively expressed in a sample cohort and not in the healthy control or alternatively were present in the control cohort but not patient cohorts. This gave a final total of 51, 65 and 51 proteins in HS/HC, MC/HC and

SC/HC respectively (Table 4.6). It is important to note that these exclusively identified proteins could be indicative of either low abundance or absence in the healthy controls as they may fall below the limit of detection for the mass spectrometer. These dysregulated proteins are listed in

Supplementary tables 4.1, 4.2 and 4.3 with their accession numbers, protein names, gene symbol and fold changes noted.

Volcano plots were generated to further visualise the dysregulated changes amongst each clinical cohort (Figure 4.6. A, B, C). Venn Diagrams were used to investigate the distribution of the complete inventory of proteomes (Figure 4.6. D) and the dysregulated proteins between healthy smokers, mild-moderate and severe-very severe COPD (Figure 4.6. E).

193

TABLE 4.6 Summary of dysregulated proteins identified in each clinical cohort #Proteins Exclusively Exclusive to Fold change ≥ 2 Fold change ≤ 0.5 Total Clinical cohorts p-value ≤ 0.05 p-value ≤ 0.05 expressed Healthy Controls Healthy Smoker 16 12 15 8 51 Mild-moderate 9 24 25 6 64 COPD Severe-very 12 6 27 5 50 severe COPD

194

Figure 4.6: Volcano plots and Venn diagram for each cohort. Volcano plots for healthy smokers (HS) (A), mild (B) and severe COPD (MC; SC) (C) ratios compared to healthy control (HC); x-axis is the log2 fold change and y-axis is the negative log10 of the p-value. Venn diagrams depict the intersects of the full proteome profiles of each cohort ratio (D) and distribution of dysregulated proteins including exclusively expressed proteins (E).

195

The volcano plots (Figure 4.6. A, B, C) highlight several distinct changes in the proteome across the different cohorts with the top 5 changes marked. However, there were also a subset of proteins that were unique to each clinical cohort that did not appear in the healthy control (these are demoted in Supplementary Tables 4.1, 4.2 and 4.3 by an * or #). A total of 28 uniquely dysregulated proteins were associated with severe-very severe COPD (Table 4.6), top amongst this list were upregulation of pre-mRNA-splicing factor ATP-dependent RNA helicase DHX15, cytokine receptor-like factor 1 and leucine zipper transcription factor-like protein and downregulation of aspartyl/asparaginyl beta-hydroxylase, 3-hydroxyisobutyrate dehydrogenase mitochondrial and fibroleukin and (Figure 4.6. C, E; Supplementary Table 4.3). Mild-moderate

COPD amassed the largest portion of uniquely dysregulated proteins (Figure 4.6.E) characterised by the upregulation of 14-3-3 protein sigma, arachidonate 15-lipoxygenase and regulator of microtubule dynamics protein 3 and the downregulation of transforming protein RhoA, heterogeneous nuclear ribonucleoprotein A0, and tRNA-splicing ligase RtcB homolog (Figure 4.6.

B, E; Supplementary Table 4.2). Healthy smokers unique proteins (36) identified the upregulation of myeloid leukemia factor 1, E3 /ISG15 ligase TRIM25 and collagen alpha-1 (IV) chain and downregulation of HHIP-like protein 2, secreted phosphoprotein 24 and aminopeptidase N

(Figure 4.6. A, E; Supplementary Table 4.1). All these proteins possess the potential to distinguish each clinical cohort and add to the mechanistic understanding of disease progression.

A total of four proteins were shared amongst all three cohorts, namely thioredoxin, Rho

GDP-dissociation inhibitor 1, isoform delta 11 of calcium/calmodulin-dependent protein kinase type II subunit delta, leucine--tRNA ligase, cytoplasmic (Figure 4.6.E). All displaying a marked increase in expression with some proteins not expressed at a detectable level in healthy control

196 patients. Overall fifteen proteins were shared with healthy smokers (Figure 4.6.E), interestingly all displayed the same direction of change, which could be indicative of changes related to cigarette smoking. A total of thirteen dysregulated proteins were shared between the two COPD cohorts.

(Figure 4.6.E).

To further explore the roles of these quantitative ratio lists and dysregulated proteins we undertook higher-level systems information analyses.

4.3.7 Proteome profiles identify perturbed upstream regulators and downstream molecular and cellular functions characterising progressive stages of COPD

The complete inventories of these quantitative ratio lists were analysed by IPA and assigned molecular and cellular functions, and upstream regulators. Hierarchical clustering of the activity scores (see methods) of these assignments sheds light on the potential activation or inhibition of upstream transcription regulators and downstream molecular implications30 (Figure 4.7). This approach identified activating transcription factor 6 (ATF6), frizzled-8 (FZD8), x-box binding protein 1 (XBP1), 4B (CUL4B), tumor necrosis factor type 1 receptor-associated protein

(TRAP1), and microRNA 122-5p (mir-122-5p) as uniquely dysregulated upstream regulators in the COPD cohorts (Figure 4.7.A).

Shared trends were observed for urokinase plasminogen activator receptor (PLAUR), enzymes carboxypeptidase X CPX-1 (CPXM1) and guanine nucleotide-binding protein subunit alpha-12 (GNA12), and transforming growth factor beta 1 (TGFβ1) which were predicted to be downregulated in all three clinical cohorts, based on the abundance changes in 4, 10 and 47 known

197 target proteins respectively (Figure 4.7.A). The reverse was also observed in transcription regulators breast cancer type 1 (BRAC1) and nuclear factor erythroid 2-related factor 2 (NFE2L2), growth factor WNT1-inducible-signaling pathway protein 2 (WISP2) and estrogen receptor

(Figure 4.7.A).

By analysing these data on the basis of significance and number of annotated protein targets instead of activity score, upstream regulators included: transcription factors MYC, tumour protein p53 (TP53), cystatin 5 (CST5) and SMARCA4, peptidases lon protease homolog, mitochondrial

(LONP1) and matrix metalloproteinase 12 (MMP-12), and additionally T-cell receptor (TCR), tumour necrosis factor (TNF), transforming growth factor beta 1 (TGFB1) and phospholipase A2 receptor 1 (PLA2R1). See Supplementary Table 4.4 for the top five upstream regulators based on significance and the number of annotated protein targets for each clinical cohort, along with their classification type, significant scores and number of annotated proteins.

Hierarchical clustering of the downstream molecular and cellular functions helped identify a number of functions potentially activated or inhibited in each clinical cohort (Figure. 4.7.B).

Severe-very severe COPD patients uniquely observed an activation of a number of assembly and organisation functions including adhesion of ECM, organisation of cytoplasm and migration of endothelial cells. Dysregulation of RNA and protein biosynthesis functions were associated with the severe-very severe COPD patients such as activation of decay of mRNA and inhibition of splicing and translation of mRNA, and synthesis and oligomerisation of protein (Figure. 4.7.B).

The profile of mild-moderate COPD was in contrast to the severe-very severe cohort, characterised by an activation of cellular proliferation and movement functions, replication of RNA virus and

198 an inhibition of interactions and localisation and proteins. (Figure. 4.7.B) Interestingly the analysis of healthy smokers in this way revealed a clearly defined profile with the majority of functions downregulated such as cellular migration of endothelial cells, binding of leukocytes and organisation of the actin cytoskeleton. All activated functions in healthy smokers were related to

RNA post-transcription modification functions; processing and splicing of RNA and mRNA, translation of mRNA (Figure. 4.7.B).

Focusing on downstream functions stratified on the basis of significance we observed an enrichment of degranulation of cells (3.24E-69), metabolism of protein (3.72E-66), decay of mRNA (7.59E-65), nonsense-mediated mRNA decay (6.03E-63) and initiation of translation of protein (2.63E-61). Stratification by number of annotated proteins, revealed an enrichment of necrosis (257), cell movement (240), apoptosis (240), cell death of tumor cell lines (218) and migration of cells (214). See Supplementary Table 4.5 for the top five downstream functions based on significance and the number of annotated protein for each clinical cohort, along with their significant scores and number of annotated proteins.

199

A B

Figure 4.7: Ingenuity pathway analysis of quantitative clinical cohort ratios. Hierarchical clustering of activation z- scores of upstream regulators (A) and downstream molecular and cellular functions (B). Grey indicates no significant data detected. 200

4.3.8 Pathway analysis reveals distinct activity profiles for COPD clinical cohorts

The complete inventories of each clinical cohort quantitative ratios were analysed by IPA

and assigned pathways (Figure 4.8). Hierarchical clustering of the activity scores (see methods) of

these assignments sheds light on the potential activation or inhibition of pathways driving the

progression of COPD (Figure 4.8.A). The two COPD cohorts were distinguished from the healthy

smokers by the activation of signalling pathways eukaryotic initiation factor 2 (EIF2), acute phase

response, production of nitric oxide and reactive oxygen species in macrophages, ribosomal

protein S6 kinase beta-1p (p70S6K) and granzyme B. Additionally observed was a predicted

inhibition of glycogen degradation pathways I and II. Four pathways were exclusively activated

in the mild-moderate COPD cohort; insulin-like growth factor 1 (IGF-1) signalling, mammalian

target of rapamycin (mTOR) signalling, sirtuin signalling and chemokine signalling. Shared trends

of activation z-scores across the cohorts included signalling by Rho family GTPases, integrin-

linked kinase (ILK) signalling, glycoprotein VI (GP6) signalling, G2/M DNA damage checkpoint

regulation and liver X receptors and retinoid X receptors (LXR/RXR) activation.

Filtering for the top five canonical pathway based on significance (Figure 4.8.B) and the

number of annotated proteins (Figure 4.8.C) showed no significant differences in identifications.

While not surprising (as 91% of clinical cohort proteomes were shared (Figure 4.6.D), the

differences in fold changes marked distinctive and interesting differences, highlight by the z-scores

calculated from the IPA analyses. To elucidate these differences further we focused on the

biological process uniquely assigned to the deregulated proteins associated with each clinical

cohort ratios.

201

A B

C

Figure 4.8: Ingenuity pathway analysis of clinical cohort ratios reveals activation patterns in pathways. Hierarchical clustering of activation z- scores 202 of canonical pathways (A). Lists of the top five most significant pathways (B) and pathways with the highest number of proteins annotated (C).

Grey indicates no significant data detected.

4.3.9 Gene ontology analysis of dysregulated proteins in clinical cohorts

To better characterise the functions driven by the dysregulated proteins identified, we carried

out a GO enrichment of the biological processes for each clinical cohort ratio. The mild-moderate

COPD cohort mapped predominantly to the dominant GO biological process categories of cellular

process, biological adhesion and response to stimulus (Figure 4.9.B) subdividing into categories

with relevance to ageing effects such as extracellular matrix organisation, cell adhesion, ER to

Golgi vesicle-mediated transport, cell proliferation and negative regulation of protein kinase

activity. We noted despite a balanced number of upregulated and downregulated changes in mild-

moderate COPD (35 vs 30, respectively), the majority of biological processes identified as

significantly enriched were found to map to downregulated proteins (Figure 4.9.B).

Cellular process, biological regulation and localisation were the dominant categories

characterising the severe-very severe COPD cohort. Subcategories included a number of lung

homeostasis related processes: response to reactive oxygen species, oxidation-reduction process,

cell proliferation, transport between Golgi and ER, muscle filament sliding, ribosomal small

subunit assembly (Figure 4.9.C). As expected the majority of mapped processes involved

upregulated proteins (41 upregulated vs 11 downregulated).

The dominant categories for healthy smokers were once again cellular process and biological

regulation, but also featured immune system process (Figure 4.9.C). Interesting subcategories

included MAPK cascade, T cell receptor signalling pathway, histone H2A and H4 acetylation, and

tumour necrosis factor-mediated signalling pathway.

203

Figure 4.9: GO annotation of the dysregulated proteins characterising each clinical cohorot. Gene ontology analysis was performed to assess the biological processes significantly enriched based on proteins exhibiting a fold change of ± 2 and ANOVA significant in HS/HC (A), MC/HC (B), and SC/HC (C). Proteins which were downregulated are denoted green and red for upregulated. 204

4.3.10 Overlap with proteomic profile of cigarette smoke-induced chronic obstructive pulmonary disease mouse model.

The proteomes of Homo sapiens (Taxon identifier 9606) and Mus musculus (Taxon identifier

10090) were downloaded from Uniprot and mapped together to allow ease of comparison of datasets generated in Chapters 2 and 4. Conversion of human endobronchial biopsies proteomic identifications to mouse equivalents mapped 95% of identifications. When compared to the cigarette smoke-induced chronic obstructive pulmonary disease mouse model, the mouse model capture 1,207 out of 1,320 (92%) mapped identifications (Figure 4.10.A). Focusing on the dysregulated changes found in mild-moderate COPD (Supplementary table 4.2) and severe-very severe COPD (Supplementary table 4.3), log2 fold changes were clustered with mapping data from

CS-induced COPD mouse model (Figure 4.10.B & C). Trends of changes can been seen at the critical time points in comparison to the human endobronchial biopsies, in particular tubulin beta-

6 chain (Q922F4) and heterogeneous nuclear ribonucleoprotein A0 (Q9CX86) in mild-moderate and actin, alpha skeletal muscle (P68134) and Fubulin-5 (Q9WVH9) in severe-very severe COPD.

It is important to recognise the mouse study was carried out on whole lung homogenates which encompasses over forty different cell types35 and the human samples are a snapshot of a specific region of the lung. Also importantly the proteomic depth currently achieved in human samples

(1,901 unique proteins) is not on par with the mouse model (7,324 unique proteins), further depth would be more representative of the correlation between the two datasets.

205

Figure 4.10: Mapped proteomes of cigarette smoke-induced chronic obstructive pulmonary disease mouse model and human endobronchial biopsies. Uniprot accession of the human endobronchial biopsies were mapped to mouse equivalents and compared to the CS-induced COPD model, resulting in a 92% capture in the mouse model. (A) Dysregulated proteins from MC/HC (B) and SC/HC (C) were clustered with data from mouse model were a match was found. Grey indicates no significant data detected.

206

4.4 Discussion

COPD remains a global problem and its incidence continues to rise, this is highlighted by the lack of curative or effective therapies to halt the progression of COPD. Driven by a knowledge gap in the understanding of the molecular nature of this highly complex disease, we have sought to provide new knowledge of COPD pathogenesis by utilising recent advancements in label-free quantitative proteomics. With this approach we characterised 1,901 unique proteins from OCT- embedded endobronichial biopsies across four clinically relevant cohorts. We have also demonstrated the utility of a TCA precipitation in purifying the proteome from OCT-embedded endobronichial biopsies, observing strong correlation between patients within each cohort

(Supplementary Figure E.4.6-9) and an even distribution of the number of proteins identified

(Table 4.6). Importantly Zhao et al. achieved a quantitative proteome depth of >5,400 proteins on

OCT core needle lung biopsies, but this study possessed an additional pre-fractionation technique.36 This fractionation step is an important distinction which contributes greatly to proteome coverage depth, by deconvoluting this complex sample into small subpopulations, thus increasing protein identification. Our past studies have demonstrated a multi-dimensional enrichment strategy coupled with chemical labelling can produce unprecedented quantitative proteome depths across a variety of sample types.21-24

One of strongest aspects of this study is the use of four clinically relevant cohorts mapping progressive stages of COPD decline with appropriate controls; healthy controls, healthy smokers, mild-moderate (GOLD stages I/II) and severe-very severe COPD (GOLD stages III/IV) (Table

4.1). This allowed for the generation of quantitative ratios identifying a total of 51, 65, and 51 dysregulated proteins associated with healthy smokers, mild-moderate and severe-very severe

207

COPD cohorts respectively (Table 4.6). Further analysis identified dysregulated proteins uniquely associated with each cohort (Figure 4.6.E), providing an inventory of potential novel biomarkers of disease progression which may assist in both the diagnosis of COPD and the delineation of subtypes within current diagnostic classification systems.

Utilising higher-level bioinformatical systems analyses we were able to map our proteomic profiles to upstream regulators, downstream functions and pathways, and extract evidence-based information on their potential activation and inhibition across our patient cohorts. Hierarchical clustering of enriched upstream regulators allowed us to visualise the trends of activity and identify

“master” regulators which may characterise the progressive decline from early to late stages of

COPD. This lead to the identification of two transcription factors, ATF6 and XBP1, known drivers of the unfolded protein response UPR.37, 38 Secreted and membrane proteins enter the endoplasmic reticulum (ER) to undergo folding and post-transitional modifications.37 The ER can come under stress due to an accumulation of misfolded and unfolded proteins, leading to activation of the cellular response UPR.37, 39, 40 The UPR seeks to resolve this imbalance by either restoring proteostasis through enacting degradation pathways or inducing apoptosis.37 Three sensory proteins drive the activation of this response; PRK-like ER kinase, ATF6, and inositol requiring protein-1 acting via XBP1.37, 38 Activation of transcription factors ATF6 and XBP1 are known to regulate autophagy, cell survival, mitochondrial biogenesis, cholesterol metabolism/ synthesis, energy metabolism, inflammation and a number of protein functions including chaperoning, degradation, folding, synthesis, and transport.38, 41

208

A loss of proteostasis is one of the hallmark features of the ageing lung,42 with COPD being reported to be an accelerated ageing disease.43 In this context, there are several protein changes in the HS and MC cohorts in our study that may also be pre-emptive of changes occurring in patients with severe-very severe COPD. An example of this is the lipoxygenase enzyme archidonate 15- lipoxygenase (ALOX15), which is overexpressed in the mild-moderate COPD cohort.

Lipoxygenase enzymes are well known for their role in inflammatory lung diseases including

COPD, asthma and chronic bronchitis, as well as later onset diseases such as Alzheimer’s disease and Parkinson’s disease.44 While this is largely due to their role in the synthesis of pro- inflammatory mediators such as leukotrienes, lipoxygenases simultaneously catalyse the production of highly reactive lipid aldehydes such as 4-hydroxynonenal (4HNE) from the breakdown of polyunsaturated fatty acids, which can drive cellular ageing in many disease contexts and propagate oxidative stress.45 Several studies have demonstrated that the modification of succinate dehydrogenase (SDHB) by 4HNE results in its dysfunction and can lead to electron leakage from the mitochondria and subsequent oxidative stress.46, 47 While the protein changes identified in this study remain to be validated, the dramatic decrease in succinate dehydrogenase

(SDHB) levels in the severe-very severe COPD patient cohort provides a striking example of a potential downstream consequence of increased ALOX15 activity in patients with mild-moderate

COPD. While there are some studies evaluating the utility of ALOX5 inhibitors to decrease the production of leukotrienes for the treatment of COPD,48 the present study highlights a potential role for ALOX15 in COPD that warrants explicit investigation in patient cohorts. In a similar manner, the increasing dysregulation of thioredoxin (TXN) across the three patient cohorts in this study (2.53 in HS < 3.24 in MC < 3.41 in SC) is also highly indicative of the onset of oxidative stress, accelerated ageing and severe-very severe changes to proteostasis that occur in the lung.49

209

Importantly, these changes accord with independent studies highlighting a role for TXN in the pathogenesis of COPD.50

Finally, in the HS cohort we have identified a marked upregulation in the aldehyde dehydrogenase ALDH3A1, an enzyme that has previously been shown to be upregulated in both human bronchial epithelial cells51 and in lung tissue obtained from smokers.52 This adds validity to identification of ALDH3A1 in our dataset but also highlights the propensity of this early onset change in the proteome to form a valuable biomarker of the first signs of cellular dysfunction.

Indeed, cigarette smoke contains reactive aldehydes in great abundance which can participate in the activation of an oxidative stress cascade (via ALOX15 and SDHB) that lead to a decline in cellular proteostasis and the onset of ageing-like phenotypes in lung tissue.53 In summary, we have provide an extensive inventory of the proteomic alterations in endobronical biopsies across three clinically important cohorts related to COPD and one healthy control cohort. We have identified a number of promising biomarkers and therapeutic targets, warranting further mechanistic investigation to elucidate their impact on the pathogenesis of COPD.

210

4.5 References

1. Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and

regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a

systematic analysis for the Global Burden of Disease Study 2010. Lancet 2012;

380:2095-128.

2. Haw TJ, Starkey MR, Nair PM, Pavlidis S, Liu G, Nguyen DH, et al. A pathogenic role

for tumor necrosis factor-related apoptosis-inducing ligand in chronic obstructive

pulmonary disease. Mucosal Immunol 2016; 9:859-72.

3. Jones B, Donovan C, Liu G, Gomez HM, Chimankar V, Harrison CL, et al. Animal

models of COPD: What do they tell us? Respirology 2017; 22:21-32.

4. Celli BR, MacNee W. Standards for the diagnosis and treatment of patients with COPD:

a summary of the ATS/ERS position paper. Eur Respir J 2004; 23:932-46.

5. Han MK, Agusti A, Calverley PM, Celli BR, Criner G, Curtis JL, et al. Chronic

obstructive pulmonary disease phenotypes: the future of COPD. Am J Respir Crit Care

Med 2010; 182:598-604.

6. Eisner MD, Anthonisen N, Coultas D, Kuenzli N, Perez-Padilla R, Postma D, et al. An

official American Thoracic Society public policy statement: Novel risk factors and the

global burden of chronic obstructive pulmonary disease. Am J Respir Crit Care Med

2010; 182:693-718.

7. Bilano V, Gilmour S, Moffiet T, d'Espaignet ET, Stevens GA, Commar A, et al. Global

trends and projections for tobacco use, 1990-2025: an analysis of smoking indicators

from the WHO Comprehensive Information Systems for Tobacco Control. Lancet 2015;

385:966-76.

211

8. Lopez AD, Shibuya K, Rao C, Mathers CD, Hansell AL, Held LS, et al. Chronic

obstructive pulmonary disease: current burden and future projections. Eur Respir J 2006;

27:397-412.

9. Barnes PJ. Corticosteroid resistance in patients with asthma and chronic obstructive

pulmonary disease. J Allergy Clin Immunol 2013; 131:636-45.

10. Spina D. Pharmacology of novel treatments for COPD: are fixed dose combination

LABA/LAMA synergistic? Eur Clin Respir J 2015; 2.

11. Cazzola M, Page C. Long-acting bronchodilators in COPD: where are we now and where

are we going? Breathe 2014; 10:110-20.

12. Hallstrand TS, Hackett TL, Altemeier WA, Matute-Bello G, Hansbro PM, Knight DA.

Airway epithelial regulation of pulmonary immune homeostasis and inflammation. Clin

Immunol 2014; 151:1-15.

13. Rennard SI, Dale DC, Donohue JF, Kanniess F, Magnussen H, Sutherland ER, et al.

CXCR2 Antagonist MK-7123. A Phase 2 Proof-of-Concept Trial for Chronic Obstructive

Pulmonary Disease. Am J Respir Crit Care Med 2015; 191:1001-11.

14. MacNee W. Oxidants/antioxidants and COPD. Chest 2000; 117:303s-17s.

15. Lakshmi SP. Emerging pharmaceutical therapies for COPD. 2017; 12:2141-56.

16. Walther TC, Mann M. Mass spectrometry-based proteomics in cell biology. J Cell Biol

2010; 190:491-500.

17. Nilsson T, Mann M, Aebersold R, Yates JR, 3rd, Bairoch A, Bergeron JJ. Mass

spectrometry in high-throughput proteomics: ready for the big time. Nat Methods 2010;

7:681-5.

212

18. Maes E, Broeckx V, Mertens I, Sagaert X, Prenen H, Landuyt B, et al. Analysis of the

formalin-fixed paraffin-embedded tissue proteome: pitfalls, challenges, and future

prospectives. Amino Acids 2013; 45:205-18.

19. Zhang W, Sakashita S, Taylor P, Tsao MS, Moran MF. Comprehensive proteome

analysis of fresh frozen and optimal cutting temperature (OCT) embedded primary non-

small cell lung carcinoma by LC-MS/MS. Methods 2015; 81:50-5.

20. Zhao X, Huffman KE, Fujimoto J, Canales JR, Girard L, Nie G, et al. Quantitative

Proteomic Analysis of Optimal Cutting Temperature (OCT) Embedded Core-Needle

Biopsy of Lung Cancer. J Am Soc Mass Spectrom 2017; 28:2078-89.

21. Degryse S, de Bock CE, Demeyer S, Govaerts I, Bornschein S, Verbeke D, et al. Mutant

JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and

MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic leukemia.

Leukemia 2018; 32:788-800.

22. Nixon B, De Iuliis GN, Hart HM, Zhou W, Mathe A, Bernstein I, et al. Proteomic

profiling of mouse epididymosomes reveals their contributions to post-testicular sperm

maturation. Mol Cell Proteomics 2018.

23. Nixon B, Johnston SD, Skerrett-Byrne DA, Anderson AL, Stanger SJ, Bromfield EG, et

al. Proteomic profiling of crocodile spermatozoa refutes the tenet that post-testicular

maturation is restricted to mammals. Mol Cell Proteomics 2018.

24. Jamaluddin MFB, Ko YA, Kumar M, Brown Y, Bajwa P, Nagendra PB, et al. Proteomic

Profiling of Human Uterine Fibroids Reveals Upregulation of the Extracellular Matrix

Protein Periostin. Endocrinology 2018; 159:1106-18.

213

25. Tyanova S, Temu T, Sinitcyn P, Carlson A, Hein MY, Geiger T, et al. The Perseus

computational platform for comprehensive analysis of (prote)omics data. Nat Methods

2016; 13:731-40.

26. J.C. O. An interactive tool for comparing lists with Venn’s diagrams.

http://bioinfogp.cnb.csic.es/tools/venny/index.html 2007-2015.

27. Huang da W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: paths toward

the comprehensive functional analysis of large gene lists. Nucleic Acids Res 2009; 37:1-

13.

28. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large

gene lists using DAVID bioinformatics resources. Nat Protoc 2009; 4:44-57.

29. Kramer A, Green J, Pollard J, Jr., Tugendreich S. Causal analysis approaches in

Ingenuity Pathway Analysis. Bioinformatics 2014; 30:523-30.

30. Schiller HB, Fernandez IE, Burgstaller G, Schaab C, Scheltema RA, Schwarzmayr T, et

al. Time- and compartment-resolved proteome profiling of the extracellular niche in lung

injury and repair. Mol Syst Biol 2015; 11:819.

31. Schwartz SA, Reyzer ML, Caprioli RM. Direct tissue analysis using matrix-assisted laser

desorption/ionization mass spectrometry: practical aspects of sample preparation. J Mass

Spectrom 2003; 38:699-708.

32. Issaq HJ, Conrads TP, Janini GM, Veenstra TD. Methods for fractionation, separation

and profiling of proteins and peptides. Electrophoresis 2002; 23:3048-61.

33. Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature 2003; 422:198-207.

214

34. Bateman NW, Goulding SP, Shulman NJ, Gadok AK, Szumlinski KK, MacCoss MJ, et

al. Maximizing peptide identification events in proteomic workflows using data-

dependent acquisition (DDA). Mol Cell Proteomics 2014; 13:329-38.

35. Schiller HB, Montoro DT, Simon LM, Rawlins EL, Meyer KB, Strunz M, et al. The

Human Lung Cell Atlas: A High-Resolution Reference Map of the Human Lung in

Health and Disease. Am J Respir Cell Mol Biol 2019; 61:31-41.

36. Annesley TM. Ion suppression in mass spectrometry. Clin Chem 2003; 49:1041-4.

37. Schroder M, Kaufman RJ. The mammalian unfolded protein response. Annu Rev

Biochem 2005; 74:739-89.

38. Kelsen SG. The Unfolded Protein Response in Chronic Obstructive Pulmonary Disease.

Ann Am Thorac Soc 2016; 13 Suppl 2:S138-45.

39. Dufey E, Sepulveda D, Rojas-Rivera D, Hetz C. Cellular mechanisms of endoplasmic

reticulum stress signaling in health and disease. 1. An overview. Am J Physiol Cell

Physiol 2014; 307:C582-94.

40. Zhang K, Kaufman RJ. The unfolded protein response: a stress signaling pathway critical

for health and disease. Neurology 2006; 66:S102-9.

41. Dombroski BA, Nayak RR, Ewens KG, Ankener W, Cheung VG, Spielman RS. Gene

expression and genetic variation in response to endoplasmic reticulum stress in human

cells. Am J Hum Genet 2010; 86:719-29.

42. Meiners S, Eickelberg O, Königshoff M. Hallmarks of the ageing lung. European

Respiratory Journal 2015; 45:807.

215

43. Brandsma C-A, de Vries M, Costa R, Woldhuis RR, Königshoff M, Timens W. Lung

ageing and COPD: is there a role for ageing in abnormal tissue repair? European

Respiratory Review 2017; 26:170073.

44. Eleftheriadis N, Thee SA, Zwinderman MR, Leus NG, Dekker FJ. Activity-Based Probes

for 15-Lipoxygenase-1. Angew Chem Int Ed Engl 2016; 55:12300-5.

45. Ayala A, Munoz MF, Arguelles S. Lipid peroxidation: production, metabolism, and

signaling mechanisms of malondialdehyde and 4-hydroxy-2-nonenal. Oxid Med Cell

Longev 2014; 2014:360438.

46. Lashin OM, Szweda PA, Szweda LI, Romani AM. Decreased complex II respiration and

HNE-modified SDH subunit in diabetic heart. Free Radic Biol Med 2006; 40:886-96.

47. Lord T, Martin JH, Aitken RJ. Accumulation of electrophilic aldehydes during

postovulatory aging of mouse oocytes causes reduced fertility, oxidative stress, and

apoptosis. Biol Reprod 2015; 92:33.

48. Kilfeather S. 5-lipoxygenase inhibitors for the treatment of COPD. Chest 2002;

121:197s-200s.

49. Cunningham GM, Flores LC, Roman MG, Cheng C, Dube S, Allen C, et al. Thioredoxin

overexpression in both the cytosol and mitochondria accelerates age-related disease and

shortens lifespan in male C57BL/6 mice. Geroscience 2018; 40:453-68.

50. Xu J, Li T, Wu H, Xu T. Role of thioredoxin in lung disease. Pulm Pharmacol Ther 2012;

25:154-62.

51. Franciosi L, Postma DS, van den Berge M, Govorukhina N, Horvatovich PL, Fusetti F, et

al. Susceptibility to COPD: differential proteomic profiling after acute smoking. PLoS

One 2014; 9:e102037.

216

52. Jang JH, Bruse S, Liu Y, Duffy V, Zhang C, Oyamada N, et al. Aldehyde dehydrogenase

3A1 protects airway epithelial cells from cigarette smoke-induced DNA damage and

cytotoxicity. Free Radic Biol Med 2014; 68:80-6.

53. Marchitti SA, Brocker C, Stagos D, Vasiliou V. Non-P450 aldehyde oxidizing enzymes:

the aldehyde dehydrogenase superfamily. Expert Opin Drug Metab Toxicol 2008; 4:697-

720.

217

4.6 Supplementary Figures

Figure E4.1: The effect of OCT presence on mass spectra analysis in a complex peptide population. Initial mass spectra shows peptides elution profile over time and subsequent fragmentation spectra (smaller window) of non-small-cell lung carcinoma cell line (H1581) prior (A) and following removal of the OCT compound (B). At the end of a 230 min gradient, a characteristic polymer signal is seen (A) and subsequently removal reveals tryptic peptides (B). (Extracted from Zhang et al. 2015 for illustrative purposes19)

218

Biological process A HC - BP response to drug 36 mRNA splicing, via spliceosome 36 36 negative regulation of apoptotic process 45 viral process 45 oxidation-reduction process 61 extracellular matrix organization 61 cell adhesion 62 viral transcription 63 SRP-dependent cotranslational to membrane 63 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 64 rRNA processing 65 translational initiation 65 translation 68 cell-cell adhesion 75

0 10 20 30 40 50 60 70 80

B CellularHC - CC component perinuclear region of cytoplasm 69 myelin sheath 69 82 cell-cell adherens junction 94 extracellular matrix 130 focal adhesion 139 extracellular region 149 extracellular space 164 mitochondrion 178 nucleoplasm 180 membrane 292 nucleus 310 cytosol 349 cytoplasm 375 extracellular exosome 568

0 100 200 300 400 500 600

C MolecularHC - MF Function oxidoreductase activity 35 actin binding 37 protein kinase binding 39 GTP binding 43 enzyme binding 44 structural molecule activity 52 protein homodimerization activity 61 calcium ion binding 61 structural constituent of ribosome 64 identical protein binding 76 RNA binding 79 cadherin binding involved in cell-cell adhesion 90 ATP binding 99 poly(A) RNA binding 235 protein binding 608

0 100 200 300 400 500 600

Figure E4.2: Gene ontology annotation of healthy control endobronchial biopsies. Refined lists (FDR ≤ 0.01; #unique peptides ≥ 2; quant n ≥ 3) were subjected to a gene ontology analysis, classifying the most signficant biological processes, (A) cellular compartment, (B) and molecular functions (C).

219

Biological process A HS - BP

mRNA splicing, via spliceosome 38 protein folding 38 proteolysis 39 negative regulation of apoptotic process 45 viral process 49 extracellular matrix organization 64 cell adhesion 68 viral transcription 68 oxidation-reduction process 69 SRP-dependent cotranslational protein targeting to membrane 69 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 71 rRNA processing 73 translational initiation 75 translation 77 cell-cell adhesion 95

0 10 20 30 40 50 60 70 80 90 100

B CellularHS - CC component perinuclear region of cytoplasm 75 endoplasmic reticulum 76 nucleolus 94 cell-cell adherens junction 115 extracellular matrix 135 focal adhesion 150 extracellular region 159 extracellular space 181 mitochondrion 194 nucleoplasm 208 membrane 323 nucleus 356 cytosol 404 cytoplasm 429 extracellular exosome 630

0 100 200 300 400 500 600

C MolecularHS- MF Function oxidoreductase activity 42 GTP binding 44 protein kinase binding 44 actin binding 44 enzyme binding 48 structural molecule activity 54 protein homodimerization activity 64 structural constituent of ribosome 71 calcium ion binding 75 RNA binding 87 identical protein binding 92 cadherin binding involved in cell-cell adhesion 109 ATP binding 115 poly(A) RNA binding 253 protein binding 690

0 100 200 300 400 500 600 700

Figure E4.3: Gene ontology annotation of healthy smokers endobronchial biopsies. Refined lists (FDR ≤ 0.01; #unique peptides ≥ 2; quant n ≥ 3) were subjected to a gene ontology analysis, classifying the most signficant biological processes, (A) cellular compartment, (B) and molecular functions (C).

220

Biological process A MC - BP

response to drug 39 mRNA splicing, via spliceosome 39 proteolysis 41 negative regulation of apoptotic process 44 viral process 50 extracellular matrix organization 60 viral transcription 65 SRP-dependent cotranslational protein targeting to membrane 65 cell adhesion 66 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 66 rRNA processing 67 oxidation-reduction process 68 translational initiation 70 translation 74 cell-cell adhesion 92

0 10 20 30 40 50 60 70 80 90

B MCCellular - CC component mitochondrial inner membrane 72 myelin sheath 73 nucleolus 86 cell-cell adherens junction 111 extracellular matrix 134 focal adhesion 141 extracellular region 156 extracellular space 177 mitochondrion 197 nucleoplasm 201 membrane 325 nucleus 350 cytosol 392 cytoplasm 421 extracellular exosome 616

0 100 200 300 400 500 600

C MolecularMC- MF Function oxidoreductase activity 39 protein kinase binding 41 actin binding 42 enzyme binding 45 GTP binding 46 structural molecule activity 54 protein homodimerization activity 63 structural constituent of ribosome 67 calcium ion binding 76 RNA binding 84 identical protein binding 87 cadherin binding involved in cell-cell adhesion 106 ATP binding 112 poly(A) RNA binding 247 protein binding 672

0 100 200 300 400 500 600

Figure E4.4: Gene ontology annotation of mild COPD endobronchial biopsies. Refined lists (FDR ≤ 0.01; #unique peptides ≥ 2; quant n ≥ 3) were subjected to a gene ontology analysis, classifying the most signficant biological processes, (A) cellular compartment, (B) and molecular functions (C).

221

Biological process A SC - BP

response to drug 37 proteolysis 39 mRNA splicing, via spliceosome 41 negative regulation of apoptotic process 44 viral process 47 cell adhesion 62 extracellular matrix organization 63 viral transcription 65 SRP-dependent cotranslational protein targeting to membrane 65 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 66 rRNA processing 67 oxidation-reduction process 68 translational initiation 70 translation 73 cell-cell adhesion 87

0 10 20 30 40 50 60 70 80 90

B SCCellular - CC component perinuclear region of cytoplasm 70 myelin sheath 73 nucleolus 89 cell-cell adherens junction 105 extracellular matrix 135 focal adhesion 138 extracellular region 158 extracellular space 181 mitochondrion 188 nucleoplasm 199 membrane 313 nucleus 345 cytosol 377 cytoplasm 414 extracellular exosome 615

0 100 200 300 400 500 600

C MolecularSC- MF Function actin binding 37 protein kinase binding 40 oxidoreductase activity 40 GTP binding 45 enzyme binding 46 structural molecule activity 51 protein homodimerization activity 65 structural constituent of ribosome 67 calcium ion binding 76 RNA binding 86 identical protein binding 90 cadherin binding involved in cell-cell adhesion 101 ATP binding 108 poly(A) RNA binding 247 protein binding 662

0 100 200 300 400 500 600 700

Figure E4.5: Gene ontology annotation of severe COPD endobronchial biopsies. Refined lists (FDR ≤ 0.01; #unique peptides ≥ 2; quant n ≥ 3) were subjected to a gene ontology analysis, classifying the most signficant biological processes, (A) cellular compartment, (B) and molecular functions (C).

222

Figure E4.6: Multi scatter plot of healthy control patients. A multi scatter plots using all six patients from the healthy control (HC) cohort with Pearson correlation values displayed for each comparison

223

Figure E4.7: Multi scatter plot of healthy smoker patients. A multi scatter plots using five patients from the healthy smokers (HS) cohort with Pearson correlation values displayed for each comparison

224

Figure E4.8: Multi scatter plot of mild COPD patients. A multi scatter plots using all six patient from the mild COPD (MC) cohort with Pearson correlation values displayed for each comparison

225

Figure E4.9: Multi scatter plot of severe COPD patients. A multi scatter plots using all six patient from the severe COPD (SC) cohort with Pearson correlation values displayed for each comparison

226

4.7 Supplementary Tables

Supplementary Table E.4.1 Dysregulated proteins in healthy smoker cohort Uniprot #Unique MS Amanda Fold change Protein name Gene name accession Peptides 2.0 Score (log2) Q96QK1 Vacuolar protein sorting-associated protein 35 5 1317.2 VPS35 -6.64# Q9Y3E5 Peptidyl-tRNA hydrolase 2, mitochondrial 2 766.71 PTRH2 -6.64# P55290-4 Cadherin-13 4 963.48 CDH13 -6.64# Q9UHA4 Ragulator complex protein LAMTOR3 2 1022.01 LAMTOR3 -6.64# P20849 Collagen alpha-1 (IX) chain 2 1022.24 COL9A1 -6.64# Q6UWX4 HHIP-like protein 2 3 568.47 HHIPL2 -6.64# Q9UGT4 Sushi domain-containing protein 2 2 545.57 SUSD2 -6.64# P02746 Complement C1q subcomponent subunit B 2 446.34 C1QB -6.64# Q13103 Secreted phosphoprotein 24 2 682.79 SPP2 -4.69 P15144 Aminopeptidase N 5 2080.69 ANPEP -3.83 Q9BX97 Plasmalemma vesicle-associated protein 3 2175.56 PLVAP -3.76 P61769 Beta-2-microglobulin 3 16223.1 B2M -3.62 O15335 Chondroadherin 14 20381.14 CHAD -3.29 P48681 Nestin 4 1192.05 NES -2.73 O75596 C-type lectin domain family 3 member A 4 1722.11 CLEC3A -2.72 P13646 Keratin, type I cytoskeletal 13 7 60409.49 KRT13 -2.43 P29536 Leiomodin-1 5 8437.34 LMOD1 -2.2 Q9UFN0 Protein NipSnap homolog 3A 4 8615.98 NIPSNAP3A -2.05 P01903 HLA class II histocompatibility antigen, DR alpha chain 2 3136.52 HLA-DRA -2 Q13753 Laminin subunit gamma-2 17 10108.28 LAMC2 -2 O14818 Proteasome subunit alpha type-7 3 4326.86 PSMA7 2.05 P38159 RNA-binding motif protein, X chromosome 9 6471.02 RBMX 2.07 P30838 Aldehyde dehydrogenase, dimeric NADP-preferring 14 7851.7 ALDH3A1 2.09 P61626 Lysozyme C 5 9649.22 LYZ 2.11 Q9Y230 RuvB-like 2 7 7565.41 RUVBL2 2.18 P61970 factor 2 2 2733.08 NUTF2 2.22 Q9Y265 RuvB-like 1 11 11080.41 RUVBL1 2.25 P04080 Cystatin-B 6 3060.01 CSTB 2.39

227

P30085 UMP-CMP kinase 7 2180.82 CMPK1 2.48 O75832 26S proteasome non-ATPase regulatory subunit 10 2 1350.23 PSMD10 2.51 P10599 Thioredoxin 2 13150.5 TXN 2.53 Q16698 2,4-dienoyl-CoA reductase, mitochondrial 6 9873.38 DECR1 2.57 Q14258 E3 ubiquitin/ISG15 ligase TRIM25 3 2194.31 TRIM25 2.9 P62263 40S S14 3 3661.38 RPS14 3.75 P02462 Collagen alpha-1 2 3276.04 COL4A1 3.82 P52565 Rho GDP-dissociation inhibitor 1 4 6008.83 ARHGDIA 4.67 P07225 Vitamin K-dependent protein S 2 916.37 PROS1 6.64* P62857 40S ribosomal protein S28 2 494.49 RPS28 6.64* P68133 Actin, alpha skeletal muscle 5 338594.84 ACTA1 6.64* Q13557-11 Calcium/calmodulin-dependent protein kinase type II subunit δ 2 726.75 CAMK2D 6.64* Q13885 Tubulin beta-2A chain 3 105724.25 TUBB2A 6.64* Q9BVK6 Transmembrane emp24 domain-containing protein 9 3 794.63 TMED9 6.64* Q9P2J5 Leucine--tRNA ligase, cytoplasmic 2 370.75 LARS 6.64* Q9UKZ9 Procollagen C-endopeptidase enhancer 2 2 846.74 PCOLCE2 6.64* O00264 Membrane-associated progesterone receptor component 1 2 1234.82 PGRMC1 6.64* O75602-3 Sperm-associated antigen 6 4 1434.71 SPAG6 6.64* O95433 Activator of 90 kDa heat shock protein ATPase homolog 1 2 483.69 AHSA1 6.64* P08263 Glutathione S-transferase A1 2 31515.17 GSTA1 6.64* P58340 Myeloid leukemia factor 1 3 993.5 MLF1 6.64* Q14677-3 Clathrin interactor 1 2 4760.58 CLINT1 6.64* Q96M98 Parkin coregulated gene protein 2 392.25 PACRG 6.64* * Proteins detected (n ≥ 3) in the cohort and absent from healthy control patients # Proteins detected (n ≥ 3) in healthy control patients and absent from cohort of interest

228

Supplementary Table E.4.2 Dysregulated proteins in mild-moderate COPD cohort #Unique MS Amanda Fold Uniprot Protein name Peptides 2.0 Score Gene name change accession (log2) Q96QK1 Vacuolar protein sorting-associated protein 35 5 1317.2 VPS35 -6.64# P20849 Collagen alpha-1 (IX) chain 2 1022.24 COL9A1 -6.64# Q9UGT4 Sushi domain-containing protein 2 2 545.57 SUSD2 -6.64# P05204 Non-histone chromosomal protein HMG-17 2 8377.61 HMGN2 -6.64# P61586 Transforming protein RhoA 3 1244.14 RHOA -6.64# Q9H1E5 Thioredoxin-related transmembrane protein 4 2 866.95 TMX4 -6.64# O15335 Chondroadherin 14 20381.14 CHAD -5.41 Q08431 Lactadherin 8 4185.66 MFGE8 -4.97 Q13151 Heterogeneous nuclear ribonucleoprotein A0 7 9817.5 HNRNPA0 -4.6 Q9Y3I0 tRNA-splicing ligase RtcB homolog 7 6225.5 RTCB -3.98 P15502-11 Elastin 2 52943.03 ELN -3.77 P10915 Hyaluronan and proteoglycan link protein 1 12 58960.46 HAPLN1 -3.67 Q969X5 Endoplasmic reticulum-Golgi intermediate compartment protein 1 5 1608.69 ERGIC1 -3.54 P23142 Fibulin-1 4 9804.42 FBLN1 -3.24 Q06828 Fibromodulin 7 5319.77 FMOD -3.16 P02792 Ferritin light chain 6 5064.03 FTL -3.13 P04275 von Willebrand factor 12 13819.56 VWF -2.87 P21810 Biglycan 26 202196.84 BGN -2.69 P55083-2 Microfibril-associated glycoprotein 4 7 28195.32 MFAP4 -2.58 P13645 Keratin, type I cytoskeletal 10 18 35647.38 KRT10 -2.52 Q04837 Single-stranded DNA-binding protein, mitochondrial 5 3293.01 SSBP1 -2.52 P21397 Amine oxidase 6 3206.62 MAOA -2.37 P51888 Prolargin 30 159365.13 PRELP -2.35 P35527 Keratin, type I cytoskeletal 9 20 26057.73 KRT9 -2.31 Q9Y678 Coatomer subunit gamma-1 4 3354.25 COPG1 -2.25 P50225 Sulfotransferase 1A1 2 688.57 SULT1A1 -2.12 P24534 Elongation factor 1-beta 3 5579.93 EEF1B2 -2.09 O75531 Barrier-to-autointegration factor 6 14935.06 BANF1 -2.08 P21941 Cartilage matrix protein 34 87007.49 MATN1 -2.07

229

P28906 Hematopoietic progenitor cell antigen CD34 2 1234.95 CD34 -2.02 P24821 Tenascin 18 21533.47 TNC 2.18 P16050 Arachidonate 15-lipoxygenase 12 12078.5 ALOX15 2.34 P02538 Keratin, type II cytoskeletal 6A 23 164495.68 KRT6A 2.41 P52209 6-phosphogluconate dehydrogenase, decarboxylating 6 6330.2 PGD 2.56 P02533 Keratin, type I cytoskeletal 14 6 51985.83 KRT14 2.61 P67936 Tropomyosin alpha-4 chain 2 112178.39 TPM4 2.95 P10599 Thioredoxin 2 13150.5 TXN 3.24 P31947 14-3-3 protein sigma 2 3217.23 SFN 3.43 P52565 Rho GDP-dissociation inhibitor 1 4 6008.83 ARHGDIA 4.12 O60763-2 General vesicular transport factor p115 6 1557.41 USO1 6.64* O75937 DnaJ homolog subfamily C member 8 2 699.87 DNAJC8 6.64* P04181 Ornithine aminotransferase, mitochondrial 3 2354.49 OAT 6.64* P07225 Vitamin K-dependent protein S 2 916.37 PROS1 6.64* Q13557-11 Calcium/calmodulin-dependent protein kinase type II subunit delta 2 726.75 CAMK2D 6.64* Q16881 Thioredoxin reductase 1, cytoplasmic 3 912.71 TXNRD1 6.64* Q7LBR1 Charged multivesicular body protein 1b 2 513.81 CHMP1B 6.64* Q96C19 EF-hand domain-containing protein D2 2 1198.19 EFHD2 6.64* Q9BUF5 Tubulin beta-6 chain 4 49502.99 TUBB6 6.64* Q9HD45 Transmembrane 9 superfamily member 3 2 638.26 TM9SF3 6.64* Q9NZ08-2 Endoplasmic reticulum aminopeptidase 1 4 1650.55 ERAP1 6.64* Q9P2J5 Leucine--tRNA ligase, cytoplasmic 2 370.75 LARS 6.64* O00203 AP-3 complex subunit beta-1 2 1513.33 AP3B1 6.64* O00232 26S proteasome non-ATPase regulatory subunit 12 2 586.55 PSMD12 6.64* O75306 NADH dehydrogenase 2 894.26 NDUFS2 6.64* P15927-3 Replication protein A 32 kDa subunit 2 551.2 RPA2 6.64* P29966 Myristoylated alanine-rich C-kinase substrate 3 1511.08 MARCKS 6.64* P42677 40S ribosomal protein S27 2 2088.77 RPS27 6.64* P54578 Ubiquitin carboxyl-terminal hydrolase 14 3 833.52 USP14 6.64* Q14839-2 Chromodomain-helicase-DNA-binding protein 4 2 638.72 CHD4 6.64* Q96N66 Lysophospholipid acyltransferase 7 2 1178.56 MBOAT7 6.64* Q96TC7 Regulator of microtubule dynamics protein 3 3 1184.22 RMDN3 6.64* Q9BPW8 Protein NipSnap homolog 1 2 526.38 NIPSNAP1 6.64*

230

Q9UM54 Unconventional myosin-VI 4 1215.13 MYO6 6.64* Q9Y6A9 complex subunit 1 2 478.66 SPCS1 6.64* * Proteins detected (n ≥ 3) in the cohort and absent from healthy control patients # Proteins detected (n ≥ 3) in healthy control patients and absent from cohort of interest

231

Supplementary Table E.4.3 Dysregulated proteins in severe-very severe COPD cohort Uniprot #Unique MS Amanda Fold change Protein name Gene name accession Peptides 2.0 Score (log2) Q12797 Aspartyl/asparaginyl beta-hydroxylase 4 772.52 ASPH -6.64# P31937 3-hydroxyisobutyrate dehydrogenase, mitochondrial 3 1832.43 HIBADH -6.64# Q14314 Fibroleukin 2 1961.05 FGL2 -6.64# Q02218 2-oxoglutarate dehydrogenase, mitochondrial 6 3439.95 OGDH -6.64# Q96E40 Sperm acrosome-associated protein 9 2 576.32 SPACA9 -6.64# Q14254 Flotillin-2 2 1869.9 FLOT2 -5.02 P21397 Amine oxidase 6 3206.62 MAOA -4.8 P21912 Succinate dehydrogenase 4 1886.36 SDHB -3.94 P62820 Ras-related protein Rab-1A 3 11546.88 RAB1A -3.25 P13797 Plastin-3 6 2217.05 PLS3 -2.68 P37108 Signal recognition particle 14 kDa protein 7 9104.3 SRP14 -2.2 P02538 Keratin, type II cytoskeletal 6A 23 164495.68 KRT6A 2.74 Q9H9B4 Sideroflexin-1 2 1374.38 SFXN1 2.75 Q99460 26S proteasome non-ATPase regulatory subunit 1 4 1016.1 PSMD1 2.8 Q53GG5-2 PDZ and LIM domain protein 3 7 2658.4 PDLIM3 2.87 P62829 60S ribosomal protein L23 4 2295.27 RPL23 3.01 P40925-3 Malate dehydrogenase, cytoplasmic 3 7743.64 MDH1 3.21 P67936 Tropomyosin alpha-4 chain 2 112178.39 TPM4 3.38 P10599 Thioredoxin 2 13150.5 TXN 3.41 P22694-2 cAMP-dependent protein kinase catalytic subunit beta 4 2866.25 PRKACB 3.47 P52565 Rho GDP-dissociation inhibitor 1 4 6008.83 ARHGDIA 3.54 P62263 40S ribosomal protein S14 3 3661.38 RPS14 3.69 O15144 Actin-related protein 2/3 complex subunit 2 7 4085.39 ARPC2 4.75 O60763-2 General vesicular transport factor p115 6 1557.41 USO1 6.64* O75937 DnaJ homolog subfamily C member 8 2 699.87 DNAJC8 6.64* P04181 Ornithine aminotransferase, mitochondrial 3 2354.49 OAT 6.64* P62857 40S ribosomal protein S28 2 494.49 RPS28 6.64* P68133 Actin, alpha skeletal muscle 5 338594.84 ACTA1 6.64* Q13557-11 Calcium/calmodulin-dependent protein kinase type II subunit delta 2 726.75 CAMK2D 6.64* Q13885 Tubulin beta-2A chain 3 105724.25 TUBB2A 6.64*

232

Q16881 Thioredoxin reductase 1, cytoplasmic 3 912.71 TXNRD1 6.64* Q7LBR1 Charged multivesicular body protein 1b 2 513.81 CHMP1B 6.64* Q96C19 EF-hand domain-containing protein D2 2 1198.19 EFHD2 6.64* Q9BUF5 Tubulin beta-6 chain 4 49502.99 TUBB6 6.64* Q9BVK6 Transmembrane emp24 domain-containing protein 9 3 794.63 TMED9 6.64* Q9HD45 Transmembrane 9 superfamily member 3 2 638.26 TM9SF3 6.64* Q9NZ08-2 Endoplasmic reticulum aminopeptidase 1 4 1650.55 ERAP1 6.64* Q9P2J5 Leucine--tRNA ligase, cytoplasmic 2 370.75 LARS 6.64* Q9UKZ9 Procollagen C-endopeptidase enhancer 2 2 846.74 PCOLCE2 6.64* O43143 Pre-mRNA-splicing factor ATP-dependent RNA helicase DHX15 4 1744.82 DHX15 6.64* O75462 Cytokine receptor-like factor 1 2 475.67 CRLF1 6.64* P22570-3 NADPH:adrenodoxin oxidoreductase, mitochondrial 3 1158.12 FDXR 6.64* Q12907 Vesicular integral-membrane protein VIP36 2 768.59 LMAN2 6.64* Q13492 Phosphatidylinositol-binding clathrin assembly protein 3 2029.19 PICALM 6.64* Q66GS9 Centrosomal protein of 135 kDa 2 678.52 CEP135 6.64* Q86SF2 N-acetylgalactosaminyltransferase 7 2 866.22 GALNT7 6.64* Q8IXB1 DnaJ homolog subfamily C member 10 2 1230.64 DNAJC10 6.64* Q9NQ48 Leucine zipper transcription factor-like protein 1 2 848.89 LZTFL1 6.64* Q9UBX3-2 Mitochondrial dicarboxylate carrier 4 714.49 SLC25A10 6.64* Q9UBX5 Fibulin-5 2 516.09 FBLN5 6.64 * Proteins detected (n ≥ 3) in the cohort and absent from healthy control patients # Proteins detected (n ≥ 3) in healthy control patients and absent from cohort of interest

233

Supplementary table E.4.4 IPA upstream regulators analyses Highest Significance Upstream -log #Annotated regulators Classification type (p-value) Protein Targets

TCR Complex 33.47 66 LONP1 Peptidase 28.37 35 MMP12 Peptidase 22.94 27 Healthy Healthy

Smokers PLA2R1 Transmembrane Receptor 17.39 18 CST5 Other 17.00 52

TCR Complex 35.47 68

- LONP1 Peptidase 27.06 34 MMP12 Peptidase 21.57 26 Mild PLA2R1COPD Transmembrane Receptor 19.05 19 moderate CST5 Other 17.06 52

TCR Complex 35.50 68

- LONP1 Peptidase 24.44 32 MMP12 Peptidase 21.58 26 COPD Severe CST5 Other 17.78 53 very severe very PLA2R1 Transmembrane Receptor 17.42 18

Highest #Annotated Protein Targets Upstream -log #Annotated regulators Classification type (p-value) Protein Targets

TP53 Transcription Regulator 5.88 61 TGFB1 Growth Factor 10.11 47 TNF Cytokine 5.78 47 Healthy Healthy

Smokers SMARCA4 Transcription Regulator 6.04 46 MYC Transcription Regulator 9.77 38

TP53 Transcription Regulator 5.29 59

- TNF Cytokine 6.19 48 SMARCA4 Transcription Regulator 6.46 47 Mild TGFB1COPD Growth Factor 9.64 46 moderate MYC Transcription Regulator 10.40 39

TP53 Transcription Regulator 5.00 58

- TGFB1 Growth Factor 9.65 46 TNF Cytokine 5.47 46 COPD Severe SMARCA4 Transcription Regulator 5.71 45 very severe very MYC Transcription Regulator 9.82 38

234

TABLE 4.E.5 IPA downstream molecular and cellular functions analyses Highest Significance -log #Annotated Biological Function (p-value) Proteins

Metabolism of protein 68.76 204 Degranulation of cells 67.28 157 Decay of mRNA 63.96 71 Healthy Healthy Smokers Nonsense-mediated mRNA decay 62.07 69 Initiation of translation of protein 60.41 73

Metabolism of protein 67.59 202

- Degranulation of cells 67.55 157 Decay of mRNA 64.09 71 Mild NonsenseCOPD -mediated mRNA decay 62.19 69

moderate Initiation of translation of protein 60.54 73

Degranulation of cells 68.49 158

- Metabolism of protein 65.43 199 Decay of mRNA 64.12 71 COPD Severe Nonsense-mediated mRNA decay 62.22 69 very severe very Initiation of translation of protein 60.58 73

Highest #Annotated Proteins

-log #Annotated Biological Function (p-value) Proteins

Necrosis 25.55 252 Cell movement 33.85 237 Apoptosis 23.60 235 Healthy Healthy Smokers Cell death of tumor cell lines 23.21 213 Migration of cells 30.16 212

Necrosis 26.60 254

- Apoptosis 25.00 238 Cell movement 34.16 237 Mild CellCOPD death of tumor cell lines 23.84 214 moderate Migration of cells 30.42 212

Necrosis 27.84 257

- Cell movement 35.62 240 Apoptosis 25.84 240 COPD Severe Cell death of tumor cell lines 25.49 218 very severe very Migration of cells 31.40 214

235

Chapter 5: General Discussion & Future Research Directions

In this final chapter, the observations and results from our proteomic and phosphoproteomic analysis from chapters 2-4 are contextualised around existing knowledge. We discuss the potential of this work and provide ways forward to further dissect this complex disease.

236

5.1 Discussion

COPD continues to rise as a leading cause of death worldwide and an ever-increasing socioeconomic burden. Cigarette smoking remains the leading cause of COPD, despite numerous evidence-based studies describing its detrimental effects. To better understand this global problem, researchers have carried out sophisticated genomic and transcriptomic analyses of COPD. Notwithstanding these findings, the dynamic molecular players driving the development and progression of this disease are still largely unknown. This is reflected in current treatment strategies which provide patients with symptomatic relief but cannot halt the progression of the disease. Recent literature has provided some promising pharmaceutical therapies aiming to reduce inflammation, inhibit reactive oxygen species production and modulate mucus hypersecretion. Despite this progress there is still incomplete knowledge of the mechanisms underpinning the disease. Given this situation, we surmised that the implementation of proteomic and phosphoproteomic techniques coupled with our cigarette smoke-induced COPD mouse model, would assist in developing a meaningful understanding of the progressive changes leading to the induction and progression of COPD. Herein, I will present the key findings and limitations of these studies and highlight directions for future investigation that have arisen from this body of work.

237

5.1.1 Key findings across these results chapters

The studies described in this thesis have utilised a CS-induced COPD mouse model, which robustly induces the key hallmark human clinical features of the disease in a relatively short period (12 weeks). A unique aspect of this model is its defined time points allowing for investigation into the induction (4 and 6 weeks) and progression (8 and 12 weeks) phases of the disease. Utilising this model, we have comprehensively characterised the largest CS- induced pulmonary proteome to date, identifying >7,200 proteins and relating their changes in expression to the induction and progression of the disease. This vast inventory and the downstream functions and upstream regulators we have annotated are reflective of a highly dynamic and complex tissue environment.

We focused our attention on the 8-week time point in the progression phase due to the compelling number of dysregulated proteins (270). This time point corresponds with a critical window which recapitulates the well-established features of human COPD, such as airway remodelling, chronic pulmonary inflammation, emphysema, impaired lung function, and mucus hypersecretion. Importantly, these are irreversible hallmarks of COPD.1 We identified two critical RNA biosynthesis proteins, HNRNPC and MSI2, and a novel calcium signalling protein, S100A1, with the propensity to signal through TLR4 and modulate downstream inflammatory response pathways. We validated these changes using a targeted proteomic approach, adding to the cogency of the proteomic identifications in this mouse model. We have accompanied this vast inventory with the identification of >27,000 unique phosphorylation sites across the induction and progression phases of our CS-induced COPD model. Annotating these changes to potential downstream functions mediated by this post-translational modification and mapping upstream kinases regulating such phosphorylation events. This has assisted us in compiling a list of clinically approved drugs to 13 distinct kinases mapping to

238 individual time points across the time course of COPD. To round out this body of work we completed a proteomic investigation of human samples, characterising 1,901 unique proteins in endobronchial biopsies from two patient cohorts suffering from mild and severe COPD, and control cohorts of healthy individuals and healthy smokers. This has allowed us to build a large annotated dataset of dysregulated changes associated with the progression of the disease in humans. Across all three of the results chapters, the dysregulated proteins and modified pathways were indicative of a number of the hallmark features of ageing in the lung related to cellular senescence, changes in proteostasis, oxidative stress, epigenetic alterations, genomic instability, and telomere attrition.2-4

5.1.2 Limitations and future directions

While this study has contributed a wealth of information on protein dysregulation in

COPD, there are aspects which could be improved in our future work. The observations outlined in Chapter four of the human endobronchial biopsies are very promising but due to the complexity and heterogeneity of COPD5 small patient cohorts like the ones featured in this study risk biasing or masking findings of significance. Thus, larger clinical cohorts are required to truly delineate the meaning of the findings we have procured in this thesis. Notwithstanding these limitations, the methods outlined in this study have provided a new way to utilise OCT samples which are conventionally collected from patients but not often used in MS-based proteomics. These new methods provide an avenue for researchers to obtain larger sample cohorts and make use of samples that have previously been stored for other purposes.

In regards to the utility of the TCA precipitation step in purifying the proteome, it should be noted that Zhao et al. achieved a quantitative proteome depth of >5,400 proteins through the use of chemical labels, tandem mass tags, and basic pH fractionation.6 When an

239 unfractionated or unenriched complex sample is analysed on the mass spectrometer, a substantial amount of information can be lost due to ion suppression or overshadowing by more abundant peptides with similar elution times.7 The inclusion of fractionation steps is an important distinction which contributes greatly to proteome coverage depth. In Chapter 2, we demonstrated through the use of multi-dimensional strategies that we could achieve an unprecedented depth of coverage of mouse lung tissue by building on methods that we have published on previously.8-10 However, in future work the implementation of hydrophilic interaction liquid chromatography (HILIC) as a fractionation technique, would help to further characterise the proteome of these clinically important biopsies. Finally, a promising new strategy was recently published whereby proteome samples undergo a tandem HILIC – high pH reverse-phase (HPRP) fractionation prior to mass spectrometry.11 This involves using the, otherwise discarded, flow through eluate collected in the first ~15 min of a HLIIC run

(Supplementary Figures E2.2 and E3.1) and subjecting it to HPRP. This study obtained a noteworthy recovery using this tandem approach. Additional studies are required to validate this approach, but it shows promise as a superior strategy to utilise all biological material and ensure that no information is lost.

Future experiments will be required to test the utility and biological importance of the potentially druggable kinases listed in Table 3.3. Initially experiments will focus on specific time points, evaluating if the phosphoproteomic profile at that time point can alter to closer reflect the age-matched controls and if disease establishment can prolonged or halted. All druggable kinase would be assessed in time, but an initial experiment would be best suited to target ErbB2 (4wk), EphA6 (6wk), DDR1 (8wk) and ErbB4 (12wk). In each case treating for the weeks leading up to the time point of interest, where tissue would be collected for testing of clinical features such as lung function, alveolar enlargement and pulmonary inflammation.

240

Also within the case time course additional mice would undergo the smoking regime the full 8

weeks to see if treatment disrupts the establishment of the key clinical features of COPD

(Figure 5.1). These experiments would aim to understand if the inhibition of these kinases can

alleviate or prolong disease establishment at the induction phases of disease and at the time of

disease establish where the majority of patients will be, can the progression of disease be halted.

Figure 5.1: Potential experiments targeting druggable kinase from Chapter 3. Mice would undergo eight or twelve weeks of the CS regime with targeted inhibition of kinases ErbB2 (A), EphA6 (B), DDR1 (C) and ErbB4 (D) where lung tissue and BALF would be collected at targeted time point and time point of disease establishment.

241

citated crocodile spermatozoa. pS/T, pS/Y, pS/T/Y = Taken together, our characterisation of the pulmonary proteome and its associated phosphorylation changes, unequivocally points to a pivotal window of significant molecular change occurring in our CS-induced COPD mouse model. This 8-week time point may provide a unique opportunity to explore whether the significant candidates discussed in this thesis hold promise therapeutically and investigate their individual roles in the pathogenesis of COPD. In this way, future gene manipulation studies could focus on the effects of the targeted deletion of candidate proteins such as HNRNPC, MSI2, S100A1 and ALOX15 during smoke exposure in mice and the downstream impact of this on the developing features of COPD. Moreover, the selective inhibition of these proteins during different phases of the smoke exposure time course may allow us to understand their contribution to different phases of the disease.

5.1.3 Conclusions

These studies demonstrate the power of advancements made in proteomics to deconvolute a highly complex tissue and disease. We have provided methods to guide future studies in the field and we have demonstrated how the integration of bioinformatical tools is essential to comprehend these vast lists of changes. This is evidenced in the establishment of

13 potentially druggable kinases that warrant further investigation in the case of the phosphoproteomic data. These approaches could be applied to other model systems of asthma and lung cancer, through their respective overlaps with COPD.12-16

In conclusion, this body of work provides the field with an extensive inventory of protein candidates, their associated phosphorylation events, mapped functions, pathways and upstream kinases potentially mediating important changes to lung tissue. We anticipate that these studies will ultimately bring about a meaningful understanding of the complexity of COPD and contribute to the design of novel therapeutics.

242

5.1.4 References

1. Beckett EL, Stevens RL, Jarnicki AG, Kim RY, Hanish I, Hansbro NG, et al. A new

short-term mouse model of chronic obstructive pulmonary disease identifies a role for

mast cell tryptase in pathogenesis. J Allergy Clin Immunol 2013; 131:752-62.

2. López-Otín C, Blasco MA, Partridge L, Serrano M, Kroemer G. The hallmarks of

aging. Cell 2013; 153:1194-217.

3. Meiners S, Eickelberg O, Königshoff M. Hallmarks of the ageing lung. European

Respiratory Journal 2015; 45:807.

4. Brandsma C-A, de Vries M, Costa R, Woldhuis RR, Königshoff M, Timens W. Lung

ageing and COPD: is there a role for ageing in abnormal tissue repair? European

Respiratory Review 2017; 26:170073.

5. Agusti A. The path to personalised medicine in COPD. Thorax 2014; 69:857-64.

6. Zhao X, Huffman KE, Fujimoto J, Canales JR, Girard L, Nie G, et al. Quantitative

Proteomic Analysis of Optimal Cutting Temperature (OCT) Embedded Core-Needle

Biopsy of Lung Cancer. J Am Soc Mass Spectrom 2017; 28:2078-89.

7. Annesley TM. Ion suppression in mass spectrometry. Clin Chem 2003; 49:1041-4.

8. Degryse S, de Bock CE, Demeyer S, Govaerts I, Bornschein S, Verbeke D, et al.

Mutant JAK3 phosphoproteomic profiling predicts synergism between JAK3

inhibitors and MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic

leukemia. Leukemia 2018; 32:788-800.

9. Nixon B, De Iuliis GN, Hart HM, Zhou W, Mathe A, Bernstein I, et al. Proteomic

profiling of mouse epididymosomes reveals their contributions to post-testicular

sperm maturation. Mol Cell Proteomics 2018.

10. Nixon B, Johnston SD, Skerrett-Byrne DA, Anderson AL, Stanger SJ, Bromfield EG,

et al. Proteomic profiling of crocodile spermatozoa refutes the tenet that post-

243

testicular maturation is restricted to mammals. Mol Cell Proteomics 2018.

11. Sun Z, Ji F, Jiang Z, Li L. Improving deep proteome and PTMome coverage using

tandem HILIC-HPRP peptide fractionation strategy. Anal Bioanal Chem 2019;

411:459-69.

12. Essilfie AT, Horvat JC, Kim RY, Mayall JR, Pinkerton JW, Beckett EL, et al.

Macrolide therapy suppresses key features of experimental steroid-sensitive and

steroid-insensitive asthma. Thorax 2015; 70:458-67.

13. Kim RY, Horvat JC, Pinkerton JW, Starkey MR, Essilfie AT, Mayall JR, et al.

MicroRNA-21 drives severe, steroid-insensitive experimental asthma by amplifying

phosphoinositide 3-kinase-mediated suppression of histone deacetylase 2. J Allergy

Clin Immunol 2017; 139:519-32.

14. Kim RY, Pinkerton JW, Essilfie AT, Robertson AAB, Baines KJ, Brown AC, et al.

Role for NLRP3 Inflammasome-mediated, IL-1beta-Dependent Responses in Severe,

Steroid-Resistant Asthma. Am J Respir Crit Care Med 2017; 196:283-97.

15. Fricker M, Deane A, Hansbro PM. Animal models of chronic obstructive pulmonary

disease. Expert Opin Drug Discov 2014; 9:629-45.

16. Starkey MR, Jarnicki AG, Essilfie AT, Gellatly SL, Kim RY, Brown AC, et al.

Murine models of infectious exacerbations of airway inflammation. Curr Opin

Pharmacol 2013; 13:337-44.

244

Appendix: Publications

245

1 Shwachman-Bodian-Diamond syndrome (SBDS) protein is a direct inhibitor of protein

2 phosphatase 2A (PP2A) activity and negative prognostic marker in acute myeloid

3 leukemia.

4 Matthew D. Dun, 1,2,# Callum J. Rigby, 1,2, Hamish D. Toop, 3 Stephen Butler, 3 Jonathan

5 Sillar, 1,2,4 Ryan J. Duchatel, 1,2 Zacary Germon,1,2 Sam Faulkner, 1,2 Mengna Chi, 1,2 Abdul

6 Mannan, 1,2 David Skerrett-Byrne, 1,5 Heather C. Murray, 1,2 Richard G. S. Kahl, 1,2 Hayley

7 Flanagan, 1,2 Juhura G. Almazi 1,2 Brett Nixon, 5 Geoff De Iuliis, 5 Charles E. de Bock, 6 Frank

8 Alvaro, 2,7 Jonathan C. Morris, 3 Anoop K. Enjeti, 1,2,4,8,9 Nicole M. Verrills, 1,2,#

9 1 University of Newcastle, Faculty of Health and Medicine, University of Newcastle,

10 Callaghan, NSW, Australia.

11 2 Hunter Medical Research Institute, Cancer Research Program, New Lambton Heights,

12 NSW, Australia.

13 3 School of Chemistry, University of New South Wales, Sydney, New South Wales, Australia.

14 4 Calvary Mater Hospital, Newcastle, NSW, Australia

15 5 University of Newcastle, Reproductive Science Group, Faculty of Science, Callaghan,

16 NSW, Australia

17 6 Children’s Cancer Institute Australia, University of New South Wales, NSW, Australia

18 7 John Hunter Children’s Hospital, New Lambton Heights, NSW, Australia

19 8 NSW health Pathology, John Hunter Hospital, Lookout Road, New Lambton Heights, NSW

20 Australia

21 9 School of Medicine and Public Health, Faculty of Health and Medicine, University of

22 Newcastle, Callaghan, NSW, Australia

23 Running title: SBDS is a PP2A inhibitory protein

24 Conflict of Interest

25 NIL 26 # Corresponding authors:

27 Matthew D. Dun

28 University Drive, Callaghan

29 NSW 2308, Australia

30 Phone: (02) 4921 5693, Email: [email protected], Twitter: @MattDun17

31 Nicole M. Verrills

32 University Drive, Callaghan

33 NSW 2308, Australia

34 Phone: (02) 4921, Email: [email protected]

35

36 ABSTRACT

37 Protein phosphatase 2A (PP2A) is a serine/threonine phosphatase inactivated in many

38 cancers including acute myeloid leukemia (AML). Activation of PP2A is emerging as a

39 potential therapeutic strategy, however the mechanisms underpinning PP2A inhibition are

40 not well understood. We have previously shown that PP2A inhibition is essential for

41 leukemia driven by oncogenic mutant c-KIT, and PP2A activating drugs FTY720 and AAL(S)

42 inhibit mutant c-KIT cell survival. Using this model, we have identified the ribosome

43 biogenesis protein SBDS, as a target of FTY720 and AAL(S). SBDS is mutated in

44 Shwachman-Diamond-syndrome, a disease resulting in bone marrow failure and myeloid

45 malignancies in adolescents and young adults. Herein, we show SBDS binds to PP2A

46 complexes comprised of the B55α regulatory subunit of PP2A. shRNA mediated knockdown

47 of SBDS increased PP2A activity and induced apoptosis. In contrast, addition of

48 recombinant SBDS to PP2A complexes inhibited activity. Immunohistochemical analysis

49 revealed high SBDS protein expression in AML blasts, whereas primitive myeloid cells of

50 healthy controls or an AML patient in complete remission did not express SBDS.

51 Furthermore, high SBDS mRNA expression was associated with decreased overall survival

52 in high-risk AML patients. Together, our data presents a role for SBDS in the dysregulation

53 of PP2A in AML.

54 Word count: 199

55

56 INTRODUCTION

57 PP2A is a family of serine/threonine phosphatases commonly inhibited in cancer, particularly

58 acute myeloid leukaemia (AML) 1, 2. PP2A inhibition is required for cell transformation 3, and

59 results in the enhanced activation of proliferative signalling pathways including the

60 PI3K/AKT/mTOR 4, ERK 5 and TGF-β 6 pathways, and inhibition of apoptotic processes 7.

61 PP2A has therefore become an attractive pharmacological target for improved anticancer

62 therapies. Work from our laboratory and others, has shown that PP2A is inhibited in mutant

63 c-KIT 8, 9 and mutant FLT3 2, 10 AML cell lines and patient samples. Genetic mutations in

64 PP2A subunits have been described in solid cancers 11 however, this is extremely rare in

65 AML 12. PP2A activity is low in up to 78% of cases of AML at diagnosis 2. Reduced

66 expression of the catalytic subunit of PP2Ac (PPP2CA) has been observed in AML patients

67 harbouring TP53 mutations and cytogenetically complex karyotypes 13. The inhibition of

68 PP2A, on the other hand, is usually associated with the over-expression of PP2A inhibitory

69 proteins such as SET 14, CIP2A 2 and SETBP 15. The mechanisms underpinning PP2A

70 inhibition in c-KIT mutant AML are not known.

71 Pharmacological activation of PP2A is possible using PP2A activating drugs such as

72 Forskolin 16, the immunosuppressant FTY720 17, and the non-immunosuppressive analogue

73 of FTY720, AAL(S) 10, 18, 19. The immunosuppressive properties of FTY720 are attributed to

74 its phosphorylation at one of the alcohol groups to generate the active metabolite, FTY720-P

75 20. Importantly, non-phosphorylated FTY720 21 has been shown to induce apoptosis in a

76 number of cancer cell types 8, 14. We have recently performed structure activity relationship

77 studies using a library of non-phosphorylatable FTY720 analogues that we have

78 synthesized, and revealed that the amino-alcohol head group of AAL(S) is superior to that of

79 FTY720 for PP2A activation and inducing AML cell death in vitro 19. However, the cellular

80 targets of AAL(S) have not been reported.

81 In this study, we use a chemical proteomics approach to identify targets of AAL(S) and

82 FTY720, in myeloid cell lines harbouring the receptor tyrosine kinase (RTK) c-KIT point

83 mutation, D816V (c-KIT/D816V). Mutant c-KIT/D816V occurs in 46% of core binding factor

84 (CBF)-AMLs (t(8;21)) 22, and is associated with increased risk of relapse and decreased

85 overall survival 23, 24. Expression of c-KIT/D816V confers constitutive activation of the

86 receptor, hyper-activation of downstream signalling pathways, and inhibition of PP2A 8.

87 Herein, we identify the Shwachman-Bodian-Diamond syndrome (SBDS) protein as a direct

88 target of AAL(S) and FTY720. We further show that SBDS binds to PP2A complexes and

89 inhibits their activity, while knockdown of SBDS increased PP2A activity and cell death in c-

90 KIT mutant myeloid cells, but not in factor-dependent, non-leukemogenic myeloid cells.

91 Thus, SBDS is a novel PP2A inhibitor and target of PP2A activating drugs. Finally, we show

92 that primary AML patient blasts express SBDS, and high SBDS mRNA expression was

93 significantly associated with worse disease outcome in patients classified as high-risk, or in

94 those harbouring relevant c-KIT or FLT3:NPM1 expression profiles, thereby revealing a

95 novel prognostic marker and drug target for new anti-AML therapies.

96

97 MATERIALS AND METHODS

98 Cell lines, retroviral infection, and survival assays

99 The FDC-P1 growth factor–dependent mouse myeloid progenitor cell line 25 stably

100 expressing an empty vector (EV) or the imatinib-resistant human c-KIT-D816V were used as

101 previously described 26. Stable and transient knockdown of SBDS was achieved in FDC.P1

102 cells using shRNA targeting the mouse Sbds gene or a non-specific control sequence

103 (shSCRM) in the pFSYci vector with a YFP reporter gene under the control of a bicistronic

104 promoter (a kind gift from Dr Daniel Link, Washington University School of Medicine, St

105 Louis, USA and prepared as described 27). Transient molecular inhibition was achieved by

106 lentiviral transduction of FD-EV and c-KIT/D816V cells with shRNA viral supernatant for 24

107 hours. Cells were then sorted on a FACSAria (BD Biosciences, San Jose, CA, USA) into

108 populations expressing high or low YFP, as a proxy marker for relative SBDS knockdown.

109 Annexin V staining was performed 24 hours post sorting 8. Cell growth and YFP expression

110 was monitored for 6 days by Trypan blue viability staining and flow cytometry, respectively.

111 In addition, stable cell lines were selected by sorting populations of YPF+ cells 27.

112 Patient recruitment, sample processing and survival analysis

113 Human bone marrow aspirates from 14 patients diagnosed with AML between 2010 and

114 2013 at the Calvary Mater Newcastle (CMN) Hospital Australia, were obtained after informed

115 consent according to institutional guidelines in keeping with the Declaration of Helsinki, and

116 studies were approved by the Hunter New England Area Health Human Ethics Committee

117 (see Supplemental Table S1 for patient information). Diagnosis was confirmed using

118 cytomorphology, cytogenetics, and flow cytometry according to the World Health

119 Organization (WHO) classification of the myeloid neoplasms 28. Cytogenetic risk

120 classification categories were defined according to the Medical Research Council schema.

121 Immunohistochemical evaluation of SBDS expression was performed using the automated

122 tissue staining facility at the Berghofer Queensland Institute of Medical Research. c-KIT

123 status was assessed using high-resolution melt analysis and direct sequencing from Ficoll-

124 purified mononuclear cells as described 8. RNA-Seq data was downloaded from the Cancer

125 Genome Atlas (TCGA) on the 12th of June 2019, and analysed using SurvExpress 29. Risk is

126 calculated using the Cox model and splits the samples after ranking the samples by their

127 prognostic index, to generate low-, medium-, and high- risk.

128 PP2A activating drug (PAD) affinity chromatography

129 A click reaction was used to couple either hydrophilic (PEG4) or hydrophobic (C4) terminal

130 azide agarose affinity beads to AAL(S) and O-FTY720 synthesised with a terminal acetylene

131 group on their hydrophobic tail (described in Supplementary Information). Cells were

132 lysed in PP2A activity assay buffer 10 and 500 µg of protein lysate was incubated with control

133 beads at 4 °C for 2 hours before incubation with AAL(S) or O-FTY720 beads at 4 °C

134 overnight. Control and active beads were washed 3 times in sterile Tris-buffered saline

135 (TBS), 0.01% Tween 20 v/v for 5 minutes at room temperature. Bound proteins were first

136 eluted using a competitive elution strategy employing 250 nM native AAL(S) or FTY720 for

137 10 minutes at room temperature. The remaining bound proteins were subjected to reducing

138 conditions by boiling in 1× NuPAGE® LDS Sample Buffer (Invitrogen, Carlsbad, CA, USA)

139 containing 2% β-mercaptoethanol v/v for 5 minutes. Proteins were separated, silver stained

140 30 and proteins that bound uniquely to active beads from c-KIT/D816V cell lysates were

141 excised and subjected to liquid chromatography tandem mass spectrometry (LC-MS/MS) as

142 previously described 31. MS results were searched using Mascot 2.2 using Uniprot_mouse

143 database and protein reported if Mascot score was ≥67.

144 Western blotting, co-immunoprecipitation, PP2A activity assay

145 Whole cell lysates for Western blot analysis were prepared using ice cold RIPA buffer 32

146 containing 5 mM Na3VO4, protease inhibitors (Complete; Roche, Basel, Switzerland) and

147 PhosSTOP (Roche). Cells were probe-tip sonicated for 2 × 20 seconds on ice and mixed for

148 30 minutes at 4 °C. Western blot analysis was performed using anti-SBDS, anti-PP2A-B56α,

149 anti-Histone H3, anti-SET, anti-SETBP (Santa Cruz Biotechnology, Dallas, TX, USA), anti-

150 SBDS (Genetex, Irvine, CA, USA), in-house anti-PP2Ac 33 and commercial anti-PP2Ac, anti-

151 PP2A-B55α, anti-PP2A-A (Millipore, Burlington, MA, USA), anti-CIP2A and anti-β-actin

152 (Sigma-Aldrich). Secondary, horseradish peroxidase (HRP) conjugated antibodies (BioRad,

153 Hercules, CA, USA), and secondary native anti-mouse-HRP (Rockland Immunochemicals,

154 Limerick, PA, USA). Bands were visualised using a cooled charge coupled device camera

155 (ImageQuant LAS-4000; GE Healthcare, Chicago, IL, USA) 31. SBDS and PP2A interacting

156 proteins were identified by co-immunoprecipitation coupled to LC-MS/MS as described 30, 31.

157 PP2A activity assay was performed as described, using anti-PP2Ac (1D6) (Millipore) 8, 10.

158 Immunolocalisation

159 Cells were washed in phosphate-buffered saline (PBS) twice and diluted to 5 × 105 cells/ml.

160 Slides were coated with foetal calf serum (FCS) and 100 µL of cell suspension was

161 centrifuged onto each glass slide at 500 × g for 5 minutes. Cells were fixed in 3.7%

162 paraformaldehyde/PBS for 10 minutes, washed 3 times with PBS, permeabilised with 0.1%

163 Triton X-100 for 3 minutes and blocked in 10% FCS/PBS at room temperature for 20

164 minutes. Primary incubation was undertaken with 1:100 dilution of primary antibody (SBDS –

165 Genetex, PP2Ac – in house) at 4 °C overnight. Slides were subjected to 3 × 5 minute

166 washes with PBS and incubated in a 1:500 dilution of the appropriate Alexa Fluor conjugated

167 secondary antibody (BD Biosciences, Franklin Lakes, NJ, USA) at room temperature for 45

168 minutes, then washed and mounted in ProLong Gold anti-fade with DAPI (Life Technologies,

169 Carlsbad, CA, USA), and imaged using a LSM510 laser scanning confocal microscope (Carl

170 Zeiss, Oberkochen, Germany) 31.

171 Molecular modelling

172 All proteins used were prepared by the addition of protons, protonation at biological pH, and

173 restrained minimisation using an optimised potentials for liquid simulations (OPLS3) force

174 field. Protein structures were taken from the Research Collaboratory for Structural

175 Bioinformatics Protein Data Bank archive (RCSB PDB): accession codes 3DW8 for PP2A

176 (Protein Phosphatase 2A Holoenzyme with B55α) 34 2L9N for SBDS 35 and 2E50 for SET 36.

177 Each of the 20 available conformers of SBDS was prepared separately, and assigned a

178 number, 2L9N_1-20, as per the original designation in the PDB file 35. Solvent mapping was

179 conducted using the FTMap web server, and averaged across all 20 SBDS conformers 37

180 (Supplementary Table S2). Protein-protein docking was conducted using the Cluspro web

181 server 38-40. Results were analysed with ‘balanced’ coefficients, and ranking was based on

182 cluster size, model scores, and comparison with results from solvent mapping. Automated

183 binding site detection was conducted using Schrodinger’s SiteMap, implemented through

184 Maestro 11.0, both with and without the recommended pre-set values for shallow binding

185 sites 41, 42. The top five ranked binding sites were retained for consideration. All ligands were

186 prepared for use by protonation at biological pH and minimisation with an OPLS3 force field.

187 Ligand-protein docking was conducted using Extra Precision Glide, implemented through

188 Schrodinger’s Maestro 11.0 43.

189

190 RESULTS

191 FTY720 and AAL(S) directly bind to SBDS that is part of a larger PP2A protein

192 complex

193 FTY720 and its close structural analogue AAL(S) are small molecule inhibitors that activate

194 PP2A in AML cells. FTY720 has been reported to bind to the PP2A inhibitory protein SET 44,

195 however, the cellular target of AAL(S) is not known. To identify cellular targets of these

196 agents, we synthesised a series of AAL(S)- and FTY720 agarose affinity beads using either

197 a hydrophobic or hydrophilic linker (Figure 1A) and screened these in non-reduced lysates

198 from isogenic growth factor-dependent FDC.P1 myeloid progenitor cell lines expressing

199 either an empty vector (FD-EV) or the c-KIT mutation D816V (c-KIT/D816V), which are

200 sensitive to PP2A activation and cytotoxicity induced by FTY720 and AAL(S) 8, 19.

201 A protein band that competitively eluted from the beads using native AAL(S) in c-KIT/D816V

202 lysates but not in FD-EV cells resolved at 28 kDa (Figure 1B - red arrowhead) was excised

203 and subjected to LS-MS/MS analysis, identifying the Shwachman-Bodian-Diamond

204 Syndrome protein (SBDS) (Supplementary Table S3). Western blotting after a drug

205 competition assay confirmed SBDS as a target of both AAL(S) hydrophobic (Figure 1C) and

206 hydrophilic linker drug-affinity beads (Figure 1D) and also of O-FTY720 beads (Figure 1E).

207 Importantly, SBDS was not eluted from control beads lacking the amino alcohol group of

208 either AAL(S) or FTY720 (Figure 1B-E). This strategy also eluted the catalytic subunit of

209 PP2Ac and the SET protein under denaturing conditions, but not with competitive drug

210 elution, albeit considerably less SET was eluted from AAL(S) drug affinity beads than from

211 O-FTY720-beads (Figure 1E). These results suggested that SBDS directly binds to FTY720

212 and AAL(S), and forms a macromolecular complex with PP2A.

213 To independently identify SBDS interacting proteins, SBDS immunoprecipitates from c-

214 KIT/D816V cells were subjected to LC-MS/MS (Supplementary Figure S1A). Peptides

215 were mapped with high confidence to SBDS, the PP2A catalytic subunit (PP2Ac), PP2A

216 structural subunit (PP2A-A), PP2A regulatory subunit B55α (B55α), nucleophosmin (NPM1),

217 and SET (Figure 1F, Supplementary Table S4). No other PP2A regulatory subunits were

218 identified in the immunoprecipitates. These novel and specific interactions were then

219 validated in c-KIT/D816V cells using reciprocal co-immunoprecipitation with both SBDS and

220 PP2Ac antibodies (Figure 1G, H). SBDS and PP2Ac also showed strong colocalisation in

221 the nucleus and cytoplasm of c-KIT/D816V cells, but not in the control FD-EV cells

222 (Supplementary Figure S1B). Potential direct binding was further investigated in silico

223 using all available nuclear magnetic resonance (NMR) structures of SBDS, which identified

224 an amphipathic binding pocket within the N-terminal domain of SBDS that likely interacts

225 with PP2A (Figure 1I). This region corresponded well with binding hot spots identified using

226 the solvent mapping platform FTMap 37 (Supplementary Figure S1C, Supplementary

227 Table S2). We next docked AAL(S) and FTY720 into SBDS, which were predicted to bind

228 leucine 12 (L12) and valine 15 (V15) within a small hydrophobic pocket localized in the

229 FYSH domain of SBDS. Serine 61 (S61) and aspartic acid (Q94) formed hydrogen bonds,

230 while glutamic acid 28 (E28) formed a salt bridge with the amino alcohol region of AAL(S)

231 and FTY720 and SBDS (Figure 1J, magnified). This region is between the predicted site of

232 interaction with PP2Ac and PP2A-B55α subunits. Taken together, these data suggest that

233 SBDS binds PP2A, and this is likely disrupted by direct binding of AAL(S) and FTY720.

234 SBDS is a novel inhibitor of PP2A activity

235 We sought to determine whether SBDS can directly inhibit PP2A activity. Western blot

236 analysis showed no difference in SBDS protein expression in c-KIT/D816V cells compared

237 to control factor dependent myeloid progenitor cells (FD-EV) cells (Supplementary Figure

238 2A). However PP2A activity was decreased in c-KIT/D816V cells, compared to FD-EV cells

239 (Figure 2A), and could be reactivated in c-KIT/D816V cells by AAL(S) or following shRNA

240 mediated knockdown of SBDS in c-KIT/D816V cells. In contrast, SBDS knockdown had no

241 effect on PP2A activity in the control FD-EV cells (Figure 2A, B) or in either cell line

242 following treatment with the c-KIT inhibitor Dasatinib (Figure 2A, B). Notably, the addition of

243 recombinant SBDS (rSBDS) to PP2A immunoprecipitants reduced the activity of PP2A in

244 both the c-KIT/D816V and FD-cells (Figure 2A, B). Activation of PP2A by AAL(S) also

245 reduced SBDS association with PP2A complexes as determined by co-immunoprecipitation

246 (Figure 2B) and immunofluorescence (Figure 2C).

247 SBDS expression is essential for the survival of c-KIT/D816V cells.

248 To examine the functional effects of SBDS inhibition, FD-EV and c-KIT/D816V cells were

249 transiently transfected with a YFP-tagged SBDS-shRNA (shSBDS) or a YFP-tagged

250 scrambled shRNA control (shSCRM) for 24 hours and sorted for populations expressing high

251 YFP. SBDS knockdown resulted in increased apoptosis (Figure 3A, Supplementary

252 Figure S3), increased cleaved caspase 3 (Figure 3B) and decreased proliferation in c-

253 KIT/D816V cells but not in FD-EV cells (Figure 3C). This was also reflected in a reduction in

254 the percentage of YFP+ cells over time in c-KIT/D816V with SBDS knockdown (Figure 3D).

255 Together, these data show that inhibition of SBDS increases PP2A activity, inhibits

256 proliferation, and induces cell death in c-KIT/D816V myeloid cells.

257 SBDS overexpression is a poor prognostic marker in AML.

258 To determine the clinical relevance of SBDS expression in AML, we assessed SBDS mRNA

259 expression across AML risk groups using publicly available TCGA RNAseq data of 168 AML

260 patients. Expression levels of SBDS were strongly associated with cytogenetic risk factors in

261 AML groups (p=1.55-57) (Figure 4A). Patients harbouring combined high-level SBDS and c-

262 KIT expression profiles were significantly associated with shorter survival (hazard ratio: high

263 / low 1.86, 95% CI of ratio 1.26 to 2.73, p=0.001672) (Figure 4B), in contrast to patients

264 harbouring high-level c-KIT expression profiles alone (data not shown). Patients with

265 combined high-level SBDS and FLT3 expressions profiles showed an overall survival of 23

266 months (hazard ratio = 1.72, CI 1.17 ~ 2.52, p=0.006155) (Figure 4C) compared to 25

267 months for patients expressing high-level FLT3 alone (FLT3 hazard ratio = 1.6, CI 1.09 ~

268 2.36, p=0.01629) (data not shown). Patients expressing high-level SBDS|FLT3|NPM1

269 profiles were significantly associated with shorter overall survival (hazard ratio = 1.59, CI

270 1.08 ~ 2.34, p=0.001757) (Figure 4D), which included patients classified with favourable

271 (hazard ratio high / low = 3.89, CI 1.05 ~ 14.41, p=0.04234) (data not shown) or intermediate

272 risk cytogenetics (hazard ratio = 1.71, CI 1.04 ~ 2.81, p=0.03509) (Figure 4E). Importantly,

273 we also assessed SBDS protein expression in the bone marrow of 14 AML patients

274 harbouring various risk groups and cytogenetics as well as in normal hematopoietic

275 progenitor cells from a patient in remission and healthy control. This approach revealed a

276 median overall survival of 17.5 months (1.5 years) for patients expressing SBDS, and 55

277 months (4.6 years) for patients with absent SBDS expression in their blast cells, not reaching

278 statistical significance (hazard ratio high / low = 1.552, CI 0.5313 ~ 5.797, p=0.2225) (Figure

279 4F). Overexpression of SBDS by immunohistochemistry in myeloblasts was observed in

280 trephines obtained from c-KIT mutant (Figure 4G, H), c-KIT wild type cytogenetically normal

281 (Figure 4I) and complex karyotype patients (Figure 4K, L). Patients harbouring favourable

282 risks (e.g. FLT3/ITD negative plus NPM1 mutation - NPMc) did not express SBDS in their

283 myeloblasts (Figure 4J). Importantly, no SBDS expression was observed in the myeloblasts

284 of a patient with a c-KIT/M541L mutation in complete remission following standard induction

285 chemotherapy (Figure 4M), and nor was there any SBDS expression in myeloid precursors

286 in the trephine from a healthy control (Figure 4N). Erythroid precursors, as expected,

287 showed uniformly high SBDS expression in all bone marrow trephines evaluated.

288

289 DISCUSSION

290 PP2A inhibition is common in cancer, thus activation of PP2A has become an attractive

291 therapeutic strategy 2, 8, 10. Despite this promise, the mechanisms underpinning PP2A

292 dysregulation and reactivation remain unclear. In seeking to address this, here we report the

293 identification of a novel PP2A inhibitory protein, the ribosome biogenesis protein, SBDS, as

294 a novel target of FTY720 and the chiral deoxy analogue of FTY720, AAL(S) (Figure 1).

295 Additionally, we show that SBDS is a novel PP2A interacting protein (Figures 1, 2) that is

296 critical for the survival of mutant c-KIT/D816V myeloid progenitor cells (Figure 3). Thus, re-

297 activation of PP2A via SBDS targeting provides a novel therapeutic approach for cancers

298 harbouring c-KIT mutations.

299 The most well characterized PP2A activating compound, the immunosuppressant FTY720

300 (fingolimod/Gilenya®), is a pro-drug that is phosphorylated by SPHK2 in vivo 20. FTY720-P

301 acts as a high-affinity agonist for four of the five G protein-coupled sphingosine-1-phosphate

302 receptors (SIPRs), causing receptor internalization and increased activity of the JAK2-PI3Kγ-

303 PKC signalling axis, a common molecular driver of myeloproliferative neoplasms 45.

304 However, we and others have shown that phosphorylation of FTY720 is dispensable for

305 PP2A activating and antineoplastic activities 10, 19. Non-phosphorylated FTY720 binds to the

306 SET nuclear proto-oncogene in leukaemia 46 and lung cancer 47 cell lines, leading to PP2A

307 activation and apoptosis. To enhance the antineoplastic activity of these compounds, and to

308 limit side-effects, we and others have developed non-phosphorylatable analogues of

309 FTY720 that inhibit proliferation and induce apoptosis in several cancer cell lines through the

310 activation of PP2A 19, 48. However, the molecular targets of these novel chemicals were

311 hitherto unknown.

312 The protein at the centre of PP2A activity, the catalytic subunit (PP2Ac), is highly expressed

313 and maintained at constant levels in all cell types 49. Control of PP2Ac dephosphorylating

314 activity is regulated by a combination of post-translational modifications and the association

315 of regulatory subunits and endogenous inhibitors 50. The endogenous PP2A inhibitor, SET,

316 has previously been reported to bind to FTY720 47. In these experiments, SET was affinity

317 purified by binding to biotinylated FTY720 47, and was associated with PP2A. However,

318 unlike SET, which we found only eluted from AAL(S)- and FTY720- drug affinity beads upon

319 denaturation, we show here that SBDS eluted from AAL(S)- and FTY720- drug affinity beads

320 using native drug as well as denaturation (Figure 1B, C). These findings suggest that

321 FTY720 and AAL(S) can interact with SET when it is bound to PP2A complexes, a finding

322 supported by a recent NMR analysis showing FTY720 bound to SET when in complex with

323 B56 containing PP2A complexes 44 Our data further indicates that FTY720 and AAL(S) can

324 bind to both monomeric and PP2A-B55α associated SBDS protein.

325 SBDS is mutated in the rare autosomal recessive condition Shwachman-Diamond Syndrome

326 (SDS). This mutation arises from a gene conversion with an adjacent pseudogene and

327 results in the insertion of in-frame stop-codons at conserved sites, resulting in multisystem

328 dysfunction with haematopoiesis being particularly affected 51. Bone marrow failure is

329 common 52, and SDS patients are predisposed to leukaemic transformation, particularly AML

330 53, however the mechanisms of transformation are very poorly understood. The syndrome is

331 characterised by SBDS loss of function, which is in contrast to the oncogenic role we

332 describe for the protein here. Whether the loss of function of SBDS drives over-activation of

333 PP2A and anti-apoptotic process in SDS patients, analogous to T cell leukaemia cells 54,

334 remains to be determined.

335 The novel PP2A inhibitory complex comprised of SBDS-PP2Ac showed distinct

336 colocalisation patterns predominately in the nucleus of c-KIT mutant myeloid progenitor cells

337 (Figure 2C). This pattern of localisation has previously been reported for SBDS in mature

338 myeloid cells 55. Co-immunoprecipitation of SBDS identified the PP2A regulatory subunit

339 PP2A-B55α as part of the holoenzyme bound by SBDS, together with the structural subunit

340 PP2A-A. The PP2A-B55α complex plays an important role in mitotic exit, at least in part via

341 its dephosphorylation of pocket proteins, Rb and p107 56. This, together with the reduced

342 proliferation induced by transient SBDS knockdown in the c-KIT/D816V, suggests that SBDS

343 regulation of PP2A-B55α complexes plays an important role in mitosis.

344 We used the resolved crystal structure of PP2A-AB55αC 34 in molecular modelling studies to

345 reveal putative interactions between the N-terminal FYSH domain (Fungal, Yhr087wp,

346 Shwachman) of SBDS and PP2Ac. This interaction may modulate the activity of PP2Ac by

347 blocking substrate phosphopeptides from binding and hence preventing their

348 dephosphorylation. Molecular modelling also identified the N-terminal domain of SBDS as

349 being important for AAL(S) and FTY720 binding, with both drugs docking into an

350 amphipathic pocket (Figure 1J). It is interesting to note that the SBDS gene mutations

351 commonly seen in Shwachman-Diamond syndrome (SDS) are located in exon 2 and intron

352 2, and result in a premature stop-codon (K62X) and a frameshift mutation resulting in a stop

353 codon (C84fsX3), respectively 51. Whether these truncated forms are expressed, can

354 interact with PP2A, and are targets of AAL(S) and FTY720 in Shwachman-Diamond

355 syndrome remain to be determined.

356 Activation of PP2A was achieved pharmacologically using AAL(S) and by molecular

357 inhibition of SBDS expression in c-KIT/D816V cells, and could be rescued by incubating

358 PP2A complexes with recombinant SBDS (Figure 2A, B). Interestingly, introduction of

359 recombinant SBDS also reduced the activity of PP2A in FD-EV control myeloid progenitor

360 cells. Therefore SBDS can inhibit the PP2A complexes in these cells, yet we saw no SBDS

361 in PP2Ac immunoprecipitates from these cells, and the activity of PP2A is much higher in

362 these cells than the c-KIT/D816V cells. This was not due to altered expression of SBDS in

363 FD-EV compared to c-KIT/D816V cells, hence future studies are required to determine why

364 SBDS associates with PP2A complexes in c-KIT/D816V cells but not the FD-EV cells.

365 Analysis of AML patients in the TCGA dataset showed a strong correlation between

366 cytogenetic risk and level of SBDS mRNA expression. Relevant to our study, high SBDS and

367 c-KIT gene expression were associated with worse overall survival (Figure 4B). This was

368 mirrored for AML patients expressing SBDS with FLT3, (Figure 4C) and patients expressing

369 SBDS with FLT3 and NPM1.The co-expression modelling may be able to further risk stratify

370 favourable (Figure 4D) and intermediate cytogenetic risk AML (Figure 4E). In particular, the

371 intermediate cytogenetic group is a challenging cohort with variable prognosis. Although the

372 consensus ELN (European Leukemia Network) guideline provides a significant improvement

373 in risk assessment, further refinement of prognostic evaluation may help overcome the

374 limitations of the current systems 57, 58.

375 Assessment of SBDS protein expression in the blasts of AML patients showed cytoplasmic

376 and nuclear localization in 9/14 patients (Figure 4G-L). By contrast, we detected very little

377 SBDS expression in the myeloblasts or neutrophils of an AML patient in remission, or that of

378 a healthy control; however, high expression of SBDS was observed in the erythroids and

379 erythrocytes of these individuals (Figure 4M, N). This latter finding is consistent with

380 previous reports of high levels of SBDS protein expression in early neutrophil precursors,

381 plasma cells, megakaryocytes and osteoblasts 59. Our protein analysis found patients with

382 moderate to high SBDS expression showed a trend towards worse overall survival, at 17

383 months, compared to 61 months for patients with no or low level SBDS expression.

384 Our studies provide a novel mechanism of PP2A inhibition in myeloid progenitor cells

385 harbouring mutant c-KIT. SBDS is as a novel drug target of FTY720 and AAL(S), with SBDS

386 expression essential for the factor independent survival of these mutant c-KIT cells. Co-

387 expression modelling of SBDS with other common prognostically relevant genes suggest

388 further improvements to risk stratification for AML patients.

389 Word count: 4087

390 Acknowledgments

391 M.D.D. is supported by a Cancer Institute NSW Fellowship. N.M.V. is supported by ARC

392 Future Fellowship. A.K.E., is supported by a HNE/NSEW Health Pathology/CMN Clinical

393 Translational Research Fellowship. This study was supported by the Hunter Medical

394 Research Institute, Zebra Equities, Hunter District Hunting Club and Ski for Kids, and The

395 Estate of James Scott Lawrie.

396 REFERENCES

397 1. Perrotti D, Neviani P. Protein phosphatase 2A: a target for anticancer therapy. Lancet 398 Oncol 2013 May; 14(6): e229-238.

399 400 2. Cristobal I, Garcia-Orti L, Cirauqui C, Alonso MM, Calasanz MJ, Odero MD. PP2A 401 impaired activity is a common event in acute myeloid leukemia and its activation by 402 forskolin has a potent anti-leukemic effect. Leukemia 2011 Apr; 25(4): 606-614.

403 404 3. Westermarck J, Hahn WC. Multiple pathways regulated by the tumor suppressor 405 PP2A in transformation. Trends Mol Med 2008 Apr; 14(4): 152-160.

406 407 4. Kuo YC, Huang KY, Yang CH, Yang YS, Lee WY, Chiang CW. Regulation of 408 phosphorylation of Thr-308 of Akt, cell proliferation, and survival by the B55alpha 409 regulatory subunit targeting of the protein phosphatase 2A holoenzyme to Akt. J Biol 410 Chem 2008 Jan 25; 283(4): 1882-1892.

411 412 5. Letourneux C, Rocher G, Porteu F. B56-containing PP2A dephosphorylate ERK and 413 their activity is controlled by the early gene IEX-1 and ERK. EMBO J 2006 Feb 22; 414 25(4): 727-738.

415 416 6. Batut J, Schmierer B, Cao J, Raftery LA, Hill CS, Howell M. Two highly related 417 regulatory subunits of PP2A exert opposite effects on TGF-beta/Activin/Nodal 418 signalling. Development 2008 Sep; 135(17): 2927-2937.

419 420 7. Aho TL, Sandholm J, Peltola KJ, Mankonen HP, Lilly M, Koskinen PJ. Pim-1 kinase 421 promotes inactivation of the pro-apoptotic Bad protein by phosphorylating it on the 422 Ser112 gatekeeper site. FEBS Lett 2004 Jul 30; 571(1-3): 43-49.

423 424 8. Roberts KG, Smith AM, McDougall F, Carpenter H, Horan M, Neviani P, et al. 425 Essential requirement for PP2A inhibition by the oncogenic receptor c-KIT suggests 426 PP2A reactivation as a strategy to treat c-KIT+ cancers. Cancer Res 2010 Jul 01; 427 70(13): 5438-5447. 428 429 9. Yang Y, Huang Q, Lu Y, Li X, Huang S. Reactivating PP2A by FTY720 as a novel 430 therapy for AML with C-KIT tyrosine kinase domain mutation. J Cell Biochem 2012 431 Apr; 113(4): 1314-1322.

432 433 10. Smith AM, Dun MD, Lee EM, Harrison C, Kahl R, Flanagan H, et al. Activation of 434 protein phosphatase 2A in FLT3+ acute myeloid leukemia cells enhances the 435 cytotoxicity of FLT3 tyrosine kinase inhibitors. Oncotarget 2016 Jul 26; 7(30): 47465- 436 47478.

437 438 11. Ruediger R, Pham HT, Walter G. Alterations in protein phosphatase 2A subunit 439 interaction in human carcinomas of the lung and colon with mutations in the A beta 440 subunit gene. Oncogene 2001 Apr 05; 20(15): 1892-1899.

441 442 12. Arriazu E, Pippa R, Odero MD. Protein Phosphatase 2A as a Therapeutic Target in 443 Acute Myeloid Leukemia. Front Oncol 2016; 6: 78.

444 445 13. Sallman DA, Wei S, List A. PP2A: The Achilles Heal in MDS with 5q Deletion. Front 446 Oncol 2014; 4: 264.

447 448 14. Neviani P, Santhanam R, Oaks JJ, Eiring AM, Notari M, Blaser BW, et al. FTY720, a 449 new alternative for treating blast crisis chronic myelogenous leukemia and 450 Philadelphia chromosome-positive acute lymphocytic leukemia. J Clin Invest 2007 451 Sep; 117(9): 2408-2421.

452 453 15. Cristobal I, Blanco FJ, Garcia-Orti L, Marcotegui N, Vicente C, Rifon J, et al. SETBP1 454 overexpression is a novel leukemogenic mechanism that predicts adverse outcome 455 in elderly patients with acute myeloid leukemia. Blood 2010 Jan 21; 115(3): 615-625.

456 457 16. Feschenko MS, Stevenson E, Nairn AC, Sweadner KJ. A novel cAMP-stimulated 458 pathway in protein phosphatase 2A activation. J Pharmacol Exp Ther 2002 Jul; 459 302(1): 111-118.

460

461 17. Matsuoka Y, Nagahara Y, Ikekita M, Shinomiya T. A novel immunosuppressive agent 462 FTY720 induced Akt dephosphorylation in leukemia cells. Br J Pharmacol 2003 Apr; 463 138(7): 1303-1312.

464 465 18. Hatchwell L, Girkin J, Dun MD, Morten M, Verrills N, Toop HD, et al. Salmeterol 466 attenuates chemotactic responses in rhinovirus-induced exacerbation of allergic 467 airways disease by modulating protein phosphatase 2A. J Allergy Clin Immunol 2014 468 Jun; 133(6): 1720-1727.

469 470 19. Toop HD, Dun MD, Ross BK, Flanagan HM, Verrills NM, Morris JC. Development of 471 novel PP2A activators for use in the treatment of acute myeloid leukaemia. Org 472 Biomol Chem 2016 May 18; 14(20): 4605-4616.

473 474 20. Brinkmann V, Davis MD, Heise CE, Albert R, Cottens S, Hof R, et al. The immune 475 modulator FTY720 targets sphingosine 1-phosphate receptors. J Biol Chem 2002 476 Jun 14; 277(24): 21453-21457.

477 478 21. Cohen JA, Barkhof F, Comi G, Hartung HP, Khatri BO, Montalban X, et al. Oral 479 fingolimod or intramuscular interferon for relapsing multiple sclerosis. N Engl J Med 480 2010 Feb 04; 362(5): 402-415.

481 482 22. Wakita S, Yamaguchi H, Miyake K, Mitamura Y, Kosaka F, Dan K, et al. Importance 483 of c-kit mutation detection method sensitivity in prognostic analyses of 484 t(8;21)(q22;q22) acute myeloid leukemia. Leukemia 2011 Sep; 25(9): 1423-1432.

485 486 23. Chevallier P, Labopin M, Turlure P, Prebet T, Pigneux A, Hunault M, et al. A new 487 Leukemia Prognostic Scoring System for refractory/relapsed adult acute 488 myelogeneous leukaemia patients: a GOELAMS study. Leukemia 2011 Jun; 25(6): 489 939-944.

490 491 24. De Kouchkovsky I, Abdul-Hay M. 'Acute myeloid leukemia: a comprehensive review 492 and 2016 update'. Blood Cancer J 2016 Jul 01; 6(7): e441.

493

494 25. Dexter TM, Garland J, Scott D, Scolnick E, Metcalf D. Growth of factor-dependent 495 hemopoietic precursor cell lines. J Exp Med 1980 Oct 01; 152(4): 1036-1047.

496 497 26. Frost MJ, Ferrao PT, Hughes TP, Ashman LK. Juxtamembrane mutant V560GKit is 498 more sensitive to Imatinib (STI571) compared with wild-type c-kit whereas the kinase 499 domain mutant D816VKit is resistant. Mol Cancer Ther 2002 Oct; 1(12): 1115-1124.

500 501 27. Rawls AS, Gregory AD, Woloszynek JR, Liu F, Link DC. Lentiviral-mediated RNAi 502 inhibition of Sbds in murine hematopoietic progenitors impairs their hematopoietic 503 potential. Blood 2007 Oct 01; 110(7): 2414-2422.

504 505 28. Vardiman JW, Thiele J, Arber DA, Brunning RD, Borowitz MJ, Porwit A, et al. The 506 2008 revision of the World Health Organization (WHO) classification of myeloid 507 neoplasms and acute leukemia: rationale and important changes. Blood 2009 Jul 30; 508 114(5): 937-951.

509 510 29. Aguirre-Gamboa R, Gomez-Rueda H, Martinez-Ledesma E, Martinez-Torteya A, 511 Chacolla-Huaringa R, Rodriguez-Barrientos A, et al. SurvExpress: an online 512 biomarker validation tool and database for cancer gene expression data using 513 survival analysis. PLoS One 2013; 8(9): e74250.

514 515 30. Dun MD, Anderson AL, Bromfield EG, Asquith KL, Emmett B, McLaughlin EA, et al. 516 Investigation of the expression and functional significance of the novel mouse sperm 517 protein, a disintegrin and metalloprotease with thrombospondin type 1 motifs number 518 10 (ADAMTS10). Int J Androl 2012 Aug; 35(4): 572-589.

519 520 31. Dun MD, Smith ND, Baker MA, Lin M, Aitken RJ, Nixon B. The containing 521 TCP1 complex (CCT/TRiC) is involved in mediating sperm-oocyte interaction. J Biol 522 Chem 2011 Oct 21; 286(42): 36875-36887.

523 524 32. Alcaraz C, De Diego M, Pastor MJ, Escribano JM. Comparison of a 525 radioimmunoprecipitation assay to immunoblotting and ELISA for detection of 526 antibody to African swine fever virus. J Vet Diagn Invest 1990 Jul; 2(3): 191-196.

527 528 33. Sim AT, Collins E, Mudge LM, Rostas JA. Developmental regulation of protein 529 phosphatase types 1 and 2A in post-hatch chicken brain. Neurochem Res 1998 Apr; 530 23(4): 487-491.

531 532 34. Cho US, Xu W. Crystal structure of a protein phosphatase 2A heterotrimeric 533 holoenzyme. Nature 2007 Jan 04; 445(7123): 53-57.

534 535 35. Hilcenko C, Freund SMV, Warren AJ. Structure of the Human Shwachman-Bodian- 536 Diamond Syndrome (SBDS) Protein. RSCB PDB; 2011.

537 538 36. Muto S, Senda M, Akai Y, Sato L, Suzuki T, Nagai R, et al. Relationship between the 539 structure of SET/TAF-Ibeta/INHAT and its histone activity. Proc Natl Acad 540 Sci U S A 2007 Mar 13; 104(11): 4285-4290.

541 542 37. Kozakov D, Grove LE, Hall DR, Bohnuud T, Mottarella SE, Luo L, et al. The FTMap 543 family of web servers for determining and characterizing ligand-binding hot spots of 544 proteins. Nat Protocols 2015 05//print; 10(5): 733-755.

545 546 38. Comeau SR, Gatchell DW, Vajda S, Camacho CJ. ClusPro: an automated docking 547 and discrimination method for the prediction of protein complexes. Bioinformatics 548 2004; 20(1): 45-50.

549 550 39. Kozakov D, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, et al. How good is 551 automated protein docking? Proteins: Structure, Function, and Bioinformatics 2013; 552 81(12): 2159-2166.

553 554 40. Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, et al. The ClusPro web 555 server for protein-protein docking. Nat Protocols 2017 02//print; 12(2): 255-278.

556 557 41. Halgren TA. New Method for Fast and Accurate Binding-site Identification and 558 Analysis. Chem Biol Drug Des 2007; 69: 146-148.

559

560 42. Halgren TA. Identifying and Characterizing Binding Sites and Assessing Druggability. 561 J Chem Inf Model 2009; 49: 377-389.

562 563 43. Friesner RA, Banks JL, Murphy RB, Halgren TA, Klicic JJ, Mainz DT, et al. Glide: A 564 new approach for rapid, accurate docking and scoring. 1. Method and assessment of 565 docking accuracy. J Med Chem 2004; 47: 1739-1749.

566 567 44. De Palma RM, Parnham SR, Li Y, Oaks JJ, Peterson YK, Szulc ZM, et al. The NMR- 568 based characterization of the FTY720-SET complex reveals an alternative 569 mechanism for the attenuation of the inhibitory SET-PP2A interaction. FASEB J 2019 570 Jun; 33(6): 7647-7666.

571 572 45. Oaks JJ, Santhanam R, Walker CJ, Roof S, Harb JG, Ferenchak G, et al. 573 Antagonistic activities of the immunomodulator and PP2A-activating drug FTY720 574 (Fingolimod, Gilenya) in Jak2-driven hematologic malignancies. Blood 2013 Sep 12; 575 122(11): 1923-1934.

576 577 46. Pippa R, Dominguez A, Christensen DJ, Moreno-Miralles I, Blanco-Prieto MJ, Vitek 578 MP, et al. Effect of FTY720 on the SET-PP2A complex in acute myeloid leukemia; 579 SET binding drugs have antagonistic activity. Leukemia 2014 Sep; 28(9): 1915-1918.

580 581 47. Saddoughi SA, Gencer S, Peterson YK, Ward KE, Mukhopadhyay A, Oaks J, et al. 582 Sphingosine analogue drug FTY720 targets I2PP2A/SET and mediates lung tumour 583 suppression via activation of PP2A-RIPK1-dependent necroptosis. EMBO molecular 584 medicine 2013 Jan; 5(1): 105-121.

585 586 48. McCracken AN, McMonigle RJ, Tessier J, Fransson R, Perryman MS, Chen B, et al. 587 Phosphorylation of a constrained azacyclic FTY720 analog enhances anti-leukemic 588 activity without inducing S1P receptor activation. Leukemia 2017 Mar; 31(3): 669- 589 677.

590 591 49. Baharians Z, Schonthal AH. Autoregulation of protein phosphatase type 2A 592 expression. J Biol Chem 1998 Jul 24; 273(30): 19019-19024.

593 594 50. Seshacharyulu P, Pandey P, Datta K, Batra SK. Phosphatase: PP2A structural 595 importance, regulation and its aberrant expression in cancer. Cancer Lett 2013 Jul 596 10; 335(1): 9-18.

597 598 51. Boocock GR, Morrison JA, Popovic M, Richards N, Ellis L, Durie PR, et al. Mutations 599 in SBDS are associated with Shwachman-Diamond syndrome. Nature genetics 2003 600 Jan; 33(1): 97-101.

601 602 52. Kuijpers TW, Alders M, Tool AT, Mellink C, Roos D, Hennekam RC. Hematologic 603 abnormalities in Shwachman Diamond syndrome: lack of genotype-phenotype 604 relationship. Blood 2005 Jul 1; 106(1): 356-361.

605 606 53. Maserati E, Pressato B, Valli R, Minelli A, Sainati L, Patitucci F, et al. The route to 607 development of myelodysplastic syndrome/acute myeloid leukaemia in Shwachman- 608 Diamond syndrome: the role of ageing, karyotype instability, and acquired 609 chromosome anomalies. British journal of haematology 2009 Apr; 145(2): 190-197.

610 611 54. Boudreau RT, Conrad DM, Hoskin DW. Apoptosis induced by protein phosphatase 612 2A (PP2A) inhibition in T leukemia cells is negatively regulated by PP2A-associated 613 p38 mitogen-activated protein kinase. Cell Signal 2007 Jan; 19(1): 139-151.

614 615 55. Orelio C, Verkuijlen P, Geissler J, van den Berg TK, Kuijpers TW. SBDS expression 616 and localization at the mitotic spindle in human myeloid progenitors. PLoS One 2009 617 Sep 17; 4(9): e7084.

618 619 56. Wlodarchak N, Xing Y. PP2A as a master regulator of the cell cycle. Crit Rev 620 Biochem Mol Biol 2016 May-Jun; 51(3): 162-184.

621 622 57. Dohner H, Estey E, Grimwade D, Amadori S, Appelbaum FR, Buchner T, et al. 623 Diagnosis and management of AML in adults: 2017 ELN recommendations from an 624 international expert panel. Blood 2017 Jan 26; 129(4): 424-447.

625

626 58. Straube J, Ling VY, Hill GR, Lane SW. The impact of age, NPM1(mut), and 627 FLT3(ITD) allelic ratio in patients with acute myeloid leukemia. Blood 2018 Mar 8; 628 131(10): 1148-1153.

629 630 59. Wong TE, Calicchio ML, Fleming MD, Shimamura A, Harris MH. SBDS protein 631 expression patterns in the bone marrow. Pediatr Blood Cancer 2010 Sep; 55(3): 546- 632 549.

633 634 60. Punna S, Kaltgrad E, Finn MG. "Clickable" agarose for affinity chromatography. 635 Bioconjug Chem 2005 Nov-Dec; 16(6): 1536-1541.

636 637 61. Schollkopf U, Groth U. Asymmetric-Synthesis Via Heterocyclic Intermediates .9. 638 Enantioselective Synthesis of (R)-Alpha-Vinylamino Acids. Angew Chem Int Edit 639 1981; 20(11): 977-978.

640

641 References: 61

642

643 Figure Captions

644 Figure 1: SBDS is target of PP2A activating drugs and a PP2A interacting protein. (A)

645 Structure of immobilized diaminodipropylamine drug beads. Control-beads lack the

646 functional group of the PP2A activating drugs. (B) Silver-stained gel of proteins pulled down

647 by AAL(S)-beads eluted using excess native drug (250 nM – AAL(S)) in factor dependent

648 (GM-CSF) control myeloid progenitor cells (FD-EV) and c-KIT/D816V cells. The red

649 arrowhead indicates the band excised and subjected to LC-MS/MS. Western blot analysis

650 confirmed SBDS as a target of AAL(S) using (C) hydrophilic- and (D) hydrophobic- beads.

651 (E) Western blot confirmation of SBDS as a target of FTY720-beads (hydrophobic-beads)

652 eluted using native drug (250 nM FTY720, 250 nM AAL(S)) and further eluted with

653 denaturing conditions (SDS). PP2Ac and SET also eluted using reducing conditions. (F)

654 Proteins identified by LC-MS/MS enriched following SBDS co-immunoprecipitation (CoIP)

655 from c-KIT/D816V cells. (G) Western blot confirmation of SBDS CoIP LC-MS/MS results. (H)

656 Reciprocal CoIP using PP2Ac as bait confirming SBDS-PP2A interactions. (I) Protein-protein

657 docking of SBDS (red) with PP2A subunits Aα (green), B55α (purple), and Cα (blue),

658 predicts SBDS NTD to directly bind to PP2Ac and PP2A-B55α. (J) Docking with SBDS

659 indicates that AAL(S) (orange) and FTY720 (green) form hydrogen bonds (E28, S61, Q94),

660 a salt bridge (E28), and hydrophobic interactions (L12, V15) with the NTD of SBDS, in the

661 region surrounded by predicted binding sites with PP2Ac (blue) and B55α (purple).

662 Figure 2. SBDS inhibits PP2A activity. (A) PP2A activity was assessed following

663 molecular inhibition of SBDS using SBDS-shRNA, or treatment of FD-EV and c-KIT/D816V

664 cells for 12 hours with 2.5 µM AAL(S) and following c-KIT inhibition using 60 nM Dasatinib

665 for 12 hours. (B) Western-blot assessment of SBDS association with PP2Ac

666 immunoprecipitated complexes with and without addition of recombinant SBDS, as for (A).

667 (C) Colocalisation of SBDS and PP2Ac was assessed by immunofluorescence analyses in

668 c-KIT/D816V cells following 90 minute treatment with 2.5 µM AAL(S) or 60 nM Dasatinib (bar

669 represents = 10 µm). *p<0.05, **p<0.01, ***p<0.001, ***p <0.0001 determined by Students t-

670 test.

671 Figure 3. SBDS expression is essential for the survival of c-KIT/D816V mutant cells.

672 FD-EV and c-KIT/D816V cells were transiently transfected with YFP-tagged scrambled DNA

673 shRNA control or two different YFP-tagged SBDS shRNA constructs for 24 hours and then

674 sorted for populations of cells expressing high YFP. (A) Cells were then stained 24 hours

675 post sort with Annexin/V, quantitation of dead cells post sort measured by Annexin/V and

676 7AAD cells is compared to as a percent of scrambled controls. (B) Western blot for cleaved

677 caspase 3 following transient SBDS knockdown. (C) c-KIT/D816V cells expressing high

678 knockdown of SBDS showed significantly reduced proliferation determined by trypan blue

679 exclusion assay, and (D) reduced YFP+ cells as determined via flow cytometry (**p<0.01,

680 Two-way ANOVA).

681 Figure 4. SBDS expression in AML blasts correlated with reduced overall patient

682 survival. (A) TCGA RNAseq data of AML patients revealed a strong correlation between

683 SBDS mRNA expression and cytogenetic risk (p=1.55e-57). (B) High-level SBDS and c-KIT

684 expression profiles associated with significantly lower survival (red line). (C) Combined high-

685 level SBDS | FLT3 expression profiles were associated with lower survival across all

686 patients. High-level SBDS | FLT3 | NPM1 expression profiles were associated with lower

687 survival for patients classified with (D) favourable and (E) intermediate cytogenetic risk

688 groups. (F) Trephine biopsies from AML patients were immuno-stained for SBDS. 9/14

689 patients showed strong cytoplasmic and nuclear SBDS staining (high-expression), while

690 5/14 showed no or very weak SBDS staining (no-low expression). Relative expression was

691 plotted as a function of overall survival. Mean overall survival for patients expressing high

692 SBDS in AML blast cells was 17 months, whereas patients expressing no-low staining was

693 61 months. (G, H) c-KIT mutant AML patients show high-level SBDS staining, predominantly

694 in the cytoplasm. Favourable risk patients harbouring (I) wild type c-KIT and NPM mutations

695 also showed high-level staining, whereas (J) patients with favourable risk FLT3/ITD and

696 NPM1 mutations showed no SBDS staining in blast cells. (K, L) Complex karyotype AML

697 patients show strong SBDS staining. SBDS expression was absent in the myeloblasts of a

698 (M) AML patient in complete remission or in (N) normal healthy bone marrow, but was

699 observed in normal erythroid precursors (bar indicates 100 µm).

700

701

702

703 Supplementary Information

704 Supplementary Methods

705 Reagents

706 Unless specified otherwise, all reagents were obtained from Sigma-Aldrich (St. Louis, Mo,

707 USA) or Thermo Fisher Scientific (Waltham, MA, USA) and were of research grade. Modified

708 trypsin was from Promega (Madison, WI, USA). Poros R2 and Poros Oligo R3 reverse-

709 phase material were from Applied Biosystems (Forster City, CA, USA). GELoader tips were

710 from Eppendorf (Hamburg, Germany). The 3M EmporeTM C8 disk was from 3M

711 Bioanalytical Technologies (St. Paul, MN, USA). All solutions were made with ultrapure

712 Sigma water (Sigma-Aldrich). Dasatinib was purchased from LC Laboratories (Woburn, MA,

713 USA). Fingolimod (FTY720) was purchased from Cayman Chemicals (Ann Arbor, MI, USA).

714 AAL(S) was synthesized as described 19.

715 Synthesis of click agarose bead for affinity chromatography

716 Structure activity relationship data confirmed that the aminoalcohol head group was

717 essential for biological activity 19 therefore affinity bead were attached to the hydrophobic tail

718 as this part of the molecule was shown to less important for the cytotoxicity and PP2A

719 activating capacity of the analogues. Finn and co-workers developed click reactions utilized

720 to couple a molecule of interest to an agarose bead for affinity chromatography 60. Thus,

721 using our recently reported modular synthesis of AAL(S) analogues, we synthesized an

722 analogue of AAL(S) with a terminal acetylene on the hydrophobic tail from the previously

723 reported bis-lactim ether 61 shown in Figure 1A. Before embarking on the probe synthesis a

724 test substrate was synthesised to see if functionalization at the end of the hydrophobic tail

725 would affect activation of PP2A. Terminal acetylene AAL(S) analogue was subjected to click

726 reaction conditions (CuI, TBTA, DIPEA, DMF) in the presence of t-butyl (2-(2-(2-

727 azidoethoxy)ethoxy)ethyl)carbamate which afforded triazole with a 53 % yield.

728 A similar protocol was applied to the synthesis of an FTY720 analogue where we

729 incorporated an oxygen into the hydrophobic tail of FTY720 (Figure 1A). O-FTY720 is only

730 slightly less cytotoxic in c-KIT/D816V cells (IC50 = 5.7 μM) as FTY720 (IC50 = 3.6 μM) and

731 AAL(S) (IC50 = 3.7 μM) 19. The AAL(S) and O-FTY720 affinity chromatography substrates

732 were attached to a terminal azide solid support using the method developed by Finn to afford

733 AAL(S) and O-FTY720 affinity chromatography probes in 16 and 45 % yields respectively. A

734 negative control bead was also synthesised in a similar manner (Figure 1A – Control).

735 Supplementary Data

736 Supplementary Table S1: AML patient data. Diagnosis was confirmed by using

737 cytomorphology, cytogenetics, and leukocyte antigen expression and was evaluated

738 according to the WHO 2008 myeloid neoplasms classification.

739 Supplementary Table S2: PP2A-ABC holoenzyme clustering scores for docked PP2A

740 inhibitory proteins. (A) Cluster score for the SBDS – PP2A-ABC holoenzyme interaction

741 was 145 poses (interfacial RMSD between poses is <9Å).

742 Supplementary Table S3: AAL(S) affinity bead chromatography LC-MS/MS. Band

743 eluting from AAL(S) affinity bead pulldown using native drug were excised, subjected to in

744 gel digestion and identified by mass spectrometry.

745 Supplementary Table S4: c-KIT/D816V SBDS interactome identified by SBDS co-

746 immunoprecipitation coupled to LC-MS/MS. Bands eluting from SBDS pulldown using low

747 pH were excised, subjected to in gel digestion and identified by mass spectrometry.

748 Supplementary Figure S1: SBDS is a PP2A interacting protein. (A) Silver-stained SDS-

749 PAGE of c-KIT/D816V SBDS co-immunoprecipitation, proteins eluted from control beads or

750 SBDS CoIP using 100 mM glycine, pH 2.5. Bands eluted from following SBDS pulldown

751 were excised and subjected to LC-MS/MS. (B) Immuno-colocalisation of SBDS (red) bound

752 to PP2Ac (green) in FD-EV and c-KIT/D816V cells (bar represents = 10 µm). (C) Predicted

753 amino acids facilitating SBDS and PP2AB’C integrations. Binding hot spots identified using

754 the solvent mapping platform FTMap 37 (D) AAL(S) (Glide XP docking score = -6.7), FTY720

755 (Glide docking score = -7.1) and AAL(S) analogues 19 plotted as a function of their cytotoxic

756 in c-KIT/D816V cells. Pearson’s coefficient of 0.74 and p< 0.003.

757 Supplementary Figure S2. SBDS knockdown or treatment of c-KIT/D816V cells with

758 AAL(S) alters PP2A interacting protein expression. (A) Western-blot assessment of

759 PP2A subunit expression in FD-EV and c-KIT/D816V cells expressing SBDS shRNA

760 expressing or following treatment with 2.5 µM AAL(S) and 60 nM Dasatinib for 12 hours. (B)

761 Western-blot assessment of known PP2A inhibiting proteins in FD-EV and c-KIT/D816V cells

762 expressing SBDS shRNA expressing or following treatment with 2.5 µM AAL(S) and 60 nM

763 Dasatinib for 12 hours. (C) Colocalisation of SBDS and PP2Ac was assessed by

764 immunofluorescence analysis in FD-EV cells following 90-minute treatment with 2.5 µM

765 AAL(S) and 60 nM Dasatinib (bar represents = 10 µm).

766 Supplementary Figure S3. Annexin/V analysis of FD-EV and c-KIT/D816V cells

767 following transient SBDS knockdown. FD-EV and c-KIT/D816V cells were transfected

768 with YFP-tagged scrambled DNA shRNA control or YFP-tagged SBDS shRNA construct for

769 24 hours, then sorted for populations of cells expressing high YFP, and then stained 24

770 hours post sort with Annexin/V and 7-AAD.

771

772

773

774

- -

A control-hydrophobic bead B cont.-bead + - + E

+ + - +

AAL(S)-bead -

+ + - -

F D -E V + c-KIT/D816V - - + + A AL( S )-hydrophobic bead 1 0 0 – I n p ut C ont.-bead AAL(S)-bead O -FTY720-beadAAL(S)-bead O -FTY720-bead c-KIT/ D816V 5 0 – O -FTY720-hydrophobic bead 3 7 – 3 7 – P P 2 A c 2 5 – L C -M S/ M S ( S B D S) 3 7 – 1 5 – S E T control-hydrophilic bead k D a k D a C S B D S S B D S 2 5 – 2 5 – A AL( S) -hydrophilic bead D Dr u g S D S S B D S el uti o n el uti o n 2 5 – 2 5 0 n M

F G C oI P H C oI P Protein Gene kDa Mascot Coverage

PP2A-A Pr65 65 989 37% L y s at e L y s at e B55α Ppp2r2a 52 75 5% 3 7 – P P 2 A c 3 7 – P P 2 A c PP2Ac Ppp2ra 36 613 35% S E T S et 3 3 1 3 6 8 % 3 7 – S E T 3 7 – S E T

S B D S S b d s 2 9 7 8 9 % k D a k D a 2 5 – S B D S 2 5 – S B D S I J P P2 A -A binding PP2A -C binding

P P 2 A -B 5 5 α bi n di n g D u n et al Fi g ur e 1 Dun et al Figure 2

** *** A ****

* ** ** ** * 1 0 0

** **

5 0 PP2A Activity(% FD-EV)

0 s h S C R M -- + + ------+ + ------s h S B D S - - -- + + ------+ + -- - - AAL(S) ------+ + ------+ + - - -A b C o n t. D a s . ------+ + ------+ + rProtein Cont. rS B D S ---+ + +- + - + - +--- + + + - + FD-EV c-KIT/D816V B CoIP: PP2Ac

SBDS

IgG + rSBDS IgG

FD-EV c-KIT/D816V

C 8 0 * Untreated AAL(S) Dasatinib

6 0

4 0 PP2A SBDS + SBDS

colocalization2 0

% SBDS / PP2A

0 + DAPI

AAL(S) AAL(S) c-KIT/D816V U n tre a te d D a s a tin ibU n tre a te d D a s a tin ib FD-EV c-KIT/D816V Dun et al Figure 3

A B 6 0 **

4 0 SBDS 25 – 37 – 2 0 25 – CASP3 kDa

0 % Annexin V / 7AAD +ve

(shSBDS vs. SCRM control) ACTB 37 – FD-EV c-KIT/ F D -E V S C R M D816V FD-EV shSBDS-AFD-EV cKIT/D816VshSBDS-B SCRM cKIT/D816VcKIT/D816V shSBDS-A shSBDS-B C 3 0 D 1 0 0 2 5 9 0 FD-EV shSCRM 2 0 FD-EV shSBDS

c e lls /m L ) 8 0 5 1 5 ** ** c-KIT/D816V shSCRM 7 0 c-KIT/D816V shSBDS-A

C e ll c o u n t 1 0

(+/-SEM YFP) c-KIT/D816V shSBDS-B YFP expression 6 0 5 (+/-SEM x10 0 5 0 0 2 0 4 0 6 0 8 0 0 20 40 60 80 100

Tim e (hours - post sort) Tim e (hours - post sort) c-KIT/M541L c-KIT/M541L A B G H 1.0 – - Low Risk - Medium Risk 1- High Risk 84, +:39, CI+45.8 84, +:23, CI+46.1 7 – 0.8 –

6 – 0.6 –

5 – 0.4 –

GeneExpression 4 – 0.2 – c-KIT+, NPM1c FLT3/ITD, NPM1c Survival Months, SBDS|KIT Unstratified SBDS p=1.55e-57 0.0 – Risk Groups HR = 1.86 (C1 1.26 ~ 2.73) p=0.001672 I J

0 20 40 60 80 C D 1.0 – 84, +:39, CI+45.8 1.0 – 17,84, +:14,+:37, CI+68.2CI+51.7 84, +:23, CI+46.1 16,84, +:7,+:25, CI+70.9CI+57.1 0.8 – 0.8 – AML Patients 0.6 – 0.6 –

0.4 – 0.4 – c-KIT+, Del7(q) c-KIT+, CK K L 0.2 – 0.2 – Survival Months, SBDS|FLT3 Survival Months, SBDS|FLT3|NPM1SBDS|FLT3|NPM1 FavourableUnstratified Risk Groups HR = 1.72 (C1 1.17 ~ 2.52) p=0.006155 Risk Groups HR = 3.891.59 (C1(CI 1.081.05 ~ ~ 2.34) 14.41) p=0.001757 p=0.04234 0.0 – 0.0 – E 0 20 40 60 80 F 0 20 40 60 80 1.0 – 43, +:17, CI+64 1.0 – 5, +:55, CI+60.2 54, +:14, CI+49.1 9, +:17.5 CI+53.4 0.8 – 0.8 –

0.6 – 0.6 – Remission AML Normal bone marrow M N 0.4 – 0.4 –

0.2 – 0.2 – Survival Months, SBDS|FLT3|NPM1 Intermediate Risk Survival Months, SBDS Protein Expression Risk Groups HR = 1.71 (C1 1.04 ~ 2.81) p=0.03509 Risk Groups HR = 1.552 (CI 0.5313 ~ 5.797) p=0.2250 0.0 – 0.0 – Control 0 20 40 60 80 0 20 40 60 80 R e s e ar c h

Proteo mic Profiling of Mouse Epididy moso mes Reveals Their Contributions to Post-testicular Sper m Maturation

A ut h or s Brett Nixon, Geoffry N. De Iuliis, Hanah M. Hart, Wei Zhou, Andrea Mathe, Ilana R. Bernstein, A manda L. Anderson, Si mone J. Stanger, David A. Skerrett- Byrne, M. Fairuz B. Ja maluddin, Juhura G. Al mazi, Elizabeth G. Bro mfield, Martin R. Larsen, and Matthe w D. Dun loadedf m o dfr e d a o nl w o D Correspondence Graphical Abstract Matt. Dun @ne wcastle.edu.au

I n Bri ef t :/www l .o g/ or e. n nli o p c m w. w w p:// htt The proteo mic co mposition of extracellular vesicles (e pi di dy mo- so mes) secreted by the mouse epididy mis has been deter mined by applying multiplexed tande m mass tag based quantification coupled with high resolution L C- 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at M S/ M S. This analysis confir med that epididy moso mes encapsu- late an extre mely rich and di- verse proteo mic cargo, which is co m mensurate with their puta- tive role in coor dinating the post-testicular maturation and storage of sper matozoa.

Hi g hli g ht s • Co mparative proteo mics of extracellular vesicles isolated fro m the mouse epididy mis.

• Epididy moso me proteo me displays pronounced seg ment-to-seg ment variation.

• Epididy moso mes deliver protein cargo to the sper m head.

• Mechanistic insights into role of epididy moso mes in sper m maturation and storage.

Nixon et al., 2019, Molecular & Cellular Proteo mics 18, S91–S108 March 2019 © 2019 Nixon et al. Published by The A merican Society for Bioche mistry and Molecular Bi ol o g y, I n c. https://doi.org/10.1074/ mcp. R A118.000946 l o s R e s e ar c h

Author’s Choice Proteo mic Profiling of Mouse Epididy moso mes Reveals their Contributions to Post-testicular

Sper m Maturation □S

Brett Nixon‡, Geoffry N. De Iuliis‡, Hanah M. Hart‡, Wei Zhou‡, Andrea Mathe‡¶, Ilana R. Bernstein‡, A manda L. Anderson‡, Si mone J. Stanger‡, David A. Skerrett- Byrne¶, M. Fairuz B. Ja maluddin¶ , Juhura G. Al mazi¶, Elizabeth G. Bro mfield‡, Martin R. Larsen§, and Matthe w D. Dun¶ ** loadedf m o dfr e d a o nl w o D

The functional maturation of sper matozoa that is neces- de monstration that epididy moso mes are able to convey sary to achieve fertilization occurs as these cells transit protein cargo to the head of maturing sper matozoa, these through the epididy mis, a highly specialized region of the data e mphasize the funda mental i mportance of epidid- male reproductive tract. A defining feature of this matu- y moso mes as key ele ments of the epididy mal microenvi- g/ or e. n nli o p c m w. w w p:// htt ration process is that it occurs in the co mplete absence of ron ment responsible for coordinating post-testicular nuclear gene transcription or de novo protein translation sper m maturation. Molecular & Cellular Proteo mics 18: in the sper matozoa. Rather, it is driven by sequential in- S91–S108, 2019. D OI: 10.1074/ mcp. R A118.000946. teractions bet ween sper matozoa and the co mplex exter- nal milieu in which they are bathed within lu men of the epididy mal tubule. A feature of this dyna mic microenvi- Ma m malian s per matozoa ac quire the potential to fertilize an ron ment are epididy moso mes, s mall me mbrane encap- ovu m as they navigate the epididy mis, an exceptionally long sulated vesicles that are secreted fro m the epididy mal 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at convoluted tubule that connects the testis to the vas defer- so ma. Herein, we report co mparative proteo mic profiling ens. This maturation process enco mpasses a suite of cellular of epididy moso mes isolated fro m different seg ments of modifications that endo w sper matozoa with the potential to the mouse epididy mis using multiplexed tande m mass tag (T MT) based quantification coupled with high resolution sustain for ward progressive motility, capacitate and subse- L C- MS/ MS. A total of 1640 epididy moso me proteins were quently participate in the cellular interactions required to identified and quantified via this proteo mic method. No- achieve conception (1). A mong the singular features that dis- tably, this analysis revealed pronounced seg ment-to- cri minate epididy mal maturation fro m that of the preceding seg ment variation in the encapsulated epididy moso me phases of ga mete develo p ment (2) is that it is driven entirely proteo me. Thus, 146 proteins were identified as being by extrinsic factors in the a bsence of nuclear gene transcri p- differentially accu mulated bet ween caput and corpus epi- tion andde novo protein translation in the sper matozoa (3, 4). didy moso mes, and a further 344 were differentially accu- Indeed, it is widely held that the co mplex intralu minal mi- mulated bet ween corpus and cauda epididy moso mes ( i.e. croenviron ment created by the epididy mal epitheliu m serves fold change of< 1.5 or > 1.5; p < 0.05). Application of as the key deter minant in the functional transfor mation of gene ontology annotation revealed a substantial portion sper matozoa (5, 6). Accordingly, the epididy mal so ma is char- of the epididy moso me proteins mapped to the cellular acterize d by a marke d division of la bor such that the proxi mal co mponent of extracellular exoso me and to the biological processes of transport, oxidation-reduction, and metab- seg ments (initial seg ment, caput and corpus epididy mis) pro- olis m. Additional annotation of the subset of epididy mo- mote sper m maturation, whereas the distal caudal seg ment so me proteins that have not previously been identified in supports sper m storage (1). Such functions are reflected in exoso mes revealed enrich ment of categories associated distinctive gene ex pressi on pr ofiles (7–9) that, in turn, dictate with the acquisition of sper m function ( e.g. fertilization seg ment-specific secretion of proteins and a range of addi- and binding to the zona pellucida). In tande m with our tional bio molecules into the lu minal fluid and thus establish

Fro m the ‡ Priority Research Centre for Reproductive Science, School of Environ mental and Life Sciences, Discipline of Biological Sciences, The University of Ne wcastle, University Drive, Callaghan, N S W 2308, Australia; § Depart ment of Bioche mistry and Molecular Biology, University of Southern Den mark, Ca mpusvej 55, D K-5230 Odense M, Den mark; ¶School of Bio medical Sciences and Phar macy, Faculty of Health and Medicine, The University of Ne wcastle, Callaghan, N S W 2308, Australia; Hunter Medical Research Institute, Cancer Research Progra m, Ne w La mbton Heights, N S W 2305, Australia Author’s Choice —Final version open access under the ter ms of the Creative Co m mons C C- B Y license. Received July 25, 2018, and in revised for m, August 28, 2018 Published, M CP Papers in Press, Septe mber 13, 2018, D OI 10.1074/ mcp. RA118.000946

Molecular & Cellular Proteo mics 18.13 S 9 1 © 2019 Nixon et al. Published by The A merican Society for Bioche mistry and Molecular Biology, Inc. Proteo mic Profiling of Mouse Epididy moso mes the unique physiological co mpart ments that affect sper m so mes have the capacity to mediate the selective transfer of maturation and prolonged sper m survival (3, 10 –12). epididy mal secretory proteins to ho mologous sper matozoa In recognition of the i mportance of epididy mal function in (26). At present ho wever, the conservation of this for m of governing s per m quality, this tissue has long been of interest intercellular co m munication for theen masse delivery of pro- as a potential site for contrace ptive intervention (13–16). Con- teins has yet to be substantiated in co m mon laboratory mod- versely, the epididy mis has also generated interest fro m the els such as the rodents. To begin to address this challenge, standpoint of therapeutic treat ment strategies to co mbat we have surveyed the proteo mic co mposition of epididy mo- s per m dysfunction associate d with male factor infertility (17– so mes isolated fro m different seg ments of the mouse epidid- 20). The realizati on of b oth g oals is pre dicate d on res oluti on of y mis using multiplexed tande m mass tag based relative quan- the mechanistic basis by which the sper m proteo me is so tification coupled with offline H PL C and L C- M S/ M S. Further, dra matically altered during the key develop mental windo w of we have exploited a co-culture syste m to de monstrate the epididy mal maturation. A mong the potential mechanis ms ca- uptake of biotinylated protein cargo fro m mouse epididy mo- pable of mediating the bulk exchange of proteo mic infor- so mes pri marily into the sper m head. loadedf m o dfr e d a o nl w o D mation to maturing sper matozoa, epididy moso mes have e merged as attractive candidates (21–26). Epididy moso mes EXPERI MENTAL PROCEDURES represent a heterogeneous population of s mall me mbrane Reagents — Unless specified other wise, all reagents were obtained bound extracellular vesicles ( E Vs) 1 (27–29) that are released fro m Sig ma Aldrich ( St. Louis, Mo) or Ther mo Fisher Scientific ( Wal- tha m, M A) and were of research or mass spectro metry grade. Anti- fro m the epididy mal epitheliu m via an apocrine secretory bodies were purchased fro m the follo wing suppliers: anti- D N M2 g/ or e. n nli o p c m w. w w p:// htt mechanis m (30 –32). This path way is characterized by the ( P A5–19800; Ther mo Fisher Scientific); anti- A D A M7, anti- B A G6, anti- for mation of cytoplas mic protrusions along the apical margin , anti- HSPA2, anti-IZ U M O1 (S C-25137, S C-365928, S C- of the principal epithelial cells (30). Follo wing detach ment, 166907, S C-79543, S C-79543; Santa Cruz Biotechnology, Dallas, these “apical blebs” break do wn to release their contents, TX); anti- O DF2, anti-P MS D7 (ab121023, ab11436; Abca m, Ca m- bridge, U K); anti- GAP D H, anti-P DIA6 ( G9545, HPA034653; Sig ma- including epididy moso mes, into the lu minal environ ment (30) Aldrich); anti- P R O M2 ( N B P1– 47941; Novus Biologicals, Littleton, where they have the potential to interact with sper matozoa C O). Anti- B4 G ALT1 and anti- MF GE8 antibodies were kindly provided and mediate the transfer of a co mplex proteinaceous cargo to by Professor Barry Shur, University of Colorado ( Denver, C O). these cells (29). Ethics State ment —All experi mental procedures were 9 1 0 2 5, er conducted b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at The partici pation of e pi di dy moso mes in the alteration of the with the approval of the University of Ne wcastle’s Ani mal Care and Ethics Co m mittee (approval nu mber A-2013–322), in accordance with sper m proteo me dra ws on a wealth of evidence that EVs, relevant national an d international gui delines. In bre d S wiss mice were release d fro m virtually all so matic tissues, can facilitate the housed under a controlled lighting regi me (16L: 8 D) at 21–22 ° C and delivery of a diverse macro molecular payload (co mprising supplied with food and water ad libitu m. Prior to dissection, ani mals pr otei ns, li pi ds, a n d n ucleic aci ds) t o reci pie nt cells ( 3 3). It is were euthanized via C O 2 inhalation. also consistent with pioneering studies of Sullivan and col- Mouse Epididy moso me Isolation and Characterization — Mouse epididy moso me isolation and validation of enrich ment were con- leagues who have de monstrated that bovine epididy mo- ducted as previously described (34). Briefly, S wiss mice (adult males of at least 8 weeks of age) were euthanized and their vasculature i m mediately perfused with pre- war med P BS to mini mize the possi- 1 The abbreviations used are: EV, extracellular vesicle; A D A M3, a bility of blood conta mination. Epididy mides were then re moved, sep- disintegrin and metallopeptidase do main 3 (cyritestin); A D A M7, a arated fro m fat and connective tissue and dissected into three ana- disintegrin and metallopeptidase do main 7; B4 GALT1, beta-1,4- to mical regions corresponding to the caput, corpus and cauda. galactosyltransferase 1; B W W, Biggers, Whitten, and Whittingha m Lu minal fluid was aspirated fro m each region by placing the tissue in mediu m; C D, co mple ment dependent; C UZ D1, C U B and zona pellu- a 500 l droplet of modified Biggers, Whitten, and Whittingha m media cida-like do main-containing protein; CL U, clusterin; D A VI D, database [ B W W; p H 7.4, os molarity 300 m Os m/kg (35, 36)] and making multiple for annotation, visualization and integrated discovery; D N M2, dy- incisions with a razor blade. The tissue was then subjected to mild na min 2; DT Y M K, deoxythy midylate kinase; F D R, false discovery rate; agitation and the mediu m subsequently filtered through 70 m me m- FL OT1, flotillin 1; G A P D H, glyceraldehyde-3-phosphate dehydrogen- branes. This suspension was then sequentially centrifuged at increas- ase; GL UL, gluta mate-a m monia ligase (gluta mine synthetase); G O, ing velocity (500 g, 5 min; 2000 g, 5 min; 4000 g, 5 min 8000 gene ontology; HILI C, hydrophilic interaction chro matography; g, 5 min; 17,000 g, 20 min; and finally 17,000 g for an additional H S P A2, heat shock protein 2; H S P90 B1, heat shock protein 90, beta 10 min) to eli minate all cellular de bris prior to the su pernatant being ( Grp94), me mber 1; IZ U M O1, izu mo sper m-egg fusion 1; L C- MS/ MS, layered onto a discontinuous iodixanol gradient (40 %, 20 %, 10 %, liquid chro matography-tande m mass spectro metry; MF GE8, milk fat and 5 %; created by diluting 60 % Opti Prep mediu m with a solution of globule- E GF factor 8 protein; M P C2, mitochondrial pyruvate carrier 2; 0. 2 5 M sucrose, 10 m M Tris). The gradients were ultracentrifuged N OL C1, nucleolar and coiled-body phosphoprotein 1; N U CB1, (100,000 g, 18 h, 4 ° C), after which t welve equivalent fractions were nucleobindin-1; O DF2, outer dense fiber protein 2; P DI A6, protein collecte d, dilute d in P B S an d su bjecte d to a final ultracentrifugation disulfide iso merase associated 6; P R O M2, pro minin 2; P S M, peptide step (100,000 g, 3 h, 4 ° C). spectru m matches; PS M D7, proteaso me (proso me, macropain) 26S All isolated epididy moso me fractions were characterized in ac- subunit non-ATPase, 7; S NA RE, soluble NSF attach ment protein cordance with the mini mal experi mental require ments for definition of receptor; TEA B, triethyla m moniu m bicarbonate; T MT, tande m mass extracellular vesicles (37), featurin g analysis of their purity, particle tag; Z P3 R, zona pellucida 3 receptor; Z P B P2, zona pellucida binding size and overall ho mogeneity as previously described (34) (please see pr ot ei n 2. supple mental Fig. S1). Briefly, this included quantitative assess ment

S 9 2 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes

of protein content and particle size heterogeneity of each of the mixe d in 1:1 ratio an d fractionate d by hy dro philic interaction chro ma- t welve fractions; with the latter being achieved via measure ment of tography ( HILI C; (45)) using a Dionex Ulti Mate 3000 capL C syste m mean particle size using dyna mic light scattering (34). Additional ( Dionex, Sunnyvale, CA) prior to nanoL C- MS/ MS. i m munoblot analyses were perfor med to deter mine the distribution of Tande m Mass Spectro metry (nanoL C- MS/ MS) Co mparative and the exoso me/epididy moso me marker flotillin 1 (FL OT1) within each Quantitative Analyses — NanoL C- MS/ MS, was perfor med using a Di- fraction, and a co mbination of FL OT1 and C D9 markers were also onex Ulti Mate 3000 nanoL C syste m ( Dionex). HILI C fractionated pep- used to dual-label epididy moso mes bound to aldehyde/sulfate latex tides were suspended in buffer A (2 % A C N/0.1 % TFA) and directly beads (34). Finally, epididy moso me preparations were also assessed loaded onto a 50 c m analytical colu mn packed with Acclai m Pep Map via trans mission electron microscopy to confir m the size and hetero- C 1 8 2 m sorbent. Peptides were eluted using a 110 min gradient geneity of the isolated populations. Notably, this experi mental work- fro m 7 to 40 % buffer B (95 % A C N, 0.1 % TFA) at 250 nl min 1 a n d flo w was perfor med on all preparations of epididy moso mes, irrespec- nanoelectrosprayed into a Q- Exactive Plus (Ther mo Fisher Scientific). tive of the do wnstrea m application. Moreover, as described belo w our Precursor scan of intact peptides was measured in the Orbitrap by proteo mic analyses confir med the presence of the top 50 proteins scanning fro m m /z 350 –1500 ( with a resolution of 70,000), the fifteen that are most co m monly identified in exoso mes, and whose identifi- most intense multiply charged precursors were selected for H C D cation is reco m mended as part of the mini mal experi mental require- frag mentation with a nor malized collision energy of 32.0, then meas- ments for definition of extracellular vesicles exoso me protein markers ure d in the Or bitra p at a resolution of 35,000. Auto matic gain control m o dfr e d a o nl w o D ( 3 8). targets were 3E6 ions for Orbitrap scans and 5E5 for M S/ M S scans. To visualize changes in the epididy moso me proteo me, populations Dyna mic exclusion was e mployed for 15 s. Frag mentation data were of epididy moso mes fro m each epididy mal seg ment (caput, corpus, converted to peak lists using Xcalibur version 4.027.19 (Ther mo cauda) were pooled fro m three ani mals to generate a single biological Fisher Scientific) and the H C D data were processed using Proteo me sa mple prior to labeling with cyanine dyes ( with three such sa mples Discoverer 2.1 (Ther mo Fisher Scientific). M S spectra were then being analyzed in this study). Briefly, epididy moso mes were lysed in searched with Mascot V2.6 (accessed 05/06/2018) against all mouse g/ or e. n nli o p c m w. w w p:// htt rehy dration buffer consisting of 7 M ur e a, 2 M thiourea, 4 % C HAPS for entries in S wiss Prot ( Release 2018 02; 16,976 entries). Mass toler- 1 h on ice with regular vortexing. Extracted protein was quantified ances in MS and MS/ MS modes were 10 pp m and 0.02 Da, respec- using a 2- D Quant kit in accordance with the manufacturer’s instruc- tively; trypsin was designated as the digestion enzy me, and up to t wo tions ( G E Healthcare, Pittsburgh, P A) and a total of 75 g of pr otein missed cleavages were allo wed. S-carba mido methylation of cysteine fro m each epididy moso me sa mple was labeled with 600 p mol of residues was designated as a fixed modification. Variable modifica- appropriate cyanine-dye reagents (i.e. either Cyanine3 or Cyanine5 tions included were, oxidation of methionine, acetylation of lysine, N HS esters; Lu miprobe, Hunt Valley, M D) fo r 1 h on ice. Labeling dea midation of asparagine or gluta mine and T MT labeling of a mines

reactions were quenched by addition of excess L -lysine (10 m M , 1 0 and lysine. Interrogation of the corresponding reversed database 9 1 0 2 5, er b m e pt e S n o L) U A C E( was L T S A C W E N F O V NI U at min on ice) after which differentially labeled epididy moso me sa mples also perfor med to evaluate the false discovery rate (F D R) of peptide were co mbined ( i.e.either caput and corpus or corpus and cauda identification using Percolator based on q-values, which were esti- epididy moso mes), prepared for resolution by 2 D S D S- P A GE (39), and mated fro m the target-decoy search approach. To filter out target i maged using a Typhoon FL A 9500 laser scanner ( GE Healthcare). peptide spectru m matches (target-PS Ms) over the decoy-PS Ms, a Epididy moso me Protein Digestion and Labeling for Co mparative fixed false discovery rate (F D R) of 1 % was set at the peptide level and Quantitative Proteo mic Analysis —Epididy moso me preparations (46). Additional identification criteria consisted of a mini mu m of t wo fro m each epididy mal seg ment surveyed (caput, corpus, cauda) were uniquely matched peptides per protein and a Mascot score of 67 (47). pooled fro m t welve ani mals to generate a single biological replicate; In Silico Analysis of Epididy moso me Protein Cargo —In silico anal- with three such re plicates being generate d for analysis in this stu dy. ysis of epididy moso me protein profiles was undertaken using a suite Epididy moso me suspensions were then subjected to fractionation by of techniques. Briefly, protein abundance data were assessed via dissolving in 200 l of ice-c ol d 0.1 M N a 2 CO 3 (p H 11.3) supple mented volcano plots to visualize tren ds associate d with differentially accu- with protease and phosphatase inhibitors ( Co mplete E DTA free; mulating proteins in the epididy moso mes sa mpled fro m opposing Roche, Basel, S witzerland) and probe tip sonicated at 4 ° C for 2 ends of the tract (i.e. caput versus cauda epididy mal seg ments).

20 s intervals prior to incubation for 1 h at 4° C. Na2 CO 3 s ol u bl e Epididy moso me protein datasets were also interrogated for enrich- proteins were isolated fro m insoluble-proteins by ultracentrifugation ment of functional path ways using bioinfor matic enrich ment tools (100,000 g for 90 min at 4 ° C) and dried (40). Both fractions were availa ble via the Data base for Annotation, Visualization an d Integrate d then dissolved in urea (6 M ur e a, 2 M thiourea) separately, reduced Discovery ( D A VI D; v6.8) (48, 49). Data sets were further curate d base d u si n g 1 0 m M DTT (1 h, 56 ° C, in the dark), an d alkylate d using 20 m M on sub-fertility phenotype ter ms using the M GI, Jackson Laboratory iodoaceta mide (45 min, roo m te mperature, in the dark). Proteins were US, Genes and Geno me Features database. digested using 1:50 ratio Lys- C/Trypsin to protein concentration, for Validation of Quantitative Protein Accu mulation in Epididy mo- 3 h at roo m te mperature. The concentration of urea was then reduced so mes —Orthogonal vali dation of the quantitative protein profiles gen- b el o w 0. 7 5 M by adding 50 m M TE A B, p H 7.8 and incubated overnight erated by nanoL C- MS/ MS was conducted using standard i m muno- at 37 ° C. Peptides were desalted and cleaned up using a modified blotting techniques. Representative proteins selected for analysis StageTip microcolu mn and solid phase extraction (SPE) colu mns included those that exhibited highest expression in the caput or ( Oasis P RI ME HL B; Waters, Rydal mere, N S W, Australia), respectively re mained at relatively constant levels in epididy moso mes sa mpled (41). Quantitative fluorescent peptide quantification ( Qubit protein throughout the epididy mis. All i m munoblotting analyses were per- assay kit; Ther mo Fisher Scientific) was e mployed and 100 g of e a c h for me d in biological tri plicate, with each biological sa m ple co m prising sa mple was labeled using tande m mass tags and co mparative and epididy moso me proteins pooled fro m a total of t welve mice. Ho w- quantitative analyses was perfor med in biological triplicate (42, 43) ever, because of li mitations in generating the volu me of epididy mo- (T MT 10 plex labels; caput 1 127 N, caput 2 127 C, caput 3 so me material required for nano-L C- MS/ MS, the protein used for 128N, corpus 1 128C, corpus 2 129N, corpus 3 129C, cauda i m munoblot analyses was generated fro m different ani mals to those 1 130N, cauda 2 130C, cauda 3 131) (T MT-10plex 2 kits; used for M S sequencing. Prior to analysis, pooled epididy moso mes Ther mo Fisher Scientific) (44). Digestion and tande m mass tag label- were solu bilize d by boiling in S D S extraction buffer (0.375 M Tri s p H ing efficiency was deter mined by L C- M S/ M S (42). Sa mples were then 6.8, 2 % w/v S D S, 10 % w/v sucrose, protease inhibitor mixture) at

Molecular & Cellular Proteo mics 18.13 S 9 3 Proteo mic Profiling of Mouse Epididy moso mes

100 ° C for 5 min. Insoluble material was re moved by centrifugation well as its su bstitution with an irrelevant anti bo dy control ( i.e. anti- (17,000 g, 10 min, 4 ° C) and soluble protein re maining in the G A P D H antibodies). supernatant was quantified using a B C A protein assay kit (Ther mo I m munofluorescence and Electron Microscopy —I m munolocaliza- Fisher Scientific). Equivalent a mounts of protein (5 g) were boiled in tion of MF GE8 was perfor med on isolated sper matozoa and epidid- S DS-PA GE sa mple buffer (2 % v/v mercaptoethanol, 2 % w/v S DS, y mal tissue sections in accordance with previously described and 10 % w/v sucrose in 0.375 M Tris, p H 6.8, with bro mphenol blue) protocols (53, 54). Briefly, sper matozoa were settled onto poly- L - at 100 ° C for 5 min, prior to be resolved by S D S- P A GE (150 V, 1 h) lysine-coated coverslips overnight at 4 ° C. All subsequent incuba- and transferred to nitrocellulose me mbranes (350 m A, 1 h). Me m- tions were perfor med at roo m te mperature in a hu midified cha mber, branes were then blocked and incubated with appropriate antibodies and all antibody dilutions and washes were conducted in P B S con- raised against target proteins. Briefly, blots were washed 3 ti mes 10 taining 0.1 % T ween-20 (P BST). Fixed cells were per meabilized in min with Tris- buffere d saline with 0.1 % (v/v) T ween-20 (T B ST), before 0.2 % Triton X-100/ P B S for 10 min and blocked in 3 % ( w/v) B S A in being probed with appropriate H R P-conjugated secondary antibod- P B ST for 1 h. Coverslips were then sequentially labeled with anti- ies. After three additional further washes, labeled proteins were de- MF GE8 antibodies (diluted 1:100) fo r 1 h at roo m te mperature. After tected using an enhanced che milu minescence kit ( GE Healthcare). incubation, coverslips were washed three ti mes, and then incubated Transfer of Epididy moso me Protein Cargo to Sper matozoa —Caput in goat anti-rabbit Alexa Fluor 488 (diluted 1:200) secondary antibody loadedf m o dfr e d a o nl w o D epididy mal sper matozoa were isolated as previously described (50) in for 1 h at roo m te mperature. Cells were then washed and counter- preparation for co-incubation with purified epididy moso mes using stained in 4 ,6-dia midino-2-phenylindole ( D A PI) before mounting in methodology opti mized for the in vitrotransfer of proteins bet ween antifa de reagent ( Mo wiol 4 – 88). La bele d cells were vie we d on an Axio bovine epididy moso mes and sper matozoa (51). Prior to co-culture, I mager A1 microscope ( Carl Zeiss MicroI maging, Oberkochen, Ger- freshly isolated epididy moso mes obtained fro m either the caput or many) equipped with epifluorescent optics and i mages captured with cauda epididy mal seg ments of 3 mice were pooled and resuspended an Oly mpus DP70 microscope ca mera ( Oly mpus, Shinjuku, Tokyo, t :/www l .o g/ or e. n nli o p c m w. w w p:// htt in P BS to generate a single biological replicate. Epididy moso me Japan). Alternatively, mouse epididy mal tissue was fixed in Bouin’s proteins were then labeled with a me mbrane i mper meant, long-chain fixative, e mbedded in paraffin wax and sectioned at 5 m thickness. N H S-ester activated biotinylation reagent (EZ-Link sulfo- N H S-L C- Sections were de-paraffinized, rehydrated, and antigen retrieval was perfor med by boiling the slides for 10 min in 10 m M Tri s ( p H 1 0). Bi otin, Ther m o Fisher Scientific) (26.9 M ) for 30 min at roo m te mper- Sections were blocked in 3 % w/v BSA/P BST fo r 1 h atroo m te mper- ature follo wed i m mediately by overnight incubation at 4 ° C. Follo wing ature, after which they were incubated with anti- MF G E8 (diluted 1:200 biotinylation, epididy moso me suspensions were diluted into P BS with 1 % w/v BSA/P BST) overnight at 4 ° C. Sections were then supple mented with 50 m M glycine t o arrest the bi otinylati on reacti on washed and incubated with fluorescent-conjugated secondary anti- (52) and subjected to ultracentrifugation (100,000 g, 3 h, 4 ° C). The

body, Alexa Fluor 594 goat anti-rabbit Ig G (1:200 with 1 9 1 0 2 5,% er b m e w/v pt e S n o L) U A B C E( S L T A/ S A C W E N F O V NI U at resultant epididy moso me pellets were resuspended in modified B W W P BST), for 1 h at roo m te mperature. DAPI counterstaining was con- ( p H 6.5) (34) in pre paration for co-incu bation with s per matozoa for 1 h ducted for 3 min. Sections were then mounted in Mo wiol and vie wed at 3 7 ° C i n 5 % C O with gentle agitation. These experi ments were 2 on an Axio I mager A1 microscope ( Carl Zeiss) as describe above. titrated such that the caput sper matozoa ( 10 10 6 c ell s/ ml) fr o m To visualize in situepididy moso me-sper m interactions, mouse ca- one mouse were incubated with the equivalent of a single ani mal’s put epididy mal tissue was fixed in 4 % ( w/v) parafor maldehyde con- epididy moso mes (i.e. the pooled epididy moso me preparation de- taining 0.5 % (v/v) glutaraldehyde as previously described (54). The scri be d a bove was s plit into 3 e qual portions prior to incu bation with tissue was then processed via dehydration, infiltration and e mbed- sper m). Follo wing incubation, sper matozoa were washed three ti mes ding in L R White resin. Sections (100 n m) were cut with dia mond knife by centrifugation (400 g, 3 min) in B W W to re move any unbound or ( Diato me Ltd., Bienne, S witzerland) on an E M U C6 ultra microto me loosely adherent epididy moso mes, before a subset were set aside for (Leica Microsyste ms, Vienna, Austria) and placed on 150- mesh nickel affinity labeling with FIT C-conjugated streptavidin to deter mine the grids. For MF GE8 detection, epididy mal sections were blocked in 3 % localization of transferred proteins. The re maining cells were pro- ( w/v) BSA in P BS at 37 ° C for 30 min before being subjected to cesse d for total protein extraction to confir m the u ptake of biotinyl- overnight incubation with anti- MF GE8 antibody ( M BS2004903, di- ated proteins. Additional controls for this experi ment included incu- luted 1:20 in 1 % B S A in P B S) at 4 ° C. After incubation, sections were bation of sper matozoa with FIT C-conjugated streptavidin in the washed free of pri mary antibody via i m mersion in five sequential absence of prior exposure to epididy moso mes, and direct biotinyla- changes of 1 % BSA in P BS (5 min each), and then incubated with tion/FIT C-streptavidin labeling of populations of caput sper matozoa appropriate secondary antibody conjugated to 10 n m gold particles 6 ( 1 1 0 )for 1 h at 37°Cin 5% CO2 . These experi ments were ( G7402, diluted 1:10 in 1 % BSA in PBS) for 2 h at 37° C. After perfor me d in tri plicate, with in de pen dent biological sa m ples ( i.e.s per- washing, labeled sections were counterstained in 2 % ( w/v) uranyl matozoa and epididy moso mes) having each been isolated fro m dif- acetate in 40 % (v/v) methanol for 20 min. Micrographs were captured ferent ani mals and a mini mu m of 100 sper matozoa were assessed/ on a JE OL-100 CX trans mission electron microscope (JE OL, Tokyo, replicate within each treat ment group. Japan) at 80k V. These experi ments were perfor med in triplicate, with To assess the putative involve ment of candidate epididy moso me independent biological sa mples having each been isolated fro m dif- ligands (i.e.milk fat globule- E GF factor 8; MF G E8) in epididy moso me- ferent ani mals. sper m interaction, the experi mental procedure described above was Experi mental Design and Statistical Rationale — For all experi ments, replicated using biotinylated caput epididy moso mes pre-incubated individual biological replicates co mprised pooled preparations of epi- with anti- MF GE8 antibodies (10 g/ ml) for 1 h at roo m te mperature didy moso mes isolated fro m the appropriate epididy mal seg ment (ca- with constant slo w rotation. After incubation, the epididy moso mes put, corpus, cauda) of bet ween three - t welve ani mals. Pooling of were washed free of unbound antibody with P B S, collected by ultra- material fro m this nu mber of ani mals was necessitated based on centrifugation (100,000 g, 1.5 h, 4 ° C) and subsequently assessed recovering enough protein for each do wnstrea m application. Three for the efficiency with which they were able to transfer biotinylated such biological replicates were used for tande m mass tag labeling to proteins to the postacroso mal sheath of caput sper matozoa, using generate our pri mary epididy moso me proteo mic inventory and to equivalent techniques to those described above. Controls for this facilitate co mparative and quantitative proteo mic analyses. Epidid- experi ments included the o mission of the anti- MF GE8 antibody as y moso me proteins were identified as being differentially accu mulated

S 9 4 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes

T ABLE I Su m mary of mouse epididy moso me proteo me data set Nu mber of differentially accu mulated Total proteins Av. peptide Av. unique peptide Av. pr otein proteins (fold change 1.5) i d e ntifi e d hits/ pr otei n hits/ pr otei n coverage ( %) C or p u s v s C a u d a v s C a u d a v s C a p ut C or p u s C a p ut Mouse epididymosomes 1640 13.1 1 1. 8 29.9 146 344 474 bet ween epididy mal seg ments if they experienced a fold change of “ me mbrane,” “extracellular exoso me,” and “cytoplas m” with 1.5 or 1.5; p 0.05. I m munoblotting of candidate epididy mo- 983, 815, and 799 proteins mapping to these respective cat- so me proteins was perfor med (n 3) in order validate our quantitative epididy moso me protein abundance data. Si milarly, additional repli- egories (Fig. 1C ,supple mental Table S2). cates were also e mployed for confir mation of epididy moso me- medi- A d ditional curation of theepididy moso me proteo me on the ated transfer of biotinylated proteins to mouse sper matozoa (n 3). basis male su b-fertility phenoty pes i dentifie d at least 98 pro- m o dfr e d a o nl w o D Please see supple mental Fig. S2 for full details of experi mental design teins whose dysregulated expression has previously been including the nu mber ofani mals/replicates used per experi ment. linked with male fertility phenotypes and/or defective sper m RESULTS production/function (supple mental Table S3). Aside fro m the broad categories of “ male infertility” and “reduced male fer- Global Proteo mic Analysis of Mouse Epididy moso mes — tility,” co m mon pathologies attributed to the loss/dysregula- g/ or e. n nli o p c m w. w w p:// htt Mouse epididy moso mes were recovered separately fro m the tion of male fertility in this sub-class of proteins included: caput, corpus and cauda epididy mides prior to being sub- “reproductive syste m phenotype,” “abnor mal epididy mis jected to Lys- C/Trypsin digestion, T MT labeling and M S anal- morphology,” “abnor mal male reproductive syste m physiol- ysis. Using stringent i dentification criteria ( descri be d a bove), ogy,” and “abnor mal sper m morphology/physiology.” Pre- this experi mental strategy identified a co mplex proteo mic su mably reflecting conserved protein expression in both the cargo co mprising a total of 1640 unique proteins in epidid- testes and epididy mis, “abnor mal testicular morphology,” y moso mes sa mpled fro m across all three epididy mal seg- “decreased testis weight,” and associated defects 9 1 0 2 5, er b in m e pt e S n o “sper- L) U A C E( L T S A C W E N F O V NI U at ments. A mong these proteins, an average nu mber of 11.8 matogenesis” and “ger m cell nu mber” also featured pro mi- unique peptide matches were generated per protein; repre- nently a mong the defined pathological lesions associated with senting an average peptide coverage of 30 % per protein these proteins (supple mental Table S3). (Table I,supple mental Table S1). Pr ovisi onal interr o gati on of thisglobal epididy moso me pro- Conservation of Epididy moso me Cargo — In vie w of the po- teo me on the basis of shared functional classification using tential for overlapping distribution of proteins bet ween the DAVI D Gene Ontology ( G O) annotation tools (version 6.8) testicular and extra-testicular regions of the male reproduc- returned do minant ter ms of “protein binding” ( G O identifiers: tive tract, we sought to confir m that the epididy moso me pro- 5515, 32403, 42802, 42803, 98641), “nucleotide binding” ( G O teins identified herein do indeed constitute those expected of identifier: 166, 3723, 5524, 5525, 44822), “oxidoreductase an enriched exoso me population. For this purpose, our co m- activity” ( G O identifier: 16491) and “catalytic activity” ( G O plete inventory of i dentifie d proteins were use d to interrogate identifiers: 3824, 16,491, 16787) a mong the top 15 G O mo- Exo Carta, a web-based database featuring a co mprehensive lecular function categories when ranked on the basis of nu m- asse mblage of exoso mal cargo identified across multiple tis- ber of annotated proteins (Fig. 1A , supple mental Table S2). sues and organis ms (38). In contrast to our G O analysis in Si mil arl y, i n ter msof G O biological process categories, nota- which a conservative 50 % (815/1640) of the identified pro- ble enrich ment was i dentifie d in the broa d ter m of “trans port” teins mapped to the cellular co mponent category of “extra- ( G O identifier: 6810), as well as the more specific ter ms of cellular exoso me,” (Fig. 1B ,supple mental Table S2), our sur- “protein transport” ( G O identifier: 15031), “vesicle- mediated v e y of the Exo Carta database identified as many as 1352/ transport” ( G O identifier: 16192) and “intracellular protein 1640 proteins (representing 82 %) that have previously been transport” ( G O identifier: 6886) (Fig. 1B , supple mental Table identified a mong exoso me-borne cargo (supple mental Table S2 ). Other notable G O biological process categories of direct S4 ). Notably, this conserved list featured many proteins, such relevance to epididy mal physiology/function included: “oxida- a s t h o s e i mplicatedintranscription and translation, which one tion-reduction process” ( G O identifiers: 6979, 55114), “prote- may not nor mally consi der to be secrete d to the extracellular olysis” ( G O identifiers: 6508), “ metabolic process” ( G O iden- environ ment (supple mental Table S4). It also co mprised all of tifiers: 5975, 6629, 6631, 8152), “binding of sper m to zona t h e t o p 50 exoso me protein markers (38), as well as 92 out of pellucida” ( G O identifier: 7339), and cell adhesion ( G O iden- the top 100 proteins that are most co m monly identified in tifiers: 7155, 98609) (supple mental Table S2). As might be exoso mes, and whose identification is reco m mended as part expected of epididy moso me-encapsulated cargo, the do mi- of the mini mal experi mental require ments for definition of nant G O cellular co mponent categories were identified as extracellular vesicles (37). Exa mples included: tetraspanins

Molecular & Cellular Proteo mics 18.13 S 9 5 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 1.G O annotation of mouse epididy moso me protein cargo. A mong the core inventory of 1640 epididy moso me-associated proteins identified in this study, a total of 1564 (95 %) were able to be annotated according to G O infor mation based on (A ) molecular function, (B ) biological process, and (C ) cellular co mponent. The percentage of epididy moso me proteins mapping to the 15 highest ranked (based on nu mber of assigned proteins) (A ) molecular function and (B ) biological process categories are depicted. (B ) Si milarly, the percentage of epididy moso me proteins mapping to the 3 highest ranked cellular co mponent categories are also sho wn.

( C D9, C D63, C D81, C D151), integrins (IT G A1, IT G A2, IT G A3, In seeking to explore the conservation of epididy moso me IT GA5, IT GA M, IT GAV, IT G B1, IT G B2), endoso me/ me mbrane proteo mic cargo across species, we also surveyed published binding proteins (TS G101, A NXA2, A NXA5, 22 RAB fa mily proteo mic lists generated for epididy moso mes isolated fro m me mbers), signal transduction/scaffolding proteins (syn- the bovine (24), hu man (55), and ovine (56) epididy mis. Al- tenin-1, syntenin-2), and molecular chaperones ( HSPA8, though such infor mation has yet to be curated in Exo Carta, H S P90 A A1) ( supple mental Table S4). In contrast, functional the most co mprehensive of these datasets (i.e.that generated annotation of those epididy moso me proteins that did not fro m the bovine model) co mprises an i mpressive 762 proteins correspond to entries in the Exo Carta database revealed en- that map to epididy moso mes isolated fro m the caput and/or rich ment in the G O biological process categories that one caudal epididy mal seg ments (24). A mong these proteins, we might ex pect to be restricte d to the male re pro ductive tract, were able to confir m the conservation of at least 367/762 including “sper matogenesis,” “binding of sper m to zona pel- (48 %) in both mouse and bovine epididy moso mes (s u p pl e- luci da,” an d “fertilization” ( G O i dentifiers: 7283, 7339, 7338) mental Table S4 ).Although considerable overlap was evident (Fig. 2, green colu mns). In addition to an abundance of pro- across most functional protein categories, we recorded par- teins linked to transport, this latter subset of proteins also ticularly high conservation a mong the subset of riboso mal and featured substantial enrich ment of G O categories kno wn to proteaso mal protein cargo, as well as those proteins mapping be associated with modification of the maturing sper m pro- to the broa d functional categories of “trans porters an d protein teo me, such as “protein glycosylation” ( G O identifier: 6486), trafficking,” “chaperone molecules,” and “enzy mes.” We also “proteolysis” ( G O identifier: 6508), “peptidase activity” ( G O noted enrich ment of proteins associated with “defense” such i dentifier: 10466), an d “ G PI anchor” ( G O i dentifier: 6506) (Fig. as the beta-defensin fa mily and many of the co mple ment 2, y ell o w c ol u m n s). dependent ( C D) proteins that have previously been charac-

S 9 6 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D

F IG. 2.G O biological processes associated with mouse epididy moso me proteins not represented in the Exo Carta database. A total

of 288 proteins were identified in mouse epididy moso mes that did not correspond with entries curated in the Exo Carta database. These g/ or e. n nli o p c m w. w w p:// htt proteins were annotated according to G O infor mation based on biological process and the 20 highest ranked (based on nu mber of assigned proteins) categories are depicted. Colored colu mns represent those proteins clustered to the G O ter ms related to transport (pink), oxidation- reduction (blue), metabolis m (gray), protein synthesis/degradation (yello w), and sper m function (green). terized in bovine epididy moso mes. In assessing the more rogation of the intensity of reporter ions tagged to each pep- modest proteo mic profiles generated for hu man and ra m epi- ti de to deter mine the differential accu mulation of proteins into di dy moso mes we were a ble to i dentify relatively high levels of epididy moso mes recovered fro m the caput, corpus and conservation. Indeed, our data co mprised at least 101/146 cauda seg ments of the mouse epididy mis. In this analysis, 9 1 0 2 5, er b m e pt e S n o L) U A C E( L an T S A C W E N F O V NI U at (69 %) and 25/28 (89 %) of those proteins previously identified arbitrary threshold of 1.5-fold change (p 0.05) was se- in hu man (55) and ra m (56) epididy moso mes, respectively lected as the basis for assign ment of differentially accu mu- (supple mental Table S4). late d pr oteins. Usin g this criteri on, we i dentifie d a su bstantial A s a n extension of this analysis, we also explored the nu mber of proteins whose relative abundance re mained con- si milarity of the epididy moso me proteo me with that of an sistent in all epididy moso me sub-populations surveyed (Fig. independently published mouse sper m proteo me (57) (sup- 3C ). Indeed, in considering the cu mulative changes in protein ple mental Table S4 ).Not withstan ding technical li mitations i m- abundance bet ween the proxi mal (caput) and distal (cauda) posed by inco mplete sequence coverage of the sper m pro- epididy moso me sub-populations, 71 % of the total pro- teo me and conversion of UniProt accession nu mbers to teo me were detected at equivalent levels (Fig. 3C , 3D ). corresponding Gene Na me I Ds (li miting our co mparison to Not withstanding the conservation of this subset of core 1560/1640 epididy moso me proteins), this approach reveled epididy moso me proteins, we did docu ment apparent gradi- the conservation of 589, 624, and 407 epididy moso me pro- ents of accu mulation/reduction associated with the abun- teins within the caput, corpus, and cauda sper m proteo mes, dance of many of the epididy moso me proteins. As antici- res pectively. pated, these changes were more pronounced when Relative Quantification of Differential Protein Accu mulation considered across the entire epididy mal tubule (i.e. caput into Epididy moso mes —Having established the overall pro- versus cauda). Indeed, we identified 474 ( 29 %) proteins that teo mic co mposition of mouse epididy moso mes, we next in- were differentially accu mulated in cauda epididy moso mes vestigated changes encountered in different epididy mal versus those sa mpled fro m the caput; including, 296 proteins seg ments. These studies were initiated via labeling of epi- that were under-represented, and a further 178 proteins that didy moso me proteins with Cy-dyes prior to their resolution by were over-represented in the cauda sa mples (Fig. 3 E). In 2 D S D S- P A GE to visually co mpare their proteo mic profiles. evaluating the te mporal appearance of these changes in the Using this approach, we docu mented a relatively high degree epididy moso me proteo me, the majority coincided with the of co m monality in the proteo mic cargo of epididy moso mes transition fro m the corpus to the cauda epididy mal seg ments fro m the proxi mal epididy mal seg ments (i.e. caput versus as opposed to bet ween the more proxi mal caput to corpus corpus epididy mis) (Fig. 3A ). By contrast, more overt qualita- seg ments. Such findings accord with our Cy-dye labeling data tive and quantitative differences were noted in the proteo mic (Fig. 3A , 3B ) as well as the physiological roles of the different cargo of corpus versus cauda epididy moso mes (Fig. 3B). epididy mal seg ments; with the caput and corpus participating Accordingly, this analysis was expanded to include an inter- in sper m maturation and the cauda fulfilling a key role in

Molecular & Cellular Proteo mics 18.13 S 9 7 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at F IG. 3.Co mparative analysis of mouse epididy moso me protein abundance. The differential accu mulation of epididy moso me protein cargo was assessed via cyanine-dye labeling of epididy moso me protein extracts, recovered fro m (A ) the proxi mal (caputversus corpus) and (B ) distal (corpus versus cauda) epididy mal seg ments. Labeled proteins extracted fro m the t wo different epididy moso me populations were mixed, prepared for resolution by 2 D S D S- P A GE and visualized via the use of a Typhoon FL A 9500 laser scanner. This experi ment was replicated three ti mes using paired epididy moso me sa mples (i.e.either caput and corpus or corpus and cauda epididy moso mes) and depicted are representative merged gel i mages for (A ) caput (cyanine3-labeled, pseudo-colored in green) and corpus (cyanine5-labled, red) epidid- y moso me proteins or alternatively, (B ) corpus (cyanine3-labeled, pseudo-colored in green) and cauda (cyanine5-labled, red) epididy moso me proteins. C –E, The relative abundance of epididy moso me proteins was also deter mined via assess ment of T MT reporter ion intensity in sa mples isolated fro m the caput, corpus and cauda seg ments of the mouse epididy mis. For this analysis a threshold of 1.5 fold change (p 0.05) was set as the basis for assign ment of differentially accu mulated proteins. C , The overall nu mber of proteins that experienced no-change (black colu mns), increased (red colu mns), or decreased (green colu mns) accu mulation in each epididy moso mes population are sho wn. Si milarly, the conservation of proteins that were either ( D ) unchanged or (E) experienced 1.5-fold increase (red font, 1 ) or decrease (green font,2 ) change a mong different epididy moso me sub-populations are also sho wn. sper m storage/ maintenance (58). In extending support for tion of bet ween 5- and 9-fold in epididy moso me fractions these divergent roles, abundance cluster analyses confir med over the sa me epididy mal seg ments (Fig. 4D and supple men- that epididy moso mes fro m the caput and corpus possess a t al T a bl e S 1 ). strikingly si milar protein abundance profile (Fig. 4A ). Con- Gene Ontology Analysis of Differentially Accu mulated Epi- versely, the population of cauda epididy moso mes were char- didy moso me Proteins —To investigate the functional charac- acterized by an al most reciprocal protein abundance profile teristics of conserved proteins as opposed to those that were ( Fi g. 4A ). differentially accu mulated into populations of caput and These quantitative changes included several proteins that cauda epididy moso mes, each subset was classified accord- experienced 5-fold changes bet ween the sub-populations ing to their kno wn, or predicted, biological processes using of caput and cauda epididy moso mes (Fig. 4B–4D and sup- Gene Ontology categories (48, 49) (Fig. 5). As previously ple mental Table S1 ).One of the most do minant a mong these docu mented follo wing curation of the entire epididy moso me was C U B and zona pellucida-like do main-containing protein proteo me (Fig. 1), many of the proteins that were detected at ( C UZ D1), a protein that was detected at levels of 9-fold equivalent levels in caput and cauda epididy moso mes higher in the cauda seg ment versus that of the caput, respec- mapped to the do minant G O biological process categories of tively (Fig. 4D and supple mental Table S1). Conversely, pro- “trans port” ( G O i dentifier: 6810), “ protein trans port” ( G O i den- t ei n s s u c h as PRRG3, SCL38A5, B4GALT4, ADA M28, tifier: 15031), “vesicle- mediated transport” ( G O identifier: A DA M7, R NASE10 were characterized by an apparent reduc- 16192), “intracellular protein transport” ( G O identifier:

S 9 8 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt

F IG. 4.Plots depicting fold changes associated with differentially accu mulated epididy moso me proteins. A , Protein abundance cluster analysis depicting patterns of relative abundance for each of the 1640 epididy moso me proteins quantitated in this study. The y axis 9 1 0 2 5, er represents b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at the average reporter ion abundance (log1 0 scale) deter mined for each of the 1640 proteins identified in this study (x axis). B –D , Volcano plots were constructed to de monstrate the log 2 fol d change (x axis) an d pro ba bility score (y axis) of proteins that were deter mine d to be differentially accu mulated in epididy moso mes isolated fro m the (B ) corpusversus caput, (C ) caudaversus corpus, and (D ) caudaversus caput epididy mal seg ments. Thresholds of 1.5-fold change (p 0.05) in T MT reporter ion intensity were i mple mented to identify proteins subject to differential accu mulation in epididy moso mes fro m different epididy mal seg ments. Several epididy moso me proteins that experienced pro minent fold- changes bet ween epididy mal seg ments are annotated as is the MF GE8 protein that was targeted for additional characterization.

0006886), “ E R to Golgi vesicle- mediated transport” ( G O iden- 45087, 6958, 2376, 50853, 42742, 50871, 6910, 6911) (Fig. tifier: 6888), an d “en docytosis” ( G O i dentifier: 6897) (Fig. 5A ; 5B ; orange colu mns). In the opposing subset of proteins pink colu mns). Notably, these transport categories also do m- characterized by lo wer abundance in cauda epididy mo- inated the G O profile of proteins that experienced a de- so mes, the prevailing G O biological processes were clearly creased abundance in cauda versus caput epididy moso mes differentiated, featuring nu merous categories associated with (Fig. 5C ; pink colu mns). In contrast, those epididy moso me vesicle transport/secretion, which in addition to those de- proteins for which a positive gradient of accu mulation was scribed above, included “retrograde vesicle- mediated docu mented bet ween the caput and caudal seg ments, transport, Golgi to E R/endoso me to Golgi” and “positive mapped to co m mon G O biological process categories of regulation of exoso mal secretion” ( G O identifiers: 6890, protein degradation/ modification: “proteolysis,” “phosphory- 42147, 1903543) (Fig. 5C ; pink colu mns). lation,” “negative regulation of pe pti dase activity” ( G O i denti- Validation of Differentially Accu mulated Epididy moso me fiers: 6508, 16310, 10466) (Fig. 5B ; yello w colu mns); metab- Proteins — To confir m the differential accu mulation of proteins olis m: “ metabolic process,” “carbohydrate metabolic process,” into epididy moso mes, 12 candidate proteins were selected and “lipid metabolic process” ( G O identifiers: 8152, 5975, for orthogonal targeted validation via i m munoblotting. Most of 6629) (Fig. 5B , gray colu mns); “oxidation-reduction process” these proteins were selected based on highest abundance in ( G O identifier: 55114) (Fig. 5B ; blue colu mns); as well as the caput epididy moso mes before experiencing reduced proteins that clustered into G O biological processes synony- abundance in the cauda seg ment of the epididy mis (i.e. mous with i m munological responses: “innate i m mune re- A DA M7, B4 GALT1, B1, MF GE8, P DIA6). Ho wever, this sponse,” “co mple ment activation,” “i m mune syste m proc- analysis also inclu de d proteins exhi biting the reci procal tren d ess,” “ B-cell receptor signaling path way,” and “defense of increasing accu mulation in cauda epididy moso mes (i.e. response to bacteriu m,” “positive regulation of B cell activa- AL D H2, CL U, P R O M2, B A G6), as well as those that re mained tion,” “phagocytosis, recognition/engulf ment” ( G O identifiers: at relatively constant levels in epididy moso mes sa mpled

Molecular & Cellular Proteo mics 18.13 S 9 9 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 5.G O annotation of differentially accu mulated epididy moso me protein cargo. Epididy moso me proteins were segregated based on relative levels of abundance in cauda versus caput epididy moso mes into those that experienced (A ) no change, (B ) increased accu mulation, or (C ) decreased accu mulation. Proteins within each category were subsequently annotated according to G O biological process. Sho wn are the top 20 biological processes assigned based on the nu mber of mapped proteins. Colored colu mns represent those proteins clustered to the G O ter ms related to transport (pink), metabolis m (gray), oxidation-reduction (blue), protein interactions/catabolis m (yello w), cell-cell adhesion (purple) and cellular responses to stress, etc (orange). Additional, do minant G O categories are denoted by black colu mns.

S 1 0 0 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 6.I m munoblot validation of the abundance of differentially accu mulating epididy moso me proteins.A , Quantitative MS data were validated via i m munoblotting of differentially accu mulating proteins. Candidate proteins included representatives with the highest abundance (according to T MT reporter ion intensity) in epididy moso mes fro m the proxi mal seg ment of the epididy mis (caput) (A DA M7, B4 GALT1, H S P90 B1, MF GE8, P DI A6) in addition to proteins exhibiting increasing accu mulation in cauda epididy moso mes ( i.e.AL D H2, CL U, P R O M2, B A G6), and those that re mained at relatively constant levels in epididy moso mes sa mpled throughout the epididy mis ( P S M D7, D N M2, H S P A2). B , Corresponding M S quantification data are presented. C , Negative controls included sper m proteins of testicular origin (IZ U M O1, A D A M3, O DF2), D , whereas positive controls included validated epididy moso me/exoso me proteins ( GAP D H, FL OT1). Analyses were perfor med in triplicate using biological sa mples co mprising pooled epididy moso mes purified fro m 12 mice and representative i m munoblots are depicted. E , A linear regression was perfor med to co mpare the quantification data obtained via T MT (x axis) and i m munoblotting (y axis) analyses for each of the targeted epididy moso me proteins, revealing significant correlation ( R 2 0.61; p 0.005) bet ween these data sets. throughout the epididy mis (PS M D7, D N M2, HSPA2). All i m- 6A ); with each of these proteins characterized by an accu mu- munoblotting experi ments were perfor med in triplicate using lati on pr ofile that cl osely parallele d the tren ds i dentifie d by M S pooled biological sa mples (n 3 ani mals/sa mple) differing analyses (Fig. 6B ). The one exception was that of HSPA2, fro m those e mployed for MS analyses and, in each experi- which was under-represented in the cauda epididy moso mes ment, flotillin 1 (FL OT1) was e mployed as an endogenous via i m muno blotting, yet recor de d at e quivalent levels in ca put control to nor malize the a bun dance levels of targete d proteins and cauda epididy moso mes via MS analysis (Fig. 6A, 6B ). (Fig. 6D ). This analysis confir me d the differential accu mulation Moreover, we were unable to detect the presence of selected of 9 of the targeted epididy moso me proteins, as well as the s per m proteins of testicular origin that were inclu de d as neg- equivalent abundance of 2 of the re maining candidates (Fig. ative controls (i.e.IZ U M O1, A D A M3, O DF2) (Fig. 6C ). Accord-

Molecular & Cellular Proteo mics 18.13 S 1 0 1 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt

F IG. 7.Exa mination of the transfer of biotinylated proteins to sper matozoa after co-incubation with epididy moso mes. The ability of mouse epididy moso mes to transfer protein cargo to sper matozoa was assessed via labeling of caput epididy moso mes with a 9 1 0 2 me 5, er b m e pt mbrane- e S n o L) U A C E( L T S A C W E N F O V NI U at i mper meant biotin reagent. The biotinylated epididy moso mes (biotinylated E S) were co-incubated for 1 h with sper matozoa isolated fro m the ca put e pi di dy mis, after which the cells were washe d thoroughly an d s plit into t wo fractions in pre paration for assess ment of biotinylate d protein uptake via either (A , B ) i m munoblotting or (C ) affinity labeling with H R P- or FIT C-conjugated streptavidin, respectively. D , To explore the specificity of epididy moso me-sper m interactions, an equivalent experi ment was perfor med in which caput sper matozoa were incubated with epididy moso mes isolated fro m the cauda epididy mis. E,F, Controls included equivalent populations of sper matozoa incubated under identical conditions in the absence of epididy moso mes. These cells were either left in an unlabeled state ( A ,F: Unlabeled sper m) to confir m the absence of auto-fluorescence or incubated directly with biotin reagent ( B ,E ) to confir m the specificity of epididy moso me- mediated protein transfer. (G ) Additionally, the efficacy of biotinylated protein transfer into the post-acroso mal sheath of the sper m head was assessed follo wing co- incubation of caput sper matozoa with either caput (pink colu mn) or cauda (blue colu mn) epididy moso mes. These analyses were perfor med in triplicate using biological sa mples co mprising pooled epididy moso mes purified fro m three mice. During co-incubation, sper matozoa and epididy moso mes were titrated to a ratio of 1:1; that is, aliquots of sper matozoa recovered fro m one ani mal ( 10 10 6 cells) were inse minated with epididy moso mes also equating to those isolated fro m a single ani mal. A mini mu m of 100 sper matozoa were assessed/replicate within each treat ment group and graphical data are presented as means S.E.; ** p 0.01. Representative i m munoblots and i m munofluorescence i mages are sho wn. ingly, a linear regression co mparing the fold-changes re- ysis reveale d significant accu mulation of biotinylate d protein corded for each of these targets revealed significant into the sper m proteo me (Fig. 7A ). Of note, the epididy mo- c orrelati o n (R 2 0.61; p 0.005) bet ween the quantification so me- mediated transfer of biotinylated protein appeared to data obtained via T MT and i m munoblotting analyses (Fig. 6 E). be selective such that at 1 h postincubation with caput epi- Together, such findings support the accuracy of our data in didy moso mes, these cargo were predo minantly localized reflecting the spatial patterns of mouse epididy moso me pro- within the post-acroso mal sheath of the head of 40 % of the teo mic signatures. s per m po pulation (Fig. 7C ). A d ditional la beling, al beit far less Accu mulation of Biotinylated Epididy moso me Cargo into intense, was detected within the anterior acroso mal region of Sper matozoa — Having confir med substantial changes in the the hea d of these s per matozoa (Fig. 7C ). Alternatively, a s mall overall profile an d relative levels of proteins present within nu mber of cells (i.e. 15 %) were characterized by punctate mouse epididy moso mes, we next sought to deter mine labeling that was either distributed throughout the sper m whether these vesicles were capable of delivering this mac- head or restricted to the sub-acroso mal ring; ho wever, we ro molecular cargo to sper matozoa. Specifically, we applied rarely (i.e. 5 %) observed any labeling of the sper m flagel- an o pti mize d co-incu bation strategy (34) to track the transfer lu m. To extend our analysis of the specificity of epididy mo- of biotinylated epididy moso me protein cargo into mouse so me-sper m interaction, we perfor med a heterologous co- sper matozoa sa mpled fro m the caput epididy mis. This anal- incubation assay in which caput sper m were incubated with

S 1 0 2 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes epididy moso mes recovered fro m the cauda epididy mis. (Fig. 8D , 8E). Notably, we were also able to de monstrate that Thereafter, we recorded the transfer of biotinylated protein pre-incubation of epididy moso mes with anti- MF GE8 antibod- cargo pri marily into the post-acroso mal sheath of the sper m ies significantly co m pro mise d the efficacy of biotinylate d pro- head, an equivalent do main to that witnessed follo wing incu- tein cargo transfer bet ween caput epididy moso mes and sper- bation with caput epididy moso mes (Fig. 7D ). Notably ho w- matozoa (Fig. 8 G ;p 0.05). ever, b oth the intensity of the bi otin la belin g ( Fi g. 7 D ), an d the nu mber of sper matozoa incorporating this label were signifi- DI S C U S SI O N cantly reduced co mpared with caput sper m incubated with A salient feature of the ma m malian e pi di dy mis is its tre men- caput epididy moso mes (i.e.24.7 2.7 versus 40.3 2.9 %, dous capacity for protein synthesis and secretion (12). Such respectively; p 0.01) (Fig. 7G ). To control for the possibility activity un der pins the pri mary roles of this tissue in pro moting of nonspecific labeling because of the presence of unreacted the functional maturation of the male ga mete as well as their biotin reagent, sper matozoa were also subjected to direct prolonge d storage in a via ble state (10, 20). Both roles neces-

bi otinylati on, revealin g a distinct pattern of la belin g that was sitate efficient mechanis m(s) of delivering protein, and pre- m o dfr e d a o nl w o D unifor mly distributed across all sper m do mains (Fig. 7 E ). Si m- su mably other regulatory cargo (50), to the developing sper m ilarly, the pr ofile of s per m pr oteins tar gete d f or direct bi otin- cells. A mong the potential mechanis ms that could facilitate ylation also differe d fro m that present in either biotinylate d such bulk transfer, there is no w co mpelling evidence support- epididy moso mes or in sper m lysates follo wing their co-incu- ing at least t wo; na mely, via nonpathological a myloid matrices bation with biotinylated epididy moso mes (Fig. 7 B ). Alterna- (61) and/or epididy moso mes (21). The for mer of these, which t :/www l .o g/ or e. n nli o p c m w. w w p:// htt tively, we failed to detect any endogenous biotin labeling may also equate to epididy mal dense bodies (62, 63), have within naive populations of sper m incubated in the absence of been proposed to coordinate interactions bet ween the epi- epididy moso mes (Fig. 7A , 7F). didy mal lu minal contents and sper matozoa, although the ex- Our collective data supporting the ability of epididy mo- tent an d biological significance of such interactions re main to so mes to act as vehicles for modification of the maturing be fully resolved. By contrast, the constitutive shedding of sper m proteo me pro mpted us to seek a more detailed char- epididy moso mes appears pivotal in ter ms of modulating acterization of the mechanistic basis of this interaction. For sper m function (29). Indeed, our study adds to a gro wing body this purpose, we elected to focus on MF GE8, a prevalent of evidence that, despite the relatively si mple structure 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T of S A C W E N F O V NI U at extracellular vesicle marker that has been variously i m plicate d these nano-sized me mbranous vesicles, they encapsulate an in the for mation, secretion and uptake of exoso mes fro m extre mely rich and diverse proteo mic cargo; and one that is nu merous cell types (59). Moreover, MF GE8 holds functional co m mensurate with their putative role as key inter me diaries in significance in ter ms of pro moting sper m maturation o wing to so ma-sper matozoa co m munication (21). its do wnstrea m role in mediating initial adhesion to the egg Although the preparation of epididy moso mes has been re- coat (60). Consistent with previous evidence that mouse sper- ported in several species, to date the co mprehensive molec- matozoa acquire MF GE8 during t wo distinct phases of their ular profiling of their cargo has pre do minantly been restricte d develop ment; the first coinciding with sper matogenic devel- to that of large do mestic s pecies (e.g. bull). Such s pecies hol d op ment in the testis and the second attributed to uptake of obvious advantages in ter ms of per mitting the collection of the protein fro m of the proxi mal epididy mal seg- large volu mes of unconta minated intralu minal fluids fro m ments, we docu mented discrete patterns of MF GE8 localiza- along the length of the e pi di dy mal tract (21, 26). Regretta bly, tion in testicular sper matozoa versus those isolated fro m the the application of equivalent collection protocols in s maller distal caput epididy mis. Thus, MF GE8 localized to the peri- laboratory ani mals such as the mouse is technically very nuclear do main of the head and the principal-piece of the challenging, particularly in the context of recovering enough flagellu m in i m mature testicular sper matozoa (Fig. 8A ). In material to per mit detaile d en d point characterization of the contrast, caput sper matozoa were characterized by additional epididy moso me proteo me. Through necessity, we have in- foci of labeling within the post-acroso mal sheath and the stead pursued the isolation of epididy moso mes fro m sa mples anterior dorsal as pect of the hea d (Fig. 8B ). E quivalent la bel- of lu minal fluid obtained by puncture of the epididy mis and ing of caput epididy mal tissue sections revealed intense processing by successive centrifugations to purify these ves- MF G E8 localization within the supranuclear do main of princi- icles. Although our previous studies have reported the utility pal cells and juxtaposed with sper matozoa within the lu men of of this a p proach in effectively eli minating cellular de bris an d the tract (Fig. 8C ); co m mensurate with that expected of a sper m frag ments (34), we readily ackno wledge that we cannot secretory protein. Accordingly, trans mission i m munoelectron entirely mitigate against the possibility of so me epithelial microscopy analyses confir med the presence of MF GE8 and/or sper m cell conta mination. One potential source of within epididy moso mes, illustrating that i m munogold particles conta mination is that of the cytoplas mic droplet, a nascent were pri marily restricted to the me mbrane of these vesicles structure for med as a legacy of sper miogenesis during which an d exten de d into stalk-like projections that for me d at sites of sper matozoa are re modeled to re move most of their cyto- interaction with the post-acroso mal sheath of sper matozoa plas m. This residual body is subsequently shed fro m the

Molecular & Cellular Proteo mics 18.13 S 1 0 3 Proteo mic Profiling of Mouse Epididy moso mes loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 8.Characterization of the exoso me marker, MF GE8, in mouse epididy moso mes. Anti- MF GE8 antibodies were used to localize the protein in populations of ( A ) testicular sper matozoa and those isolated fro m the (B ) caput epididy mis. Sper matozoa were sequentially labeled with an appropriate Alexa Fluor 488-conjugated secondary antibody (green) and counterstained with the nuclear stain D A PI (blue). C , Si milarly, anti- MF GE8 antibodies were also used to label caput epididy mal tissue sections, prior to the application of an Alexa Fluor 594-conjugated secon dary anti bo dy (re d) an d D A PI counterstain ( blue). Arro whea ds in dicate su pranuclear la beling of e pithelial cells characteristic of e pi di dy mal secretory proteins. Ep epitheliu m, Sp lu minal sper matozoa, Int interstitial tissue.D –F, Trans mission i m munoelectron microscopy was used to assess the localization of anti- MF GE8 antibodies in caput epididy moso mes in situ. These sections were labeled with appropriate secondary antibody conjugated to 10 n m gold particles, with (F) controls including equivalent sections in which the pri mary antibody was o mitte d (i.e.secon dary anti bo dy only). (E ) De picts an enlarge ment of the boxe d area in panel (D ).G , The efficacy of biotinylate d protein transfer into the post-acroso mal sheath of the head of caput sper matozoa (inset) was assessed follo wing pre-incubation of caput epididy moso mes with either anti- MF GE8 or anti- G A P D H antibodies, the latter being a protein that one would not expect to be associated with the me mbrane of extracellular vesicles. All analyses were perfor med in triplicate with representative i m munofluorescence and i m munogold i mages being sho wn. (G ) A mini mu m of 100 sper matozoa were assessed/replicate within each treat ment group and graphical data are presented as means S.E.; * p 0. 0 5. maturing sper m cell as they are conveyed through the epidid- an d the ancillary categories of “ protein trans port” an d “vesi- y mis. Of relevance to our study, the cytoplas mic droplet does cle- mediated transport” were identified a mong the do minant contain nu merous vesicles of roughly equivalent size to epi- biological processes annotated fro m the co mplete mouse didy moso mes and could thus be co-isolated alongside this epididy moso me proteo me. Moreover, i m munoblotting con- population of exoso mes. Adding to this concern is that rec- fir med that the epididy moso me fractions studied herein were ognition that the cytoplas mic droplet serves to co mpart men- devoi d of several s per m-s pecific markers, inclu ding well char- talize proteins i mplicated in me mbrane trafficking path ways, acterized proteins of testicular origin (i.e. IZ U M O1, A D A M3 glucose transport, glycolysis, actin, tubulin and the protea- a n d O D F 2). s o m al c o m pl e x ( 6 4). Gi v e n t hi s p o s si bilit y f or c o nt a mi n ati o n, it Although such evidence affir ms our enrich ment of epidid- was particularly reassuring that “extracellular exoso me” fea- y moso mes, our co mpiled proteo mic inventory did contain tured a mong the top ranked cellular co mponent categories several categories of protein that one may not reasonably identified in the epididy moso me reported herein. Indeed, ex pect to be associate d with an extracellular vesicle destine d 82 % of the total epididy moso me protein cargo identified to be delivered to sper matozoa, or do wnstrea m seg ments of herein have previously been identified as genuine exoso me- the male/fe male reproductive tracts. We did not anticipate the borne cargo in other cellular models. Si milarly, “transport” presence of proteins mapping to the broad functional cate-

S 1 0 4 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Mouse Epididy moso mes gories of nucleotide binding and processing; with relatively me mbrane directly influence the efficacy of epididy moso me large nu mbers of histone variants and ribonucleoproteins uptake, regrettably the mechanistic basis of this process has serving as cases in point. Although it is difficult to envisage yet to be co mpletely resolved. Notably, it has been argued the functional significance of such fin dings, they are certainly that endocytic uptake, one of the principal routes of exoso me not without precedent. In this context, equivalent proteins internalization in so matic cells (33), is severely co mpro mised have been docu mented in populations of exoso mes originat- in mature sper matozoa (72). Indeed, cytoche mical investiga- in g fr o m cell ty pes as diverse as fi br o blasts, mast cells, neural tions have reported that sper matozoa lack the machinery ste m cells, dendritic cells, and oligodendrocytes (65– 68). A needed to internalize exogenous molecules via endocytosis subset of these proteins have also been recorded a mong the and are also devoid of the lysoso mal organelles that serve as constituents of bovine epididy moso mes (24). Further, it is the ty pical targets for en docytose d cargo (72). Further, mature ackno wledged that mature sper matozoa do harbor the basic, sper matozoa are apparently also incapable of the active lipid and presu mably obsolete, machinery to synthesize proteins, recycling necessitated by endocytosis (73). In vie w of this including nu merous cytoplas mic and mitochondrial riboso mal evidence, it is possible that sper matozoa e mploy non-canon- loadedf m o dfr e d a o nl w o D proteins (69). It is wi dely hel d that such proteins si m ply re p- ical path ways for uptake of epididy moso mes, such as direct resent re mnants of the sper matogenic process. Ho wever, our fusion occurring at the interface of the respective me mbranes study raises the intriguing prospect that the co mple ment of (33). In this context, several co mple mentary protein fa milies these proteins may also be supple mented during post-testic- i mplicated as key regulators of me mbrane/vesicle fusion- ular sper m develop ment via interaction with epididy mo- based path ways have been identified in sper matozoa (69) and t :/www l .o g/ or e. n nli o p c m w. w w p:// htt so mes. In the absence of evidence substantiating the synthe- in the epididy moso me proteo me reported herein. Exa mples of sis of proteins fro m nuclear-encoded genes in sper m, such the latter proteins include those of the soluble N SF attach- proteins may be subverted for alternative non-canonical func- ment protein receptor (S NARE) ( e.g. VAT1, STX3, STX4, tions or may be trans mitte d to the oocyte u pon fertilization to STX5, STX6, STX7, STX8, STX12, STX16, STX17, STX BP2, participate in early e mbryogenesis. Further work is clearly STXBP3), RAB s mall GTPase (RAB1A, RAB1B, RAB2A, required to substantiate these possibilities and thus refine RAB2B, RAB3A, RAB5A, RAB5B, RAB5C, RAB6B, RAB7A, our understanding of the biological i mplications of such RAB8A, RAB8B, RAB9A, RAB11B, RAB13, RAB14, RAB18, enrich ment. RAB22A, RAB23, RAB24, RAB25, RAB35), 9 1 0 and 2 5, er b m e pt e S n o L) U A SEC C E( L T S A C W E N F O V NI U at Notably, functional annotation of the subset of the 18 % of (SEC11A, SEC13, SEC22B, SEC23A, SEC23B, SEC23IP, epididy moso me proteins that were identified as not having S E C24 A, S E C31 A) related fa milies. Alternatively, it has been not been reported in previous exoso me proteo mic cata- postulated that selective trafficking of epididy moso me cargo logues, revealed an abundance of candidates linked to sper m to recipient sper m cells may be coordinated by the lipid maturation an d/or fertilization; characteristics that one may raft-like properties of the vesicular me mbranes (74). In this logically ex pect to be associate d with the functional transfor- regard, it is kno wn that the lipid co mposition of mouse epi- mation of the male ga mete. Notable exa mples include the didy moso mes is dyna mically re modeled in different epididy- functional subunits of the chaperonin containing T C P1 co m- mal seg ments such that these vesicles beco me progressively plex ( C CT/T Ri C) as well as the putative interacting partners of more rigi d in the distal seg ments of the duct (75). At present Z P3 R and Z P B P2, which have been i mplicated in the media- it is not kno wn what i mplications this has in ter ms of epidid- tion of sper m-oocyte interactions (70, 71). Such findings sug- y moso me-sper m and/or epididy moso me-so ma interactions. gest that this “non-conserved” subset of epididy moso me- Clearly, a d ditional work is nee de d to distinguish the relative borne proteins may be of interest in helping to decipher the contri bution of the putative route(s) of se questration of e pi- mechanis ms driving the functional maturation of sper mato- di dy moso me contents into reci pient cells. In gui ding this work zoa; potentially extending to a directed analysis of species- ho wever, it is notable that previous studies have described s pecific ele ments of these path ways. In a si milar context, the the fusogenic properties of bovine epididy moso mes and pro- abundance of these proteins assigned to the broad categories vided co mpelling evidence that such interactions lead to sig- of “transport” and “oxidation-reduction” may hold i mportant nificant changes in the lipid and protein co mposition of epi- infor mation in ter ms of dissecting the mechanistic basis of didy mal sper m (22). These findings are consistent with our epididy moso me-biogenesis/trafficking and protection of the o wn i m munoelectron microscopy analyses, which confir med mature ga mete fro m free ra dical injury, res pectively. the presence of stalk-like projections for ming at the interface Our collective data also su p port the notion that e pi di dy mo- of epididy moso me-sper matozoa contact within the lu men of so me-s per m interaction is selective, with biotinylate d e pi di d- the epididy mis. Such ultrastructural features have previously y moso me proteins being preferentially sequestered into a been recorded and taken as evidence of vesicle fusion be- discrete physiological do main kno wn as the post-acroso mal t ween sper matozoa and oviductoso mes (extracellular me m- sheath, which is located within the posterior of the sper m brane vesicles release d into the ovi ductal flui d) (76), raising head. Although such selectivity raises the prospect that the prospect of conserved mechanis ms for facilitating cargo unique co mpositional characteristics of the sper m plas ma delivery bet ween sper matozoa and the different populations

Molecular & Cellular Proteo mics 18.13 S 1 0 5 Proteo mic Profiling of Mouse Epididy moso mes

of extracellular vesicles they encounter en route to the site of 4. Corn wall, G. A. (2014) Role of posttranslational protein modifications in fertilization. Accordingly, epididy moso me protein transfer was epididy mal sper m maturation and extracellular quality control. Adv. Exp. Med. Biol. 759, 159 –180 significantly inhibited by antibody masking of MF G E8, a protein 5. Cooper, T. G. (1986) The epididy mis, sper m maturation and fertilisation., that possesses an R G D recognition motif i mplicated in integrin/ Springer-Verlag, Berlin, Heidelberg, Ne w York, London, Paris, Tokyo ligan d interactions that procee d cellular fusion (59). 6. Cooper, T. G. (1995) Role of the epididy mis in mediating changes in the male ga mete during maturation. Adv. Exp. Med. Biol. 377, 87–101 In su m mary, the data obtained in the present study pro- 7. J eli n s k y, S. A., T ur n er, T. T., B a n g, H. J., Fi n g er, J. N., S ol ar z, M. K., Wil s o n, vi des novel insights into the diversity of the proteo mic lan d- E., Bro wn, E. L., Ko pf, G. S., an d Johnston, D. S. (2007) The rat e pi di d- scape encapsulated within mouse epididy moso mes. In ac- y mal transcripto me: co mparison of seg mental gene expression in the rat and mouse epididy mides. Biol. Reprod. 76, 561–570 cordance with previous work, our findings e mphasize the 8. Johnston, D. S., Jelinsky, S. A., Bang, H. J., Di Candeloro, P., Wilson, E., funda mental i mportance of epididy moso mes as key ele ments Kopf, G. S., and Turner, T. T. (2005) The mouse epididy mal transcrip- of the epididy mal microenviron ment necessary for coordinat- to me: transcriptional profiling of seg mental gene expression in the epi- ing post-testicular sper m maturation and storage. This work didy mis. Biol. Reprod. 73, 404 – 413 9. Johnston, D. S., Jelinsky, S. A., Bang, H. J., Di Candeloro, P., Wilson, E., encourages further studies ai med at deciphering the biogen-

Kopf, G. S., and Turner, T. T. (2005) The mouse epididy mal transcrip- m o dfr e d a o nl w o D esis and cargo-sorting mechanis ms responsible for epidid- to me: transcriptional profiling of seg mental gene expression in the epi- y moso me for mation as well as more detailed exa mination of didy mis. Biol. Reprod. 73, 404 – 413 10. Dacheux, J. L., and Dacheux, F. (2014) Ne w insights into epididy mal the mechanis m(s) by which they can coordinate the delivery of function in relation to sper m maturation. Reproduction 147, R27– R42 proteinaceous cargo to reci pient cells. 11. Dacheux, J. L., Dacheux, F., and Druart, X. (2016) Epididy mal protein markers and fertility. Ani m. Reprod. Sci. 169, 76– 87 Ackno wledg ments —This work was supported by a National 12. Guyonnet, B., Dacheux, F., Dacheux, J. L., and Gatti, J. L. (2011) The g/ or e. n nli o p c m w. w w p:// htt Health and Medical Research Council of Australia Project Grant epididy mal transcripto me and proteo me provide so me insights into ne w (APP1147932) a warded to B. N. and M. D. D. B. N. is supported by an epididy mal regulations. J. Androl. 32, 651– 664 13. Hinton, B. T., and Cooper, T. G. (2010) The epididy mis as a target for male Australian Research Council Future Fello wship. M. D. D. is supported contraceptive develop ment. Handb. Exp. Phar macol. 117–137 by a Cancer Institute N S W E C R Fello wship. This work was supported 14. Turner, T. T., Johnston, D. S., an d Jelinsky, S. A. (2006) E pi di dy mal geno m- by Mr Nathan S mith fro m The University of Ne wcastle Analytical and ics and the search for a male contraceptive.Mol. Cell Endocrinol. 250, Bio molecular Research Facility (A B RF) and The Acade mic and Re- 1 7 8 – 1 8 3 search Co mputing Support ( A R C S) tea m, within IT Services at the 15. Cooper, T. G., and Yeung, C. H. (1999) Recent bioche mical approaches University of Ne wcastle, provided high perfor mance co mputing ( H P C) to post-testicular, epididy mal contraception. Hu m. Reprod. Update 5, t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at infrastructure for su p porting the bioinfor matics. 1 4 1 – 1 5 2 16. Jones, R. (1989) Me mbrane re modelling during sper m maturation in the epididy mis. Oxf. Rev. Reprod. Biol. 11, 285–337 DATA AVAILABILITY 17. Bedford, J. M. (2015) Hu man sper matozoa and te mperature: the elephant The dataset ( Dataset S1) analyzed here has been depos- in the roo m.Biol. Reprod. 93, 97 ited in the Mass Spectro metry Interactive Virtual Environ- 18. Cooper, T. G., Waites, G. M., and Nieschlag, E. (1986) The epididy mis and male fertility. A sy mposiu m report. Int. J. Androl.9, 81–90 ment ( MassIVE) database with the dataset identifier 19. Cooper, T. G., Yeung, C. H., Nashan, D., and Nieschlag, E. (1988) Epidid- MSV000082497 and is publicly accessible at: https:// y mal markers in hu man infertility. J. Androl. 9, 91–101 massive.ucsd.edu/ProteoSAFe/dataset.jsp?task 5cf36b 20. Sullivan, R., and Mieusset, R. (2016) The hu man epididy mis: its function in sper m maturation. Hu m. Reprod. Update 22, 574 –587 642ed54e318f3b5dbb0d1db830. The FTP do wnload is ac- 21. Sullivan, R. (2015) Epididy moso mes: a heterogeneous population of mi- cessi ble via thelink:ftp:// massive.ucsd.edu/ MSV000082497. crovesicles with multiple functions in sper m maturation and storage. Asian J. Androl. 17, 7 2 6 – 7 2 9 □S This article contains supple mental material. 22. Sch warz, A., Wenne muth, G., Post, H., Brandenburger, T., Au muller, G., ** To who m correspondence should be addressed: School of Bio- and Wilhel m, B. (2013) Vesicular transfer of me mbrane co mponents to medical Sciences and Phar macy, Faculty of Health and Medicine, The bovine epididy mal sper matozoa. Cell Tissue Res. 353, 549 –561 23. Surya wanshi, A. R., Khan, S. A., Joshi, C. S., and Khole, V. V. (2012) University of Ne wcastle, Callaghan, N S W 2308, Australia. Tel.: 61- Epididy moso me- mediated acquisition of M MS D H, an androgen-de- 2-4921-5693; Fax: 61-2-4921-6903; E- mail: Matt. Dun @ne wcastle. pendent and develop mentally regulated epididy mal sper m protein. J. e d u. a u. Androl. 33, 963–974 Author contributions: B. N., G. N. D.I., M. R.L., and M. D. D. designed 24. Girouard, J., Frenette, G., and Sullivan, R. (2011) Co mparative proteo me r e s e ar c h; B. N., G. N. D.I., H. M. H., W. Z., A. M., I. B., A. L. A., S. J. S., J. G. A., an d li pi d profiles of bovine e pi di dy moso mes collecte d in the intralu minal E. G. B., M. R. L., a n d M. D. D. a n al y z e d d at a; B. N., M. R. L., a n d M. D. D. co mpart ment of the caput and cauda epididy midis. Int. J. Androl. 34, wr ot e t h e p a p er; H. M. H., W. Z., A. M., I. B., A. L. A., S. J. S., D. A. S.- B., e 4 7 5 – e 4 8 6 M. F. B. J., J. G. A., E. G. B., M. R. L., a n d M. D. D. p erf or m e d r e s e ar c h. 25. Frenette, G., Gir ouar d, J., D’ A m ours, O., Allar d, N., Tessier, L., an d Sullivan, R. (2010) Characterization of t wo distinct populations of epididy mo- so mes collected in the intralu minal co mpart ment of the bovine cauda REFERENCES epididy mis. Biol. Reprod. 83, 473– 480 1. Corn wall, G. A. (2009) Ne w insights into epididy mal biology and function. 26. Sullivan, R., Frenette, G., and Girouard, J. (2007) Epididy moso mes are Hu m. Reprod. Update 15, 213–227 involved in the acquisition of ne w sper m proteins during epididy mal 2. Her mo, L., Pelletier, R. M., Cyr, D. G., an d S mith, C. E. (2010) Surfing the transit. Asian. J. Androl. 9, 483– 491 wave, cycle, life history, and genes/proteins expressed by testicular 27. Machtinger, R., Laurent, L. C., and Baccarelli, A. A. (2016) Extracellular ger m cells. Part 1: background to sper matogenesis, sper matogonia, and vesicles: roles in ga mete maturation, fertilization and e mbryo i mplanta- sper matocytes. Microsc. Res. Tech. 73, 241–278 tion.Hu m. Reprod. Update 22, 182–193 3. Aitken, R. J., Nix on, B., Lin, M., K o p pers, A. J., Lee, Y. H., an d Baker, M. A. 28. Barkalina, N., Jones, C., Wood, M. J., and Co ward, K. (2015) Extracellular (2007) Proteo mic changes in ma m malian sper matozoa during epididy- vesicle- mediated delivery of molecular co mpounds into ga metes and mal maturation. Asian J. Androl. 9, 554 –564 e mbryos: learning fro m nature. Hu m. Reprod. Update 21, 627– 639

S 1 0 6 Molecular & Cellular Proteo mics 18.13 Proteomic Profiling of Mouse Epididymosomes

29. Sullivan, R., and Saez, F. (2013) Epididymosomes, prostasomes, and lipo- databases using mass spectrometry data. Electrophoresis 20, somes: their roles in mammalian male reproductive physiology. Repro- 3551–3567 duction 146, R21–R35 48. Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2009) Bioinformatics 30. Hermo, L., and Jacks, D. (2002) Nature’s ingenuity: bypassing the classical enrichment tools: paths toward the comprehensive functional analysis of secretory route via apocrine secretion. Mol. Reprod. Dev. 63, 394–410 large gene lists. Nucleic Acids Res. 37, 1–13 31. Farkas, R. (2015) Apocrine secretion: New insights into an old phenome- 49. Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2009) Systematic and non. Biochim. Biophys. Acta 1850, 1740–1750 integrative analysis of large gene lists using DAVID Bioinformatics Re- 32. Aumuller, G., Wilhelm, B., and Seitz, J. (1999) Apocrine secretion–fact or sources. Nature Protoc. 4, 44–57 artifact? Ann. Anat. 181, 437–446 50. Nixon, B., Stanger, S. J., Mihalas, B. P., Reilly, J. N., Anderson, A. L., Tyagi, 33. Mulcahy, L. A., Pink, R. C., and Carter, D. R. (2014) Routes and mecha- S., Holt, J. E., and McLaughlin, E. A. (2015) The microRNA signature of nisms of extracellular vesicle uptake. J. Extracell Vesicles 3, 24641 mouse spermatozoa is substantially modified during epididymal matu- 34. Reilly, J. N., McLaughlin, E. A., Stanger, S. J., Anderson, A. L., Hutcheon, ration. Biol. Reprod. 93, 91 K., Church, K., Mihalas, B. P., Tyagi, S., Holt, J. E., Eamens, A. L., and 51. Frenette, G., Lessard, C., and Sullivan, R. (2002) Selected proteins of Nixon, B. (2016) Characterisation of mouse epididymosomes reveals a “prostasome-like particles” from epididymal cauda fluid are transferred complex profile of microRNAs and a potential mechanism for modifica- to epididymal caput spermatozoa in bull. Biol. Reprod. 67, 308–313 tion of the sperm epigenome. Sci. Rep. 6, 31794 52. Lin, D. T., Makino, Y., Sharma, K., Hayashi, T., Neve, R., Takamiya, K., and

35. Biggers, J. D., Whitten, W. K., and Whittingham, D. G. (1971) The culture of Huganir, R. L. (2009) Regulation of AMPA receptor extrasynaptic insertion Downloaded from mouse embryos in vitro. In: Daniel, J. C., ed. Methods in Mammalian by 4.1N, phosphorylation and palmitoylation. Nat. Neurosci. 12, 879–887 Embryology, pp. 86–116, Freeman Press, San Francisco, CA 53. Dun, M. D., Anderson, A. L., Bromfield, E. G., Asquith, K. L., Emmett, B., 36. Anderson, A. L., Stanger, S. J., Mihalas, B. P., Tyagi, S., Holt, J. E., McLaughlin, E. A., Aitken, R. J., and Nixon, B. (2012) Investigation of the McLaughlin, E. A., and Nixon, B. (2015) Assessment of microRNA ex- expression and functional significance of the novel mouse sperm protein, pression in mouse epididymal epithelial cells and spermatozoa by next a disintegrin and metalloprotease with thrombospondin type 1 motifs generation sequencing. Genom. Data 6, 208–211 number 10 (ADAMTS10). Int. J. Androl. 35, 572–589 37. Lotvall, J., Hill, A. F., Hochberg, F., Buzas, E. I., Di Vizio, D., Gardiner, C., 54. Zhou, W., De Iuliis, G. N., Turner, A. P., Reid, A. T., Anderson, A. L., http://www.mcponline.org/ Gho, Y. S., Kurochkin, I. V., Mathivanan, S., Quesenberry, P., Sahoo, S., McCluskey, A., McLaughlin, E. A., and Nixon, B. (2017) Developmental Tahara, H., Wauben, M. H., Witwer, K. W., and Thery, C. (2014) Minimal expression of the dynamin family of mechanoenzymes in the mouse experimental requirements for definition of extracellular vesicles and epididymis. Biol. Reprod. 96, 159–173 their functions: a position statement from the International Society for 55. Thimon, V., Frenette, G., Saez, F., Thabet, M., and Sullivan, R. (2008) Extracellular Vesicles. J. Extracell Vesicles 3, 26913 Protein composition of human epididymosomes collected during surgi- 38. Keerthikumar, S., Chisanga, D., Ariyaratne, D., Al Saffar, H., Anand, S., cal vasectomy reversal: a proteomic and genomic approach. Hum. Re- Zhao, K., Samuel, M., Pathan, M., Jois, M., Chilamkurti, N., Gangoda, L., prod. 23, 1698–1707 and Mathivanan, S. (2016) ExoCarta: A Web-Based Compendium of 56. Gatti, J. L., Metayer, S., Belghazi, M., Dacheux, F., and Dacheux, J. L. Exosomal Cargo. J. Mol. Biol. 428, 688–692 (2005) Identification, proteomic profiling, and origin of ram epididymal at UNIV OF NEWCASTLE (CAUL) on September 5, 2019 39. Nixon, B., Mitchell, L. A., Anderson, A. L., McLaughlin, E. A., O’Bryan, M., K., fluid exosome-like vesicles. Biol. Reprod. 72, 1452–1465 and Aitken, R. J. (2011) Proteomic and functional analysis of human sperm 57. Skerget, S., Rosenow, M. A., Petritis, K., and Karr, T. L. (2015) Sperm detergent resistant membranes. J. Cell Physiol. 226, 2651–2665 Proteome Maturation in the Mouse Epididymis. PLoS ONE 10, e0140650 40. Fujiki, Y., Hubbard, A. L., Fowler, S., and Lazarow, P. B. (1982) Isolation of 58. Zhou, W., De Iuliis, G. N., Dun, M. D., and Nixon, B. (2018) Characteristics intracellular membranes by means of sodium carbonate treatment: ap- of the Epididymal Luminal Environment Responsible for Sperm Matura- plication to endoplasmic reticulum. J. Cell Biol. 93, 97–102 tion and Storage. Front. Endocrinol. 9, 59 41. Larsen, M. R., Cordwell, S. J., and Roepstorff, P. (2002) Graphite powder as 59. Raymond, A., Ensslin, M. A., and Shur, B. D. (2009) SED1/MFG-E8: a an alternative or supplement to reversed-phase material for desalting bi-motif protein that orchestrates diverse cellular interactions. J. Cell and concentration of peptide mixtures prior to matrix-assisted laser Biochem. 106, 957–966 desorption/ionization-mass spectrometry. Proteomics 2, 1277–1287 60. Ensslin, M. A., and Shur, B. D. (2003) Identification of mouse sperm SED1, 42. Degryse, S., de Bock, C. E., Demeyer, S., Govaerts, I., Bornschein, S., a bimotif EGF repeat and discoidin-domain protein involved in sperm- Verbeke, D., Jacobs, K., Binos, S., Skerrett-Byrne, D. A., Murray, H. C., egg binding. Cell 114, 405–417 Verrills, N. M., Van Vlierberghe, P., Cools, J., and Dun, M. D. (2018) 61. Whelly, S., Muthusubramanian, A., Powell, J., Johnson, S., Hastert, M. C., Mutant JAK3 phosphoproteomic profiling predicts synergism between and Cornwall, G. A. (2016) Cystatin-related epididymal spermatogenic JAK3 inhibitors and MEK/BCL2 inhibitors for the treatment of T-cell subgroup members are part of an amyloid matrix and associated with acute lymphoblastic leukemia. Leukemia 32, 788–800 extracellular vesicles in the mouse epididymal lumen. Mol. Hum. Reprod. 43. Koch, H., Wilhelm, M., Ruprecht, B., Beck, S., Frejno, M., Klaeger, S., and 22, 729–744 Kuster, B. (2016) Phosphoproteome Profiling Reveals Molecular Mech- 62. Asquith, K. L., Harman, A. J., McLaughlin, E. A., Nixon, B., and Aitken, R. J. anisms of Growth-Factor-Mediated Kinase Inhibitor Resistance in EGFR- (2005) Localization and significance of molecular chaperones, heat Overexpressing Cancer Cells. J. Proteome Res. 15, 4490–4504 shock protein 1, and tumor rejection antigen gp96 in the male reproduc- 44. Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B., Parker, K., tive tract and during capacitation and acrosome reaction. Biol. Reprod. Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., 72, 328–337 Juhasz, P., Martin, S., Bartlet-Jones, M., He, F., Jacobson, A., and 63. Reid, A. T., Anderson, A. L., Roman, S. D., McLaughlin, E. A., McCluskey, Pappin, D. J. (2004) Multiplexed protein quantitation in Saccharomyces A., Robinson, P. J., Aitken, R. J., and Nixon, B. (2015) Glycogen synthase cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell Pro- kinase 3 regulates acrosomal exocytosis in mouse spermatozoa via teomics 3, 1154–1169 dynamin phosphorylation. FASEB J. 29, 2872–2882 45. Engholm-Keller, K., Hansen, T. A., Palmisano, G., and Larsen, M. R. (2011) 64. Au, C. E., Hermo, L., Byrne, E., Smirle, J., Fazel, A., Kearney, R. E., Smith, Multidimensional strategy for sensitive phosphoproteomics incorporat- C. E., Vali, H., Fernandez-Rodriguez, J., Simon, P. H., Mandato, C., ing protein prefractionation combined with SIMAC, HILIC, and TiO(2) Nilsson, T., and Bergeron, J. J. (2015) Compartmentalization of mem- chromatography applied to proximal EGF signaling. J. Proteome Res. 10, brane trafficking, glucose transport, glycolysis, actin, tubulin and the 5383–5397 proteasome in the cytoplasmic droplet/Hermes body of epididymal 46. Dun, M. D., Chalkley, R. J., Faulkner, S., Keene, S., Avery-Kiejda, K. A., Scott, sperm. Open. Biol. 5 R. J., Falkenby, L. G., Cairns, M. J., Larsen, M. R., Bradshaw, R. A., and 65. Luga, V., Zhang, L., Viloria-Petit, A. M., Ogunjimi, A. A., Inanlou, M. R., Chiu, Hondermarck, H. (2015) Proteotranscriptomic Profiling of 231-BR Breast E., Buchanan, M., Hosein, A. N., Basik, M., and Wrana, J. L. (2012) Cancer Cells: Identification of Potential Biomarkers and Therapeutic Tar- Exosomes mediate stromal mobilization of autocrine Wnt-PCP signaling gets for Brain Metastasis. Mol. Cell Proteomics 14, 2316–2330 in breast cancer cell migration. Cell 151, 1542–1556 47. Perkins, D. N., Pappin, D. J., Creasy, D. M., and Cottrell, J. S. (1999) 66. Valadi, H., Ekstrom, K., Bossios, A., Sjostrand, M., Lee, J. J., and Lotvall, Probability-based protein identification by searching sequence J. O. (2007) Exosome-mediated transfer of mRNAs and microRNAs is a

Molecular & Cellular Proteomics 18.13 S107 Proteomic Profiling of Mouse Epididymosomes

novel mechanism of genetic exchange between cells. Nat. Cell Biol. 9, 72. Jones, S., Lukanowska, M., Suhorutsenko, J., Oxenham, S., Barratt, C., 654–659 Publicover, S., Copolovici, D. M., Langel, U., and Howl, J. (2013) Intra- 67. Cossetti, C., Iraci, N., Mercer, T. R., Leonardi, T., Alpi, E., Drago, D., Alfaro- cellular translocation and differential accumulation of cell-penetrating Cervello, C., Saini, H. K., Davis, M. P., Schaeffer, J., Vega, B., Stefanini, M., peptides in bovine spermatozoa: evaluation of efficient delivery vectors Zhao, C., Muller, W., Garcia-Verdugo, J. M., Mathivanan, S., Bachi, A., that do not compromise human sperm motility. Hum. Reprod. 28, Enright, A. J., Mattick, J. S., and Pluchino, S. (2014) Extracellular vesicles 1874–1889 from neural stem cells transfer IFN-gamma via Ifngr1 to activate Stat1 73. Gadella, B. M., and Evans, J. P. (2011) Membrane fusions during mamma- signaling in target cells. Mol. Cell 56, 193–204 lian fertilization. Adv. Exp. Med. Biol. 713, 65–80 68. Kramer-Albers, E. M., Bretz, N., Tenzer, S., Winterstein, C., Mobius, W., 74. Girouard, J., Frenette, G., and Sullivan, R. (2009) Compartmentalization of Berger, H., Nave, K. A., Schild, H., and Trotter, J. (2007) Oligodendrocytes proteins in epididymosomes coordinates the association of epididymal secrete exosomes containing major myelin and stress-protective proteins: proteins with the different functional structures of bovine spermatozoa. Trophic support for axons? Proteomics Clin. Appl 1, 1446–1461 Biol. Reprod. 80, 965–972 69. Amaral, A., Castillo, J., Ramalho-Santos, J., and Oliva, R. (2014) The 75. Rejraji, H., Sion, B., Prensier, G., Carreras, M., Motta, C., Frenoux, J. M., combined human sperm proteome: cellular pathways and implications for basic and clinical science. Hum. Reprod. Update 20, 40–62 Vericel, E., Grizard, G., Vernet, P., and Drevet, J. R. (2006) Lipid remod- 70. Dun, M. D., Aitken, R. J., and Nixon, B. (2012) The role of molecular eling of murine epididymosomes and spermatozoa during epididymal chaperones in spermatogenesis and the post-testicular maturation of maturation. Biol. Reprod. 74, 1104–1113 Downloaded from mammalian spermatozoa. Hum. Reprod. Update 18, 420–435 76. Al-Dossary, A. A., Bathala, P., Caplan, J. L., and Martin-DeLeon, P. A. 71. Dun, M. D., Smith, N. D., Baker, M. A., Lin, M., Aitken, R. J., and Nixon, (2015) Oviductosome-sperm membrane interaction in cargo delivery: B. (2011) The chaperonin containing TCP1 complex (CCT/TRiC) is Detection of fusion and underlying molecular players using three-dimen- involved in mediating sperm-oocyte interaction. J. Biol. Chem. 286, sional super-resolution structured illumination microscopy (SR-SIM). 36875–36887 J. Biol. Chem. 290, 17710–17723 http://www.mcponline.org/ at UNIV OF NEWCASTLE (CAUL) on September 5, 2019

S108 Molecular & Cellular Proteomics 18.13 R e s e ar c h

Modification of Crocodile Sper matozoa Refutes the Tenet That Post-testicular Sper m Maturation Is Restricted To Ma m mals

A ut h or s Brett Nixon, Stephen D. Johnston, David A. Skerrett- Byrne, A manda L. Anderson, Si mone J. Stanger, Elizabeth G. Bro mfield, Jacinta H. Martin, Philip M. Hansbro, and Matthe w D. Dun loadedf m o dfr e d a o nl w o D Correspondence Graphical Abstract Brett. Nixon @ne wcastle.edu.au

I n Bri ef t :/www l .o g/ or e. n nli o p c m w. w w p:// htt Quantitative proteo mic and phosphoproteo mic analyses of Australian salt water croco dile sper matozoa reveals that these cells respond to capacitation sti muli by mounting a c A M P- mediated signaling cascade that 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at is analogous to that of sper ma- tozoa fro m the ma m malian line- age. Si milarly, the phosphory- lated proteins responsible for driving the functional maturation of crocodile sper matozoa share substantial evolutionary overlap with those docu mented in ma m- malian sper matozoa.

Hi g hli g ht s

• Co mparative and quantitative phosphoproteo mics of Australian salt water crocodile sper matozoa.

• Capacitation sti muli elicit a c A M P- mediated signaling cascade in crocodile sper matozoa.

• Mechanistic insights into conservation of sper m activation path ways.

Nixon et al., 2019, Molecular & Cellular Proteo mics 18, S59 – S76 March 2019 © 2019 Nixon et al. Published under exclusive license by The A merican Society for Bioche mistry and Molecular Biology, Inc. https://doi.org/10.1074/ mcp. R A118.000904 l o s R e s e ar c h

Modification of Crocodile Sper matozoa Refutes the Tenet That Post-testicular Sper m

Maturation Is Restricted To Ma m mals* □S

Brett Nixon‡§‡‡, Stephen D. Johnston¶, David A. Skerrett- Byrne§, A manda L. Anderson‡, Si mone J. Stanger‡, Elizabeth G. Bro mfield‡§, Jacinta H. Martin‡§, Philip M. Hansbro§ , and Matthe w D. Dun§** loadedf m o dfr e d a o nl w o D Co mpetition to achieve paternity has contributed to the The molecular processes leading to fertilization re main develop ment of a multitude of elaborate male reproduc- a mong the key unresolved questions in the field of reproduc- tive strategies. In one of the most well-studied exa mples, tive biology. Based on studies of the ma m malian lineage, it is the sper matozoa of all ma m malian species must undergo widely accepted that ter minally differentiated sper matozoa a series of physiological changes, ter med capacitation, in develop the capacity to fertilize an ovu m during sequential g/ or e. n nli o p c m w. w w p:// htt the fe male reproductive tract before realizing their poten- phases of post-testicular maturation as they pass through the tial to fertilize an ovu m. Ho wever, the evolutionary origin male (e pi di dy mis) (1) an d fe male re pro ductive tracts (2). The and adaptive advantage afforded by capacitation re mains obscure. Here, we report the use of co mparative and latter of these is ter med capacitation and is defined as a quantitative proteo mics to explore the biological signifi- ti me-dependent process during which sper matozoa experi- cance of capacitation in an ancient reptilian species, the ence a suite of bioche mical and biophysical changes that Australian salt water crocodile ( Crocodylus porosus ). Our collectively endo w the m with the ability to recognize and data reveal that exposure of crocodile sper matozoa to fertilize an ovu m (2, 3). A distinctive characteristic 9 1 0 2 5, er b m e pt e S n o L) of U A C this E( L T S A C W E N F O V NI U at capacitation sti muli elicits a cascade of physiological re- phase of functi onal maturati on is that it occurs in the c o m plete sponses that are analogous to those i mplicated in the absence of nuclear gene transcription and de novo protein functional activation of their ma m malian counterparts. In- synthesis. Instea d, the regulation of ca pacitation rests al most deed, a mong a total of 1119 proteins identified in this exclusively with convergent signaling cascades that trans- study, we detected 126 that were differentially phosphor- ylated ( 1.2 fold-change) in capacitated versus nonca- duce extracellular signals to effect extensive post-transla- pacitated crocodile sper matozoa. Notably, this subset of tional mo dification of the intrinsic s per m proteo me (4). In this phosphorylated proteins shared substantial evolutionary context, differential protein phos phorylation has e merge d as a overlap with those docu mented in ma m malian sper mato- do minant molecular s witch, which regulates sper m-oocyte zoa, and included key ele ments of signal transduction, recognition and adhesion, the ability to undergo acroso mal metabolic and cellular re modeling path ways. Unlike exocytosis, and the propagation of an altered pattern of ma m malian sper m, ho wever, we noted a distinct bias for move ment referred to as hyperactivation (5). differential phosphorylation of serine (as opposed to A curiosity of the capacitation cascade is its evolutionary tyrosine) residues, with this a mino acid featuring as the origin and the adaptive advantage that is afforded by such an target for 80 % of all changes detected in capacitated elaborate for m of post-testicular sper m maturation. Indeed, sper matozoa. Overall, these results indicate that the studies of the sper matozoa of sub-therian vertebrate species pheno menon of sper m capacitation is unlikely to be such as those of the aves, have failed to docu ment a process restricted to ma m mals and provide a fra me work for un- derstanding the molecular changes in sper m physiology equivalent to capacitation (6 – 8); with sper matozoa of fo wls necessary for fertilization. Molecular & Cellular Pro- and turkeys appearing refractory to the need for any such teo mics 18: S59–S76, 2019. DOI: 10.1074/ mcp. physiological changes during their residence in the fe male R A118.000904. re pro ductive syste m (6, 8). Si milarly, in the quail it has been

Fro m the ‡ Priority Research Centre for Reproductive Science, School of Environ mental and Life Sciences, The University of Ne wcastle, Callaghan, N S W 2308, Australia; § Hunter Medical Research Institute, Ne w La mbton Heights, N S W 2305, Australia; ¶ School of Agriculture and Food Science, The University of Queensland, Gatton, QL D 4343, Australia; Priority Research Centre for Healthy Lungs, Faculty of Health and Medicine, The University of Ne wcastle, Ne wcastle, N S W 2308, Australia; ** Priority Research Centre for Cancer Research, Innovation and Translation, School of Bio medical Sciences and Phar macy, Faculty of Health and Medicine, The University of Ne wcastle, Callaghan, N S W 2308, Australia Received June 5, 2018, and in revised for m, July 24, 2018 Published, M CP Papers in Press, August 2, 2018, D OI 10.1074/ mcp. RA118.000904

Molecular & Cellular Proteo mics 18.13 S 5 9 © 2019 Nixon et al. Published under exclusive license by The A merican Society for Bioche mistry and Molecular Biology, Inc. Proteo mic Profiling of Crocodile Sper matozoa sho wn that most testicular sper m can bind to a perivitelline University of Queensland Ani mal Ethics Co m mittee ( S A S/361/10) and me mbrane and acroso me react with no additional advantage Queensland Govern ment Scientific Purposes Per mit ( WISP09374911). Se men used throughout this study was collected by digital massage being afforded by exposure to capacitation sti muli (9). Al- (12) fro m mature ( 3.0 m) salt water crocodiles during the breeding though it has been suggested that reptilian sper matozoa also season ( Nove mber 2014, 2015). experience mini mal post-testicular maturation, this paradig m Sper m Capacitation — The in vitrocapacitation of crocodile sper- has recently been challenged by functional analysis of ejacu- matozoa was achieved via elevation of intracellular c A M P levels as lated sper matozoa fro m the Australian salt water crocodile previously described (13, 14). In brief, ra w se men sa mples were diluted 1:4 into one of t wo modified for mulations of Biggers, Whitten (Crocodylus porosus )(10). In this context, exposure to capac- and Whittingha m (B W W) 1 mediu m (15), na mely: (1) noncapacitating itati on sti muli, which elevate intracellular levels of cyclic A M P B W W media control ( N C) [co mprising 120 m M Na Cl, 4.6 m M K Cl, 1. 7

(c A M P), pro moted a significant enhance ment of the motility m M C a Cl 2 . 2 H2 O, 1. 2 m M KH 2 PO 4 , 1. 2 mM M g S O 4 . 7 H2 O, 5. 6 m M profile recor de d in croco dile s per matozoa. Nota bly, dilution in D -glucose, 0.27 m M sodiu m pyruvate, 44 m M sodiu m lactate, 5 U/ ml capacitation mediu m also enhances the post-tha w survival of penicillin, 5 mg/ ml strepto mycin and 20 m M HEPES buffer and 3 mg/ ml B S A (p H 7.4, os molality of 300 m Os m/kg)], or (2) capacitating cryopreserved crocodile sper matozoa (11). Conversely, croc- B W W ( CAP), an equivalent for mulation to that of N C B W W, with m o dfr e d a o nl w o D odile sper matozoa are rendered quiescent upon incubation in additional supple mentation of 25 m M N a H C O 3 , a phosphodiesterase bicarbonate-free media for mulated to suppress the capacita- inhi bitor ( pentoxifylline, 1 mM ) and a me mbrane per meable cA MP tion of eutherian sper matozoa (10). We contend that such analogue (dibutyryl cyclic A M P, dbc A M P, 1 m M ). F oll o wi n g dil uti o n, changes may reflect physiological de mands i mposed by the an aliquot of sper m was i m mediately assessed for viability, motility characteristics, acroso mal integrity, and phosphorylation status. The transferal of sper m storage responsibilities fro m the male to re mainder of the sa mple was incubated at 30 ° C for 120 min. At the g/ or e. n nli o p c m w. w w p:// htt the fe male reproductive tract, and the attendant need to al- co mpletion of incubation, sper m suspensions were prepared for ternatively silence and reactivate sper matozoa to enhance mass spectro metry analysis as described belo w. To substantiate their longevity and fertilization co mpetence, respectively. capacitation-like changes in crocodile sper matozoa, a sub-popula- Nevertheless, the mechanistic basis of these opposing re- tion of these cells were assesse d for phos phorylation of serine, thre- onine and tyrosine residues. Si milarly, the sper matozoa were also sponses re main obscure, as does the identity of the proteins assayed for overall levels of phospho- P K A substrates. i m plicate d in their regulation. Co mparative and Quantitative Sper m Proteo me and Phosphopro- Here, we have used mass spectro metry-based proteo mics teo me Analysis —Preparations of crocodile sper matozoa (noncapaci- to generate a co mprehensive protein inventory of mature tated and capacitated) were subjected to me mbrane protein 9 1 0 2 5, er b m e pt e S n o L) enrich- U A C E( L T S A C W E N F O V NI U at ment by dissolving in 100 l of ice-cold 0.1 M N a CO supple mented crocodile sper matozoa and subsequently explore signatures 2 3 with protease ( Sig ma) and phosphatase inhibitors ( Roche, Co mplete of capacitation via quantitative phosphoproteo mic profiling E DT A free). These suspensions were subjected to probe tip sonica- strategies. Our data confir m that the phos phorylation status of tion at 4 ° C for 3 10 s intervals before incubation for 1 h at 4 ° C. the crocodile sper m proteo me is substantially modified in Soluble proteins were isolated fro m me mbrane proteins by ultracen- response to capacitation sti muli, thus refuting the tenet that trifugation (100,000 g for 90 min at 4 ° C) (16). Me mbrane-enriched this pheno menon is restricted to the ma m malian lineage and pellets and soluble proteins were dissolved in urea (6 M ur e a, 2 M thiourea), reduced using 10 mM DTT (30 min, roo m te mperature), providing a fra me work for understanding the molecular alkylated using 20 m M iodoaceta mide (30 min, roo m te mperature, in changes in sper m physiology necessary for fertilization. the dark), an d su bse quently digeste d with 0.05 activity units of Lys- C endoproteinase ( Wako, Osaka, Japan) for 3 h at 37° C. After Lys- C EXPERI MENTAL PROCEDURES di gesti on, the s oluti on was dilute d in 0.75 M ur e a, 0. 2 5 M thiourea with TE A B, and digested using 2 % w/ w trypsin (specificity for positively Che micals and Reagents — Unless other wise specified, che mical charged lysine and arginine side chains; Pro mega, Madison, WI) reagents were obtained fro m Sig ma ( St. Louis, M O). Anti-phospho- overnight at 37 ° C in 500 m M triethyla m moniu m bicarbonate (TEA B) serine ( P5747), phosphothreonine ( P6623), phosphotyrosine ( P5964), and centrifuged at 14,000 g for 30 min at 4 ° C (17, 18). Peptides flotillin1 (F1180), C A B Y R ( S A B2107035), and tubulin (T5168) antibod- were desalted and cleaned up using a modified StageTip microcol- ies were purchased fro m Sig ma. Anti-phospho (Ser/Thr) P KA sub- u mn (19). Quantitative fluorescent peptide quantification ( Qubit strate antibodies (9621) was fro m Cell Signaling ( Danvers, M A). Anti- ZP BP2 ( H00124626- B01) antibody was fro m Abnova (Taipei City, Tai wan). Anti- D N M3 (14737–1-AP) and SPAT C1 (25861–1-AP) anti- 1 The abbreviations used are: B W W, Biggers, Whitten, and Whit- bodies were fro m ProteinTech ( Rose mont, IL). Anti-Z P B P1 ( S1587) tingha m mediu m; A C R, Acrosin; A KAP4, A-kinase anchoring protein antibody was fro m Epito mics ( Burlinga me, C A). Anti- A C R (sc-46284) 4; C A B Y R, Calciu m binding tyrosine phosphorylation regulated pro- and A C R B P (sc-109379) were fro m Santa Cruz Biotechnology ( Dallas, tein; cA MP, cyclic adenosine monophosphate; C C CP, carbonyl cya- TX). Anti-AKAP4 (4 B DX-1602) antibody was fro m 4 Bio Dx (Lille, nide m-chlorophenyl hydrazone; D D A, data dependent acquisition; France). Anti-rabbit Ig G-horseradish peroxidase ( H RP) was pur- D N M, Dyna min; F D R, false discovery rate; FIT C, Fluorescein isothio- chased fro m Merck Millipore ( Billerica, M A), and anti- mouse Ig G and cyanate; FL OT1, Flotillin 1; G O, gene ontology; HILI C, hydrophilic anti-goat Ig G H R P were fro m Santa Cruz Biotechnology. All fluores- interaction chro matography; H RP, Horseradish peroxidase; L C- MS/ cently labeled ( Alexa Fluor) secondary antibodies were fro m Ther mo M S, liquid chro matography-tande m mass spectro metry; P K A, protein Fisher Scientific ( Walt man, M A). Fluorescein isothiocyanate (FIT C) kinase A; PL A, proxi mity ligation assay; P S A, Pisu m sativu m aggluti- conjugated Pisu m sativu m agglutinin ( P S A) (FL-1051) was fro m Vector nin; Ser, Serine; S P AT C1, Sper matogenesis and centriole associated Laboratories ( Burlinga me, C A). 1; Thr, Threonine; Tyr, Tyrosine; TE A B, ttriethyla m moniu m bicarbon- Ani mals and Se men Collection —The study was undertaken at ate; Z P B P1, Zona pellucida binding protein 1; Z P B P2, Zona pellucida Koorana Crocodile Far m, QL D, Australia with the approval of the bin ding protein 2.

S 6 0 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa

Assay; Ther mo Fisher Scientific) was e mployed and 100 g of each transferred onto nitrocellulose me mbranes ( Hybond C-extra; GE peptide sa mple was labeled using isobaric tag based methods (20), Healthcare, Buckingha mshire, England, U K) (23). Me mbranes were according to manufacturer’s specifications (iT R A Q; S CIEX, Fra ming- blocked for 1 hin Tris buffered saline (T B S) containing 5 % w/v ski m ha m, M A). Digestion and isobaric tag labeling efficiency was deter- milk po wder. After rinsing with T B S containing 0.1 % v/v T ween-20 mined by NanoL C- MS/ MS (described belo w). Sa mples were then (T BST), me mbranes were sequentially incubated with appropriate mixed in 1:1 ratio and phosphopeptides enriched using a multidi men- pri mary antibody at 4 ° C overnight and its corresponding H R P-con- sional strategy e mploying Ti O(2) pre-enrich ment step follo wed by jugated secondary antibody for 1 h. Follo wing three washes in T B ST, separate multi- and mono-phosphorylated peptides post-fraction- labeled proteins were detected using enhanced che milu minescence ation using a se quential elution fro m i m mo bilize d metal affinity chro- reagents ( GE Healthcare). matography (21). Non modified peptides were then subjected to of- I m munocytoche mistry —Sper matozoa were fixed in 4 % parafor m- fline hy dro philic interaction li qui d chro matogra phy ( HILI C) before high aldehyde, washed three ti mes with 0.05 M glycine in P B S and then resolution L C- M S/ M S (18). a p plie d to poly- L -lysine coated glass coverslips. The cells were per- Tande m Mass Spectro metry ( NanoL C- MS/ MS) Quantitative Analy- meabilized with 0.2 % Triton X-100 and blocked in 3 % B S A/ P B S for ses —NanoL C- MS/ MS, was perfor med using a Dionex Ulti Mate 1 h. Coverslips were then washed in P B S and incubated in a hu mid- 3000 RSL C nanoflo w HPL C syste m (Ther mo Fisher Scientific). Me m- ified cha mber with appropriate pri mary and secondary antibodies (1 h brane and soluble mono- and multiphosphorylated peptides, and at 37 ° C). Coverslips were washed (3 5 min) in filtered P B S after m o dfr e d a o nl w o D non modified peptides (seven me mbrane enriched fractions and seven each antibody incubation, before mounting in antifade reagent co m- soluble fractions) were suspended in buffer A (0.1 % for mic acid) and prising 10 % Mo wiol 4 – 88 ( Merck Millipore), 30 % glycerol and 2.5 % directly loaded onto an Acclai m Pep Map100 C18 75 m 2 0 m m 1,4-diazobicyclo-(2.2.2)-octane ( D A B C O) in 0.2 M Tris ( p H 8.5). S per m trap colu mn (Ther mo Fisher Scientific) for pre-concentration and on- cells were then exa mined by confocal microscopy ( Oly mpus, Nagano, line desalting. Separation was then achieved using an E A S Y- Spray J a p a n). Pep Map C18 75 m 500 m m colu mn (Ther mo Fisher Scientific), Duolink Proxi mity Ligation Assay —In situ proxi mity ligation assays g/ or e. n nli o p c m w. w w p:// htt e mploying a linear gradient fro m 2 to 32 % acetonitrile at 300 nl/ min (PLA) were conducted according to manufacturer’s instructions over 120 min. Q-Exactive Plus M S Syste m (Ther mo Fisher Scientific) ( OLI N K Biosciences, Uppsala, S weden; as described by (24)) using was operated in full MS/data dependent acquisition MS/ MS mode co mbinations of either anti- CA BY R and anti-phosphoserine or (data-dependent acquisition). The Orbitrap mass analyzer was used anti-SPAT C1 and anti-phosphoserine antibodies. Coverslips were at a resolution of 70 000, to acquire full M S with an m /z range of mounted as previously described for i m munocytoche mistry and vi- 390 –1400, incorporating a target auto matic gain control value of 1e6 sualized by fluorescence microscopy ( Carl Zeiss, Sydney, N S W, Aus- and maxi mu m fill ti mes of 50 ms. The 20 most intense multiply tralia). If target proteins resi de d within a maxi mu m distance of 40 n m, this reaction resulte d in the pro duction of a discrete fluorescent foci

charged precursors were selected for higher-energy collision disso- 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at ciation frag mentation with a nor malized collisional energy of 32. that appeared as red spots (25). PL A fluorescence was quantified for M S/ M S frag ments were measured at an Orbitrap resolution of 35,000 200 cells per sli de. using an auto matic gain control target of 2e5 and maxi mu m fill ti mes Experi mental Design and Statistical Rationale — All M S analyses of 110 ms (18). Frag mentation data were converted to peak lists using were perfor med in duplicate using ejaculated se men sa mples col- lected fro m different crocodiles (n 2). Si milarly, i m munoblotting Xcalibur (Ther mo Fisher Scientific), and the H C D data were searched analyses to confir m the phospho-serine, -threonine, -tyrosine, and using Proteo me Discoverer version 2.1 (Ther mo Fisher Scientific) -P KA status of crocodile sper matozoa were conducted on the sa me against the Archosauria cro wn group co mprising birds and crocodil- t wo se men sa mples (n 2). Ho wever, i m munocytoche mical, PLA, ians within the Uniprot K B database (Archosauria , do wnloaded on the and functional experi ments to substantiate crocodile sper m pro- 31st January of 2018, 622,090 sequences). Mass tolerances in M S teo mic data were perfor med in triplicate using individual biological and MS/ MS modes were 10 pp m and 0.02 Da, respectively; trypsin sa mples differing fro m those e mployed for MS analyses (n 3). was designated as the digestion enzy me, and up to t wo missed Where appropriate, graphical data are presented as means S.E. cleavages were allo wed. S-carba mido methylation of cysteine resi- (n 3), with statistical significance being deter mined by analysis of dues was designated as a fixed modification, as was modification of variance ( A N OV A). lysines and peptide N ter mini with the isobaric iT R A Q 8-plex label. Variable modifications considered were: acetylation of lysine, oxida- tion of methionine and phosphorylation of serine, threonine and tyro- RESULTS sine residues. Results fro m searches of me mbrane-enriched and Global Proteo mic Analysis of Crocodile Sper matozoa — Be- soluble fractions were merged and interrogation of the corresponding fore analysis, the quality of sper matozoa in each ejaculate reversed database was also perfor med to evaluate the false discovery rate (F D R) of peptide identification using Percolator based on q-values, was assessed via i m munolabeling with markers of acro- which were esti mated fro m the target-decoy search approach. To filter so mal ( P S A, green) and nuclear integrity ( D A PI, blue) and o ut t ar g et p e pti d e s p e ctr u m m at c h e s (t ar g et- P S M s) o v er t h e d e c o y- counterstaining of the flagellu m with anti-tubulin antibodies P S Ms, a fixed false discovery rate (F D R) of 1 % was set at the peptide (red) (Fig. 1A ). A portion of the sper m sa mple fro m each l e v el ( 1 7). T h e d at a s et (Dataset S1 ) analyzed here have been deposited ani mal was subjected to standard cell lysis and the proteins i n t h e M a s s S p e ctr o m etr y I nt er a cti v e Virt u al E n vir o n m e nt ( M a s sI V E) database and are publicly accessible at: https:// massive.ucsd.edu/ resolved by S DS-PA GE to confir m broadly equivalent pro- ProteoSAFe/dataset.jsp?task 8acd6725da734f6f89bbd64460d03686. teo mic profiles, which differed substantially fro m equivalent S DS-P A GE and I m munoblotting — After incubation, crocodile sper- lysates of mouse and hu man sper matozoa (Fig. 1B ). The matozoa were pelleted (400 g, 1 min), washed in N C B W W media co mplexity of the re maining sa mples was reduced via frac- and re-centrifuged (400 g, 1 min). The sper m pellet was re-sus- tionation into me mbrane-enriched and soluble cell lysates; pended in S D S extraction buffer (0.375 M Tris p H 6.8, 2 % ( w/v) S D S, 10 % ( w/v) sucrose, protease inhibitor mixture), incubated at 100 ° C both of which were subjected to mass spectro metry analysis. for 5 min and equivalent a mounts of protein (10 g) were separated Not withstanding the inco mplete annotation of the crocodile by S D S- P A GE (22). Gels were either stained with silver reagent or geno me, and the attendant need for sequence align ment to

Molecular & Cellular Proteo mics 18.13 S 6 1 Proteo mic Profiling of Crocodile Sper matozoa

F IG. 1. Assess ment of crocodile sper m sa m- ples. A , The integrity of crocodile sper matozoa iso- lated fro m ejaculated se men sa mples were as- sessed via i m munolabeling of cells with acroso mal (FIT C-conjugated P S A, green), nuclear ( D A PI, blue) and flagellar (tubulin, red) markers. B , A portion of the sper matozoa fro m each of t wo ani mals were su bjecte d to stan dar d cell lysis before extracte d pro- teins were resolved by S D S- P A GE and silver stained to confir m broa dly e quivalent proteo mic profiles. For co mparative purposes, crocodile sper m lysates m o dfr e d a o nl w o D were resolved alongside equivalent lysates prepared fro m mouse and hu man sper matozoa. t :/www l .o g/ or e. n nli o p c m w. w w p:// htt be perfor med against the Archosauria cro wn group, our ex- increase in abundance of 71 proteins (supple mental Table peri mental strategy identified a co mplex proteo mic signature S2 ); p ossi bly reflectin g differential partiti onin g an d/ or transl o- co mprising a total of 1119 proteins (supple mental Table S1). cation bet ween intracellular do mains upon exposure to ca- A mong these proteins, an average of 4.6 peptide matches pacitation sti muli, thus influencing the efficiency of their (enco mpassing 3.4 unique peptide matches) were generated extracti on. per protein; representing an average peptide coverage of Conservation of the Crocodile Sper m Proteo me 9 1 0 2 — 5, er b m To e pt e S n o L) U begin A C E( L T S A C W E N F O V NI U at 11 % per protein (supple mental Table S1). to explore the extent of conservation bet ween the crocodile Pr ovisi onal interr o gati on of thisglobal crocodile sper m pro- sper m proteo me and that of more widely studied ma m malian teo me on the basis of Gene Ontology ( G O) classification (26) species, we elected to survey the published proteo me co m- returned do minant ter ms of catalytic activity, binding, struc- piled for hu man sper matozoa; a co mprehensive dataset co m- tural molecule activity, an d trans porter activity a mong the to p prising 6199 of the predicted 7500 that constitute this cell G O molecular function categories when ranked on the basis ty pe (27). To facilitate this co m parison, allArchosauria Uni Prot of nu mber of annotated proteins (Fig. 2A ,supple mental Table accession nu mbers were manually converted into accession S2 ). Notable enrich ment was also identified in the broad G O nu mbers equating to their hu man ho mologues (supple mental biological process categories of cellular process, metabolic T a bl e S 1 ). Duringthis annotation, 232 proteins were unable to process, single organis m process, biological regulation and be assigned accession nu mbers because of a mbiguous no- localization (Fig. 2B , supple mental Table S2). Additional cat- menclature ( e.g. proteins re maining as uncharacterized or e g ori e s of direct relevance to sper m physiology/function in- failure to unequivocally deter mine the relevant protein iso- cluded: develop mental process, reproductive process, cell for m) and an additional 82 potentially redundant protein iden- motility and cell adhesion (Fig. 2 B , supple mental Table S2). tifications were detected a mong our initial crocodile pro- The do minant G O cellular co mpart ments represented in the teo mic inventory (e.g. apparently equivalent proteins that croco dile s per m proteo me inclu de d that of the cell, organelle, have been assigned different gene na mes within the Archo- macro molecular co mplex, and me mbrane, with so me 438, sauria database). Of the re maining 805 proteins, ho mologues 201, 174, and 136 proteins mapping to each of these respec- of 675 (84 %) were identified in the co mpiled hu man sper m tive categories (Fig. 2C ,supple mental Table S2). proteo me (supple mental Table S1; Fig. 3A ). Although we re- Our application of isobaric peptide labeling afforded the m ai n cognizant thatneither sper m proteo mic database is yet opportunity to investigate the relative abundance of each to be fully annotate d, functional classification of the 130 pu- protein in opposing populations of noncapacitated and ca- tatively nonconserved crocodile sper m proteins revealed pacitated sper matozoa. As might be expected of a cell that notable enrich ment in the molecular function category of cat- lacks the ca pacity to engage in protein synthesis, the majority alytic activity (Fig. 3C ) an d the biological process of meta bo- ( 90 %) of crocodile sper m proteins were detected at equiv- lis m (Fig. 3D ). Moreover, a substantial nu mber of these pro- alent levels irrespective of capacitation status. A mong the teins mapped to the me mbrane do main (Fig. 3E), suggestive re maining proteins, we docu mented an apparent reduction in of differing specialization of the surface and possibly the the abundance of 53 proteins and conversely, an apparent metabolic characteristics of these cells. In expanding this

S 6 2 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 2.G O annotation of crocodile sper m proteo me. The co mplete inventory of 1119 identified proteins was subjected to provisional Gene Ontology ( G O) classification using the universal protein kno wledgebase ( Uni Prot K B) functional annotation tools (26). Proteins were curated on the basis of (A ) G O molecular function, (B ) G O biological process, and (C ) G O cellular co mpart ment. analysis to include co mparison with the published rooster (A C R BP), zona pellucida binding protein 1 (ZP BP1) and 2 sper m proteo me co mprising 822 proteins (28), we were only ( Z P B P2), flotillin 1 (FL OT1), dyna min 3 ( D N M3); sper m flagel- able to identify ho mologues of 264 crocodile sper m proteins lu m: protein kinase A anchoring protein 4 ( A K A P4), calciu m (supple mental Table S1; Fig. 3B ). Ho wever, o wing to the binding tyrosine phosphorylation regulated protein ( C A B Y R), modest annotation of these datasets we caution against in- and sper matogenesis and centriole associated 1 (for merly terpreting these data as evidence for the divergence of the speriolin; S P AT C1)] (Fig. 4). Moreover, we failed to docu ment sper m proteo mes in these species. any for m of overt capacitation-associated change in protein To validate these in silico co mparisons, nine candidate abundance or distribution bet ween subcellular do mains (Fig. proteins with well characterized roles in the sper matozoa of 4). One possible exception was that of acrosin, a highly con- eutherian species were assessed for their presence and lo- served proteinase that do minates the profile of acroso mal calization in crocodile sper matozoa. This analysis confir med proteins found in ma m malian sper matozoa. Thus, although the presence of each of the nine targete d proteins in croco dile acrosin was present throughout the peri-nuclear do main of sper matozoa and de monstrated labeling patterns consistent both noncapacitated and capacitated crocodile sper matozoa, with those expected based on studies of eutherian sper ma- this localization pattern was acco mpanied by a particularly tozoa [i.e.sper m head: acrosin ( A C R), acrosin binding protein intense foci of posterior hea d la beling in nonca pacitate d cells.

Molecular & Cellular Proteo mics 18.13 S 6 3 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 3.Conservation of the crocodile sper m proteo me and annotation of proteins not curated within the co mpiled hu man sper m pr ote o me. A ,B , Venn diagra ms illustrating the conservation of crocodile sper m proteins with those reported for (A ) hu man (27) and (B ) rooster sper matozoa (28). C –E , Gene Ontology ( G O) analysis was perfor med to assess the functional classification of the 130 crocodile sper m proteins that are not currently annotated within the co mpiled hu man sper m proteo me. For this purpose, proteins were curated on the basis of (C ) G O molecular function, ( D ) G O biological process, and (E) G O cellular co mpart ment using the universal protein kno wledgebase ( Uni Prot K B) functional annotation tools (26).

The loss of this la beling in ca pacitate d cells coinci de d with a U si n g a threshold of 1.2 fold-change, we identified 174/ 1.77-fold increase in acrosin abundance in the me mbrane- 269 ( 65 %; supple mental Table S3) phosphopeptides that enriched sper m fraction, and a reciprocal 1.56-fold decrease experienced differential phosphorylation in capacitated ver- in the soluble fraction (supple mental Table S1). sus noncapacitated sper matozoa (Table I). A mong these pep- Analysis of Capacitation-associated Protein Phosphoryla- tides, 31 were characterized by the presence of multiple tion of Crocodile Sper matozoa —In seeking to refine our phos phorylation sites an d si milarly, several a d ditional can di- analysis of proteins potentially involve d in the regulation of dates mapped to proteins possessing multiple phosphopep- crocodile sper m function, we perfor med phosphopeptide-en- tides. Thus, 22 proteins were identified as being targeted for rich ment in tande m with isobaric-tag based labeling to iden- multiple (de)phosphorylation events, with pro minent exa m- tify signatures of capacitation-associated signaling within ples including: fibrous sheath C A B Y R protein, centroso mal these cells. Before perfor ming this analysis, sper m lysates protein of isofor m B, phosphodiesterase, outer dense fiber 2 were subjected to i m munolabeling with anti-phospho-serine, and sulfotransferase fa mily cytosolic 1 B me mber 1 isofor m C, -threonine, -tyrosine, and - P K A antibodies, confir ming that each with as many as 17, 6, 5, 3, and 3 differentially phos- ca pacitation sti muli elicite d e quivalent res ponses in the s per- phorylated residues, respectively (Table I). Taking these matozoa of crocodiles used in this study ( n 2; Fig. 5) and events into consi deration, a total of 126 uni que proteins were those we have reported in previous studies (n 5; (10)). i dentifie d as being differentially phos phorylate d in ca pacitate d I mportantly, this consistency also extended to the relative versus noncapacitated sper matozoa (Table I). abundance of phosphorylated peptides detected in each bi- A mong these phosphorylation events, a notable bias was ol o gi c al r e pli c at e ( supple mental Table S3). detected for serine residues (Fig. 6 A , Table I), with this

S 6 4 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 4. Validation of the conservation of crocodile sper m proteins. To validate our proteo mic data, 9 proteins were targeted for assess ment of their presence and localization in crocodile sper matozoa. These candidates included proteins kno wn to reside in either the sper m head [acrosin ( A C R), acrosin binding protein ( A C R B P), flotillin 1 (FL OT1), zona pellucida binding protein 1 ( Z P B P1) and 2 ( Z P B P2), and dyna min 3 ( D N M3)] or flagellu m [protein kinase A anchoring protein 4 ( A K A P4), calciu m binding tyrosine phosphorylation regulated protein ( C A B Y R), and sper matogenesis and centriole associated 1 (for merly speriolin; S P AT C1)]. For this purpose, populations of noncapacitated and capacitated crocodile sper matozoa were fixed and per meabilized before labeling with appropriate antibodies (red). The cells were then counterstained with DAPI (blue) and exa mined by confocal microscopy. These experi ments were replicated on independent sa mples fro m three different crocodiles, and representative labeling patterns are sho wn. Scale bar, 5 m. a mino acid featuring as the target for 80 % of all differen- ti on in ca pacitate d s per mat oz oa ( Fi g. 7, Ta ble I). Alternatively, tially phosphorylated sites (Fig. 6B , Table I). Thereafter, phosphopeptides mapping to proteins such as CEP57, threonine was the next most co m mon phospho-target, with TEKT2, O DF5, L M NT D2, and N DST1 were characterized by only relatively fe w tyrosine resi dues being i dentifie d as dif- an apparent halving of their abundance in capacitated cells ferentially phos phorylate d in our analysis (Fig. 6). ( Fi g. 7, T a bl e I). Overall, we noted a trend for proportionally more peptides Although the quantitative changes docu mented a mong experiencing increased, as opposed to reduced, phosphory- most of these peptides are likely attributed to genuine lation in capacitated sper matozoaversus that of their nonca- changes in phosphorylation status driven by capacitation- pacitated counterparts (77 versus 55, respectively; Table I). associated signaling cascades, we cannot equivocally The for mer of these capacitation-associated changes in- conclude that all such changes are strictly tied to these cluded several phosphopeptides that more than doubled their events. Rather, so me 41 phosphopeptides were identified basal levels docu mented in noncapacitated sper matozoa (Fig. that displayed reciprocal profiles of accu mulation into dif- 7, Table I). A mong the most do minant of these were peptides ferent sub-cellular fractions depending on the capacita- mapping to CABYR, a protein that was co m mon to both tion status of the sper matozoa fro m which they were ex- me mbrane-enriched and soluble sper m fractions; and fea- tracted (Table I), thus supporting our previous supposition tured phosphopeptides that sho wed reduced phosphoryla- that their parent proteins may have undergone capacitation-

Molecular & Cellular Proteo mics 18.13 S 6 5 Proteomic Profiling of Crocodile Spermatozoa

FIG.5. Assessment of equivalent physiological re- Downloaded from sponse to capacitation stimuli. To confirm that capacita- tion stimuli elicited equivalent responses in the spermatozoa of both crocodiles used in the phospho-enrichment aspect of this study, cell lysates were prepared from noncapacitated (Non-cap) and capacitated (Cap) populations of crocodile spermatozoa. Lysates were resolved by SDS-PAGE before http://www.mcponline.org/ being subjected to immunoblotting with (A) anti-phosphoser- ine (pS), (B) anti-phosphothreonine (pT), (C) anti-phosphoty- rosine (PY) or (D) anti-phospho-PKA substrate (pPKA) antibodies. at UNIV OF NEWCASTLE (CAUL) on September 5, 2019

associated translocation between different sub-cellular coupled to the opposing action of cellular kinases and domains. phosphatases. Gene Ontology Analysis of Differentially Phosphorylated Validation of Phosphorylated Proteins—To begin to validate Proteins—The majority of proteins that experienced increased the phosphorylation of crocodile sperm proteins, we em- phosphorylation during capacitation mapped to the dominant ployed a proximity ligation assay with paired anti-phospho- GO biological process categories of cellular process, meta- serine and anti-CABYR or anti-SPATC1 antibodies. This bolic process, and biological regulation (Fig. 8A), with addi- strategy confirmed that CABYR was targeted for serine phos- tional sub-categories of direct relevance to sperm capacita- phorylation throughout the mid and principal-piece of the tion including: cellular response to stimulus, regulation of crocodile sperm flagellum (Fig. 9A); cellular domains consis- signal transduction, cell surface receptor signaling pathway, tent with those that harbor the CABYR protein itself (Fig. 3). regulation of cAMP-dependent protein kinase (i.e. PKA), cil- Equivalent results were also obtained for PLA labeling of ium movement, and reproductive process (Fig. 8A). Interest- phosphoserine/SPATC1 albeit restricted to the sperm neck ingly, the opposing subset of crocodile sperm proteins that coinciding with the distribution of this centriole associated experienced reduced phosphorylation following induction of protein. The specificity of PLA labeling was supported by the capacitation mapped to similar biological process categories absence of fluorescence, save for a discrete foci of weak stain- (i.e. cellular process, metabolic process, capacitation, flagel- ing in the anterior region of the sperm head, in negative controls lated sperm motility, response to stress, and signal transduc- featuring the use of anti-phosphoserine antibodies alone or in tion) suggesting that regulation of sperm activity is tightly combination with a nonphosphorylated target (i.e. ZPBP1).

S66 Molecular & Cellular Proteomics 18.13 lecu r & Celu r P teom 18 3 1 8. 1 s c mi o e ot Pr ar ul ell C & ar ul c e ol M T ABLE I Differential phosphorylation of peptides in capacitated versus non-capacitated crocodile sper matozoa

Fold change ( Cap/ Non-cap) Change (peptides) Phosphorylated peptide Phosphorylated Protein na me U ni pr ot I D r e si d u e( s) M e m br a n e- e nri c h e d S ol u bl e

ME MBRANE-ENRICHED ONLY Increased (19) R .DSHFEKAGTSQSSR.S p S A-kinase anchor protein 4 A 0 A 1 5 1 M 6 A 1 1. 8 7 – R .SEHALQSEQR. K p S Coiled-coil do main-containing protein 40 A 0 A 1 5 1 N K J 4 1. 7 1 – K .AQQTYSLISLGGET WINR. R 2 p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 1. 2 8 – R .RGSQQPTGQESR. R p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 2. 0 2 – R .RTGKV MASHSQQTR. E p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 2. 4 9 – R .SEEYAEQLTVQLAEKDSYVAEALSTLES WR. S p S Outer dense fiber protein 2 (Frag ment) A 0 A 0 8 7 R G V 2 2. 9 9 – K .TLTDVVSPGPCYFVDPHVSR. F p S Outer dense fi ber protein 3-like protein 1 A 0 A 1 5 1 M 9 U 6 1. 7 9 – K .STGQNDGDEPQSAEHIESR. T p S P erili pi n A 0 A 0 9 1 H Q Z 3 1. 2 9 – R .RGSADQGYPAR. R p S Phos phati dylinositol 4,5-bisphosphate 5-phosphatase A isofor m A A0A151NDF4 1. 6 6 – K .D WEDDSDEDLSNFDR. F p S Prostaglandin E synthase 3 A 0 A 1 5 1 P B S 0 1. 5 2 – R .RASVCAEAYNPDEEEDDAESR. I p S Protein kinase cA MP-dependent type II regulatory subunit beta H0YY13 1. 3 1 – R .RNSSISGTATPR. H p S R NA-binding protein MEX3 D A 0 A 1 5 1 N G Q 2 1. 3 7 – R .SRSPSPLPLR. S p S Sper matogenesis associated 18 U 3I 5 S 0 1. 3 6 – R .SRSPSPLPLR. S p S Sper matogenesis associated 18 U 3I 5 S 0 1. 2 6 – R .LQAEAPVQTSGFDLLERLSPLSQTESQTQR. L p T/ S Sulfotransferase fa mily cytosolic 1 B me mber 1 isofor m C A 0 A 1 5 1 M J 4 1 1. 3 8 – R .NIYNPPESNASVI MDYNEEGQLLQTAFLGTSR. R pS pY Teneurin trans me mbrane protein 3 U 3 K A S 7 1. 4 0 – K .RAHTPTPGIY MGRPTHSGGGGGGGAGR. R p T Transfor mer 2 alpha ho molog F 1 N P M 7 1. 2 2 – R .RRSPSPYYSR. GY p S Transfor mer 2 alpha ho molog F 1 N P M 7 1. 4 0 – R .QASTDAGTAGALTPQHVR. A p S Y or ki e i s of or m X 1 A 0 A 0 Q 3 M C H 6 1. 2 1 – Decreased (28) K .CKSQFTITPGSEQIR. A p S Aconitate hydratase, mitochondrial ( Aconitase) A 0 A 0 9 3 T J W 6 – 1. 3 5 – K .EVNQQSSNRNEFCHER. G p S A-kinase anchor protein 7 A 0 A 1 5 1 P 6 K 4 – 1. 3 2 – R .SLAEASGSRPAGTR. N p S A-kinase anchor protein 9 A 0 A 1 5 1 M 6 A 1 – 1. 3 0 – K .AKTSPDAFIQLALQLAHYR. D p T/ S/ Y Carnitine pal mitoyltransferase 1 A U 3 J 8 S 7 – 1. 3 1 – R .TPEELDDSDFETEDFDVR. S p S Catenin alpha-2 isofor m A A0A151NQU8 –1.20 – R .SLSPSPPPSR. Y p S Centroso mal protein of A 0 A 1 5 1 N 2 J 7 – 1. 2 2 – R .SSSPLSSTLRSPSHSPER. A p S Centroso mal protein of isofor m B A 0 A 1 5 1 P A H 0 – 1. 2 8 – R .RLQEQLLNVASEDDITTSR. K p S Centroso me and spindle pole-associated protein 1-like A 0 A 1 5 1 M N 7 2 – 1. 3 0 – ro ic P fl o rocodle rma a o z o at m er p S e dil o c o Cr of g n ofili Pr c mi o e ot Pr K .QGISNTVPSGELLPSPGVLR. L p S Cilia-an d flagella-associate d protein 57 A 0 A 1 5 1 M K T 8 – 1. 6 1 – R .ALAA MEQDSPVQR. I p S Cili a- a n d flagella-associated protein 58 A 0 A 1 5 1 N V A 7 – 1. 4 5 – R .QDPSPTVSSEGVGAR. A p S Diphthine methyltransferase isofor m B A 0 A 1 5 1 M F 0 1 – 1. 2 5 – R .SAATSGAGSTTSGVVSGSLGSR. E p S E3 ubiquitin-protein ligase H U WE2 A0A151NCW7 –1.47 – R .GSGTASDDEFENLR. I p S E3 ubiquitin-protein ligase H U WE3 A0A151NCW7 –1.46 – K .AQQTYSLISLGGET WINRR. T p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 1. 2 3 – K .V MASHSQQTRES WIQEFR. V p S/ T Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 1. 3 3 – K .AEEPAPVPVEKPPEPA MSELTVGINGFGRIGR. L p T Glyceraldehyde-3-phosphate dehydrogenase A 0 A 1 5 1 M 7 J 4 – 1. 2 2 – R .QDSFPDENHLSR. K p S Intraflagellar trans port 43-like protein isofor m B A 0 A 1 5 1 M LI 4 – 1. 4 1 – R .ADSSDTDLEIEDAER. S p S Leucine-rich repeat-containing protein 74 A isofor m A A 0 A 1 5 1 M L E 8 – 1. 2 7 – K .APASPPPTSGL WTTQR. D p S Long-chain-fatty-acid– CoA ligase A CS B G1 A 0 A 1 5 1 N W J 6 – 1. 3 1 – R .VDHGAEIITQSPGR. S p S Microtubule-associated pr ot ei n A 0 A 1 5 1 N 9 F 1 – 1. 2 3 – R .IRTPEQIPSPVNTYLTEEDLFHR. K pT pS MYCBP-associated protein isofor m B A 0 A 1 5 1 N 2 T 7 – 1. 7 9 – R .EVDNLTLTPSDSQDDVR. S p S Pericentrin isofor m A A 0 A 1 5 1 N A F 1 – 1. 4 3 – K .TSPETSGIFSGEDFPIIR. F p T/ S Regulator of G- protein signaling protein-like isofor m B A 0 A 1 5 1 MI B 2 – 1. 4 6 – R .SASGLLEGLSPLVSEQDLSTIQPLIR.Y pS pS/T TAO kinase 2 (Fragment) A 0 A 1 5 1 N Q E 0 – 1. 4 0 – K .TQYASAESQR. S p S T e kti n- 1 A 0 A 1 5 1 N 5 N 4 – 1. 3 8 – R .SASHQIR. Q p S Tektin-2 isofor m B A 0 A 1 5 1 P G B 9 – 2. 8 7 – R .SSGASSSSLNLIR. W p S Testis anion transporter 1 ( Solute carrier fa mily 26 me mber 8) A 0 A 1 5 1 N 8 J 2 – 1. 4 0 – K .RLQYVQSELR. L p S Uncharacterized protein A 0 A 1 5 1 P 2 B 6 – 1. 6 7 –

SOLUBLE ONLY Increased (33) K .QDAANLYHHK.H p Y A xi n- 1 A 0 A 0 9 3 P E G 4 – 5. 5 8 R .RTSMGGTQQQFVEGVR. M p S Catenin beta 1 G 1 N G P 2 – 1. 3 7 R .RHGLSTSSLR. AD p S Centroso mal protein of 135 k Da (Frag ment) A 0 A 0 9 1 F 4 F 0 – 1. 3 3 R .SSSPLSSTLRSPSHSPER. A p S Centroso mal protein of isofor m B A 0 A 1 5 1 P A H 0 – 1. 5 4 7 6 S

R .RASPRPSSSISFRPAAER. A p S Coiled-coil do main-containing protein 136-like A 0 A 1 5 1 M 3 U 3 – 1. 3 4

D o w nl o a d e d fr o m m o fr d e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C ( E L T S A C W E N F O V NI U at 8 6 S a o z o at m er p S e dil o c o Cr of g n ofili Pr c mi o e ot Pr T ABLE I —continued

Fold change ( Cap/ Non-cap) Change (peptides) Phosphorylated peptide Phosphorylated Protein na me U ni pr ot I D r e si d u e( s) M e m br a n e- e nri c h e d S ol u bl e

R .HADHGALTLGSGSATTR. L p T/ S E3 ubiquitin-protein ligase H U WE1 A 0 A 1 5 1 N C W 7 – 1. 2 6 R.SAATSGAG-TTSGVVSGSLGSR.E p S E3 ubiquitin-protein ligase H U WE2 A 0 A 1 5 1 N C W 7 – 1. 3 1 K .GQQTSLI WSR. KQ p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 1. 2 0 K .V MASHSQQTRESWIQEFR. V p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 3. 9 1 K .V MASHSQQTRESWIQEFR. V p S/ T Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 1. 9 4 K .FGNLQIEESR. R p S Fibrous sheath-interacting protein 3 A 0 A 1 5 1 M N U 1 – 1. 6 6 R .EIN QSETNVTNEIIR. T p S Fibrous sheath-interacting protein 4/5 A 0 A 1 5 1 M N U 1 – 2. 3 5 R .GEPNVSYICSR.Y p Y Glycogen synthase kinase 3 beta A 0 A 1 5 1 M D 6 2 – 1. 4 8 RK .NAEPEQSHSNTSTLTER. E p S GT Pase-activating Rap/Ran GAP do main-like 1 protein transcript variant 4 G0ZS69 – 1. 4 5 K .LFPIGSSTSSIQGDHPQGR. R p S La min tail do main-containing protein 2 isofor m A A 0 A 1 5 1 N X E 0 – 1. 2 4 R .GRYDSQVALR. G p S Myeloid leuke mia factor 3 A 0 A 1 5 1 N B 7 1 – 1. 2 6 R .HGESA WNLENR. F p S Phosphoglycerate mutase (E C 5.4.2.11) (E C 5.4.2.4) A 0 A 0 Q 3 P M K 9 – 1. 4 9 R .THNGESVSYLFSHVPL. - p S Phosphoribosyl pyrophosphate synthetase 2 U 3I 8 U 7 – 1. 6 6 R .RFSEGTSADR. E p S Proteaso me 26S subunit, ATPase 6 F 1 N C S 8 – 1. 5 0 R .YHGHSMSDPGVSYR. T p S Pyruvate dehydrogenase E1 co mponent subunit alpha A 0 A 1 5 1 N 0 5 6 – 1. 5 5 K .IGPQQPSGVAPGAGSR. H p S Ras-related protein Rab-2 B A 0 A 1 5 1 P 5 1 3 – 1. 4 6 R .RNSAPVSVSAVR. T p S Rho GT Pase-activating protein 31 A 0 A 0 Q 3 M R E 9 – 1. 2 7 K .SASSISLFASR. E p S Sper m-tail P G-rich repeat-containing protein 2 isofor m C A 0 A 1 5 1 N M F 5 – 1. 4 4 R .SASGLLEGLSPLVSEQDLSTIQPLIR.Y 2 pS pS/T TAO kinase 2 (Frag ment) A 0 A 1 5 1 N Q E 0 – 2. 4 5 K .LHEVALNTGPDSSCGLATAGFR. T p S T e kti n- 4 A 0 A 1 5 1 N 2 D 8 – 1. 6 7 K .GQSIHLLNGR.K p S Trans me mbrane protein 41A A 0 A 1 5 1 N H 6 2 – 1. 5 5 K .F WEVISDEHGIDIAGNYYGGASLQLER. I p S Tubulin beta chain A 0 A 0 Q 3 U 5 V 4 – 1. 6 5 K .F WEVISDEHGIDPSGNYVGDSDLQLER. I p S/ Y Tu bulin beta-4 chain ( Beta-tu bulin class-III) P 0 9 6 5 2 – 1. 2 0 K .HAFSLHQLQNDIR. I p S U bi quitinyl hy drolase 1 ( E C 3.4.19.12) A 0 A 1 5 1 M C T 8 – 1. 4 7 R .NSFQNVLEPDITR. V p S Uncharacterized protein A 0 A 1 5 1 P 8I 0 – 1. 2 9 R .VHSFLSSCLPHR. K p S Uncharacterized protein A 0 A 1 5 1 MI N 4 – 1. 8 8 K .STDESPYTPPSDSQR. M p T W D repeat-containing protein 17 isofor m C A 0 A 1 5 1 P B 2 9 – 1. 3 6 R .DTVISLSDVQVR. R p S W D repeat-containing protein 78 A 0 A 1 5 1 N J C 1 – 1. 4 2 Decreased (18) K .AQQTYSLISLGGET WINRR. T 2 pS Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 – 1. 2 7 K .IESEGGLQLLQRIYQLRK. D p Y Protein SE RA C1 (Frag ment) A 0 A 0 9 1 W 9 Y 3 – 1. 2 7 K .SES MGNTSPRR. S p S La min tail do main-containing protein 2 isofor m A A 0 A 1 5 1 N X E 0 – 2. 7 1 K .SETSAFGAPSQNSLGAVSNAETQR. R 2 pS Phosphodiesterase (EC 3.1.4.-) A 0 A 1 5 1 L Z S 8 – 1. 2 0 K .YISLCRSEHALQSEQR. K p S/ Y Coiled-coil do main-containing protein 41 A 0 A 1 5 1 N K J 4 – 1. 3 3 lecu r & Celu r P teom 18 3 1 8. 1 s c mi o e ot Pr ar ul ell C & ar ul c e ol M R .AGLSQLCDSSDEEQQDTQPGPR. E p S Nucleolar protein with MIF4 G do main 1 U 3 J W Q 2 – 1. 4 4 R .EPSLHNVEELPPSR. R p S Fibroblast gro wth factor (F GF) A 0 A 1 5 1 P 6 U 2 – 1. 3 5 R .H MFYHDLQVRPEDHALL MSDPPLSPTTNR. E 2 pS Actin-like pr ot ei n 9 A 0 A 1 5 1 N X K 5 – 1. 2 8 R .LQYVQSELR. L p S Uncharacterized protein A 0 A 1 5 1 P 2 B 6 – 1. 2 3 R .LSPLSQTESQTQR. L p S Sulfotransferase fa mily cytosolic 1 B me mber 1 isofor m C A 0 A 1 5 1 M J 4 1 – 1. 3 2 R .QVQDTQQLLER. A p T La minin subunit ga m ma-2 (Frag ment) A 0 A 0 9 1 Q 1 5 3 – 1. 9 1 R .SQSSAQFLSGDQEP WAFR. G p S Meningio ma expressed antigen 5 (hyaluronidase) H 0 ZI 4 2 – 1. 2 2 R .SSGASSSSLNLIR. W p S Testis anion transporter 1 ( Solute carrier fa mily 26 me mber 8) A 0 A 1 5 1 N 8 J 2 – 1. 3 3 R .VISQEAIGLQSR. H 2 pS Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 – 1. 4 3 R .VISQEAIGLQSR. H p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 1. 5 9 R .YDSQVALR. G p S M y el oi d leuke miafactor6 A 0 A 1 5 1 N B 7 1 – 1. 6 0 R .YHGHSMSDPGISYR. T p S Pyruvate dehydrogenase E1 co mponent subunit alpha (E C 1.2.4.1) A 0 A 1 D 5 P E H 3 – 1. 6 3 R .TGKV MASHSQQTR. E p S Fibrous sheath C A B Y R-binding protein A 0 A 1 5 1 P D Z 5 – 3. 1 8

CHANGED IN BOTH ME MBRANE-ENRICHED AND SOLUBLE FRACTIONS Increased in both K .DGSGQHVDVSPTSQR. L p S Aconitate hydratase, mitochondrial ( Aconitase) ( E C 4.2.1.-) A0A1D5NWW1 1.31 1. 2 8 fractions (25) R .ASSPGYIDSPTYSR. Q p S Actin binding LI M protein fa mily me mber 3 U 3 K 2 G 6 1. 4 0 1. 5 3 K .RNIQQYNSFVSLSV. - p S c A M P-dependent protein kinase type I-alpha regulatory subunit R 0 L 5 G 8 1. 8 6 1. 7 8 K .CSPSGHLNTQPHYR. L p S Centroso mal protein of isofor m B A 0 A 1 5 1 P 2 B 4 1. 5 7 2. 3 4 R .RLYGGSQSSR. K p S Doublecortin do main-containing protein 2 C isofor m A A 0 A 1 5 1 N 8 4 9 2. 9 4 1. 7 3 R .RSPPPSPSTQR. R p S D y n a mi n 3 A0A151NTM6 1.73 1. 2 3

R .SHHAAGAAPAPTPAAR. A p S E 3 ubiquitin-protein ligase H U WE1 A0A151NPY6 2.06 1. 2 0

D o w nl o a d e d fr o m m o fr d e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C ( E L T S A C W E N F O V NI U at lecu r & Celu r P teom 18 3 1 8. 1 s c mi o e ot Pr ar ul ell C & ar ul c e ol M

T ABLE I —continued

Fold change ( Cap/ Non-cap) Change (peptides) Phosphorylated peptide Phosphorylated Protein na me U ni pr ot I D r e si d u e( s) M e m br a n e- e nri c h e d S ol u bl e

R .ASSTTPEHDATR. S pS EF-hand calciu m-binding do main-containing protein 3 isofor m C A0A151PBR5 1. 6 7 1. 6 1 K .RLSSIGAENTEENRR. W pS Fructose-bisphosphate aldolase (E C 4.1.2.13) A 0 A 1 5 1 M L N 4 1. 3 2 1. 3 1 R .HCGGSHTITYPYR. H pS Gluta mine-rich protein 2 A 0 A 1 5 1 N 3 5 2 1. 3 9 1. 6 1 R .SASLVEESR. I pS Hydrocephalus-inducing protein-like protein isofor m B A 0 A 1 5 1 M S N 2 1. 2 1 1. 2 3 R .DGHSSVEDAR. A pS Interferon-sti mulated exonuclease-like 2 A 0 A 1 5 1 P 9 1 0 1. 2 8 1. 3 0 K .SES MGNTSPR. R pS La min tail do main-containing protein 2 isofor m A A 0 A 1 5 1 N X E 0 1. 9 1 1. 4 9 R .LLLKPHIQSXEDLQLILELLEK. M pS Malignant fibrous histiocyto ma-a mplified sequence 1 A 0 A 0 Q 3 P Z Z 6 1. 7 9 3. 1 4 R .TASNEHLTR. A pS MICOS complex subunit A 0 A 1 5 1 N P V 1 2. 2 3 1. 2 6 R .LTNASRHAHLVAR. F pS Mitochondria-eating protein isofor m A A 0 A 1 5 1 P A 7 7 1. 3 4 1. 3 0 K .STDAQLQEEAAR. T pS Phosphodiesterase ( E C 3. 1. 4.-) A 0 A 1 5 1 L Z S 8 1. 5 3 1. 2 1 K .SETSAFGAPSQNSLGAVSNAETQR. R pS Phosphodiesterase (EC 3.1.4.-) A 0 A 1 5 1 L Z S 8 1. 3 3 1. 3 3 K .RNSFGSCQDR. N pS Protein pitchfork A 0 A 1 5 1 N B Q 8 1. 4 8 1. 9 0 K .EHLQTRTPEPVEGR. K pT Radial spoke head 3-like protein A 0 A 1 5 1 M 2 H 0 1. 2 3 1. 2 8 R .NLGSINTELQDVQR.I pS SE C22 ho molog B, vesicle trafficking protein (gene/pseudogene) G1 MZ U9 1. 5 9 2. 4 3 K .RASGQAFELILSPR. S pS Stath min isofor m A A 0 A 1 5 1 M M G 4 1. 5 5 2. 5 8 K .RAHTPTPGIY MGRPTHSGGGGGGGAGR. R 2 pT Transfor mer 2 alpha ho molog F 1 N P M 7 1. 2 7 1. 4 0 K .F WEVISDEHGIDIAGNYRGAAPLQLER. I pY Tubulin beta chain (Frag ment) A 0 A 0 9 3 G R G 0 1. 2 2 1. 5 1 R .HVLHDAY.- pY Uncharacterized protein A 0 A 1 5 1 M F N 7 1. 2 8 1. 5 9 Decreased in both fractions (9) R .RVSVCAEAFNPDEEEEDTEPR. V pS cAMP-dependent protein kinase type II-alpha regulatory subunit A0A151 M KS7 1. 7 8 1. 4 4 R .HGLSTSSLR. AD pS Centroso mal protein of 135 kDa (Frag ment) A 0 A 0 9 1 F 4 F 0 1. 6 9 1. 4 2 K .LNQAQQTDSNLSVYKR. K 2 pS Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 1. 2 0 1. 6 7 K .LGLGIDEDEVTAEVLGAAAADEIPPLEGDEDTSR. M pT/S Heat shock cognate protein HSP 90-beta (Frag ment) A 0 A 0 9 1 R X G 0 1. 3 8 1. 2 9 RK .HVLIAEVIFTNIGGAATAVGDPPNVIIVSK. QR pS P protein A 0 A 1 5 1 N X 7 1 1. 3 7 1. 2 9 R .YYSPGYSEALLER. V pS Sper m-associated antigen 6 A 0 A 1 5 1 P C W 8 1. 3 5 1. 5 2 R .ATNELDQVPSPELGSEGIFYR. H pS Sper m-associated antigen 8 isofor m A A 0 A 1 5 1 M T E 8 1. 8 7 1. 7 4 R .EAVCGSPASARSAGNATVLAFSR. C p S W D repeat-containing protein 16 (Frag ment) A 0 A 0 9 9 Y Q N 6 1. 2 2 1. 4 7

R .RDTVISLSDVQVR. R pS WD repeat-containing protein 80/81 A 0 A 1 5 1 N J C 1 2. a o 3 z 1 o at m er p S e 1. dil o 2 c 0 o Cr of g n ofili Pr c mi o e ot Pr RECIPROCAL CHANGE IN ME MBRANE-ENRICHED AND SOLUBLE FRACTIONS Increased in soluble and K .GYSVGDILQEV MR. Y pS A-kinase anchor protein 10/11 A0A151M6A1; A0A151ND25 1.43 2.02 decreased in me mbrane- K . MAQNSDTSLK. S pS A-kinase anchor protein 5 A 0 A 1 5 1 M 6 A 1 1. 4 2 1. 9 3 enriche d (31) K .STEILEA MVR. R pS A-kinase anchor protein 8 A 0 A 1 5 1 M 6 A 1 1. 2 2 1. 4 1 KR .NLQAVVQTPGGR. KR pT Centroso mal protein of 135 kDa (Frag ment) A0A091F4F0; A0A151PAH0 1.75 1.70 K .GTSEVRVTTTVTTR. G pT/S Centroso mal protein of isofor m B A 0 A 1 5 1 P 2 B 4 1. 3 7 1. 6 7 R .LVLALDGGRSHDIISLESR. S pS Centroso mal protein of isofor m B A 0 A 1 5 1 P A H 0 2. 3 9 1. 5 0 R .SHDIISLESR. S pS Centrosomal protein of isofor m B A 0 A 1 5 1 P A H 0 3. 2 2 1. 6 5 R .DLILGNSETDQSR. S pS Coiled-coil do main-containing protein 63 A 0 A 1 5 1 N S E 3 2. 0 4 2. 1 9 R .DKKYKPT WHCIVGR. N pY Dynein light chain 1, cytoplas mic (Frag ment) A 0 A 0 9 4 K G E 2 1. 2 0 1. 4 2 K .GQQTSLI WSR. KQ pT/S Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 1. 2 5 1. 2 2 K .V MASHSQQTRESWIQEFR. V 2 pS Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 1. 3 0 1. 3 8 K .V MASHSQQTRES WIQEFR. V pS Fibrous sheath CABYR-binding protein A 0 A 1 5 1 P D Z 5 1. 6 9 2. 2 5 K .LTQHIRSLEELR. N pS Golgin subfa mily B me mber 1 isofor m B A 0 A 1 5 1 P C T 3 1. 5 2 1. 9 4 K .HTGPNSPDTANDGFVR. L pS Heterogeneous nuclear ribonucleoprotein H1-like protein Q6 WNG8 1. 2 3 1. 3 5 K .GVHTA MSALSVAPTR. A p S MI C O S co mplex subunit MI C13/14 A 0 A 1 5 1 P 0 P 1 1. 3 4 1. 3 5 K .NFDELSINPDAHTFR.S pS Myeloid leuke mia factor 5 A 0 A 1 5 1 N B 7 1 1. 3 2 1. 2 7 K .EELNEVTQELAESEHENTLLR. R pT/S Outer dense fiber protein 2 E 1 B S P 2 1. 9 3 1. 2 5 R .SEEYAQQLTVQLAEKDSYVAEALSTLES WR. S pS/T/Y Outer dense fiber protein 2 (Frag ment) A 0 A 0 9 1 S X F 2 1. 5 3 1. 3 5 K .H MTSSDINTLTR. Q pS Outer dense fiber protein 4 A 0 A 1 5 1 M F S 7 1. 7 3 1. 3 6 K . MLDLETQLSR. NIST pS Outer dense fiber protein 5 A 0 A 1 5 1 M F S 7 2. 4 0 1. 2 0 R .SIVHAVQAGIFVER.M pS Phosphodiesterase (EC 3.1.4.-) A 0 A 1 5 1 L Z S 8 2. 1 5 1. 5 3 R .YNHSHDQLVLTGSSDSR. V pS Protein TSSC1 A 0 A 1 5 1 N 7 Y 4 1. 2 0 1. 2 1

9 6 S

D o w nl o a d e d fr o m m o fr d e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C ( E L T S A C W E N F O V NI U at Proteo mic Profiling of Crocodile Sper matozoa

In vie w of the de monstration that a significant pro portion of differentially phosphorylated proteins corresponded to those housed in the sper m flagellu m and mapped to aerobic meta- S ol u bl e bolic path ways (e.g. fructose-bisphosphate aldolase, glycer- aldehyde-3-phosphate dehydrogenase, phosphoglycerate Fold change 1. 9 0 1. 2 4 ( Cap/ Non-cap) m ut a s e, p yr u v at e d e h y dr o g e n a s e, a c o nit a s e, l o n g- c h ai n-f att y- e nri c h e d M e m br a n e- acid- Co A ligase, carnitine pal mitoyltransferase), we sought to deter mine whether oxidative phosphorylation does support the enhanced motility profile of crocodile sper matozoa that occurs in res ponse to ca pacitation sti muli (10). For this pur- U ni pr ot I D A0A151NTM3 1.55 1.53 A0A093K505A0A151NBB1A0A151MFN7Q 9 8 U J 8 1.26 1.31 1.44A0A151PA77 1.23 1.26 1.51 2.66 1.28 A0A151NHA5 1.39 1.24 A0A151LZS8A0A151M2H0 A0A091EDJ7A0A151P497 2.00 1.76 1.60 2.00 1.31 1.46 1.20 1.23 A0A151M4J1A0A151M4J1A0A151MJ41 1.35 1.38 2.15 A0A151PDZ5 2.75 1.36 1.46 2.01 1.49 A0A151MSN2 1.30 1.21 pose, populations of capacitating crocodile sper matozoa were co-incubated with carbonyl cyanide m-chlorophenyl hy-

drazone ( C C C P), a che mical uncoupler of oxidative phosphor- m o dfr e d a o nl w o D ylation. As note d in Fig. 10, the a p plication of C C C P lea d to a significant decrease in sper m motility, which was manifest in the for m of a reduced percentage of motile sper matozoa (Fig. 10A ) as well as a pronounced reduction in the overall rate of move ment a mong these cells (Fig. 10 B ). This suppression of t :/www l .o g/ or e. n nli o p c m w. w w p:// htt sper m motility occurred independent of an attendant loss of vitality, which re mained above 80 % in all treat ment groups. Protein na me Further more, lipido mic profiling of noncapacitated versus ca-

protein isofor mA pacitated crocodile sper m me mbranes revealed a significant, 3-fold reduction in the abundance of pal mitoleic acid (16:1, n-9)(n 3; 2.33 0.67 versus0.78 0.50; p 0.02). Notably, alpha chain (Frag ment) the loss of pal mitoleic acid appeared selective such that the t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

s u b u nit levels of an additional 17 phospholipid fatty acid substrates re mained essentially unchanged in response to capacitation (data not sho wn). I —continued pT Phosphodiesterase (EC 3.1.4.-) pT Radial spoke head 3-like protein pS pS pS EF-hand Putative calciu ho Septin-4/5/6 m-binding meodo do main main-containing transcription factor protein isofor1 isofor3 m B m C A0A151PBR5 1.67 1.61 pS Sper matogenesis-associated protein isofor6 m A pS Sper matogenesis-associatedpS proteinpS isofor6 Uncharacterized m A Branched-chainprotein pS pS alpha-keto acid Mitochondria-eatingdehydrogenase Mitochondrial import E1-alpha receptor subunit TOM20-like protein A0A151M2X9 1.47 1.78 pS pS RNA-binding Speriolin protein 8A pS pS Fibrous sheath Hydrocephalus-inducing CABYR-binding protein-likeprotein protein isofor m B pS/T Uncharacterized protein pS/T/Y SH3 do main and tetratricopeptide repeat-containing protein 2 A0A091IYK3 1.39 1.45 2 pS Sulfotransferase fa mily cytosolic 1B me mber isofor1 m C r e si d u e( s) pT pS/T Tubulin

ABLE DI S C U S SI O N T Phosphorylated It is well recognized the sper matozoa of all ma m malian species only acquire functional maturity as they are conveyed through the male and fe male reproductive tracts. Despite decades of research ho wever, the evolutionary origin, and adaptive advantage, of these elaborate for ms of post-testic- ular maturation re main obscure. Here, we have exploited quantitative proteo mics coupled with phosphopeptide-en- rich ment strategies to explore the crocodile sper m proteo me an d i dentify signatures of post-translational mo dification as- sociate d with the functional activation of these cells. Our data Phosphorylated peptide confir m that the phosphorylation status of the crocodile sper m proteo me is substantially modified in response to sti m- uli for mulated to elevate intracellular levels of the second messenger c A M P; thus supporting the necessity for capaci- R .AVGYAATAVTLSR. L R .ASSTTPEHDATR.R .HDSETEDR .DQGQGHIDTSPFSSVVR. ML S WDDLLHGPECR. A S R .RVEHPSPSGDLPTR .LSPLSQTESQTQR. WCTPR. L E K .HSLSSVLNR. S K .VLHEAGFNIEDSSSETNK.K .VLLSHSNNLSNIR.R .SVELKTAKPIDPSKTDPTVLLFVESQYSQLGQDIIAILESSR.R .IGHHSTSDDSSAYR. A S S R .IVSAQSLAEDDVE.KR .LLDTEDELSDIQSDSVPLEVR. - F pS D pT Bifunctional heparan sulfate N-deacetylase/N-sulfotransferase 4 A0A091G6U2 2.35 2.46 R .AKEHLQTRTPEPVEGR.R .GFGSEEGSR.R .TTERP MSVRDSIQPGLGPR.D A K K .VQRAVG MLSNTTAIAEA WAR.R L .RGSQQPTGQESR.R .HLGIDISPEGR. A R R .SRSMSPVLSR. R tation-like changes in pro moting the fertility of these cells. Moreover, we have established that the enhanced motility profile of capacitated crocodile sper matozoa is likely fueled by aerobic metabolis m of selective me mbrane fatty acid substrates. Although sper matozoa have no w been successfully recov- ered fro m several reptilian species, syste matic atte mpts to Change (peptides) modulate the physiology of these cells are rare and global and decreased in soluble (10) proteo mic analyses are currently lacking. Thus, in co mpleting Increased in me mbrane-enriched the first co mprehensive proteo mic assess ment of reptilian

S 7 0 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D

F IG. 6.Assess ment of differentially phosphorylated peptides. The (A ) total nu mber and (B ) proportion of phospho-serine (p S), -threonine (pT), and -tyrosine residues (p Y) was deter mined a mong those peptides experiencing differential phosphorylation (i.e. 1.2 fold change) in populations of noncapacitated versus capacitated crocodile sper matozoa. p S/T, p S/ Y, p S/T/ Y a mbiguous phosphorylation of either: a serine or threonine resi due, a serine or tyrosine resi due, or a serine, threonine, or tyrosine resi due, res pectively. t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 7.Plots depicting fold changes associated with differentially phosphorylated peptides. Plots were constructed to de monstrate the fold change (x axis; depicted as log 2 fold change) and overall Sequest HT Score (y axis) of phosphorylated peptides in (A ) me mbrane-enriched and (B ) soluble lysates extracted fro m noncapacitated and capacitated crocodile sper matozoa. For the purpose of this analysis, a threshold of at least a 1.2 fold change in reporter ion intensity was i mple mented to identify differentially phosphorylated peptides. The identity of the parent protein fro m which a portion of the peptides originate has been annotated.

Molecular & Cellular Proteo mics 18.13 S 7 1 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 8.G O annotation of crocodile sper m proteins harboring peptides that experienced differential capacitation-associated phos- phorylation. Gene Ontology ( G O) analysis was perfor med to assess the functional classification of all peptides that experienced a 1.2 fold change in reporter ion intensity in noncapacitated versus capacitated crocodile sper matozoa. Proteins were curated on the basis of G O biological process using the universal protein kno wledgebase ( Uni Prot K B) functional annotation tools (26) and whether they under went ( A ) increased or (B ) decreased phosphorylation. sper matozoa, we have been able to initiate co mparative anal- me mbrane integrity ( 50 %) during exposure to os motic ex- yses with the curated proteo mes fro m representative ma m- cursions of bet ween 25–1523 m Os m kg 1 (29). Such charac- malian (hu man, (27)) and avian (rooster, (28)) sper matozoa teristics are perhaps a physiological necessity o wing to the with a vie w to furthering our un derstan ding of this cell’s co m- potential for these cells to encounter dilution into fresh, or plex biological machinery. These analyses confir med the brackish, water follo wing ejaculation into the cloacal cha m ber presence of so me 84 % of the identified crocodile sper mato- of the fe male. Alternatively, a high tolerance to anisotonic zoa proteins within the hu man sper m proteo me; a level of me dia coul d be linke d to s per m storage in this s pecies, as it conservation that suggests the core proteo mic architecture of a p pears to be in micro bats (30). Although, the preservation of sper matozoa fro m these distantly related vertebrate species plas ma me mbrane integrity in the face of extre me os motic are broadly co mparable. Working within the li mitations i m- challenge undoubtedly reflects its lipid architecture, such posed by inco mplete coverage, functional classification of properties may be aug mented by the synergistic action of ion crocodile sper m proteins that are not currently annotated in transport and drug efflux proteins in the lipid bilayer. An the hu man sper m proteo me revealed enrich ment in the mo- interesting exa mple of one such protein is that of the testis lecular function category of catalytic activity, an d the biolog- anion transporter 1 ( SL C26 A8), an anion exchanger that me- ical process of metabolis m. In addition, a substantial nu mber diates chloride, sulfate and oxalate transport and has been of these proteins mapped to the me mbrane do main. Based on p o st ul at e d t o f ulfil criti c al f u n cti o n s i n t h e m al e g er m li n e ( 3 1). these data, we infer s pecialization of the surface, an d possi bly Note worthy in the context of our study, SL C26 A8 has been the metabolic characteristics, of crocodile sper matozoa. i mplicated in the for mation of a molecular co mplex involved in The for mer explanation is consistent with evidence that the the regulation of chloride and bicarbonate ions fluxes during plas ma me mbrane of crocodile sper matozoa displays excep- induction of sper m capacitation (32). Thus, an increased un- tionally high tolerance to anisotonic os motic stress (29). In- derstanding of the functional relationships bet ween the pro- deed, crocodile sper matozoa retain high levels of plas ma teo mic co mposition of the crocodile sper m me mbrane and

S 7 2 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D t :/www l .o g/ or e. n nli o p c m w. w w p:// htt t IV NEWCASTLE(CAUL on tembe 5 2019 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at

F IG. 9.Confir mation of C A B Y R and S P AT C1 as targets for serine phosphorylation in crocodile sper matozoa. Proxi mity ligation assays ( PL A) were e mployed to confir m that representative proteins, C A B Y R and S P AT C1, were substrates for phosphorylation in noncapacitated ( Non-cap) and capacitated crocodile sper matozoa. This assay results in the production of punctate red fluorescent signals when target antigens of interest, i.e.(A ) C A B Y R and phosphoserine residues or (B ) S P AT C1 and phosphoserine residues reside within a maxi mu m of 40 n m fro m each other. These experi ments were replicated on independent sa mples fro m three different crocodiles, and representative PL A labeling patterns are sho wn. Arro wheads in (B ) indicate PL A labeling of the neck of the flagellu m. C , Negative controls included the labeling of sper matozoa with paired antibodies against phosphoserine and Z P B P1, a protein that was not identified as a substrate for serine phos phorylation, an d the o mission of one of the pri mary anti bo dies fro m the initial incu bation ( i.e.phos phoserine only). As antici pate d, neither of these negative controls generate d positive PL A la beling of the s per m flagellu m. They di d ho wever, result in discrete, nons pecific PL A la beling in the anterior region of the sper m head (arro ws). Scale bar, 5 m. their a bility to survive os motic excursions may ulti mately hel p latter category including proteins i mplicated in lipid catabo- infor m protocols to address the e merging need for the suc- lis m, modification, and transport. A mong these proteins we cessful cryopreservation of crocodile sper matozoa (11). identified carnitine pal mitoyl transferase 1 ( C PT1 A), an en- Consistent with energy-production being a key attribute in zy me that catalyzes the rate-li miting reaction of beta-oxida- the support of motility needed for sper matozoa to ascend the tion of fatty acids (33), and one that experienced a mong the fe male reproductive tract and achieve fertilization, meta- highest fold changes (2.27 increase) of accu mulation into the bolic enzy mes were identified as one of the do minant func- detergent labile (soluble) fraction of capacitated sper m ly- tional categories represented a mong the crocodile sper m pro- sates. Fro m its position in the outer mitochondrial me mbrane, teo me. Notably, enzy mes mapping to glycolysis, oxidative C PT1 catalyzes the for mation of long-chain acylcarnitines phosphorylation and lipid metabolis m were each highly en- fro m their respective Co A esters and thus co m mits the m to riched in the crocodile sper m proteo me, with those of the - oxi dati on within the mit och on drial matrix (33). It f oll o ws that

Molecular & Cellular Proteo mics 18.13 S 7 3 Proteo mic Profiling of Crocodile Sper matozoa loadedf m o dfr e d a o nl w o D

F IG. 10. Assess ment of the i mpact of the mitochondrial uncoupling agent, C C CP on crocodile sper m motility. To deter mine the contribution of oxidative phosphorylation to supporting the enhanced motility para meters elicited in response to capacitation sti muli, capacitating populations of crocodile sper matozoa were co-incubated C C C P (a che mical uncoupler of oxidative phosphorylation) or the

appropriate D M S O vehicle control. Sper matozoa were sa mpled fro m each treat ment group i m mediately after introduction of C C C P (t 0) and g/ or e. n nli o p c m w. w w p:// htt at 30 and 60 min intervals and assessed for (A ) overall percentage of motile cells and (B ) the rate of sper m move ment using criteria defined by Barth (49). These experi ments were replicated on independent sa mples fro m three different crocodiles, and data are presented as mean S. E. * p 0. 0 5, ** p 0. 0 1. phar macological inhibition of C PT1 A reduces flux through Beyond its putative i mpact on C PT1 A activity, elevation of -oxidation and, in species such as the horse, this manifests intracellular c A M P also elicited the phosphorylation of nu mer- in the for m of co mpro mised sper m motility (34). The fact that ous alternative su bstrates i m plicate d in s per m motility initia- this response occurs independently of any attendant loss of tion and maintenance. Notably, these proteins included 9 1 0 2 5, er b m e pt e S n o L) U A pep- C E( L T S A C W E N F O V NI U at vitality, has been taken as evi dence that stallion s per matozoa tides mapping to the alpha and beta regulatory subunits of are able to effectively use endogenous fatty acids as an protein kinase A ( P K A), a pro miscuous c A M P-dependent ser- energy substrate to support motility (34). Although we have ine/threonine kinase. In eutherian sper matozoa, P K A is widely not yet ha d the o p portunity to test this hy pothesis directly in ackno wledged as the central hub of the canonical capacita- crocodile sper matozoa, we did secure several lines of correl- tion cascade o wing to its ability to integrate c A M P signaling ative evi dence that these cells use a si milar meta bolic strat- with the do wnstrea m tyrosine kinase signaling path ways that egy. Thus, capacitated crocodile sper matozoa experienced a un der pin the functional activation of the cell (2). Consistent selective depletion of pal mitoleic acid [a monounsaturated with data fro m our o wn i m munolocalization studies (10), P K A fatty acid substrate kno wn to enhance the motility profile of pri marily resides in the axone me, a structure that for ms an sper matozoa fro m species as diverse as sheep and fo wls (35, integral part of the motility a p paratus of the s per m flagellu m. 36)], as well as a si gnificant re ducti on in m otility f oll o win g the Indeed, P K A is effectively anchored within this specific sub- uncoupling of oxidative phosphorylation. Moreover, we iden- cellular location by interaction bet ween a docking do main tifie d C PT1 A as a su bstrate for differential phos phorylation in present in the enzy me’s regulatory subunit and that of scaf- noncapacitated versus capacitated crocodile sper matozoa. folding proteins of the protein kinase A anchoring protein This finding takes on added significance in vie w of the de m- ( A K A P) fa mily (42, 43); multiple me mbers of which also dis- onstration that C PT1 catalytic activity can be selectively mo d- played differential phosphorylation in capacitating crocodile ulated by a mechanis m of c A M P-dependent phosphorylation/ sper matozoa (i.e.A KAP4, A KAP5, A KAP8, A KAP10/11). This dephosphorylation in so matic cells (37–39). It is therefore sequestration of P K A ensures that the enzy me is juxtaposed conceivable that the differential phosphorylation of C PT1 A with its relevant axone mal protein targets, while si multane- witnessed in crocodile sper matozoa may serve as a physio- ously segregating its activity to prevent in discri minate phos- logical s witch to divert their metabolis m either to ward, or phorylation of alternative su bstrates. a way fro m, fatty acid oxidation. Because fatty acid metabo- These data are entirely consistent with the de monstration lis m is conducive to long-ter m sustained release of energy, that the bulk of the crocodile sper m proteins identified as this strategy could assist with prolonging in vivosper m stor- undergoing differential capacitation-associated phosphoryla- age before ovulation (40, 41), while also proving advanta- tion were those harbored within the sper m flagellu m; with geous in the context of enabling crocodile sper matozoa to pro minent exa mples including fibrous sheath C A B Y R-binding negotiate the many meters of fe male reproductive tract before protein, outer dense fiber proteins ( O DF2, O DF3, O DF4, arrivi n g at t h e sit e of f ertiliz ati o n ( 4 1). O DF5), cilia-and flagella- associated proteins ( CF A P57, CF A P58),

S 7 4 Molecular & Cellular Proteo mics 18.13 Proteo mic Profiling of Crocodile Sper matozoa fibrous sheath-interacting proteins (F SI P3, F SI P4/5), microtu- ways, centering on P K A activity, to pro mote their functional bule-associated protein, tubulins (T U B A, T U B B), and dynein activation (10). In doing so, these data challenge the widely ( D Y NLL1). They also accord with our previous observations of pro mulgated vie w that post-testicular sper m maturation is a ra pi d an d sustaine d increase in the rate of motility as being li mited to the ma m malian lineage. a mong the principle changes witnessed in capacitating croc- Ackno wledg ments — We thank the staff at Koorana Crocodile Far m, odile sper matozoa (10). Although the conservation of phos- and in particular John Lever and Robbie McLeod, for assistance with pho-substrates docu mented above suggests conservation of collection of crocodile se men. We are also grateful for the technical the core activation path ways e mployed by reptilian and eu- assistance of Ta mara Keeley and Ed Qualischefski. therian sper matozoa, it is also apparent that do wnstrea m signaling events sho w so me degree of divergence. Thus, DATA AVAILABILITY unlike eutherian sper m capacitation in which tyrosine phos- The data set (supple mental Dataset S1) analyzed here have phorylation a p pears to exert overri ding control (4), we i den - been deposited in the Mass Spectro metry Interactive Virtual tified co mparatively fe w tyrosine phosphorylated peptides Environ ment ( MassIVE) database (Project I D: MassIVE loadedf m o dfr e d a o nl w o D in capacitated crocodile sper matozoa. Such findings agree MSV000082258), and are publicly accessible at: https:// with our previous i m munoblotting studies in which we also massive. ucsd.edu/ProteoSAFe/dataset.jsp?task 8acd6725da docu mented only relatively subtle changes in tyrosine phos- 734f6f89bbd64460d03686. phorylation status, save for a s mall subset of very high □ molecular weight proteins (10). With the increased resolu - S This article contains supple mental Tables. ‡‡ To who m correspondence should be addressed: Priority Re- g/ or e. n nli o p c m w. w w p:// htt tion afforded by the M S strategy e mployed herein, we have search Centre for Reproductive Science, School of Environ mental no w affir me d the i dentity of at least one of these proteins as and Life Sciences, The University of Ne wcastle, Callaghan, NS W dynein; a microtubule-dependent force-generating AT Pase 2308, Australia. Tel.: 61-2-4921-6977; Fax: 61-2-4921-6308; E- that plays a pivotal role in axone mal microtubule sliding mail: Brett. Nixon @ne wcastle.edu.au. Author contributions: B. N., S. D.J., and M. D. D. designed research; and hence the propagation of sustained flagellu m beating B. N., S. D. J., D. A. S.- B., A. L. A., S. J. S., E. G. B., J. H. M., a n d M. D. D. (4 4). p erf or m e d r e s e ar c h; B. N., D. A. S.- B., A. L. A., S. J. S., E. G. B., P. M. H., Although the i dentification of relatively fe w phos photyrosine and M. D. D. analyzed data; B. N., S. D.J., and M. D. D. wrote the paper. substrates represents a departure fro m the widely accepted 9 1 0 2 5, er b m e pt e S n o L) U A C E( L T S A C W E N F O V NI U at models of eutherian sper m capacitation, our findings do more REFERENCES closely approxi mate those experienced in so matic cells 1. Zhou, W., De Iuliis, G. N., Dun, M. D., an d Nixon, B. (2018) Characteristics of the epididy mal lu minal environ ment responsible for sper m maturation wherein, phosphorylation of serine, threonine and tyrosine and storage. Front. Endocrinol. 9, 59 a mino aci ds occurs at an esti mate d ratio of 1000:100:1 (45). In 2. Aitken, R. J., and Nixon, B. (2013) Sper m capacitation: a distant landscape seeking to reconcile these apparently incongruous results, it gli mpsed but unexplored. Mol. Hu m. Reprod. 19, 785–793 3. Baker, M. A., Nixon, B., Nau movski, N., and Aitken, R. J. (2012) Proteo mic is perhaps note worthy that a subset of the serine/threonine insights into the maturation and capacitation of ma m malian sper mato- su bstrates i dentifie d herein are instea d regulate d by tyrosine zoa. Syst. Biol. Reprod. Med. 58, 211–217 phosphorylation in the sper matozoa of ma m malian species, 4. Gervasi, M. G., and Visconti, P. E. (2016) Chang’s meaning of capacitation: A molecular perspective. Mol. Reprod. Dev. 83, 860 – 874 thus raisin g the p ossi bility of linea ge s pecific ex pansi on of the 5. Sti v al, C., P u g a M oli n a L d el, C., P a u d el, B., B uff o n e, M. G., Vi s c o nti, P. E., role of tyrosine kinases in the sper matozoa of higher verte- and Krapf, D. (2016) Sper m capacitation and acroso me reaction in brates. Illustrative of this pheno menon, we i dentifie d the fi- ma m malian sper m. Adv. Anat. E mbryol. Cell Biol. 220, 93–106 6. Ho warth, B., Jr. (1970) An exa mination for sper m capacitation in the fo wl. brous sheath calciu m-binding tyrosine phosphorylation- Biol. Reprod. 3, 338 –341 regulated protein ( CA BY R) as co mprising as many as 17 7. H o warth, B., Jr. (1983) Fertilizin g a bility of c ock s per mat oz oa fr o m the testis differentially phosphorylated peptides, not one of which fea- epididy mis and vas deferens follo wing intra magnal inse mination. Biol. Reprod. 28, 586 –590 tures a phospho-tyrosine residue. As its na me suggests, this 8. Ho warth, B., Jr., and Pal mer, M. B. (1972) An exa mination of the need for represents a marked departure fro m the ho mologue charac- s per m ca pacitation in the turkey, Meleagris gallo pavo. J. Re pro d. Fertil. terized in mouse (46, 47) and hu man sper matozoa (48), the 28, 443– 445 9. Nixon, B., E wen, K. A., Krivanek, K. M., Clulo w, J., Ki d d, G., Ecroy d, H., an d for mer of which harbors as many as seven potential tyrosine Jones, R. C. (2014) Post-testicular s per m maturation an d i dentification of phosphorylation motifs that are subject to extensive phospho- an epididy mal protein in the Japanese quail (Coturnix coturnix japonica ). rylation during in vitro capacitation (46). Characterization of Reproduction 147, 265–277 10. Nixon, B., Anderson, A. L., S mith, N. D., McLeod, R., and Johnston, S. D. the adaptive significance of such changes re mains as an (2016) The Australian salt water crocodile ( Crocodylus porosus) provides intriguing focus for future research. evidence that the capacitation of sper matozoa may extend beyond the In su m mary, we have exploited an advanced proteo mic ma m malian lineage. Proc. Biol. Sci. 283, 1–9 11. Johnston, S. D., Qualischefski, E., Coo per, J., McLeo d, R., Lever, J., Nixon, platfor m to i mprove our understanding of sper m biology in a B., Anderson, A. L., Hobbs, R., Gosalvez, J., Lopez-Fernandez, C., and model reptilian species, the Australian salt water crocodile. Keeley, T. (2017) Cryopreservation of salt water crocodile ( Crocodylus Through the identification of recognized hall marks of the ca- porosus) sper matozoa. Reprod. Fertil. Dev. 29, 2235–2244 12. Johnston, S. D., Lever, J., McLeod, R., Oishi, M., Qualischefski, E., pacitation cascade, our collective data affir m the hypothesis O manga, C., Leitner, M., Price, R., Barker, L., Ka maue, K., Gaughan, J., that crocodile sper m do engage a net work of signaling path- and D’ Occhio, M. (2014) Se men collection and se minal characteristics of

Molecular & Cellular Proteo mics 18.13 S 7 5 Proteomic Profiling of Crocodile Spermatozoa

the Australian saltwater crocodile (Crocodylus porosus). Aquaculture 30. Crichton, E. G. (2000) Sperm storage and fertilization. In: Crichton, E. G., 422–423, 25–35 and Krutzsch, P. H., eds. Reproductive Biology of Bats, pp. 295–320, 13. Asquith, K. L., Baleato, R. M., McLaughlin, E. A., Nixon, B., and Aitken, R. J. Academic Press, London (2004) Tyrosine phosphorylation activates surface chaperones facilitat- 31. Toure, A., Lhuillier, P., Gossen, J. A., Kuil, C. W., Lhote, D., Jegou, B., ing sperm-zona recognition. J. Cell Sci. 117, 3645–3657 Escalier, D., and Gacon, G. (2007) The testis anion transporter 1 14. Mitchell, L. A., Nixon, B., and Aitken, R. J. (2007) Analysis of chaperone (Slc26a8) is required for sperm terminal differentiation and male fertility in proteins associated with human spermatozoa during capacitation. Mol. the mouse. Hum. Mol. Genet. 16, 1783–1793 Hum. Reprod. 13, 605–613 32. Rode, B., Dirami, T., Bakouh, N., Rizk-Rabin, M., Norez, C., Lhuillier, P., 15. Biggers, J. D., Whitten, W. K., and Whittingham, D. G. (1971) The culture of Lores, P., Jollivet, M., Melin, P., Zvetkova, I., Bienvenu, T., Becq, F., mouse embryos in vitro. In: Daniel, J. C., ed. Methods in Mammalian Planelles, G., Edelman, A., Gacon, G., and Toure, A. (2012) The testis Embryology, pp. 86–116, Freeman Press, San Francisco, CA anion transporter TAT1 (SLC26A8) physically and functionally interacts 16. Fujiki, Y., Hubbard, A. L., Fowler, S., and Lazarow, P. B. (1982) Isolation of with the cystic fibrosis transmembrane conductance regulator chan- intracellular membranes by means of sodium carbonate treatment: ap- nel: a potential role during sperm capacitation. Hum. Mol. Genet. 21, plication to endoplasmic reticulum. J. Cell Biol. 93, 97–102 1287–1298 17. Dun, M. D., Chalkley, R. J., Faulkner, S., Keene, S., Avery-Kiejda, K. A., 33. Lee, K., Kerner, J., and Hoppel, C. L. (2011) Mitochondrial carnitine palmi- Scott, R. J., Falkenby, L. G., Cairns, M. J., Larsen, M. R., Bradshaw, toyltransferase 1a (CPT1a) is part of an outer membrane fatty acid R. A., and Hondermarck, H. (2015) Proteotranscriptomic Profiling of transfer complex. J. Biol. Chem. 286, 25655–25662 Downloaded from 231-BR Breast Cancer Cells: Identification of Potential Biomarkers and 34. Swegen, A., Curry, B. J., Gibb, Z., Lambourne, S. R., Smith, N. D., and Therapeutic Targets for Brain Metastasis. Mol. Cell. Proteomics 14, Aitken, R. J. (2015) Investigation of the stallion sperm proteome by mass 2316–2330 spectrometry. Reproduction 149, 235–244 18. Degryse, S., de Bock, C. E., Demeyer, S., Govaerts, I., Bornschein, S., 35. Eslami, M., Ghasemiyan, H., and Zadeh Hashem, E. (2017) Semen supple- Verbeke, D., Jacobs, K., Binos, S., Skerrett-Byrne, D. A., Murray, H. C., mentation with palmitoleic acid promotes kinematics, microscopic and Verrills, N. M., Van Vlierberghe, P., Cools, J., and Dun, M. D. (2017) antioxidative parameters of ram spermatozoa during liquid storage. Re-

Mutant JAK3 phosphoproteomic profiling predicts synergism between prod. Domest. Anim. 52, 49–59 http://www.mcponline.org/ JAK3 inhibitors and MEK/BCL2 inhibitors for the treatment of T-cell 36. Rad, H. M., Eslami, M., and Ghanie, A. (2016) Palmitoleate enhances quality acute lymphoblastic leukemia. Leukemia 32, 788–800 of rooster semen during chilled storage. Anim. Reprod. Sci. 165, 38–45 19. Larsen, M. R., Cordwell, S. J., and Roepstorff, P. (2002) Graphite pow- 37. Harano, Y., Kashiwagi, A., Kojima, H., Suzuki, M., Hashimoto, T., and der as an alternative or supplement to reversed-phase material for Shigeta, Y. (1985) Phosphorylation of carnitine palmitoyltransferase and desalting and concentration of peptide mixtures before matrix-as- activation by glucagon in isolated rat hepatocytes. FEBS Lett. 188, sisted laser desorption/ionization-mass spectrometry. Proteomics 2, 267–272 1277–1287 38. Pegorier, J. P., Garcia-Garcia, M. V., Prip-Buus, C., Duee, P. H., Kohl, C., 20. Ross, P. L., Huang, Y. N., Marchese, J. N., Williamson, B., Parker, K., and Girard, J. (1989) Induction of ketogenesis and fatty acid oxidation by Hattan, S., Khainovski, N., Pillai, S., Dey, S., Daniels, S., Purkayastha, S., glucagon and cyclic AMP in cultured hepatocytes from rabbit fetuses. Juhasz, P., Martin, S., Bartlet-Jones, M., He, F., Jacobson, A., and Evidence for a decreased sensitivity of carnitine palmitoyltransferase I at UNIV OF NEWCASTLE (CAUL) on September 5, 2019 Pappin, D. J. (2004) Multiplexed protein quantitation in Saccharomyces to malonyl-CoA inhibition after glucagon or cyclic AMP treatment. cerevisiae using amine-reactive isobaric tagging reagents. Mol. Cell. Biochem. J. 264, 93–100 Proteomics 3, 1154–1169 39. Guzman, M., and Geelen, M. J. (1992) Activity of carnitine palmitoyltrans- 21. Engholm-Keller, K., Birck, P., Storling, J., Pociot, F., Mandrup-Poulsen, T., ferase in mitochondrial outer membranes and in digitonin- and Larsen, M. R. (2012) TiSH–a robust and sensitive global phospho- permeabilized hepatocytes. Selective modulation of mitochondrial en- proteomics strategy employing a combination of TiO2, SIMAC, and zyme activity by okadaic acid. Biochem. J. 287 (Pt 2), 487–492 HILIC. J. Proteomics 75, 5749–5761 40. Davenport, M. (1995) Evidence of possible sperm storage in the caiman, 22. Laemmli, U. K. (1970) Cleavage of structural proteins during the assembly Paleosuchus palpebrosus. Herpetol. Rev. 26, 14–15 of the head of bacteriophage T4. Nature 227, 680–685 41. Gist, D. H., Bagwill, A., Lance, V., Sever, D. M., and Elsey, R. M. (2008) 23. Towbin, H., Staehelin, T., and Gordon, J. (1979) Electrophoretic transfer Sperm storage in the oviduct of the American alligator. J Exp Zool A Ecol of proteins from polyacrylamide gels to nitrocellulose sheets: proce- Genet Physiol 309, 581–587 dure and some applications. Proc. Natl. Acad. Sci. U.S.A. 76, 42. Wong, W., and Scott, J. D. (2004) AKAP signalling complexes: focal points 4350–4354 in space and time. Nat. Rev. Mol. Cell Biol. 5, 959–970 24. Dun, M. D., Smith, N. D., Baker, M. A., Lin, M., Aitken, R. J., and Nixon, 43. Luconi, M., Carloni, V., Marra, F., Ferruzzi, P., Forti, G., and Baldi, E. (2004) B. (2011) The chaperonin containing TCP1 complex (CCT/TRiC) is Increased phosphorylation of AKAP by inhibition of phosphatidylinositol involved in mediating sperm-oocyte interaction. J. Biol. Chem. 286, 3-kinase enhances human sperm motility through tail recruitment of 36875–36887 protein kinase A. J. Cell Sci. 117, 1235–1246 25. Soderberg, O., Gullberg, M., Jarvius, M., Ridderstrale, K., Leuchowius, 44. Loreng, T. D., and Smith, E. F. (2017) The Central Apparatus of Cilia and K. J., Jarvius, J., Wester, K., Hydbring, P., Bahram, F., Larsson, L. G., Eukaryotic Flagella. Cold Spring Harb. Perspect. Biol. 9, a028118 and Landegren, U. (2006) Direct observation of individual endogenous 45. Raggiaschi, R., Gotta, S., and Terstappen, G. C. (2005) Phosphoproteome protein complexes in situ by proximity ligation. Nat. Methods 3, analysis. Biosci. Rep. 25, 33–44 995–1000 46. Naaby-Hansen, S., Mandal, A., Wolkowicz, M. J., Sen, B., Westbrook, V. A., 26. Consortium, T. U. (2017) UniProt: the universal protein knowledgebase. Shetty, J., Coonrod, S. A., Klotz, K. L., Kim, Y. H., Bush, L. A., Flickinger, Nucleic Acids Res. 45, D158–D169 C. J., and Herr, J. C. (2002) CABYR, a novel calcium-binding tyrosine 27. Amaral, A., Castillo, J., Ramalho-Santos, J., and Oliva, R. (2014) The phosphorylation-regulated fibrous sheath protein involved in capacita- combined human sperm proteome: cellular pathways and implications tion. Dev. Biol. 242, 236–254 for basic and clinical science. Hum. Reprod. Update 20, 40–62 47. Li, Y. F., He, W., Kim, Y. H., Mandal, A., Digilio, L., Klotz, K., Flickinger, C. J., 28. Labas, V., Grasseau, I., Cahier, K., Gargaros, A., Harichaux, G., Teixeira- and Herr, J. C. (2010) CABYR isoforms expressed in late steps of Gomes, A. P., Alves, S., Bourin, M., Gerard, N., and Blesbois, E. (2015) spermiogenesis bind with AKAPs and ropporin in mouse sperm fibrous Qualitative and quantitative peptidomic and proteomic approaches to sheath. Reprod. Biol. Endocrinol. 8, 101 phenotyping chicken semen. J. Proteomics 112, 313–335 48. Li, Y. F., He, W., Mandal, A., Kim, Y. H., Digilio, L., Klotz, K., Flickinger, C. J., 29. Johnston, S. D., Lever, J., McLeod, R., Qualischefski, E., Brabazon, S., Herr, J. C., and Herr, J. C. (2011) CABYR binds to AKAP3 and Ropporin Walton, S., and Collins, S. N. (2014) Extension, osmotic tolerance and in the human sperm fibrous sheath. Asian J. Androl. 13, 266–274 cryopreservation of saltwater crocodile (Crocodylus porosus) spermato- 49. Barth, A. D. (1995) Evaluation of frozen bovine semen by the veterinary zoa. Aquaculture 426–427, 213–221 practitioner. Proc. Bovine Short Course, pp. 105–110

S76 Molecular & Cellular Proteomics 18.13 RESEARCH ARTICLE

Proteomic Profiling of Human Uterine Fibroids Reveals Upregulation of the Extracellular

Matrix Protein Periostin Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019

M. Fairuz B. Jamaluddin,1 Yi-An Ko,1 Manish Kumar,1 Yazmin Brown,1 Preety Bajwa,1 Prathima B. Nagendra,1 David A. Skerrett-Byrne,1 Hubert Hondermarck,1 Mark A. Baker,2 Matt D. Dun,1 Rodney J. Scott,1 Pravin Nahar,3,4 and Pradeep S. Tanwar1

1School of Biomedical Sciences and Pharmacy, University of Newcastle, Callaghan, 2308 New South Wales, Australia; 2School of Environmental and Life Sciences, University of Newcastle, Callaghan, 2308 New South Wales, Australia; 3School of Medicine and Public Health, University of Newcastle, Callaghan, 2308 New South Wales, Australia; and 4Department of Maternity and Gynaecology, John Hunter Hospital, New Lambton Heights, 2305 New South Wales, Australia

The central characteristic of uterine fibroids is excessive deposition of extracellular matrix (ECM), which contributes to fibroid growth and bulk-type symptoms. Despite this, very little is known about patterns of ECM protein expression in fibroids and whether these are influenced by the most common genetic anomalies, which relate to MED12. We performed extensive genetic and proteomic analyses of clinically annotated fibroids and adjacent normal myometrium to identify the composition and expression patterns of ECM proteins in MED12 mutation–positive and mutation–negative uterine fibroids. Genetic sequencing of tissue samples revealed MED12 alterations in 39 of 65 fibroids (60%) from 14 patients. Using isobaric tagged–based quantitative mass spectrometry on three selected patients (n = 9 fibroids), we observed a common set of upregulated (.1.5-fold) and downregulated (,0.66-fold) proteins in small, medium, and large fibroid samples of annotated MED12 status. These two sets of upregulated and downregulated proteins were the same in all patients, regardless of variations in fibroid size and MED12 status. We then focused on one of the significant upregulated ECM proteins and confirmed the differential expression of periostin using western blotting and immunohistochemical analysis. Our study defined the proteome of uterine fibroids and identified that increased ECM protein expression, in particular periostin, is a hallmark of uterine fibroids regardless of MED12 mutation status. This study sets the foundation for further investigations to analyze the mechanisms regulating ECM overexpression and the functional role of upregulated ECM proteins in leiomyogenesis. (Endocrinology 159: 1106–1118, 2018)

terine leiomyomas, also known as fibroids, are the fibroids are benign, depending on their size, number, and Umost prevalent tumors in women, occurring location, they can cause substantial morbidity and bur- in .70% of premenopausal women (1–3). Although den on the health care system (4). The clinical signs and

ISSN Online 1945-7170 Abbreviations: ACN, acetonitrile; ANM, adjacent normal myometrium; ANXA1, Copyright © 2018 Endocrine Society annexin A1; ANXA2, annexin A2; COL12A1, collagen type XII a 1; COL2A1, collagen type Received 9 October 2017. Accepted 4 December 2017. II a 1; COL3A1, collagen type III a 1; COL5A2, collagen type V a 2; COL7A1, collagen type First Published Online 13 December 2017 VII a 1; ECM, extracellular matrix; FMOD, fibromodulin; HILIC, hydrophilic interaction liquid chromatography; IHC, immunohistochemistry; LAMA5, laminin subunit a 5; LAMB2, laminin subunit b 2; LC, liquid chromatography; MED12, mediator complex subunit 12 gene; MS, mass spectrometry; MS/MS, tandem mass spectrometry; PCR, polymerase chain reaction; POSTN, periostin; pSMAD2, phosphorylated form of mothers against decapentaplegic homolog 2; S100A6, protein S100-A6; SERPINA1, a-1-anti- trypsin; SMOC2, secreted protein acidic and rich in cysteine–related modular calcium- binding protein 2; SPARC, secreted protein acidic and rich in cysteine; SPARCL1, secreted protein acidic and rich in cysteine–like 1; TFA, trifluoroacetic acid; TGF-b, trans- forming growth factor-b; TGFBI, transforming growth factor-b–induced protein ig-h3; TINAGL1, tubulointerstitial nephritis antigen-like; TNC, tenascin; VCAN, versican core protein.

1106 https://academic.oup.com/endo Endocrinology, February 2018, 159(2):1106–1118 doi: 10.1210/en.2017-03018 doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1107 symptoms of fibroids include heavy or prolonged men- established that aberrant remodeling of the ECM con- strual bleeding, discomfort, abdominal pain, pregnancy tributes to the development and progression of a range of complications, and infertility (4, 5). Fibroids are also a diseases, including cancer and fibrosis (17). major reason for hysterectomy (4). Treatment options for Despite disorganized and copious ECM being a patients with fibroids are predominantly limited to sur- hallmark feature of fibroids, no studies to date have gery and hormonal agents because of an incomplete investigated whether there is an association between Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 understanding of the etiology and pathogenesis of these underlying genetic anomalies of fibroids and ECM ex- tumors. pression patterns. In this study, using extensive genetic Whole genome sequencing has provided insight into and proteomic analyses of clinically annotated fibroids the genetic basis of fibroids (6). Fibroids harbor four main and adjacent normal myometrium (ANM), we have types of driver mutations: high mobility group AT-hook 2 identified the composition and expression patterns of (HMGA2) reorganizations; biallelic inactivation of fu- ECM proteins in MED12 mutation–positive and mutation– marate hydratase; deletions in collagen, type IV a 5 negative uterine fibroids. (COL4A5) and collagen, type IV a 6(COL4A6); and mutations in the mediator complex subunit 12 (MED12) Material and Methods (7). However, mutations in exon 1 and exon 2 of MED12 represent the most common genetic aberrations in fi- Patient recruitment and tissue samples broids (8, 9). MED12 is a subunit of the mediator To study the genetic basis of uterine fibroids, we determined complex that participates in genome-wide regulation as the mutation status in exon 1 and exon 2 of the MED12 gene in well as gene-specific transcription (10). The mediator women from the Hunter New England region of New South complex connects transcription factors to RNA poly- Wales, Australia. Uterine fibroids from 2 to 14 cm in diameter and ANM tissues were collected from individuals who un- merase (10). MED12 mutations or deletions are mutually derwent hysterectomies at the John Hunter Hospital, New- exclusive to certain other genetic alterations, such as castle, New South Wales, Australia. Human tissue collection fumarate hydratase inactivation, suggesting that distinct and experimentation were conducted in accordance with the mechanisms operating downstream contribute to the guidelines of the Institutional Human Research Ethics Com- pathogenesis of this disease (11). Conditional expression mittee at the University of Newcastle. The age range of the patients was 42 to 72 years, with an average age of 47 years. of a Med12 missense variant in a mouse model, a liaison Sixty-five uterine fibroids of different sizes from 14 patients commonly present in human patients, led to the devel- (most patients had multiple fibroids) and 14 ANM samples opment of uterine fibroids, providing further evidence from the same patients (1 ANM sample per patient), were that Med12 genetic aberrations are drivers of uterine collected. We classified fibroids into three groups according to , leiomyogenesis (12). In contrast, fibroids did not form in their tumor size: small (diameter 2.0 cm), medium (diameter 2 . another mouse model in which the Med12 was deleted, to 4 cm), and large (diameter 4 cm). Histopathological analysis by a pathologist validated ANM and fibroid tissue indicating that fibroids associated with Med12 defects samples. The collected tissues were immediately transferred to arise from a gain of function mechanism (12). However, the laboratory, washed with phosphate-buffered saline to the downstream signaling pathways involved in the de- remove excessive blood, snap-frozen, and stored in liquid ni- velopment of fibroids remain unknown. trogen until further analysis. Tissue arrays from ANM and Fibroids are thought to arise from a single myocyte fibroid tissue samples were generated with a core diameter of 2 mm, with two cores per patient. that undergoes hyperplastic transformation (13). Unlike malignant tumors, fibroids are characterized by modest Genomic DNA extraction from tissue samples rates of cell proliferation (5, 14). However, fibroids can To evaluate the status of MED12 mutations in fibroids and expand to the size of a grapefruit because of their ANM, 79 tissue samples were selected for DNA isolation. To abundant extracellular matrix (ECM), which is a defining isolate the genomic DNA, approximately 25 mg of frozen tissue feature of uterine leiomyomas (4, 5, 13, 15). The ECM is a (fibroid or ANM) was cut into smaller pieces, homogenized in component of all organs and plays an integral role in tissue lysis buffer (Buffer ATL; Qiagen) and treated with tissue development, wound healing, and maintenance of proteinase K to aid protein degradation and tissue lysis. Ge- normal tissue homeostasis (16). This three-dimensional nomic DNA was extracted using the QIAamp DNA Mini Kit (Qiagen) according to the manufacturer’s instructions. DNA matrix structure is composed of a complex network of concentrations were determined by using a NanoDropTM 2000/ proteins belonging to the collagen, proteoglycan, and 2000c spectrophotometer (Thermo Fisher Scientific). glycoprotein families (17, 18). The ECM not only sup- ports cell organization but also regulates cell response Polymerase chain reaction amplification and and behavior through ligand-integrin interactions and Sanger sequencing the release of bound soluble factors, including growth DNA amplification and sequencing were performed at the factors and Wnt signaling molecules (19). It is well Hunter North Pathology sequencing core facility, Newcastle. 1108 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118

The DNA from fibroid and ANM tissue samples was amplified fractionation, peptide samples were desalted using C18-SD with ImmolaseTM DNA polymerase (Bioline) using specific 4 mm/1 mL Extraction Disk Cartridge (Empore). Cartridges were primers (Sigma-Aldrich) (sense 50-CCTCCGGAACGTTTCATA- activated with acetonitrile (ACN) and then equilibrated with GAT-30 and antisense 50-TTCGGGACTTTTGCTCTCAC-30) 0.1% TFA in MilliQ water, according to the manufacturer’s targeting exon 1 of the MED12 gene (sense 50-GCCCTTT- protocol. Peptides were loaded into the cartridge and washed CACCTTGTTCCTT-30 and antisense 50-TGTCCCTATAAGT- three times with 500 mL of MilliQ water containing 0.1% TFA. CTTCCCAACC-30) for exon 2 of the MED12 gene, as described Bound peptides were eluted into a LoBind tube in two steps us- Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 previously (8). Briefly, a total of 24 mL master mix consisting of ing 100 mL of 60% ACN, 0.1% TFA and then 100 mL of 80% genomic DNA (3 mL), forward and reverse primers (each 0.4 mL), ACN, 0.1% TFA. Peptide concentrations were determined ImmolaseTM DNA polymerase (0.2 mL), buffer D (12 mL), and using a Qubit 2.0 Fluorometer assay (Invitrogen). Peptides nucleasefreewater(8.4mL) was prepared for each reaction. (minimum quantity ;100 mg) were dried by SpeedVac SC100 Samples were subjected to the following thermal cycling conditions (Thermo Fisher Scientific) vacuum centrifugation and were on a T100TM thermal cycler (Bio-Rad): denaturation at 95°C for reconstituted in the dissolution buffer. Reconstituted peptides 10 minutes followed by 30 cycles of 95°C for 30 seconds, 58°C (for were then processed according to the manufacturer’sprotocol exon 1) and 60°C (for exon 2) for 30 seconds, 72°C for 30 seconds, for 4plex iTRAQ reagent (AB SCIEX). The 4plex labeling of and extension at 72°C for 10 minutes, followed by a final soak step three fibroids (small, medium, and large) and one ANM tissue at 4°C. After polymerase chain reaction (PCR) amplification, the sample from each of the three patients was performed for products were purified by using the QIAquick PCR Purification Kit 2 hours at room temperature. Labeled peptides were pooled into a (Qiagen) and eluted in 20 mL buffer EB (10 mM Tris$Cl, pH 8.5) single tube and dried with the SpeedVac. according to the manufacturer’s instructions. Purified PCR products were quantified to determine the suitability of the DNA concentration (should be .30 ng/mL) for sequencing. Fractionation of labeled peptides by hydrophilic PCR products were bidirectionally sequenced using the Sanger interaction liquid chromatography method and detected with capillary electrophoresis on an ABI Hydrophilic interaction liquid chromatography (HILIC) 3730xl Automatic DNA Analyzer (Applied Biosystems). PCR fractionation was performed to purify and fractionate the products were sequenced using the BigDye Terminator v.3.1 mixed peptides. The fractionation procedure was conducted at Ready Reaction Premix and Sequencing Buffer (5X) (Applied the Mass Spectrometry Core Facility of the Charles Perkins Biosystems) using the original PCR primers specific to MED12 Centre, University of Sydney. Dried mixed peptides were exon 1 and exon 2, respectively. (5X of Sequencing Buffer al- reconstituted in 90% ACN incorporating 0.1% TFA. Any in- lows users to dilute BigDye Terminator Ready Reaction Premix. soluble material was removed by centrifugation at 20,000g for Use of 5X buffer maintains PH and magnesium optimallly for 5 minutes at 4°C. Twenty micrograms of peptide material was reaction.) Sequence chromatographs were analyzed for somatic resolved onto the HILIC column using an inverted organic mutations in exon 1 and exon 2 of MED12 using Mutation gradient of solvent A (water, 0.1% TFA) and solvent B (ACN, Surveyor software (SoftGenetics). 0.1% TFA). Fractions were collected in a deep 96-well plate, dried, and resuspended in 0.1% formic acid. Each HILIC Protein sample preparation and 4plex iTRAQ fraction was subjected to liquid chromatography (LC)/tandem mass spectrometry (MS/MS). labeling and processing To perform proteomic analysis of genetically annotated fi- broids and ANM, 75 to 100 mg of each tissue sample was cut Mass spectrometry and data analysis into smaller pieces, resuspended in 100 mL of ice-cold lysis Peptides were injected onto a trapping column for pre- 3 buffer (0.1 M Na2CO3,10mMNa3VO4) with protease in- concentration (Acclaim Pepmap100 20 3 0.075 mm mm C18, hibitor cocktail (Complete Mini; Roche), and phosphatase in- Thermo Fisher Scientific), followed by nanoflow LC (Thermo hibitors PhosStop (Roche). Homogenization was performed by Dionex, Ultimate 3000 RSLCnano, Thermo Fisher Scientific). using the BeadBug homogenizer (Benchmark Scientific). The Peptide separation was achieved using a 500 3 0.075-mm ID, homogenates, which contained 150 mg of proteins, were dis- PepMap 2-mm EasySpray C18 column (Thermo Fisher Scien- solved in urea buffer (12 M urea, 4 M thiourea), reduced with tific) with the following mobile phases: 0.1% formic acid in 10 mM dithiothreitol, and heated at 56°C for 1 hour, followed high-performance LC water (solvent A) and 80% ACN com- by alkylation with 20 mM iodoacetamide and incubation in the bined with 0.1% formic acid (solvent B). Peptides were resolved dark at room temperature for 45 minutes, as previously de- using a linear gradient from 2% to 35% solvent B, over scribed (20). Protein samples were then digested with trypsin 120 minutes, with a constant flow of 250 nL/min. The peptide eluent [Mass Spectrometry Grade (Promega), 2 mg/mL in Milli-Q water flowed into a nano-electrospray emitter at the sampling region (Millipore), and 50 mg of trypsin/sample] at room temperature of a Q-Exactive Plus Orbitrap mass spectrometer (Thermo for 3 hours. The digest was diluted further by using 100 mM Fisher Scientific). The electrospray process was initiated by triethylammonium bicarbonate, pH 7.8, to a final concentra- applying 2.0 kV to the liquid junction of the emitter, and data tion of urea buffer (0.75 M urea, 0.25 M thiourea) and subjected were acquired under the control of Xcalibur (Thermo Fisher to a second digestion using trypsin for 16 hours at room Scientific) in data-dependent mode. The mass spectrometry temperature. Following digestion, trypsin was inactivated by (MS) survey scan was performed using a resolution of 35,000. acidifying the samples to a pH of 2 using 10% trifluoroacetic The 10 most intense multiply charged precursors were selected acid (TFA). Lipid precipitation was achieved by centrifuging the for high-energy collisional dissociation fragmentation with a acidified samples at 14,000g for 10 minutes at room temper- normalized collision energy of 30.0, then measured in the ature, and the supernatant containing the peptides was collected Orbitrap (Thermo Fisher Scientific) at a resolution of 17,500. in a LoBind tube (Eppendorf) for further processing. Before Automatic gain control targets were 3E6 ions for Orbitrap scans doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1109 and 17,000 for MS/MS scans. The raw MS data were processed fixation, tissues were embedded in paraffin and 5-mm sections with the Proteome Discoverer software package, version were prepared. Tissue sections were deparaffinized, quenched to 2.0.0.802 (Thermo Fisher Scientific). Proteins and peptides were eliminate internal peroxidase activity, and incubated with pri- identified by searching against the UniProt Human reference mary antibodies (Table 1) or normal rabbit immunoglobulin G proteome database (downloaded 21 October 2016, with a total (negative control; Jackson ImmunoResearch Laboratories) of 48,140 entries). The following search parameters were used: overnight at 4°C. Following three washes in phosphate-buffered mass tolerances in MS and MS/MS modes of 10 ppm and saline with 0.1% Tween-20, tissue sections were incubated for Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 20 ppm, respectively; trypsin designated as the digestion enzyme, 1 hour in biotinylated secondary antibody, followed by in- with up to two missed cleavages allowed; S-carbamidomethy- cubation in horseradish peroxidase streptavidin (1:250; Jackson lation of cysteine residues; oxidation on methionine; phos- ImmunoResearch Laboratories). Tissue sections were exposed phorylation on serine, threonine, and tyrosine; and acetylation to diaminobenzidine (Sigma-Aldrich) to develop antibody sig- and methylation of lysine and deamidation of asparagine and nal. Nuclei were counterstained using hematoxylin. Human glutamine set as variable modifications. The false discovery rate endometrium was used as positive control (Supplemental was set to 1% for positive identification of proteins, peptides, Fig. 1). Images were captured using an Aperio AT2 slide scanner and phosphorylation sites. Data were analyzed by using Gitools (Leica Biosystems). Periostin (POSTN) staining was quantified 2.3.1 software (Biomedical Genomics Group) to generate heat using a HaloTM image analysis platform (Indica Laboratories). maps and Venny 2.1.0 (BioinfoGP–CSIC) to create Venn dia- An area quantification algorithm was used to quantify the pixel grams illustrating the number of shared and differentially intensities of diaminobenzidine staining. The H-score was then expressed proteins between patients. Proteins represented by at calculated using pixel intensities, according to the following least two unique peptides were included in the analysis. We formula: classified a protein as upregulated if the level of expression was ¼ð 3 Þ 1.5-fold higher than the corresponding ANM sample and H-score 3 percentage of pixels with strong staining downregulated if it was 0.66-fold lower than the corresponding 1 ð2 3 percentage of pixels with intermediate stainingÞ ANM sample. A q value must be below 0.05 to identify sig- 1 ð1 3 percentage of pixels with weak stainingÞ nificant changes in expression.

Western blotting Statistical analysis Lysed soluble tissue extracts containing 30 mg of proteins Data were analyzed and graphed with Prism 6.0 software 6 were mixed in Laemmli buffer containing 10% b-mercaptoe- (GraphPad). Values are presented as mean standard error of thanol (Sigma-Aldrich), heated for 5 minutes, and separated by the mean. Statistical significance was calculated using the , sodium dodecyl sulfate–polyacrylamide gel electrophoresis Student t test. P values 0.05 were considered to indicate (10% Mini-PROTEAN® TGX™ Gels, Bio-Rad), then trans- statistically significant differences. For MS data analysis, the ferred to nitrocellulose membrane (AmershamTM ProtranTM Student t test value comparing protein expression differences 0.45 mM NC) for western blotting. After blocking with 5% milk between ANM and fibroid across patients were corrected to (weight-to-volume ratio) in Tris-buffered saline (0.1% Tween- P values using the Benjamini-Hochberg method (22). Pearson 20), the membrane was incubated overnight at 4°C with pri- correlation analysis was used to determine the correlation be- mary antibodies (Table 1). Horseradish peroxidase–conjugated tween two groups. secondary antibodies (Jackson ImmunoResearch Laboratories) were applied to the membrane for 1 hour at room temperature. LAS-3000 Imager (Fujifilm) was used to capture the image. The Results band intensity of western blot was quantified using an ImageJ plugin (National Institutes of Health). Analysis of mutations in exon 1 and exon 2 of the MED12 gene in human uterine fibroids Previous studies have established that approximately Immunohistochemistry 70% of uterine fibroids harbor specific mutations in Immunohistochemistry (IHC) was performed as we de- scribed previously (21). Briefly, fresh tissues were washed three MED12, either missense mutations or small in-frame times with phosphate-buffered saline for 15 minutes per wash insertions and deletions in exon 2 (8, 23, 24). Addi- and fixed overnight at 4°C in 10% buffered formalin. After tional mutations in exon 1 have also been reported (9)

Table 1. Antibody Dilutions, Conditions, and Providers With Catalog Numbers

Antibody Provider Catalog No. Dilution RRID POSTN Sigma-Aldrich HPA012306 WB: 1:1000 AB_1854827 IHC: 200 pSMAD2 Merck Millipore 566415 WB: 1:1000 AB_565143 IHC: 1:200 GAPDH Cell Signaling Technology 5174 WB: 1:5000 AB_10622025 Actin DSHB JLA20 1: 4000 AB_528068

Abbreviations: DSHB, Developmental Studies Hybidoma Bank; GAPDH, glyceraldehyde 3–phosphate dehydrogenase; POSTN, periostin; pSMAD2, phosphorylated form of mothers against decapentaplegic homolog 2; RRID, Research Resource Identifier; WB, western blot. 1110 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118

Table 2. Summary of Somatic MED12 Mutations Observed in Fibroids of Various Sizes

Location per No. of Mutations Out Mutation Type Nucleotide Change Predicted Protein Change of 65 Fibroids (%) Missense Exon 1 c.28G.A p.E10K 1 (1.5) .

Exon 1 c.296G A p.E33K 1 (1.5) Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 Exon 2 c.122T.A p.V41Q 1 (1.5) Exon 2 c.128A.C p.Q43P 1 (1.5) Exon 2 c.130G.A p.G44S 5 (7.7) Exon 2 c.130G.C p.G44R 3 (4.6) Exon 2 c.130G.T p.G44C 2 (3.0) Exon 2 c.131G.A p.G44D 9 (13.8) Exon 2 c.131G.C p.G44A 1 (1.5) Exon 2 c.131G.T p.G44V 5 (7.7) Deletion Exon 2 c.100_131del p.D34_G44del 1 (1.5) Exon 2 c.114_134del p.L39_F45del 2 (3.0) Exon 2 c.115_135del p.L39_F45del 3 (4.6) Exon 2 c.120_165del p.N40_E55del 1 (1.5) Exon 2 c.121_125del p.V41_K42del 1 (1.5) Exon 2 c.126_132del p.K42_G44del 1 (1.5) Exon 2 c.127_143del p.Q43_Q48del 1 (1.5) and presented similar tumorigenic mechanism as exon 2 screening of 14 patients, we selected three patients (n = 12 mutations (25, 26). To determine the genetic basis of the tissue samples: 9 fibroid and 3 ANM samples) harboring uterine fibroids used in our study, we performed Sanger both MED12 mutation–positive and mutation–negative sequencing and screened for somatic mutations in 14 tumors and performed proteomic analysis. Australian patients. Altogether, 39 of 65 fibroids from 14 different patients had MED12 mutations [39 of 65 Proteome analysis reveals changes in ECM (60%)] (Table 2). More specifically, 27 of 37 fibroids expression profile in genetically annotated (72%) from 11 patients displayed missense mutations, uterine fibroids affecting codon 44 in exon 2 of MED12. These include To determine whether fibroid size and MED12 mu- G44V, G44C, G44R, G44A, G44S, and G44D muta- tation status influence fibroid protein expression, we tions. Another hot spot was found in codons V41Q compared the protein composition of genetically anno- (sample ULM 5.4) and Q43P (sample ULM 5.7) and, tated fibroids from each size category with ANM control interestingly, these were identified in only one patient tissue samples. We first performed proteomic analysis of (Table 2; Supplemental Table 1). As opposed to the three fibroids (small, medium, and large) and one ANM missense mutations, we also identified 10 tumors (27%) tissue sample from each of the three patients using the from four patients with deletion mutations in MED12 4plex iTRAQ labeling system (n = 3 normal and n = 9 exon 2 (Table 2). We next sought to analyze the mutation fibroids) (Fig. 2). This system enabled comparisons be- in exon 1 and observed that only two missense mutations tween different-size fibroids and ANM samples in the were detected in two patients [namely, codons E10K same LC/MS run, thereby minimizing experimental (sample ULM3.3) and E33K (sample ULM10.4)] variation. From the three patients, a combined total of (Table 2; Supplemental Table 1). As expected, we did not 1061, 1081, and 1116 upregulated proteins in small, detect any mutations in MED12 in any of the ANM medium, and large fibroids, respectively, compared with samples (Supplemental Table 1). The remaining 26 of 65 ANM, were identified using the iTRAQ experiment fibroids from 14 patients were found to be MED12 (Fig. 3D). In contrast, a total of 466, 659, and 592 mutation negative (Supplemental Table 1). Our genetic downregulated proteins in small, medium, and large fi- analysis highlighted that multiple fibroids within the broids, respectively, were identified (Supplemental Fig. 2). same patient can harbor diverse genetic profiles (Sup- Quantitative LC-MS/MS results from these three patients plemental Table 1). For example, eight variously sized were then compared using Venn diagrams to identify the tumors from one of the patients harbored five different hot number of common and differentially expressed proteins spots, all of which affected glycine at codon 44 in exon 2 of within each fibroid size category. A representative Venn MED12 (Fig. 1). This analysis also confirmed that mu- diagram for upregulated proteins in small fibroids is tations in exon 2 of MED12 are common in uterine fi- shown in Fig. 3D. The complete set of Venn diagrams, broids in Australian women. On the basis of the mutation classified into downregulated proteins, is provided in doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1111 Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019

Figure 1. Somatic MED12 exon 2 mutations in one patient with uterine leiomyomata showing different genomic profiles. Sequence chromatograms of various size (large, medium, small) fibroids and corresponding ANM tissues are shown. Codon 44 in MED12 exon 2 is highlighted by the horizontal bars above the traces. Mutated bases are indicated by arrows.

Supplemental Fig. 2. We set our predefined criteria of q , Analysis of small fibroids vs ANM control revealed 0.05 with relative expression levels at least .1.5-fold that 18 proteins were upregulated in all patients (Fig. 3A or ,0.66-fold compared with ANM for upregulated and and 3D). Of these 18 proteins, 10 ECM proteins, in- downregulated proteins, respectively; proteins repre- cluding collagen type II a 1 (COL2A1), collagen type III a a sented by at least two unique iTRAQ-labeled peptides 1 (COL3A1), collagen type VII 1 (COL7A1), ma- were included in the analysis. The P values were then trix metalloproteinase 2 (MMP2), POSTN, serpin H1 calculated by Student t test comparing the protein ex- (SERPINH1), secreted protein acidic and rich in cysteine – pression differences between ANM and fibroid across the (SPARC) related modular calcium-binding protein 2 b– three patients and adjusted for multiple testing by the (SMOC2), transforming growth factor- induced pro- Benjamini-Hochberg method (22). (P , 0.05 is con- tein ig-h3 (TGFBI), tenascin (TNC), and versican core sidered to indicate statistical significance after the Benjamini- protein (VCAN) were identified. In the analysis of me- Hochberg correction for multiple testing) (Fig. 3A–3C and dium fibroid vs ANM tissue, 15 common proteins were Fig. 4A–4C; Supplemental Table 2). detected in the three patients (Fig. 3B and 3D). Thirteen of On the basis of on our quantitative results, we extracted these proteins—asporin, collagen type XII a1 (COL12A1), the list of proteins that were up- and downregulated in fi- COL2A1, COL3A1, collagen type V a 2 (COL5A2), broids and performed pathway analysis using STRING, COL7A1, fibromodulin (FMOD), POSTN, a-1-antitrypsin version 10.5 (a protein-protein interaction network database), (SERPINA1), SMOC2, SPARC-like 1 (SPARCL1), TGFBI, to identify the associated protein-protein interaction path- and TNC—were of ECM origin. In addition, we also de- ways. These analyses highlighted several protein networks, tected 22 upregulated proteins present in large fibroids that including extracellular matrix receptor and focal adhesion were commonly shared among patients (Fig. 3C and 3D). interactions, and PI3K-Akt signaling (Supplemental Fig. 3). Of these proteins, 19 were ECM proteins: COL12A1, 1112 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118

a-2 (LAMA2),lumican (LUM), osteo- glycin (OGN), POSTN, SERPINA1, SMOC2, SPARCL1, TGFBI, TNC, and VCAN. We further examined proteins that were commonly downregulated in each of the different-sized fibroids (27 Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 proteins in small fibroids, 25 proteins in medium fibroids, and 32 proteins in large fibroids) (Fig. 4A–4C; Supple- mental Fig. 2). The ECM-related pro- teins that were downregulated across the different-sized fibroids were annexin A1 (ANXA1), annexin A2 (ANXA2), col- lagen type XIV a 1 (COL14A1), collagen type XVIII a 1 (COL18A1), laminin subunit a 5 (LAMA5), laminin sub- unit b 2 (LAMB2), protein S100-A6 (S100A6), serpin peptidase inhibitor clade C (SERPINC1), and tubulointer- stitial nephritis antigen-like (TINAGL1) in small fibroids; ANXA1, collagen type XV a 1 (COL15A1), fibrillin-1 (FBN1), LAMA5, LAMB2, and TINAGL1 in medium fibroids; and ANXA1, ANXA2, LAMA5, LAMB2, S100A6, and TINAGL1 in large fibroids. The pro- teomic data (accession number, gene ID, number of unique peptides, cov- erage, fold change in abundance of fi- broid compared with ANM, q values, and P, 0.05 adjusted by the Benjamini- Hochberg method) of the commonly identified proteins found in small, me- dium, and large fibroids from the three patients are shown in Supplemental Table 2.

POSTN is a significantly upregulated ECM protein during early stages of uterine leiomyogenesis From the list of ECM proteins identified as upregulated in fibroids, COL2A1, COL3A1, COL7A1, POSTN, SMOC2, TGFBI, and TNC were the Figure 2. Schematic representation of the proteomics workflow. Twelve samples (nine only ones detected across the fibroid size fibroids and three ANM) from three patients were prepared in parallel, digested, and labeled range. POSTN was one of the signifi- with 4plex iTRAQ and mixed. To reduce sample complexity, pooled peptides were subjected to HILIC fractionation before LC-MS analysis. Experimental procedures are described in the cantly upregulated ECM proteins ob- Materials and Methods section. m/z 114–117 are the iTRAQ reporter ions. served in small fibroids, suggesting that it is upregulated at the earliest stages of collagen, type I a 1 (COL1A1), collagen type I a 2 uterine leiomyogenesis. One small fibroid containing (COL1A2), COL2A1, COL3A1, collagen type V a 1 MED12 mutation–positive or mutation–negative and (COL5A1), COL5A2, COL7A1, FMOD, laminin subunit corresponding ANM tissue (total n = 24) from each of the doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1113 Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019

Figure 3. Upregulated proteins in uterine fibroids. Heat map represents the protein expression patterns in the various sizes—(A) small, (B) medium, and (C) large fibroids—against the corresponding ANM tissues, which were commonly identified as shown in Table 2. Ratio is mapped from red (increase) to blue (decrease) or black (no change); see color key inset. ECM proteins are highlighted in green and POSTN is indicated in a box. *P , 0.05, adjusted by the Benjamini-Hochberg method. (D) Venn diagram illustrating the number of shared and uniquely differentially upregulated (.1.5-fold) expressed proteins identified by 4plex iTRAQ in the fibroid (small, medium, and large) vs ANM. +, MED12 mutation–positive; 2, MED12 mutation–negative. eight patients were selected and analyzed for POSTN ex- corresponding ANM in the eight patients regardless of pression by western blotting and IHC. Western blot results MED12 mutations (Fig. 5A) [eight ANM vs eight fibroids revealed higher levels of POSTN in small fibroids than in with MED12-positive mutation (P = 0.02); eight ANM vs 1114 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118 Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019

Figure 4. Differentially downregulated (,0.66-fold) proteins identified by 4plex iTRAQ in uterine fibroids. Heat map depicts the protein expression patterns in fibroids of different sizes—(A) small, (B) medium, and (C) large fibroids—against the corresponding ANM tissues. Ratio is mapped from red (increase) to blue (decrease) or black (no change); see color key inset. ECM proteins are highlighted in green. *P , 0.05, adjusted by the Benjamini-Hochberg method. +, MED12 mutation–positive; 2, MED12 mutation–negative. eight fibroids with MED12-negative mutation (P =0.04)]. growth factor-b (TGF-b) signaling, which is an upstream Phospho-SMAD2, which indicates the phosphorylated regulator of POSTN (27). We next monitored the expres- form of mothers against decapentapligic homolog 2 sion level of pSMAD2 in these patients to determine (pSMAD2), is one of the key components for transforming whether any correlation existed with POSTN. We observed doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1115 Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019

Figure 5. Differential expression of POSTN in human uterine fibroids. (A) Greater upregulation of POSTN and pSMAD2 in fibroid than ANM. Western blot showing the overexpression of POSTN and pSMAD2 in small MED12 mutation–positive or mutation–negative fibroids compared with ANM controls (n = 8 patients; *P, 0.05). Levels of b-actin and glyceraldehyde 3–phosphate dehydrogenase (GAPDH) were monitored as a quantitative control in western blot analysis. (B) IHC detection of POSTN showing elevated levels staining (in brown) in fibroid (III, IV) compared with ANM (I, II; n = 4 normal and n = 15 fibroids). Correlation analysis of pSMAD2 and POSTN expression levels in human fibroid tissues (n = 30). Scale bars: 100 mm. +, MED12 mutation–positive; 2, MED12 mutation–negative; wt, wild-type. greater levels of pSMAD2 in small fibroids compared with Discussion corresponding ANM in these patients irrespective of MED12 mutations (Fig. 5A). We observed a significant Uterine fibroids affect up to 70% of women during their linear correlation between the expression of pSMAD2 and reproductive years and are an important source of gy- POSTN (r = 0.48; P = 0.0068) (Fig. 5B). IHC of small fi- necological and obstetrical problems (1, 2). Despite the broids with MED12-positive mutations also exhibited major health care burden posed by fibroids, very little is upregulation of POSTN compared with ANM, supporting known about their etiology and pathogenesis. Treatment western blot results (Fig. 5B). Quantitative analysis of IHC strategies for these benign tumors are limited to invasive POSTN expression using H-Score (Fig. 5B) confirmed surgical procedures and hormonal therapies. Therefore, higher staining intensity and, therefore, POSTN upregu- identification of novel drug targets and improved treat- lation in fibroids compared with ANM (n = 15 fibroids, ment strategies are required. Our research data showed n=4ANM;P = 0.00083). that fibroid clusters taken from the same patient possess 1116 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118 diverse genomic profiles. In the literature, mutations in domains, and a carboxyl-terminal domain (28–30). Each exon 2 of the MED12 gene are the most common genetic domain interacts with specific ECM proteins and cell- anomalies in fibroids (8), and our genomic data sup- surface integrins to promote the assembly of extracellular ported this observation. Our study determined the architectures (31–34), which govern the biomechanical MED12 status of fibroids in the Australian population. properties of connective tissues. Previous studies have We identified that 27 tumors from 11 patients that dis- shown that POSTN interacts with collagens, laminins, Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 played missense mutations affecting codon 44 in exon 2 and tenascins (31–34). In fibroids of all sizes (small, of MED12. Glycine residue at codon 44 appears to be medium, and large), we observed upregulation of essential for normal MED12 function as all six base POSTN, along with upregulation of collagens, laminins, substitutions lead to changes in this amino acid. and tenascins (Fig. 3A–3C). This suggests that these Excessive ECM deposition is a major hallmark of this proteins are important contributors to the excessive ECM disease (5), which contributes to fibroid growth and bulk- build-up that leads to fibroid expansion. Further studies type symptoms. Despite the significance of ECM in are required to target the binding sites on the multido- fibroids, there has been limited research on ECM char- main structure of POSTN to investigate the relationship acterization and expression in fibroids. Our study in- between POSTN and the interacting proteins. vestigated whether MED12 status and fibroid size POSTN plays a central role in normal tissue homeo- influence ECM composition and expression patterns. stasis and disease development (34). Previous studies After genetic analysis, we generated heat maps showing have demonstrated that high levels of POSTN expression protein upregulation and downregulation in fibroids are associated with a variety of cancers, including head compared with ANM control tissues. Interestingly, we and neck (35, 36), oral (37), lung (38), breast (39), observed that, overall, the set of proteins (ECM-related ovarian (40), colon (41), pancreatic (42), and liver (43, and otherwise) that were upregulated or downregulated 44) cancer. POSTN participates in many biological were the same in all three patients, irrespective of MED12 processes involved in cancer, including cell adhesion, mutation status and fibroid size. invasion, metastasis, and tumor angiogenesis (36, 37, 41, Because a copious amount of ECM is characteristic of 45). Our study compared the expression level of POSTN fibroids, we next focused on the upregulated ECM in fibroids relative to ANM. POSTN is a candidate for proteins. Comparison of expression levels revealed that further functional and clinical investigations given its POSTN, an ECM protein, was one of the most upre- abundance in uterine fibroids. The molecular mechanism gulated proteins, with an average threefold higher ex- of POSTN is still unclear, bu a recent study revealed that pression in fibroids than in ANM. Because fibroids grow it interacts with integrins and activates focal adhesion as they progress, small fibroids indicate early stages of kinase and PI3K-Akt–mediated signaling pathways, disease. Therefore, because POSTN had upregulated promoting tumor angiogenesis, invasion, and metastasis expression levels in small fibroids, this ECM protein may (46). Our STRING analysis also identified the involvement have potential as a novel drug target to hinder the pro- of these signaling networks in fibroids (Supplemental gression of uterine leiomyogenesis. We further validated Fig. 3). the expression of POSTN in our fibroid samples using Our proteomic analysis revealed that TGF-b was also western blot and IHC. upregulated in fibroids of all sizes compared with ANM Our proteomic and western blot analysis demon- (Fig. 3A–3C). Deletion of the Periostin (Postn) gene in a strated that in all cases of MED12 mutation–positive and mouse model of muscular dystrophy altered TGF-b mutation–negative fibroids, POSTN was upregulated signaling, resulting in enhanced tissue regeneration and compared with ANM. In contrast, in the absence of reduced levels of fibrosis, thereby providing evidence for mutations in MED12, POSTN expression was not interaction between POSTN and TGF-b (47). Therefore, consistent (Fig. 5A). In the case of MED12-negative fi- we can also speculate that POSTN interacts with TGF-b broids, the expression of POSTN may be affected by the signaling pathway to promote ECM formation in leio- presence of other genetic mutations. Consequently, the myogenesis. Further research is warranted to define the effect of other genetic mutations on fibroid ECM protein functional role of POSTN and other ECM-related pro- expression requires further investigation. Our study fo- teins in fibroid initiation and progression. cused on MED12 mutations because in the literature In summary, our study defined the global protein these are the most commonly detected mutations in expression patterns of fibroids vs ANM in human pa- fibroids. tients. Furthermore, we used genomic and proteomic POSTN is a 90-kDa secreted ECM protein with a analysis to study the relationship among exon 2 MED12 multidomain structure that contains an amino-terminal mutation status, fibroid size, and ECM protein expres- EMI domain, a tandem repeat of four fasciclin (FAS1) sion. This analysis revealed that the same group of ECM doi: 10.1210/en.2017-03018 https://academic.oup.com/endo 1117 proteins was upregulated or downregulated in all pa- 3. Arslan AA, Gold LI, Mittal K, Suen TC, Belitskaya-Levy I, Tang tients, despite variations in fibroid size and the MED12 MS, Toniolo P. Gene expression studies provide clues to the pathogenesis of uterine leiomyoma: new evidence and a systematic gene. Furthermore, we validated the expression level of review. Hum Reprod. 2005;20(4):852–863. POSTN because it was one of the significantly upregu- 4. Bulun SE. Uterine fibroids. N Engl J Med. 2013;369(14): lated ECM proteins identified in small fibroids and these 1344–1355. 5. Stewart EA, Laughlin-Tommaso SK, Catherino WH, Lalitkumar S, fibroids indicate early stages of leiomyogenesis. Because

Gupta D, Vollenhoven B. Uterine fibroids. Nat Rev Dis Primers. Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 excessive ECM is a prominent feature of fibroids, tar- 2016;2:16043. geting upregulated ECM proteins is a rational approach 6. Mehine M, Kaasinen E, M¨akinenN, Katainen R, K¨ampj¨arviK, for overcoming this disease. Pitk¨anenE, Heinonen HR, B ¨utzowR, Kilpivaara O, Kuosmanen A, Ristolainen H, Gentile M, Sj ¨obergJ, Vahteristo P, Aaltonen LA. Investigation of the role of the ECM in cancer and Characterization of uterine leiomyomas by whole-genome se- other abnormalities has illustrated that the ECM is an quencing. N Engl J Med. 2013;369(1):43–53. active participant in disease progression and is responsive 7. Mehine M, Kaasinen E, Heinonen H-R, M¨akinenN, K¨ampj¨arviK, Sarvilinna N, Aavikko M, V¨ah¨arautioA, Pasanen A, B ¨utzowR, to surrounding cell types and signaling molecules. Further Heikinheimo O, Sj ¨obergJ, Pitk¨anenE, Vahteristo P, Aaltonen LA. studies are essential for understanding the mechanisms Integrated data analysis reveals uterine leiomyoma subtypes with and mediators responsible for the overproduction of distinct driver pathways and biomarkers. Proc Natl Acad Sci USA. 113 – specific ECM components. This understanding will pro- 2016; (5):1315 1320. 8. M¨akinenN, Mehine M, Tolvanen J, Kaasinen E, Li Y, Lehtonen vide an opportunity to develop intervening strategies that HJ, Gentile M, Yan J, Enge M, Taipale M, Aavikko M, Katainen R, will curb the production of these ECM proteins, thereby Virolainen E, B ¨ohlingT, Koski TA, Launonen V, Sj ¨obergJ, Taipale reducing fibroid size and removing the associated dis- J, Vahteristo P, Aaltonen LA. MED12, the mediator complex ease burden. subunit 12 gene, is mutated at high frequency in uterine leio- myomas. Science. 2011;334(6053):252–255. 9. K¨ampj¨arviK, Park MJ, Mehine M, Kim NH, Clark AD, B ¨utzowR, B¨ohling T, B ¨ohmJ, Mecklin JP, J¨arvinenH, Tomlinson IP, van der Acknowledgments Spuy ZM, Sj ¨obergJ, Boyer TG, Vahteristo P. Mutations in Exon 1 highlight the role of MED12 in uterine leiomyomas. Hum Mutat. The authors thank Dr. Ben Crossett and Dr. Trisha Al Mazi for 2014;35(9):1136–1141. help with HILIC and Nathan Smith for help with LC-MS/MS. 10. Taatjes DJ. The human Mediator complex: a versatile, genome- 35 Financial Support: Work in the Tanwar laboratory was in wide regulator of transcription. Trends Biochem Sci. 2010; (6): – part supported by funding from the National Health and 315 322. 11. K¨ampj¨arviK, M¨akinenN, Mehine M, V¨alipakkaS, Uimari O, Medical Research Council (Grant APP1081461 to P.S.T.), the Pitk¨anenE, Heinonen HR, Heikkinen T, Tolvanen J, Ahtikoski A, Australian Research Council, the Cancer Institute New South Frizzell N, Sarvilinna N, Sj ¨oberg J, B ¨utzow R, Aaltonen LA, Wales, and the John Hunter Hospital Charitable Trust. Y.B., Vahteristo P. MED12 mutations and FH inactivation are mutually P.B., and P.B.N. are recipients of the University of Newcastle exclusive in uterine leiomyomas. Br J Cancer. 2016;114(12): – Postgraduate Research Fellowship. M.D.D. is supported by a 1405 1411. 12. Mittal P, Shin YH, Yatsenko SA, Castro CA, Surti U, Rajkovic A. Cancer Institute New South Wales, Australia Early Career Med12 gain-of-function mutation causes leiomyomas and genomic Fellowship. instability. J Clin Invest. 2015;125(8):3280–3284. Author Contributions: M.F.B.J. and P.S.T. designed the 13. Moravek MB, Yin P, Ono M, Coon JS V, Dyson MT, Navarro A, research. M.F.B.J., Y.-A. K., M.K., Y.B., P.B., P.B.N, and D.A.S.-B. Marsh EE, Chakravarti D, Kim JJ, Wei J-J, Bulun SE. Ovarian performed the research. M.F.B.J., H.H., M.A.B., M.D.D., R.J.S., steroids, stem cells and uterine leiomyoma: therapeutic implica- tions. Hum Reprod Update. 2015;21(1):1–12. P.N., and P.S.T. analyzed the data. M.F.B.J., Y.B., and P.S.T. wrote 14. Peddada SD, Laughlin SK, Miner K, Guyon JP, Haneke K, Vahdat the paper. P.S.T. supervised the study, provided financial support, HL, Semelka RC, Kowalik A, Armao D, Davis B, Baird DD. and edited and had final approval of the manuscript. All authors Growth of uterine leiomyomata among premenopausal black reviewed, approved, and commented on the manuscript. and white women. Proc Natl Acad Sci USA. 2008;105(50): Correspondence: Pradeep S. Tanwar, PhD, Life Sciences 19887–19892. Building 236, University Drive, University of Newcastle, Callaghan, 15. Stewart EA, Shuster LT, Rocca WA. Reassessing hysterectomy. Minn Med. 2012;95(3):36–39. 2308 New South Wales, Australia. E-mail: pradeep.tanwar@ 16. Mammoto T, Ingber DE. Mechanical control of tissue and organ newcastle.edu.au. development. Development. 2010;137(9):1407–1420. Disclosure Summary: The authors have nothing to disclose. 17. Bonnans C, Chou J, Werb Z. Remodelling the extracellular matrix in development and disease. Nat Rev Mol Cell Biol. 2014;15(12): 786–801. 18. Pickup MW, Mouw JK, Weaver VM. The extracellular matrix References modulates the hallmarks of cancer. EMBO Rep. 2014;15(12): 1243–1253. 1. Cramer SF, Patel A. The frequency of uterine leiomyomas. Am J 19. Malik M, Norian J, McCarthy-Keith D, Britten J, Catherino WH. Clin Pathol. 1990;94(4):435–438. Why leiomyomas are called fibroids: the central role of extracellular 2. Baird DD, Dunson DB, Hill MC, Cousins D, Schectman JM. High matrix in symptomatic women. Semin Reprod Med. 2010;28(3): cumulative incidence of uterine leiomyoma in black and white 169–179. women: ultrasound evidence. Am J Obstet Gynecol. 2003;188(1): 20. Dun MD, Chalkley RJ, Faulkner S, Keene S, Avery-Kiejda KA, 100–107. Scott RJ, Falkenby LG, Cairns MJ, Larsen MR, Bradshaw RA, 1118 Jamaluddin et al Proteomics of Uterine Fibroids Endocrinology, February 2018, 159(2):1106–1118

Hondermarck H. Proteotranscriptomic profiling of 231-BR breast 34. Kii I, Ito H. Periostin and its interacting proteins in the construction cancer cells: identification of potential biomarkers and therapeutic of extracellular architectures. Cell Mol Life Sci. 2017;74(23): targets for brain metastasis. Mol Cell Proteomics. 2015;14(9): 4269–4277. 2316–2330. 35. Chang Y, Lee TC, Li JC, Lai TL, Chua HH, Chen CL, Doong SL, 21. Bajwa P, Nielsen S, Lombard JM, Rassam L, Nahar P, Rueda BR, Chou CK, Sheen TS, Tsai CH. Differential expression of osteoblast- Wilkinson JE, Miller RA, Tanwar PS. Overactive mTOR signaling specific factor 2 and polymeric immunoglobulin receptor genes in leads to endometrial hyperplasia in aged women and mice. nasopharyngeal carcinoma. Head Neck. 2005;27(10):873–882. 8 – Oncotarget. 2017; (5):7265 7275. 36. Kudo Y, Ogawa I, Kitajima S, Kitagawa M, Kawai H, Gaffney PM, Downloaded from https://academic.oup.com/endo/article-abstract/159/2/1106/4736307 by University of Newcastle user on 25 February 2019 22. Benjamini Y, Hochberg Y. Controlling the false discovery rate: Miyauchi M, Takata T. Periostin promotes invasion and anchorage- a practical and powerful approach to multiple testing. J R Stat Soc independent growth in the metastatic process of head and neck B. 1995;57:289–300. cancer. Cancer Res.2006;66(14):6928–6935. 23. Markowski DN, Bartnitzke S, L ¨oningT, Drieschner N, Helmke 37. Siriwardena BS, Kudo Y, Ogawa I, Kitagawa M, Kitajima S, BM, Bullerdiek J. MED12 mutations in uterine fibroids–their re- Hatano H, Tilakaratne WM, Miyauchi M, Takata T. Periostin is lationship to cytogenetic subgroups. Int J Cancer. 2012;131(7): frequently overexpressed and enhances invasion and angiogenesis 1528–1536. in oral cancer. Br J Cancer. 2006;95(10):1396–1403. 24. McGuire MM, Yatsenko A, Hoffner L, Jones M, Surti U, Rajkovic 38. Ouyang G, Liu M, Ruan K, Song G, Mao Y, Bao S. Upregulated A. Whole exome sequencing in a random sample of North expression of periostin by hypoxia in non-small-cell lung cancer American women with leiomyomas identifies MED12 mutations in cells promotes cell survival via the Akt/PKB pathway. Cancer Lett. majority of uterine leiomyomas. PLoS One. 2012;7(3):e33251. 2009;281(2):213–219. 25. Turunen M, Spaeth JM, Keskitalo S, Park MJ, Kivioja T, Clark AD, 39. Shao R, Bao S, Bai X, Blanchette C, Anderson RM, Dang T, M¨akinen N, Gao F, Palin K, Nurkkala H, V¨ah¨arautioA, Aavikko Gishizky ML, Marks JR, Wang XF. Acquired expression of peri- M, K¨ampj¨arviK, Vahteristo P, Kim CA, Aaltonen LA, Varjosalo ostin by human breast cancers promotes tumor angiogenesis M, Taipale J, Boyer TG. Uterine leiomyoma-linked MED12 mu- through up-regulation of vascular endothelial growth factor re- tations disrupt mediator-associated CDK activity. Cell Reports. ceptor 2 expression. Mol Cell Biol. 2004;24(9):3992–4003. 2014;7(3):654–660. 40. Ismail RS, Baldwin RL, Fang J, Browning D, Karlan BY, Gasson JC, 26. K¨ampj¨arviK, J¨arvinenTM, Heikkinen T, Ruppert AS, Senter L, Chang DD. Differential gene expression between normal and Hoag KW, Dufva O, Kontro M, Rassenti L, Hertlein E, Kipps TJ, tumor-derived ovarian epithelial cells. Cancer Res. 2000;60(23): Porkka K, Byrd JC, de la Chapelle A, Vahteristo P. Somatic MED12 6744–6749. mutations are associated with poor prognosis markers in chronic 41. Bao S, Ouyang G, Bai X, Huang Z, Ma C, Liu M, Shao R, Anderson lymphocytic leukemia. Oncotarget. 2015;6(3):1884–1888. RM, Rich JN, Wang XF. Periostin potently promotes metastatic 27. Kudo A. Introductory review: periostin-gene and protein structure. growth of colon cancer by augmenting cell survival via the Akt/PKB Cell Mol Life Sci. 2017;74(23):4259–4268. pathway. Cancer Cell. 2004;5(4):329–339. 28. Horiuchi K, Amizuka N, Takeshita S, Takamatsu H, Katsuura M, 42. Baril P, Gangeswaran R, Mahon PC, Caulee K, Kocher HM, Ozawa H, Toyama Y, Bonewald LF, Kudo A. Identification and Harada T, Zhu M, Kalthoff H, Crnogorac-Jurcevic T, Lemoine characterization of a novel protein, periostin, with restricted ex- NR. Periostin promotes invasiveness and resistance of pancreatic pression to periosteum and periodontal ligament and increased cancer cells to hypoxia-induced cell death: role of the beta4 integrin expression by transforming growth factor beta. J Bone Min Res. and the PI3k pathway. Oncogene. 2007;26(14):2082–2094. 1999;14(7):1239–1249. 43. Lv Y, Wang W, Jia WD, Sun QK, Huang M, Zhou HC, Xia HH, 29. Kudo A. Periostin in fibrillogenesis for tissue regeneration: periostin Liu WB, Chen H, Sun SN, Xu GL. High preoparative levels of actions inside and outside the cell. Cell Mol Life Sci. 2011;68(19): serum periostin are associated with poor prognosis in patients with 3201–3207. hepatocellular carcinoma after hepatectomy. Eur J Surg Oncol. 30. Sugiura T, Takamatsu H, Kudo A, Amann E. Expression and 2013;39(10):1129–1135. characterization of murine osteoblast-specific factor 2 (OSF-2) in a 44. Lv Y, Wang W, Jia WD, Sun QK, Li JS, Ma JL, Liu WB, Zhou HC, baculovirus expression system. Protein Expr Purif. 1995;6(3): Ge YS, Yu JH, Xia HH, Xu GL. High-level expression of periostin is 305–311. closely related to metastatic potential and poor prognosis of he- 31. Kii I, Nishiyama T, Li M, Matsumoto K, Saito M, Amizuka N, patocellular carcinoma. Med Oncol. 2013;30(1):385. Kudo A. Incorporation of tenascin-C into the extracellular matrix 45. Gillan L, Matei D, Fishman DA, Gerbin CS, Karlan BY, Chang DD. by periostin underlies an extracellular meshwork architecture. Periostin Secreted by Epithelial Ovarian Carcinoma Is a Ligand for J Biol Chem. 2010;285(3):2028–2039. aVb3 and aVb5 integrins and promotes cell motility. Cancer Res. 32. Maruhashi T, Kii I, Saito M, Kudo A. Interaction between periostin 2002;62:5358–5364. and BMP-1 promotes proteolytic activation of lysyl oxidase. J Biol 46. Utispan K, Sonongbua J, Thuwajit P, Chau-In S, Pairojkul C, Chem. 2010;285(17):13294–13303. Wongkham S, Thuwajit C. Periostin activates integrin a5b1 33. Snider P, Hinton RB, Moreno-Rodriguez RA, Wang J, Rogers R, through a PI3K/AKT‑dependent pathway in invasion of chol- Lindsley A, Li F, Ingram DA, Menick D, Field L, Firulli AB, angiocarcinoma. Int J Oncol. 2012;41(3):1110–1118. Molkentin JD, Markwald R, Conway SJ. Periostin is required 47. Lorts A, Schwanekamp JA, Baudino TA, McNally EM, Molkentin for maturation and extracellular matrix stabilization of non- JD. Deletion of periostin reduces muscular dystrophy and fibrosis in cardiomyocyte lineages of the heart. Circ Res. 2008;102(7): mice by modulating the transforming growth factor-b pathway. 752–760. Proc Natl Acad Sci USA. 2012;109(27):10978–10983. OPEN Leukemia (2017), 1–13 www.nature.com/leu

ORIGINAL ARTICLE Mutant JAK3 phosphoproteomic profiling predicts synergism between JAK3 inhibitors and MEK/BCL2 inhibitors for the treatment of T-cell acute lymphoblastic leukemia

S Degryse1,2,8, CE de Bock1,2,8, S Demeyer1,2, I Govaerts1,2, S Bornschein1,2, D Verbeke1,2, K Jacobs1,2, S Binos3, DA Skerrett-Byrne4,5, HC Murray4,5, NM Verrills4,5, P Van Vlierberghe6,7, J Cools1,2 and MD Dun4,5

Mutations in the interleukin-7 receptor (IL7R) or the Janus kinase 3 (JAK3) kinase occur frequently in T-cell acute lymphoblastic leukemia (T-ALL) and both are able to drive cellular transformation and the development of T-ALL in mouse models. However, the signal transduction pathways downstream of JAK3 mutations remain poorly characterized. Here we describe the phosphoproteome downstream of the JAK3(L857Q)/(M511I) activating mutations in transformed Ba/F3 lymphocyte cells. Signaling pathways regulated by JAK3 mutants were assessed following acute inhibition of JAK1/JAK3 using the JAK kinase inhibitors ruxolitinib or tofacitinib. Comprehensive network interrogation using the phosphoproteomic signatures identified significant changes in pathways regulating cell cycle, translation initiation, mitogen-activated protein kinase and phosphatidylinositol-4,5-bisphosphate 3-kinase (PI3K)/AKT signaling, RNA metabolism, as well as epigenetic and apoptotic processes. Key regulatory proteins within pathways that showed altered phosphorylation following JAK inhibition were targeted using selumetinib and trametinib (MEK), buparlisib (PI3K) and ABT-199 (BCL2), and found to be synergistic in combination with JAK kinase inhibitors in primary T-ALL samples harboring JAK3 mutations. These data provide the first detailed molecular characterization of the downstream signaling pathways regulated by JAK3 mutations and provide further understanding into the oncogenic processes regulated by constitutive kinase activation aiding in the development of improved combinatorial treatment regimens.

Leukemia advance online publication, 22 September 2017; doi:10.1038/leu.2017.276

INTRODUCTION protein kinase (MAPK) pathway.4 Likewise, very early work found T-cell acute lymphoblastic leukemia (T-ALL) is an aggressive that IL7 stimulation leads to JAK3-mediated association and 5 leukemia that is common in children and adolescents. Long-term phosphorylation of the p85 subunit of the PI3-kinase. childhood ALL survival rates have improved significantly following The recent identification of activating JAK3 mutations in T-ALL 6–8 refined chemotherapeutic treatment regimens; however, these are cases shows promising therapeutic potential. Indeed, small associated with substantial acute and long-term side effects.1 molecules targeting JAK family members are in development or Recently, Janus kinase 3 (JAK3) mutations were identified in 16% are already in use for the treatment of several diseases which 9 of T-ALL cases2 and it is now known to be sufficient to drive T-ALL could be repurposed for T-ALL. However, detailed functional development in mice using an in vivo bone marrow transplant analysis of the different JAK3 mutations has found these have model.3 JAK3 is a non-receptor tyrosine kinase and functions in different dependencies on members of the IL7R complex for their class I cytokine receptor complexes through binding of the ability to cause cellular transformation.3,10 For example the JAK3 common γ chain (IL2RG). JAK3 binds the IL2RG that forms (M511I) mutation requires the presence of both JAK1 and the heterodimers with other receptors such as the IL7Rα chain, which IL2RG for cellular transformation, whereas the JAK3(L857Q) can in turn binds JAK1. Under normal conditions, ligand binding to the also signal independent from the JAK1/IL2RG complex.3,10 There- receptor complex activates cytokine signaling through coopera- fore, the pathways activated downstream of different JAK3 tive JAK1/JAK3 phosphorylation. In this active conformation, they mutations in additional to canonical STAT5 activation, may not phosphorylate downstream targets including STAT5. In addition to only differ from wild-type cytokine activation of JAK3, but also canonical STAT5 activation, increasing evidence shows that the between the different JAK3 mutations themselves. Hence, the interleukins (ILs) can activate additional signaling pathways in detailed molecular characterization of the oncogenic JAK3 path- T-cells. For example, a phosphoproteomic approach following ways based on the exact mutation present will aid in the 5 min stimulation with IL2 and IL15 found recruitment and development of improved treatment strategies. phosphorylation of the SHC–GRB2–SOS complex at the cytokine Accordingly, we have used quantitative assessment of the receptor which then activates the canonical mitogen-activated phosphorylation status of proteins downstream of mutant JAK3

1VIB Center for Cancer Biology, Leuven, Belgium; 2KU Leuven Center for Human Genetics, Leuven, Belgium; 3Thermo Fisher Scientific, Scoresby, Victoria, Australia; 4Faculty of Health and Medicine, University of Newcastle, Callaghan, New South Wales, Australia; 5Cancer Research Program, School of Biomedical Sciences and Pharmacy, Hunter Medical Research Institute, University of Newcastle, New Lambton Heights, New South Wales, Australia; 6Department of Pediatrics and Genetics, Center for Medical Genetics, Ghent University, Ghent, Belgium and 7Cancer Research Institute Ghent, Ghent, Belgium. Correspondence: Professor J Cools, VIB Center for Cancer Biology, Herestraat 49, Box 912 Leuven B-3000, Belgium or Dr MD Dun, School of Biomedical Sciences and Pharmacy, Hunter Medical Research Institute, University of Newcastle, University Drive, Callaghan, New South Wales 2308, Australia. E-mail: [email protected] or [email protected] 8These authors contributed equally to this work. Received 30 March 2017; revised 17 July 2017; accepted 15 August 2017; accepted article preview online 30 August 2017 Phosphoproteomic screen of mutant JAK3 S Degryse et al 2 (L857Q) and JAK3(M511I) using transformed Ba/F3 cells treated (Thermo Fisher Scientific, Bremen, Germany) coupled to a Dionex Ultimate with the JAK1/JAK3-selective inhibitors ruxolitinib or tofacitinib, or 3000RSLC nanoflow HPLC system (Thermo Fisher Scientific). Samples were a vehicle control. Our data have mapped associations between loaded onto an Acclaim PepMap100 C18 75 μm × 20 mm trap column mutant JAK3 signaling and multiple components of the phospha- (Thermo Fisher Scientific) for pre-concentration and online desalting. Separation was then achieved using an EASY-Spray PepMap C18 tidylinositol-4,5-bisphosphate 3-kinase (PI3K)/AKT and MAPK μ fi pathways, as well as to critical components of the cell cycle 75 m × 500 mm column (Thermo Fisher Scienti c), employing a linear gradient from 2 to 32% acetonitrile at 300 nl/min over 120 min. Q-Exactive machinery, anti-apoptotic constituents, RNA metabolism and Plus MS System (Thermo Fisher Scientific) was operated in full MS/data- epigenetic regulators. dependent acquisition MS/MS mode (data-dependent acquisition). The Orbitrap mass analyzer was used at a resolution of 70 000, to acquire full MS with an m/z range of 390–1400, incorporating a target automatic gain MATERIALS AND METHODS control value of 1e6 and maximum fill times of 50 ms. The 20 most intense Expression plasmids multiply charged precursors were selected for higher-energy collision JAK3 mutant sequences were synthesized by GenScript (Piscataway, NJ, dissociation fragmentation with a normalized collisional energy of 32. MS/ USA). All constructs were cloned into the murine stem cell virus–green MS fragments were measured at an Orbitrap resolution of 35 000 using an fi fluorescent protein vectors. automatic gain control target of 2e5 and maximum ll times of 110 ms. Parallel reaction monitoring. Phosphopeptides were loaded onto an Cell culture, virus production and retroviral transduction Acclaim PepMap100 C18 75 μm × 20 mm trap column (Thermo Fisher Cell culture, virus production and retroviral transduction were performed Scientific) for pre-concentration and online desalting, then separated using as previously described.3 EASY-Spray PepMap C18 75 μm × 250 mm column (Thermo Fisher Scientific). An optimized stepped 82 min gradient was employed (5 to 22 to 35%). Each sample was run in data-dependent acquisition mode Quantitative phosphoproteomics (described above) to evaluate phosphopeptide enrichment efficiency Discovery. Three independent sets of Ba/F3 cells expressing mutant (93–97%) and loading. Parallel reaction monitoring (PRM) was performed 3 human JAK3(L857Q) were cultured as described previously and treated using optimized methods for collision energy, charge state and retention with 500 nM tofacitinib, ruxolitinib or dimethylsulfoxide for 90 min times at a resolution of 17 500, automatic gain control of 2e5, maximum fill 11 (3 × biological). Cell pellets were prepared as previously described. times of 90 ms and 1.6 m/z isolation window. Membranes were enriched by ultra-centrifugation12 and proteins dissolved in v/v 6 M urea/2 M thiourea. Proteins were then reduced, alkylated and digested as previously described.11 Lipids were precipitated from Data analysis membrane peptides using formic acid and quantitated (Qubit protein Database searching of all.raw files was performed using Proteome assay kit, Thermo Fisher Scientific, Carlsbad, CA, USA). One hundred Discoverer 2.1 (Thermo Fisher Scientific). Mascot 2.2.3 and SEQUEST HT micrograms of membrane and soluble peptides from each of the nine were used to search against the Swiss_Prot, Uniprot_mouse database samples were individually labeled using tandem mass tags (TMT-10plex (24910 sequences, downloaded 19 April 2016). Database searching 2 × kits, Thermo Fisher Scientific, Bremen DE, Germany) and mixed with a parameters included up to two missed cleavages, to allow for full tryptic 1:1 ratio.13 Phosphopeptides were isolated from the proteome using digestion, a precursor mass tolerance set to 10 p.p.m. and fragment mass titanium dioxide and immobilized metal affinity chromatographybefore tolerance 0.02 Da. Cysteine carbamidomethylation was set as a fixed offline hydrophilic interaction liquid chromatography (LC).14 LC tandem modification while dynamic modifications included oxidation (M), phospho mass spectrometry (MS/MS) was performed on 24 mono-phosphopeptide (S/T), phospho (Y) and TMT6plex (modification designated for TMT10plex). enriched hydrophilic interaction LC fractions, 12 membrane and 12 soluble Interrogation of the corresponding reversed database was also performed enriched fractions, as well as multi-phosphorylated peptides enriched from to evaluate the false discovery rate of peptide identification using both using a Q-Exactive Plus hybrid quadrupole-Orbitrap MS system Percolator on the basis of q-values, which were estimated from the

Figure 1. Experimental workflow. Three independent sets of Ba/F3 cells expressing mutant human JAK3 (L857Q) were cultured and treated with 500 nM tofacitinib, ruxolitinib or dimethylsulfoxide (DMSO) for 90 min (biological and technical replicates). Cells were digested with Typ/ Lys-C mix. Peptides from each of the nine samples were labeled using tandem mass tags (TMT-10plex, Thermo Fisher Scientific) and mixed 1:1. Phosphopeptides were isolated and liquid chromatography-tandem mass spectrometry (LC-MS/MS) was performed using a Q-Exactive Plus hybrid quadrupole-Orbitrap MS system (Thermo Fisher Scientific). Database searching of all.raw files was performed using Proteome Discoverer 2.1 (Thermo Fisher Scientific). Mascot 2.2.3 and SEQUEST HT were used to search against the Swiss_Prot, Uniprot_mouse database.

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 3 target-decoy search approach. To filter out target peptide spectrum In vivo drug treatment of primary T-ALL sample matches over the decoy-peptide spectrum matches, a fixed false discovery Patient sample 389E was injected into the lateral tail vein of 28 female NSG rate of 1% was set at the peptide level. The experimental workflow is mice. Leukemia burden was assessed by hCD45+ staining in the peripheral shown in Figure 1. PRM data files were processed with PD 2.1 to generate blood on day 19 post injection. Mice were ranked from highest to lowest spectral libraries and uploaded into Skyline (MacCoss Lab, University of hCD45 and then sequentially placed in four treatment groups. Animal Washington), which was then used to identify peptide chromatogram weight distribution was also assessed to determine equivalence between peaks. Peptide quantification was performed by quantifying area under the all groups. Ruxolitinib and ABT-199 were administered daily by oral gavage 15 curve of the MS2 extracted ion chromatogram within Skyline. Results for 14 consecutive days, vehicle was given as placebo. No preliminary data were normalized to the spiked-in heavy-labelled peptide to account for were available to perform a statistical power analysis. The investigators variations in sample injection. MS raw data is available in the Massive were not blinded to the treatment groups. public repository (PXD007046).

In silico analysis of phosphosites abundance and pathway RESULTS prediction JAK3 pathway network enrichment analysis Quantitative phosphoproteomic differences across treatments were To determine the signal transduction pathways downstream clustered by unique phosphosites (Cluster3, Stanford University, Palo Alto, mutant JAK3, Ba/F3 cells transformed by the JAK3(L857Q) CA, USA) and examined using heatmaps (Java Treeview, Stanford 16 mutation (signaling partially independent from JAK1) were initially University) to visualize trends and consistency (as described )in screened using unbiased global quantitative phosphoproteomic phosphopeptide abundance across treatment groups. Ingenuity Pathway fi Analysis software (version 8.8, Ingenuity Systems, Redwood City, CA, USA) pro ling. This approach simultaneously determined the identity, together with the Core Analysis function was used to identify pathways phosphorylation status and relative abundance of proteins regulated by the JAK3(L857Q) mutation. isolated from three biological replicates following JAK1 inhibition (ruxolitinib) or JAK3 inhibition (tofacitinib) (Figure 1). Deep phosphoproteomic coverage was achieved by employing a Western blotting multidimensional strategy incorporating protein pre-fractionation Cells were lysed in cold lysis buffer containing 5 mM NA3VO4, protease and sequential phosphopeptide enrichment techniques to enrich inhibitors (Complete, Roche, Bazel, Switzerland) and PhosSTOP (Roche). The proteins were separated on NuPAGE NOVEX Bis-Tris 4 to 12% gels for mono- and multi-phosphorylated peptides, coupled to (Invitrogen, Carlsbad, CA, USA). Western blot analysis was performed using hydrophilic interaction LC, to reduce sample complexity before antibodies against JAK3 (Cell Signaling, Danvers, MA, USA 3775), phospho- high-resolution tandem MS. A total of 6036 unique proteins and STAT5 (Y694) (Cell Signaling 9359); STAT5 (Invitrogen 335900), phospho- 2070 unique phosphoproteins, associated with ~ 5400 unique RanBP3 (Cell Signaling 9380); phospho-4E-BP1 (Cell Signaling 2855); phosphorylation sites were quantitatively sequenced across all 9 pERK1/2 (Cell Signaling 9101), ERK1 (Santa Cruz, Dallas, TX, USA, Sc-93), samples (Figures 2a–c and Supplementary Table S2). Accurate site- pMEK1/2 (Cell Signaling 9154), MEK1/2 (Cell Signaling 9126) and β-actin specific assignment of phosphorylation was achieved using (Sigma-Aldrich A1978); secondary antibodies were conjugated with phosphoRS17 using a false discovery rate of 1% followed by horseradish peroxidase (GE Healthcare, Chicago, IL, USA). Bands were manual interrogation for all peptides that were significantly visualized using a cooled charge-coupled device camera (ImageQuant fi LAS-4000; GE Healthcare). altered after JAK inhibition. The ratio of phosphosites identi ed were 82.5% serine: 14.9% threonine: 2.6% tyrosine. Using a cutoff of log2 fold change 40.5 we identified 84 Inhibitor treatment in Ba/F3 cells downregulated unique phosphorylation sites and 29 upregulated 3 Ba/F3 cells (www.dsmz.de) were cultured and transduced as described. phosphorylation sites following 90 min of JAK3 inhibition using Cells were confirmed mycoplasma negative. Ba/F3 cells were seeded in 96- 5 tofacitinib across biological and technical replicates (Figure 2d). well plates (1x10 cells/ml) and treated with compound or vehicle Similarly, treatment with ruxolitinib identified 86 downregulated (dimethylsulfoxide). A quantitative evaluation of proliferation was done after 24 h using ATPlite (PerkinElmer, Waltham, MA, USA) and measured on and 20 upregulated phosphorylation sites (Figure 2e). In both the VICTOR X4 Reader (PerkinElmer). instances, the phosphorylation of STAT5A at tyrosine Y694 was significantly reduced by − 5.9- and − 3.5-fold following tofacitinib or ruxolitinib treatment, respectively (Figure 2f), which was also Ex vivo treatment of primary T-ALL samples confirmed by western blotting (Figure 2g). The mutational background of T-ALL samples used in this work is shown in The spectral intensity for each unique phosphosite (reporter- Supplementary Table S1. All human T-ALL samples were collected with ion) within each phosphoprotein was averaged across biological informed patient consent and with approval of the UZ Leuven ethics committee. T-ALL cells were expanded in vivo by injection in 8 weeks old and technical replicates for each treatment and grouped. The female NSG mice. After 2–3 months, human T-ALL cells were isolated from global phosphopeptide profile of JAK3 and JAK1 inhibition were the spleen, seeded in 96-well plates (5 × 105 cells/well) and incubated with clustered into canonical and biological pathways (as described in vehicle (dimethylsulfoxide) or inhibitor. Cell viability was assessed 48 h Materials and Methods, Figure 3). As expected, canonical Ingenuity after adding the compounds by using the Guava easyCyte Flow Cytometer Pathway Analysis pathways associated with decreased phospho- (Millipore, Overijse, Belgium). CompuSyn was used to calculate the site abundance after both tofacitinib and ruxolitinib treatments combination index (CI). All animal experiments were approved by the KU was linked to JAK/STAT signaling and also IL-22 signaling. This Leuven ethics committee. cluster was also significantly enriched for proteins implicated in 14-3-3-mediated signaling (Figure 3, Cluster 2). Cluster 3 showed Annexin V/PI staining and flow cytometry analyses increased phosphosite abundance after tofacitinib and ruxolitinib For apoptosis staining the fluorescein isothiocyanate Annexin V apoptosis treatments with a significant enrichment for proteins implicated detection kit with propidium iodide was used as per the manufacturer’s DNA methylation, transcriptional repression and ERK/MAPK instructions (Biolegend, Dedham, MA, USA). Briefly, 500 000 cells of each signaling pathways. condition were stained for 20 min. Staining cells were washed in 1 × in Of interest, cluster 1 and cluster 5 are associated with phosphate-buffered saline and data were acquired on the MACSQuant VYB pharmacological inhibition of JAK3 specifically. These clusters flow cytometer (Miltenyi Biotec, Cambridge, MA, USA). For phospho-flow analysis, cells were fixed with Inside Fix and permeabilisation buffer are enriched for pathways involved in altered DNA Repair (Miltenyi Biotec), and stained with an APC-labeled antibody against mediated by the homologous recombination repair pathways, phospho-STAT5 (eBioscience, San Diego, CA, USA). Cells were analyzed multiple components of complementary apoptotic cascades such using the FACSVerse (BD, Bedford, MA, USA). All FACS data analysis was as the cytolytic granzyme B pathway, the growth arrest and DNA carried out using FlowJo sotware (Tree Star, Ashland, OR, USA). damage-inducible 45 (GADD45) pathway (Figure 3, Cluster 1) and

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 4

a b 5396 unique c 5393 unique phosphosites phosphosites 8 8 6036 unique 4 4 proteins 2 2

1606 1 1 Ratios Ratios 0.5 0.5

0.25 0.25 2070 unique phosphoproteins 0.125 0.125 2000 4000 2000 4000 tofacitinib/control ruxolitinib/control

d 8 e 8

6 6 p value 4 p value 4

STAT5A STAT5A - Log10 - Log10 2 2

0 0 -3 -2 -1 0 1 2 3 -3-2-10123 Log2 Fold change Log2 Fold change (tofacitinib/control) (ruxolitinib/control)

fgrep. # 1 rep. # 2 rep. # 3 phospho STAT5 (Y694) tofacitinib - +- -+- -+- 0 ruxolitinib --+--+--+ -1 phospho STAT5 (Y694) -2 fold change 1.0 0.1 0.3 1.0 0.1 0.3 1.0 0.1 0.1 -3

Log2 fold change STAT5 -4 tofacitinib ruxolitinib Figure 2. Quantitative phosphoproteomic analysis of mutant JAK3 L857Q Ba/F3 cells. (a) Venn diagram summary of results from experimental workflow after mapping of all phospho-proteomic data across all experiments. (b and c) Distribution of quantitative phosphorylation changes of tofacitinib/control and ruxolitinib/control. (d and e) Tofacitinb (500 nM) treatment resulted in 84 down- and 29 upregulated phosphorylation sites (significance reported for changes versus control of ± 0.5 log2 fold). (e) Ruxolitinib (500 nM) treatment resulted in 86 down- and 20 upregulated phosphorylation sites (significance reported for changes versus control of ± 0.5 log2 fold). (f) STAT5A showed consistent downregulation across all three replicates upon tofacitinib and ruxolitinib treatment in the phospho-proteomic screen. (g) Western blot analysis further validated the decrease in phospho-STAT5 (Y694) upon treatment with JAK selective inhibitors tofacitinib and ruxolitinib.

also PI3K/AKT signaling and protein ubiquitination (Figure 3, serine and threonine residues. These proteins and their major Cluster 5). These data show that specific inhibition of JAK1 or JAK3 cellular functions are illustrated schematically in Figure 4b. may have slightly different effects on downstream signaling pathway. Proteins involved in RNA processing and chromatin remodeling fi The global phosphopeptide list was then ltered to generate a are directly regulated downstream of mutant JAK3 signaling short list of significantly altered changes common to both There were 18 proteins associated with RNA metabolism including tofacitinib or ruxolitinib treatment (average ± 0.5 log2-fold change/dimethylsulfoxide control) (Supplementary Table 3). This mRNA stability/processing, splicing and degradation some of stringent short list consisted of 84 and 48 unique phosphosites which have established roles in cancer (Figure 4b). De novo decreasing or increasing in phosphorylation respectively after phosphorylation of Threonine 7 (T7) of SF3B1 (splicing factor 3b, both tofacitinib and ruxolitinib treatment. A heatmap showing subunit 1) increased upon inhibition of JAK3 mutant signaling. individual replicate values of phosphosite abundances compared This factor is important for anchoring the spliceosome to fi precursor mRNA and is one of the most frequently mutated to control is presented in Figure 4a for each protein identi ed. – Within this list, four proteins had altered tyrosine phosphorylation, genes in chronic lymphocytic leukemia.18 20 Serine 100 of RNMT STAT5(Y694) and ACTB(Y198) sites were decreased after inhibitor showed increased phosphorylation and this protein is important treatment whilst PGRMC2(Y204) and MCM2(Y137) were increased in the coordination of mRNA capping and linked to cancer.21 In after inhibitor treatment. The remaining changes occurred on addition to RNA metabolism, an unexpected link with several

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 5

assigned IPA canonical pathways assigned IPA biological functions control tofacitinib ruxolitinib

Cluster 1 Cluster 1 DNA Double-Strand Break Repair by Homologous Recombination RNA Post-Transcriptional Modification Granzyme B Signaling Gene Expression GADD45 Signaling Cellular Growth and Proliferation Hereditary Breast Cancer Signaling Cell Cycle Estrogen-mediated S-phase Entry Cell Death and Survival

Cluster 2 Cluster 2 IL-22 Signaling Cell Signaling Role of JAK family kinases in IL-6-type Cytokine Signaling Post-Translational Modification 14-3-3-mediated Signaling Protein Synthesis UVA-Induced MAPK Signaling Cell Death and Survival EIF2 Signaling Cell Morphology Cluster 3 Cluster 3 Molecular Mechanisms of Cancer Signaling Gene Expression DNA Methylation and Transcriptional Repression Signaling Cell Cycle ERK/MAPK Signaling Cellular Assembly and Organization Non-Small Cell Lung Cancer Signaling Cellular Function and Maintenance Chronic Myeloid Leukemia Signaling RNA Post-Transcriptional Modification

Cluster 4 Cluster 4 DNA Methylation and Transcriptional Repression Signaling RNA Post-Transcriptional Modification Nucleotide Excision Repair Pathway Gene Expression Huntington's Disease Signaling Cell Death and Survival Hereditary Breast Cancer Signaling Cellular Compromise EIF2 Signaling Protein Synthesis

Cluster 5 Cluster 5 PI3K/AKT Signaling Cell Cycle Mitotic Roles of Polo-Like Kinase Cellular Growth and Proliferation EIF2 Signaling Gene Expression Protein Ubiquitination Pathway Protein Synthesis Signaling in Hematopoietic Progenitor Cells Cell Death and Survival

1.5 1.8 2.1 2.4 2.7 3.0

Log phosphosite abundance Figure 3. Identification of five independent phosphosite clusters following tofacitinib and ruxolitinib treatment. Heatmap of the average of reporter ion quantifications for each phosphosite in three replicate experiments are shown. Biological functions and signaling pathways identified by Ingenuity Pathway Analysis (IPA) have been assigned to each cluster if significantly over-represented by phosphopeptides. Fisher’s exact test was used to evaluate significance, which was set at P-values o0.01. proteins involved in epigenetic regulation was also found. across nine samples simultaneously.24 Therefore, changes in HIST1H1E, a regulator of higher-order chromatin structure that is phosphorylation in proteins such as STAT5 and YBX3 were more mutated in lymphoma,22 showed an increased phosphorylation accurately measured using targeted methods such as PRM on serine 36 upon JAK inhibition. MCM8, which is linked with the (Figure 4e). Nevertheless, discovery proteomics is a powerful tool development of chronic myelogenous leukemia, showed a that guided our PRM validation study and revealed a highly decrease in phosphorylation upon JAK inhibition.23 significant correlation between phosphoproteins regulating RNA Confirmation of novel phosphorylation changes in proteins processing and chromatin remodeling in both JAK3 L857Q- and regulating RNA processing and chromatin remodeling was M511I-transformed cells. performed by an independent targeted proteomics experiment Human JAK3 mutant T-ALL samples 389E and XC65 (parallel reaction monitoring, PRM) (Figures 4c–e and (Supplementary Table 1) were also validated by PRM. Significant Supplementary Table 2). This approach revealed a significant correlation of the abundance of phosphoproteins regulated by level of correlation between methods of phosphopeptide mutant JAK3 were seen between patients following JAK1 quantification (that is, discovery by data-dependent acquisition inhibition using ruxolitinib (Figure 4e and Supplementary and validation by PRM) and also between JAK3(L867Q) and JAK3 Figure 1A). STAT5 and proteins implicated in DNA methylation (M511I) mutants (Figures 4c and d). Reduction in relative fold- (DNMT1 and KDM2A), genome replication (MCM8 and NCAPD2) change is a consequence of data-dependent acquisition based and pre-mRNA processing (SF1 and NCBP1) were significantly quantitative approaches (reporter-ion based relative quantifica- modulated in both patient samples following JAK inhibition tion—TMT Figure 4a), owing to the thousands of events measured (Figure 4e and Supplementary Table 2). However, regulation of

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 6

α γ EXTRACELLULAR IL7R c tofacitinib ruxolitinib

STAT5 (Y694) PSD4 (S706) LCP1 (S5) PRRC2C (S2646) UP RIF1 (S1565) YBX3 (S126) INTRACELLULAR INPP5D (S173) RPS6KA1 (T352) ARF regulators TOM1 (S160) RPS6KA1 (S369) ARFGAP2 S145/S336 ruxolitinib JAK1 JAK3 tofacitinib SERBP1 (S390) PTPN12 (S748) GSK3B (S9) RPS6 (S235) Vesicle trafficing C1orf112 (S802) TWF1 (S142) PI3K pathway MAPK pathway MAPK3K1 (S142) RCSD1 (S83) DYNC1H1 S4366 SNX5 S20 BAG6 (S995) INPP5D S173 RAF1 T640 OSBPL3 (S340) TBC1D1 S256 TMEM230 S24 SNX5 (S20) NUMA1 (S398) GSK3B S9 MAP3K1 S142 WNK1 (T1972) DENND4C (S1221) PSD4 S706 NOP56 (S554) SPN (S339) TMEM230 (S24) DOWN NCBP1 (S22) Cytoskeletal organization RPS6 (S235/S236) MICALL2 (T831/2*) Actin remodelling RRP1B (S672) FMNL1 S1021 SIPA1L1 S1345 Tubulin associated SLC9A1 (S707) RCSD1 S83 SEPT2 S218 ACTB (Y198) IQGAP2 (S16) STMN1 S38 PHIP S1315 NCBP1 (T21) LCP1 S5 TWF1 S142 TUBA1B T80 RPS6 (S236) WNK1 T1972 DOCK2 S1683 RBBP8 (S376) AKAP13 (S2697) ACTB Y198 RCSD1 S83 TUBB4B S40 RIOK2 (S433/S437) CTDSPL2 (S28) MICALL2 T831/S832* NUMA1 S398 ANAPC2 (S549) RCSD1 (T256) CAD (S1859) RBM14 (S627) mRNA stability/processing SUN2 (S12) POM121 (S412/3/S416) NCAPD2 (S1320) RBBP6 (S1179) RRP1B S672 YBX3 S126 mRNA splicing PDS5B (S1182) NCBP1 S22 ATXN2L S597 RANBP3 (S58) HNRNPM S617 HNRNPK S116 URB4 (S362) PPIP5K2 (S510) SERBP1 S390 U2SURP S485 ZC3H13 (S242) SRRM1 S779 SRRM1 S550 HNRNPK (S116) CMTR1 S27 ZC3H13 S242 RBL1 (T332) SRP14 (S45) SF3B1 T7 CASC3 S262/S263* RNMT S100 RAF1 (T640) CHAF1A (T159/S173*) PNN S100 SF1 S80/S82 GTPBP1 (S8/S12) MSH6 (S63) ATXN2L (S597) SIPA1L1 (S1345) m7G AAAA URB4 (T363) HIST1H1E (S36) mRNA degradation CMTR1 (S27) SRRM1 (S779) GTPBP1 S8/S12 TACC3 (S71) PNN (S100) LRRFIP1 (S302) MCM2 (S140) Ribosomal complex SF1 (S80/S82) Translation ASPSCR1 (S279) Nuclear export MCM8 (S623) NPM1 (S252) RPS6KA1 T352/T369 NOP56 S554 EEF2K S394/S395 DNMT1 (S140) D17WSU92E (S250) RPS6 S235/236 RRP36 S66 RANBP3 T56/S58 EEF1D (S133) TUBB4B (S40) RANBP2 S2641 CBL (S481) TRIM28 (S50) FLJ45252 (S115) POU2F1 (S359) CYTOPLASM TBC1D1 (S256) RRP36 (S66) NCOR2 (S545) INCENP (S280) HNRNPM (S617) OSBPL8 (T54) RTN4 (S344) SEPT2 (S218) MKI67 (S2333) OSBPL8 (S65) Lamina associated proteins STMN1 (S38) ZFP318 (S205) NUCLEUS SYK (S270) NCOA7 (S212) SUN2 S12 RPRD1B (S166) SIPA1L1 (S1344) TUBA1B (T80) WIPI2 (S394/5*) Transcription DNA damage/repair SORT1 (S807) UBAP2L (S628/9*) FAM195A (T86) PGRMC2 (Y204) STAT5 Y694 RPRD1B S166 RBBP8 S376 Nuclear pore PHIP (S1315) BBC3 (S9/S10*) RBM14 S256 LRRFIP1 S302 RIF1 S1565 FMNL1 (S1021) AHCTF1 S2188 EHD4 (S459) TRIM28 S50 NCOA7 S212 CHAF1A T159/S173* IQSEC2 (S383) SF3B1 (T7) POM121 S412/S413/S416 ARFGAP (S336) RANBP2 (S2641) POU2F1 S359 RBM14 S627 MSH6 S63 AHCTF1 (S2188) SRRM1 (S550) RBM14 (S256) RNMT (S100) U2SURP (S485) EEF2K (S394/S395) PTPN12 (T746) CASC3 (S262/3*) SSR1 (S268) RMDN3 (S46) ARFGAP (S145) PGM2 (S173) SOCK2 (S1683) SAP30L (S98) POLD3 (S454) ATAD1 (S317) Chromatin remodelling Histone (de)acetylation Histone (de)methylation KDM2A (T713) DYNC1H1 (S4366) HIST1H1E S36 MKI67 S2333 NCOR2 S545 DNMT1 S140 SH3KBP1 (S108) MCM2 (Y137) MCM8 S623 RBL1 T332 SAP30L S98 KDM2A T713 NCAPD2 S1320

Patient JAK3(L857Q): PRM vs TMT Ba/F3 Tofacitinib Ruxolitinib L857Q M511I 389E XC65 R =0.6 R =0.6 TOF RUX TOF RUX TOF RUX TOF RUX 6 6 p<0.003 p<0.008 YBX3 (S126) n/f n/f n/f n/f STAT5 (Y694) PRM PRM HNRNPM (S617) n/f n/f n/f n/f -3 TMT 3 -3 TMT 3 NCBP1 (T21/S22) RAF1 (T640) n/f n/f RBL1 (T332) n/f n/f n/f n/f -6 -6 NCAPD2 (T1319/S1320) HNRNPK (S116) U2SURP (S485) n/f n/f n/f n/f RRP1B (S672) n/f n/f PRM: JAK3(L857Q) vs. JAK3(M511I) ATXN2L (S597) n/f n/f KDM2A (T713) Tofacitinib Ruxolitinib SF1 (S80/82) 3 R =0.9 R =0.9 p<0.001 3 DNMT1 (S140) p<0.001 CMTR1 (S27) MAP3K1 (S142) HIST1H1E (S36) -6 6 -6 6 SRRM1 (S550) n/f n/f n/f n/f n/f n/f MCM8 (S623) n/f n/f n/f n/f

-3 -3 -4 -3 -2-10123

Log2 Figure 4. Heatmap and schematic representation of phosphosites down- or upregulated common to both tofacitinib and ruxolitinib treatment. Phosphosites with an average decrease or increase of 0.5 log2 fold relative to dimethylsulfoxide (DMSO) control are shown. (a) Color reflects fold change in phosphorylation relative to control (DMSO treated) cells. Blue and yellow indicate down- and upregulation respectively. An asterisk indicates phosphosites, which could not be annotated with 100% certainty. (b) Schematic for a selection of phosphosites with an average decrease or increase of 0.5 log2 fold relative to DMSO control are shown alongside the main cellular process they are involved. Yellow and blue indicates an increase or decrease in phosphorylation, respectively upon JAK selective inhibition (tofacitinib and ruxolitinib). (c) PRM validation of phosphoproteomic profiling of JAK3(L857Q) signaling following tofacitinib or ruxolitinib treatment. (d) PRM correlation between JAK3(L857Q) and JAK3(M511) signaling following tofacitinib and ruxolitinib treatment. (e) PRM validation of phosphorylation upon JAK selective inhibition (tofacitinib and ruxolitinib) in BaF/3 JAK3(L857Q) and JAK3(M511I) transformed cells and in patient samples 389E and XC65. (PRM phosphosite annotation reflects mouse residues), n/f denotes phosphopeptide not found.

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 7

IL7Rα γc EXTRACELLULAR tofacitinib ERK/MAPK Signaling INTRACELLULAR Molecular Mechanisms of Cancer PIP3 PIP2 PIP3 PIP2 JAK3 tofacitinib DNA Methylation and Transcriptional Repression JAK1 Apoptosis Signaling * SHC GRB2* Cell Cycle: DNA Damage Checkpoint Regulation * * INPP5D PI3K STAT5 * PI3K/AKT Signaling GAB1/2 SOS GAB1/2* 0 123456 AKT buparlisib - Log10 p value RAS

ruxolitinib * * Molecular Mechanisms of Cancer GSK3 BCL2 ABT199 * RAF1 MAP3K1 PI3K/AKT Signaling

ERK/MAPK Signaling * selumetinib BAD RPS6kA1 ERK1/2 MEK1/2 trametinib mTOR Signaling

0123456 PI3K PATHWAY APOPTOSIS PATHWAY MEK/ERK PATHWAY - Log10 p value

JAK3(L857Q) JAK3(M511I)

DMSO + - - - - - + - - - - - BCL2 BCL-XL MCL1 tofacitinib (500 nM) - + - - + + - + - - + + selumetinib (300 nM) - - + - + - - - + - + - trametinib (100 nM) - - - + - + - - - + - + 60 200 500 pSTAT5 (Y694) 100 kDa 400 40 150 STAT5 100 kDa 300 20 100 pMEK1/2 (S217/221) 37 kDa 200 RNA-seq FPKM RNA-seq FPKM 0 RNA-seq FPKM 50 MEK1/2 100 37 kDa pERK1/2 (p42/44) -20 0 0 37 kDa JAK3 None JAK3 None JAK3 None ERK1 T other 37 kDa β-actin JAK-STAT other JAK-STA JAK-STAT other 37 kDa Figure 5. Ingenuity Pathway Analysis (IPA) identifies significant enrichment for PI3K/AKT, MEK/ERK and apoptosis signaling. (a) Pathways for which phospho-sites showed a 0.5 log2 fold change increase or decrease in phosphorylation compared to control following tofacitinib or ruxolitinib treatment are shown. (b) Schematic representation of PI3K, apoptosis, and MEK/ERK pathway. Proteins that show an increase or decrease in phosphorylation by tofacitinib or ruxolitinib treatment (average ± 0.5 log2 fold change /dimethylsulfoxide (DMSO) control) are indicated with an asterisk. Inhibitors used in downstream experiments are indicated. (c) Western blot assessment of downstream STAT5, ERK and MEK activation in JAK3(L857Q) or JAK3(M511I) transformed Ba/F3 cells treated with tofacitinib, selumtinib or trametinib for 90 min. (d) RNA-seq analysis of BCL2, MCL1 and BCL-XL in T-ALL patients that either have a JAK3 mutation (n = 20), other JAK-STAT pathway mutations (n = 45, includes mutations in IL7R, PTPN2, FLT3, STAT5B, JAK1, JAK3, SH2B3, IL2RB) or no mutations in JAK-STAT (n = 199). (P-value calculated using welch t-test). phosphoproteins following mutant JAK3 inhibition using tofaciti- metabolism and chromatin remodeling, the majority of these nib was not as consistent as with ruxolitinib, indicating again proteins remain difficult to target due to the lack of small differences between these drugs (Figure 4e). molecular inhibitors. To this end, pathway interrogation using To further complement the PRM, western blot analysis was used Ingenuity Pathway Analysis revealed multiple components of to confirm the phosphorylation status for EIF4EBP1 at threonine MAPK, PI3K and apoptosis pathway to be modulated following T45 and RANBP3 at serine S58, for which site-specific phospho- either tofacitinib or ruxolitinib treatment (Figure 5a, antibodies were available. Both showed significantly reduced Supplementary Figures S2–S4 and Supplementary Table S4). phosphorylation following treatment with tofacitinib or ruxolitinib Specifically, phosphorylation changes were seen in SHC, GRB2 (Supplementary Figure S1B). Moreover, these phosphorylation and GAB1/2, which are upstream of the MEK/ERK pathway. sites were confirmed as being directly downstream of JAK3 Moreover, GAB1/2 is also known to associate with the PI3K (L857Q and M511I) mutations with Ba/F3 transformed by the pathway (Figure 5b). In line with this analysis, acute inhibition of activating mutation STAT5 N642H or parental Ba/F3 stimulated JAK3(L857Q)- and JAK3(M511I)-transformed Ba/F3 cells with fi with IL3 showed significantly less phosphorylation at these sites tofacitinib revealed signi cantly decreased pERK1/2 and pMEK1 compared with untreated Ba/F3 cells expressing either of these levels. This was further reduced when given in combination with the MEK inhibitors selumetinib and trametinib (Figure 5c). JAK3 mutations (Supplementary Figure S1B). Although the phosphoproteomics data did not identify phosphor- ylation changes in any anti- or pro-apoptotic proteins, mutant Inhibiting JAK in combination with the MAPK, PI3K and Bcl-2 JAK3 signaling would be expected to upregulate members of the pathway results in synergistic or additive effects on Ba/F3 cells anti-apoptotic BCL2 family including BCL-2 and MCL1 at the transformed by JAK3 mutants transcriptional level.25 Analysis of expression data from 264 T-ALL Although a large number of novel phosphorylation changes were patients26 revealed that cases with JAK3 mutations (n = 20) had identified downstream of mutant JAK3 signaling including in RNA significantly higher levels of BCL2 compared with non-mutated

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 8

tofacitinib + selumetinib tofacitinib + trametinib tofacitinib + ABT199 tofacitinib + buparlisib 2 2 2 2 CI

CI 1 CI CI 1 1 1 JAK3 M511I 0 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Fraction affected Fraction affected Fraction affected Fraction affected

tofacitinib+ selumetinib tofacitinib+ trametinib tofacitinib + ABT199 tofacitinib + buparlisib 2 2 2 2 CI

CI 1

CI 1 1 CI 1 JAK3 L857Q 0 0 0 0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Fraction affected Fraction affected Fraction affected Fraction affected

**** **** **** 100 100 **** 100 100 ** *** *** ****

50 50 50 50 JAK3 M511I Relative proliferation (%) Relative proliferation (%) 0 0 Relative proliferation (%) 0 0 selumetinib - + - + trametinib - + - + ABT-199 - + - + buparlisib - + - + tofacitinib - - + + tofacitinib - - + + tofacitinib - - + + tofacitinib - - + +

**** **** 100 **** 100 100 100 *** **** **** *** ****

50 50 50 50 JAK3 L857Q Relative proliferation (%) Relative proliferation (%) Relative proliferation (%) Relative proliferation (%) 0 0 Relative proliferation (%) 0 0 selumetinib - + - + trametinib - + - + ABT-199 - + - + buparlisib - + - + tofacitinib - - + + tofacitinib - - + + tofacitinib - - + + tofacitinib - - + + Figure 6. Inhibiting JAK in combination with the MAPK, PI3K and Bcl-2 pathway results in synergistic or additive effects on Ba/F3 cells transformed by JAK3 mutants. (a) Chou-Talalay plots showing the effect of tofacitinib with MEK inhibitors (selumetinib, trametinib), Bcl-2 inhibitor (ABT-199) or PI3K inhibitor (buparlisib) on Ba/F3 cells transformed by JAK3 M511I or L857Q after 24 h incubation. CompuSyn was used to calculate the combination index (CI). CIo1 indicate synergistic effects, C = 1 indicate additive effects, C41 indicate antagonistic effects. (b) Ba/F3 transformed by M511I or L857Q were treated for 24 h with single compounds or combination. Relative proliferation compared to vehicle treated cells is shown. Data represents the average of three experiments ± s.e.m. Significance was calculated using One- way analysis of variance (ANVA) and the Bonferroni correction. (****P ⩽ 0.0001; ***P ⩽ 0.001; **P ⩽ 0.01). Concentrations used; 0.15 μM tofacitinib in combination with 3.75 μM selumetinib, 0.67 μM trametinib, 0.6 μM buparlisib and 0.3 μM tofacitinib in combination with 0.15 μM ABT-199.

cases. There were no differences for BCL-XL or MCL1 expression tofacitinib and 2.109–5 μM selumetinib) (Figure 6a). (Figure 5d). Interestingly, all combinations of tofacitinib with the MEK inhibitor These data therefore provided the rationale to test whether trametinib showed synergistic cytotoxicity (CI o0.1–0.9) in a selective inhibitors of these pathways such as the selective Bcl-2 dose-dependent manner (Figure 6a). This combination had inhibitor (ABT-199), PI3K inhibitor (buparlisib) or MEK inhibitors no effect on parental Ba/F3 cells stimulated by IL3 (data not (selmuetinib and trametinib) would synergistically inhibit cellular shown). growth and survival when used in combination with a JAK Inhibition of the anti-apoptotic protein Bcl-2 using ABT-199 in selective inhibitor. combination with tofacitinib was also synergistic at moderate Ba/F3 cells transformed by two different JAK3 mutants, JAK3 concentrations. However, low concentrations of both tofacitinib (M511I) or JAK3(L857Q), were treated with a combination and ABT-199 were antagonistic, with CI values reaching 2.7. Effects of various concentrations of tofacitinib and the MEK from intermediate concentration combinations vary from moder- inhibitor selumetinib. Most combinations showed a ate synergism to highly synergistic. The combination of 0.3 μM synergistic effect ranging from moderate (0.3–0.05 μM tofacitinib ABT-199 and 0.3 μM tofacitinib induced synergistic cytotoxicity (CI and 0.89–1.582 μM selumetinib) to high synergism (0.3 μM values up to 0.3).

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 9

Figure 7. Synergistic effect of tofacitinib in combination with MEK or Bcl-2 inhibitors in JAK3 mutant PDX samples. (a) Schematic representation of experimental workflow. (b) Chou-Talalay plots showing the effect of tofacitinib with MEK inhibitors (selumetinib, trametinib) or Bcl-2 inhibitor (ABT-199) on T-ALL-derived PDX samples. CompuSyn was used to calculate the combination index (CI). CIo1 indicate synergistic effects, C = 1 indicate additive effects, C41 indicate antagonistic effects. (c–e) Relative viable cell count is after 48 h treatment of PDX samples with a combination of 0.2 μM tofacitinib and 0.3 μM selumetinib (c) or 0.2 μM trametinib (d) or a combination with 0.3 μM tofacitinib and 0.13 μM ABT-199 (e). Data represents the average of three experiments ± s.e.m. Significance was calculated using one-way analysis of variance (ANOVA) and the Bonferroni correction. (****P ⩽ 0.0001; ***P ⩽ 0.001; nonsignificant (ns) P ⩾ 0.05).

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 10

Peripheral Blood Peripheral Blood Bone Marrow *** ** 100 100 80 Vehicle *** *** ruxolitinib *** 80 *** 80 60 ABT-199 Ruxolitinib +ABT-199 60 60 40 ***

40 % hCD45 40 % hCD45 20 Start

% hCD45 in the blood 20 20

0 0 0 10 20 30 40 0 Days after injection Vehicle Vehicle ABT-199 ABT-199 Ruxolitinib Ruxolitinib RuxolitinibABT-199 + RuxolitinibABT-199 +

Figure 8. Combination treatment of mice xenografted with JAK3 (M511I) patient 389E. (a) Leukemia burden was followed on basis of human CD45 expression in the peripheral blood. Treatment with vehicle (n = 7), ruxolitinib (40 mg/kg/day)(n = 7), ABT-199 (20 mg/kg/day; n = 7) or ruxolitinib (40 mg/kg/day)+ABT-199 (20 mg/kg/day) (n = 8) was begun once CD45%42% and indicated with an arrow. Treatment continued daily for 14 days. (b) End-stage assessment of human (h)CD45% in the peripheral blood. (c) End-stage assessment of human (h)CD45% in the bone marrow. Significance was calculated using student t-test (***P ⩽ 0.01; **P ⩽ 0.05).

Inhibition of the P13K pathway revealed both high synergism tofacitinib (0.01–0.04 μM) with low-dose trametinib concentrations and high antagonism dependent on dosing schedules. Low (0.015 μM) (Figure 7b). Targeting BCL2 using ABT-199 in combina- dosage of tofacitinib (up to 0.05 μM) in combination with tion with tofacitinib revealed high synergism. Low-dose tofacitinib buparlisib elicited a highly antagonistic effect. However, signifi- (0.08 μM) with low-dose ABT-199 (0.044 μM) resulted in a highly cant synergism (CIo0.1) was elicited through the use of synergistic CI value of o0.1 (Figure 7b). intermediate tofacitinib concentrations (0.2 μM) in combination Single synergistic dosage combinations determined using the with high dose buparlisib (2.5 μM). X11 sample were then used on the 389E (JAK3 M511I mutant These initial screening matrices were then used to determine T-ALL sample) and X10 (JAK3 Wild-Type T-ALL sample) samples optimal dosage regimens required to elicit a synergistic cytotoxi- (Figures 7c and d). The T-ALL cells expressing a JAK3(M511I) city in mutant JAK3-expressing cells and subsequently to be used mutation showed significantly reduced viability upon treatment in experiments at fixed concentrations, alone and combined. with the combination of the JAK selective inhibitor tofacitinib and These fixed dose regimens significantly reduced proliferation a MEK inhibitor selumetinib (Figure 7c) or trametinib (Figure 7d) when in combination compared with single compound treat- compared with single-compound treatment. Interestingly, T-ALL ments for both JAK3(M511I) and JAK3(L857Q) mutant-transformed cells 389E showed less sensitivity to JAK selective inhibition alone Ba/F3 cells (Figure 6b). (compared with X11), but synergy was still observed. The T-ALL sample × 10, which expresses wild-type JAK3 (and also does not Synergistic effect of tofacitinib in combination with MEK or Bcl-2 harbor IL7R or JAK1 mutations), did not show any reduction in cell inhibitors in JAK3 mutant PDX samples viability after treatment with tofacitinib and/or selumetinib and To determine the preclinical efficacy of these drug combinations, trametinib showing that the synergy is dependent on the we next sought to determine the cytotoxic effect on primary presence of a JAK3 mutation (Figures 7c and d). TheT-ALLsampleswerealsotreated with the Bcl-2 inhibitor patient-derived T-ALL cells. Two T-ALL samples expressing a JAK3 ABT-199 alone or in combination with tofacitinib. T-ALL samples with (M511I) mutation (X1 1and 389E) and one sample that was JAK3 JAK3 mutation were sensitive to both tofacitinib or ABT-199 alone, wild type (X10) were used (Supplementary Table 1). These patient and the combination of these drugs was more effective than each samples were initially expanded within NSG mice before collection drug separately. The T-ALL sample × 10 (JAK3 wild type) showed of the leukemic cells and immediately incubated ex vivo with our reduced cell viability after treatment with the Bcl-2 inhibitor ABT-199, optimized JAK3 pathway targeting regimens for 48 h (Figure 7a). Tofacitinib in combination with the MEK and Bcl-2 inhibitors but there was no additional reductionwhencombinedwiththeJAK- were selected, as these combinations showed the most consistent selective inhibitor tofacitinib (Figure 7e), indicating again that synergic effects over a wide range of concentrations. Synergy was synergy between tofacitinib and ABT-199 is only obtained in T-ALL initially assessed and confirmed using the T-ALL X11 sample (JAK3 samples harboring a JAK3 mutation. Similar results were shown after M511I) for the combination of tofacitinib and the MEK inhibitors, apoptosis staining using Annexin-V and propidium iodide. Treat- selumetinib and trametinib, and the Bcl-2 inhibitor ABT-199. ment of the T-ALL sample X11 with tofacitinib in combination with Although most concentration combinations showed synergic selumetinib, trametinib or ABT-199 resulted in an increase in efficacy, some combinations of selumetinib and tofacitinib were Annexin-V+/propidium iodide+ cells compared with cells treated antagonistic revealing CI values up to 1.7. Notably, this antagon- with a single compound (Supplementary Figure S5). ism was only seen when combinations had the highest concentration of tofacitinib (1.28 μM) (Figure 7b). Importantly, In vivo combination treatment of patient-derived xenograft the combination of trametinib and tofacitinib elicited cytotoxicity sample carrying a JAK3 mutations leads to a significant decrease with CI values ranging from o0.1 to 1, which is indicative of in leukemia burden highly synergistic efficacy. Furthermore, the most synergic Having established strong synergism between JAK3 and Bcl-2 combination was elicited through the combination of low-dose inhibition, we moved to determine whether this could also be

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 11 observed in vivo. To this end, an in vivo patient-derived xenograft phosphorylated at threonine 632 (T632) following DNA damage model was established with T-ALL sample 389E that carries a JAK3 and largely abrogates the chromatin-binding capacity of (M511I) mutation. For these studies, the JAK1-selective inhibitor KDM2A.35 We also reveal reduced phosphorylation of the DNA ruxolitinib was used due to its known side effect profile and methyltransferase family member DNMT1 upon inhibition of known efficacy in other hematological disorders27 in combination mutant JAK3 signaling (Figure 4). DNMT1 is required to maintain with ABT-199. To accurately detect synergism, suboptimal the methylation of the entire genome and altered expression has concentrations were used for ABT-199 (20 mg/kg/day/PO) and been observed in a range of cancers including lymphoma, breast, ruxolitinib (40 mg/kg/day/PO), which were 70–80% reduced colon, liver, pancreas and esophagus cancer.36 For both KDM2A compared with previously published studies when used as a and DNMT1, it will be important now to determine the role of single agent.28,29 Leukemia burden was assessed by human CD45 these phosphorylation sites in regulating protein function and expression in the peripheral blood at the beginning, midpoint and their role in T-ALL. end stage of the 14-day treatment phase. After 14 days, disease One of the more surprising links we found downstream mutant burden was assessed in all mice and the combination treatment of JAK3 signaling was RNA metabolism including proteins involved in ruxolitinib and ABT-199 led to significant decrease of leukemia mRNA stability, splicing and degradation. There were significant cells in the peripheral blood and significantly less bone marrow decreased phosphorylation changes upon inhibition of JAK3 in compared with vehicle, ABT-199-only or ruxolitinib-only treatment RRP1B, NCBP1, SERBP1, YBX3, ATXN2L, U2SURP, HNRNPM, (Figures 8a–c). HNRNPK and GTPBP1, and significant increased phosphorylation changes upon inhibition of JAK3 in CMTR1, RNMT, ZC3H13, SF3B1, PNN, CASC3 and SF1. Of these candidate proteins, a number DISCUSSION already have potential roles in cancer. RRP1B expression levels can Although improved treatment and supportive care have led to a predict patient outcome in breast cancer and can regulate histone better outcome for T-ALL patients, poor response to therapy and methylation.37 SERBP1 has been found to be significantly relapse remain significant problems, especially in adult patients. In upregulated in CD34+ CML cases compared with control C34+ addition, survivors continue to suffer long-term effects from cells.38 Significant hypermethylation of the YBX3 promoter to chemotherapy and in adults the rates of survival remain below downregulate its mRNA expression has been found in AML.39 50%.30,31 Although there is no evidence that JAK3 mutations are HNRNPM has been found to promote cancer metastasis by associated with a poor prognosis, this tyrosine kinase is an regulating alternative splicing.40 Depletion of RNMT has been attractive therapeutic target to help establish a more patient- described to effectively and specifically inhibits cancer cell growth specific therapy.2,9 Precision therapeutic strategies targeting and cell invasive capacities in different types of cancer.21 SRRM1 recurring oncogene addicted cells along with the pathways that has been shown to regulate splicing of the CD44 protein41 and the cooperate in malignant cell growth offer treatment potentials that identification of different residues that increase and decrease in are believed to have the greatest efficacy, while reducing side phosphorylation status may indicate a functional switch for this effects. protein. Finally, SF3B1 is recurrently mutated in MDS and CLL,20,42 Here we describe the first comprehensive analysis of phospho- and SF3B1 inhibitors are in preclinical development for the proteome signaling pathways regulated downstream of mutant treatment of cancer and therefore may have potential indications JAK3. We identified changes in phospho-peptides upon inhibition in combination with JAK3 inhibitors in T-ALL. of mutant JAK3 signaling, with the identification of proteins To this end, phosphoproteomic profile provided a list of involved in MAPK, PI3K/mTOR and apoptosis, and hitherto pathways that could be targeted by small molecular inhibitors in unknown links to proteins involved RNA processing and combination of JAK3 inhibitors. We show that the combined epigenetic regulation. It is known that JAK proteins are able to inhibition of JAK3 signaling with inhibitors targeting PI3K, MEK or phosphorylate the SH2 domain containing protein SHC1 that is BCL2 resulted in a synergistic cytotoxic effect in Ba/F3 samples recruited to the cytokine receptor and in turn recruits the GRB2– transformed by JAK3 mutants and primary T-ALL patient samples SOS complex. SOS is the guanosine nucleotide exchange factor for carrying JAK3 mutations. This synergistic effect was selective for RAS. Through RAS, the GRB2-SOS complex is able to activate the JAK3-mutated T-ALL samples, as no synergy was observed in the canonical MAPK pathway.32 Both SHC1 and GAB2 phosphorylation T-ALL sample that was wild type for JAK3. We also tested the were significantly decreased upon inhibition of mutant JAK3 combination of ruxolitinib with ABT-199 in vivo and this also signaling. In addition to the RAS/MAPK pathway, the PI3K pathway resulted in a significant decrease in leukemia burden, suggesting has also been shown to have an important role downstream of the this combinational would be an effective treatment strategy in JAK kinases. Sharfe et al.5 showed that JAK3 was able to interact patients with a JAk3 mutation. Recently, Canté-Barrett et al.43 with the p85 subunit of PI3K upon IL7 stimulation and showed that inhibiting both the MEK and PI3K-AKT pathways subsequently induced phosphorylation of this p85 subunit. PI3K synergistically prevents the proliferation of Ba/F3 cells expressing also interacts with the GAB2 adaptor protein, which indicates a JAK3 mutants. Furthermore, combined inhibition of MEK and PI3K/ link between PI3K and MAPK activation.33 AKT was cytotoxic to four primary T-ALL patient samples, which A number of candidate proteins found to be directly down- carried a JAK3 mutation. It has also been shown that immature stream of mutant JAK3 have already been found to have T-ALL patients with high levels of BCL2 were sensitive to ABT-199 important roles either in the pathogenesis of T-ALL or cancer in as a single agent.44 We have now extended this observation to general. We have recently described a link between mutations in take into account the JAK3 mutational status and tested the proteins involved in epigenetic regulation such as the PRC2 inhibitory effects of ABT-199 in combination with tofacitinib, a JAK complex and mutations in the IL7R/JAK3 signaling pathway.2 Here selective inhibitor and show that this combination is highly we show phosphoproteomic data linking mutant JAK3 signaling effective in causing cytotoxicity for T-ALL cells. It is important to with proteins implicated in epigenetic regulation. In the current note that for some concentrations combinations of MEK or BCL2 work, we describe an increased phosphorylation of KDM2A upon inhibitors with the JAK inhibitor tofacitinib resulted in an inhibition of mutant JAK3 signaling. KDM2A is a histone antagonistic or additive effect rather than a synergistic effect. demethylase that specifically demethylates lysine-36 of histone This highlights the importance of identifying the optimal 3. Increased expression levels of KDM2A promotes tumor cell treatment dose for patients in order to obtain maximal synergy. growth and migration in gastric cancer34 but there are currently In summary, this study has characterized signaling downstream no reports on how the phosphorylation status of KDM2A of mutant JAK3 showing constitutive activation of proteins threonine 713 (T713) regulates its activity. However, KDM2A is involved in cell cycle, translation, apoptosis, MAPK and PI3K/AKT

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 12 pathways and epigenetic regulation. Through the use of primary 11 Dun MD, Chalkley RJ, Faulkner S, Keene S, Avery-Kiejda KA, Scott RJ et al. Pro- T-ALL samples we show synergistic inhibition by combining JAK- teotranscriptomic profiling of 231-BR breast cancer cells: identification of selective inhibitors with MEK, PI3K and BCL2 inhibitors. Together, potential biomarkers and therapeutic targets for brain metastasis. Mol Cell Pro- 14 – our research gives the incentive to further explore the use of JAK teomics 2015; : 2316 2330. 12 Fujiki Y, Hubbard AL, Fowler S, Lazarow PB. Isolation of intracellular membranes inhibitors in combination with MEK or BCL2 inhibitors in order to by means of sodium carbonate treatment: application to endoplasmic reticulum. optimize current treatment regimes. J Cell Biol 1982; 93:97–102. 13 Thompson A, Schäfer J, Kuhn K, Kienle S, Schwarz J, Schmidt G et al. Tandem mass tags: a novel quantification strategy for comparative analysis of complex protein CONFLICT OF INTEREST mixtures by MS/MS. Anal Chem 2003; 75: 1895–1904. SB is an employee of Thermo Fisher Scientific, the corporation that produces Orbitrap 14 Engholm-Keller K, Birck P, Størling J, Pociot F, Mandrup-Poulsen T, Larsen MR. mass spectrometers and proteomics software. SB is an LCMS applications specialist TiSH--a robust and sensitive global phosphoproteomics strategy employing a and provided technical support for MS methodology and data interpretation. Beyond combination of TiO2, SIMAC, and HILIC. J Proteomics 2012; 75:5749–5761. this, the authors are not aware of any affiliations, memberships, funding, or financial 15 MacLean B, Tomazela DM, Shulman N, Chambers M, Finney GL, Frewen B et al. holdings that might be perceived as affecting the objectivity of this manuscript. Skyline: an open source document editor for creating and analyzing targeted 26 – There are no patents, products in development, or marketed products to declare. All proteomics experiments. Bioinformatics 2010; : 966 968. other authors declare no conflict of interest. 16 Nixon B, Stanger SJ, Mihalas BP, Reilly JN, Anderson AL, Dun MD et al. Next generation sequencing analysis reveals segmental patterns of microRNA expression in mouse epididymal epithelial cells. PLoS One 2015; 10: e0135605. ACKNOWLEDGEMENTS 17 Taus T, Köcher T, Pichler P, Paschke C, Schmidt A, Henrich C et al. Universal and confident phosphorylation site localization using phosphoRS. J Proteome Res Dr Ben Crossett from the Mass Spectrometry Core Facility at The University of Sydney 2011; 10: 5354–5362. University, Mr Nathan Smith from The University of Newcastle Analytical and 18 Quesada V, Conde L, Villamor N, Ordóñez GR, Jares P, Bassaganyas L et al. Exome Biomolecular Research Facility (ABRF) and Dr Michael Mariani from Thermo Fisher sequencing identifies recurrent mutations of the splicing factor SF3B1 gene in Scientific provided MS support. The Academic and Research Computing Support chronic lymphocytic leukemia. Nat Genet 2011; 44:47–52. (ARCS) team, within IT Services at the University of Newcastle, provided high 19 Wang L, Lawrence MS, Wan Y, Stojanov P, Sougnez C, Stevenson K et al. SF3B1 performance computing (HPC) infrastructure for supporting the bioinformatics. This and other novel cancer genes in chronic lymphocytic leukemia. N Engl J Med study was supported by grants from the Belgian government (cancer plan), the FWO- 2011; 365:2497–2506. Vlaanderen (JC and PVV), the Children Cancer Fund Ghent (PVV), the Belgian Stand 20 Rossi D, Bruscaggin A, Spina V, Rasi S, Khiabanian H, Messina M et al. Mutations of Up To Cancer Foundation (PVV) and the Foundation Against Cancer (JC and PVV); a the SF3B1 splicing factor in chronic lymphocytic leukemia: association with pro- European Research Council grant (JC); the Interuniversity Attraction Poles granted by gression and fludarabine-refractoriness. Blood 2011; 118: 6904–6908. the Federal Office for Scientific, Technical and Cultural Affairs, Brussels, Belgium (JC); 21 Stefanska B, Cheishvili D, Suderman M, Arakelian A, Huang J, Hallett M et al. and the Agency for Innovation by Science and Technology in Flanders, Belgium (SD Genome-wide study of hypomethylated and induced genes in patients with liver and DB); Cancer Institute NSW, Australia ECF (MDD and NV). The Hunter Cancer cancer unravels novel anticancer targets. Clin Cancer Res 2014; 20: 3118–3132. fi Research Alliance and the European Association for Cancer Research provided a 22 Lunning MA, Green MR. Mutation of chromatin modi ers; an emerging hallmark 5 travel scholarship for S Degryse. IG and SB are both Aspirant Fellows of the FWO- of germinal center B-cell lymphomas. Blood Cancer J 2015; : e361. Vlaanderen. Jennie Thomas, Life Governor of the Hunter Medical Research Institute 23 Cai L, Zhao K, Yuan X. Expression of minichromosome maintenance 8 in chronic myelogenous leukemia. Int J Clin Exp Pathol 2015; 8: 14180–14188. provided a travel scholarship for MDD. The Cancer Institute NSW in partnership with 24 Karp NA, Huber W, Sadowski PG, Charles PD, Hester SV, Lilley KS. Addressing the Faculty of Health and Medicine from the University of Newcastle funded the MS accuracy and precision issues in iTRAQ quantitation. Mol Cell Proteomics 2010; 9: platform. 1885–1897. 25 Takada K, Jameson SC. Naive T cell homeostasis: from awareness of space to a REFERENCES sense of place. Nat Rev Immunol 2009; 9: 823–832. 26 Liu Y, Easton J, Shao Y, Maciaszek J, Wang Z, Wilkinson MR et al. The genomic 1 Stock W, La M, Sanford B, Bloomfield CD, Vardiman JW, Gaynon P et al. What landscape of pediatric and young adult T-lineage acute lymphoblastic leukemia. determines the outcomes for adolescents and young adults with acute lym- Nat Genet 2017; 63: 5329. phoblastic leukemia treated on cooperative group protocols? A comparison of 27 Senkevitch E, Durum S. The promise of Janus kinase inhibitors in the treatment of Children's Cancer Group and Cancer and Leukemia Group B studies. Blood 2008; hematological malignancies. Cytokine 2016; 98:33–41. 112: 1646–1654. 28 Chonghaile TN, Roderick JE, Glenfield C, Ryan J, Sallan SE, Silverman LB et al. 2 Vicente C, Schwab C, Broux M, Geerdens E, Degryse S, Demeyer S et al. Targeted Maturation stage of T-cell acute lymphoblastic leukemia determines BCL-2 versus sequencing identifies association between IL7R-JAK mutations and epigenetic BCL-XL dependence and sensitivity to ABT-199. Cancer Discov 2014; 4: 1074–1087. modulators in T-cell acute lymphoblastic leukemia. Haematologica 2015; 100: 29 Evrot E, Ebel N, Romanet V, Roelli C, Andraos R, Qian Z et al. JAK1/2 and Pan- 1301–1310. deacetylase inhibitor combination therapy yields improved efficacy in preclinical 3 Degryse S, de Bock CE, Cox L, Demeyer S, Gielen O, Mentens N et al. JAK3 mutants mouse models of JAK2V617F-driven disease. Clin Cancer Res 2013; 19: 6230–6241. transform hematopoietic cells through JAK1 activation, causing T-cell acute 30 Hunger SP, Lu X, Devidas M, Camitta BM, Gaynon PS, Winick NJ et al. Improved lymphoblastic leukemia in a mouse model. Blood 2014; 124: 3092–3100. survival for children and adolescents with acute lymphoblastic leukemia between 4 Osinalde N, Sanchez-Quiles V, Akimov V, Guerra B, Blagoev B, Kratchmarova I. 1990 and 2005: a report from the children's oncology group. J Clin Oncol 2012; 30: Simultaneous dissection and comparison of IL-2 and IL-15 signaling pathways by 1663–1669. global quantitative phosphoproteomics. Proteomics 2015; 15: 520–531. 31 DeAngelo DJ, Stevenson KE, Dahlberg SE, Silverman LB, Couban S, Supko JG et al. 5 Sharfe N, Dadi HK, Roifman CM. JAK3 protein tyrosine kinase mediates interleukin- Long-term outcome of a pediatric-inspired regimen used for adults aged 18-50 7-induced activation of phosphatidylinositol-3' kinase. Blood 1995; 86: 2077–2085. years with newly diagnosed acute lymphoblastic leukemia. Leukemia 2015; 29: 6 Zhang J, Ding L, Holmfeldt L, Wu G, Heatley SL, Payne-Turner D et al. The genetic 526–534. basis of early T-cell precursor acute lymphoblastic leukaemia. Nature 2012; 481: 32 Osinalde N, Moss H, Arrizabalaga O, Omaetxebarria MJ, Blagoev B, Zubiaga AM 157–163. et al. Interleukin-2 signaling pathway analysis by quantitative phosphopro- 7 Girardi T, Vicente C, Cools J, De Keersmaecker K. The genetics and molecular teomics. J Proteomics 2011; 75: 177–191. biology of T-ALL. Blood 2017; 129: 1113–1123. 33 Gadina M, Sudarshan C, Visconti R, Zhou YJ, Gu H, Neel BG et al. The docking 8 De Keersmaecker K, Atak ZK, Li N, Vicente C, Patchett S, Girardi T et al. Exome molecule gab2 is induced by lymphocyte activation and is involved in signaling sequencing identifies mutation in CNOT3 and ribosomal genes RPL5 and RPL10 in by interleukin-2 and interleukin-15 but not other common gamma chain-using T-cell acute lymphoblastic leukemia. Nat Genet 2013; 45:186–190. cytokines. J Biol Chem 2000; 275: 26959–26966. 9 Degryse S, Cools J. JAK kinase inhibitors for the treatment of acute lymphoblastic 34 Huang Y, Liu Y, Yu L, Chen J, Hou J, Cui L et al. Histone demethylase KDM2A leukemia. J Hematol Oncol 2015; 8:91. promotes tumor cell growth and migration in gastric cancer. Tumour Biol 2015; 10 Losdyck E, Hornakova T, Springuel L, Degryse S, Gielen O, Cools J et al. Distinct 36:271–278. acute lymphoblastic leukemia (ALL)-associated janus kinase 3 (JAK3) mutants 35 Cao L-L, Wei F, Du Y, Song B, Wang D, Shen C et al. ATM-mediated KDM2A exhibit different cytokine-receptor requirements and JAK inhibitor specificities. phosphorylation is required for the DNA damage repair. Oncogene 2016; 35: J Biol Chem 2015; 290: 29022–29034. 301–313.

Leukemia (2017) 1 – 13 Phosphoproteomic screen of mutant JAK3 S Degryse et al 13 36 Zhang W, Xu J. DNA methyltransferases and their roles in tumorigenesis. Biomark 43 Canté-Barrett K, Spijkers-Hagelstein JAP, Buijs-Gladdines JGCAM, Uitdehaag JCM, Res 2017; 5:1. Smits WK, van der Zwet J et al. MEK and PI3K-AKT inhibitors synergistically block 37LeeM,DworkinAM,LichtenbergJ, Patel SJ, Trivedi NS, Gildea D et al. Metastasis- activated IL7 receptor signaling in T-cell acute lymphoblastic leukemia. Leukemia associated protein ribosomal RNA processing 1 homolog B (RRP1B) modulates metas- 2016; 30:1832–1843. tasis through regulation of histone methylation. Mol Cancer Res 2014; 12:1818–1828. 44 Peirs S, Matthijssens F, Goossens S, Van de Walle I, Ruggero K, de Bock CE et al. 38 Čokić VP, Mojsilović S, Jauković A, Kraguljac-Kurtović N, Mojsilović S, Šefer D et al. ABT-199 mediated inhibition of BCL-2 as a novel therapeutic strategy in T-cell Gene expression profile of circulating CD34(+) cells and granulocytes in chronic acute lymphoblastic leukemia. Blood 2014; 124: 3738–3747. myeloid leukemia. Blood Cells Mol Dis 2015; 55:373–381. 39 Wong JJ-L, Lau KA, Pinello N, Rasko JEJ. Epigenetic modifications of splicing factor genes in myelodysplastic syndromes and acute myeloid leukemia. Cancer Sci This work is licensed under a Creative Commons Attribution- 2014; 105: 1457–1463. NonCommercial-ShareAlike 4.0 International License. The images or 40 Xu Y, Gao XD, Lee J-H, Huang H, Tan H, Ahn J et al. Cell type-restricted activity of other third party material in this article are included in the article’s Creative Commons hnRNPM promotes breast cancer metastasis via regulating alternative splicing. license, unless indicated otherwise in the credit line; if the material is not included under Genes Dev 2014; 28: 1191–1203. the Creative Commons license, users will need to obtain permission from the license 41 Cheng C, Sharp PA. Regulation of CD44 alternative splicing by SRm160 and its holder to reproduce the material. To view a copy of this license, visit http:// potential role in tumor cell invasion. Mol Cell Biol 2006; 26: 362–370. creativecommons.org/licenses/by-nc-sa/4.0/ 42 Malcovati L, Karimi M, Papaemmanuil E, Ambaglio I, Jädersten M, Jansson M et al. SF3B1 mutation identifies a distinct subset of myelodysplastic syndrome with ring sideroblasts. Blood 2015; 126:233–241. © The Author(s) 2017

Supplementary Information accompanies this paper on the Leukemia website (http://www.nature.com/leu)

Leukemia (2017) 1 – 13