COLLAGEN CROSSLINKING AND REMODELING IN

BREAST AND PANCREATIC CANCER

by

ALEXANDER STEVEN BARRETT

B.S., Florida State University, 2010

M.S., University of South Florida, 2013

A thesis submitted to the

Faculty of the Graduate School of the

University of Colorado in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Structural Biology and Biochemistry

2017

This thesis for the Doctor of Philosophy degree by Alexander S. Barrett has been approved for the Structural Biology and Biochemistry Program by

Jeff S. Kieft, Chair Elan Z. Eisenmesser Robert S. Hodges Traci R. Lyons Virginia F. Borges Kirk C. Hansen, Advisor

Date: 12/15/2017

ii

Barrett, Alexander Steven (Ph.D., Structural Biology and Biochemistry)

Collagen Crosslinking and Extracellular Matrix Remodeling in Breast and Pancreatic

Cancer

Thesis directed by Associate Professor Kirk C. Hansen

ABSTRACT

Much effort has been devoted to understanding the molecular mechanisms by which stromal remodeling contributes to solid tumor progression. The extracellular matrix (ECM) is a major component of the stroma that normally regulates tissue development and homeostasis, however; its dysregulation contributes to tumor progression. While the ECM serves as the scaffold upon which tissues are organized, it also provides essential biochemical and biomechanical cues that direct cell growth, survival, migration, differentiation and immune function. While it is known that genetic mutations initiate and drive malignant transformation, cancer progresses within a dynamic ECM that is capable of modulating the hallmarks of cancer. Many solid tumors are characterized by tissue fibrosis that contributes to poor prognosis, however; comprehensive characterization of ECM composition and biomechanically relevant features, such as (LOX) mediated collagen crosslinking, remain understudied in tumor progression. Significant hurdles in characterizing a desmoplastic stroma include methods capable of characterizing insoluble ECM components and downstream analytical techniques to analyze these components. Therefore, this research aims at developing and applying new

iii analytical methods to unravel the complex connection between fibrosis, tissue stiffness, and collagen crosslinking by revealing a more complete and detailed molecular view of how the ECM changes during tumor progression. As such, we have applied these methods to characterize ECM composition and crosslinking in breast and pancreatic cancer, both of which are characterized by substantial ECM remodeling leading to formation of a desmoplastic stroma. We have used genetically engineered mouse models of breast and pancreatic cancer aimed at modulating tumor fibrosis in vivo and go on to assess similarities and differences in these models with clinical patient samples. Taken together, the work included in this thesis provides an updated toolkit for ECM characterization and sheds light into new prognostic markers and therapeutic targets that may improve the survival of patients whose tumors are characterized by fibrosis.

The form and content of this abstract are approved. I recommend its publication.

Approved: Kirk C. Hansen

iv

I dedicate this work to my grandmother, Eunice Barrett.

v

ACKNOWLEDGEMENTS

Completion of this thesis and my academic journey has been one of the great joys of my life thus far. The discoveries presented here are a testament to the love, support, and encouragement from a number of people, without whom none of this would be possible. Many of these people have helped to shape me into the scientist that I am today and to you all, I am forever grateful.

First and foremost, I would like to thank my thesis advisor and mentor Dr. Kirk

Hansen. Thank you for giving me intellectual freedom in my work, engaging me in new ideas, and demanding a high quality of work in all my endeavors. Your passion and drive to push the limits of scientific discovery is a constant unwavering beacon that has always drawn me in closer. Thank you for always making me look forward to the next experiment with your endless passion for discovery.

Next, I would like to thank the members of the Hansen Lab and Mass

Spectrometry Core Facilities. Without the help of Ryan Hill and Monika

Dzieciatkowska, I would not have become the expert in mass spectrometry and proteomics that I am today and I am standing on their shoulders with the completion of this thesis. Monika, you are an exceptional scientist and have taught me more about mass spectrometers than I thought I would ever get to know. Thank you for always being there for me with kind words and intelligent solutions. Ryan, you have laid the groundwork for all the exciting work I was able to accomplish in the lab. I count some of the projects we have worked on together among some of my best times in the lab and am grateful for your help along the way. Travis Nemkov, thank you for being my buddy in the lab, motivating me to accomplish my goals and going

vi through the journey with me. The fun times we have had along the way are some of my greatest memories from school. Angelo D’Alessandro, your unwavering support of my professional development and your dedication to scientific discovery has directly motivated me to do more each day. I have cherished the intelligent discussions we have had and your words of encouragement. I would like to thank the many collaborators that I have worked with during my time in school who were integral to the success of the projects presented here. In particular, I would like to thank Ori Maller for his friendship and help along the way. I would also like to thank the members of my thesis committee for their insightful comments and suggestions:

Robert Hodges, Elan Eisenmesser, Jeff Kieft, Traci Lyons and Ginger Borges. Also, thank you to the students, faculty and program administrators for their support during my time in school. Elizabeth Wethington, thank you for your friendship and always being there for me.

Lastly, I would like to thank my friends and family. To my parents, Steve and

Veronica, you have taught me to believe in myself and pushed me to achieve the goals I set out for myself. Thank you for always knowing just what to say when the going got tough. I would like to thank my brother Cam Barrett for always being there for me and reminding me to have fun. I would like to thank my grandma, Eunice for always being my number one supporter in everything that I do. To my soon to be wife, Katie: I would not have come this far without your love and belief in me. You are my best friend and your unbreakable spirit has motivated me to accomplish my goals and strive for success in everything that I do.

vii

TABLE OF CONTENTS CHAPTER

I: THE EXTRACELLULAR MATRIX AT A GLANCE ...... 1

Thesis Direction and Summary ...... 1

Introduction ...... 3

The Role of the ECM in Tissue Development and Homeostasis ...... 3

Properties and Features of the ECM ...... 5

Collagen and ...... 6

Collagen ...... 6

History ...... 6

The collagen family ...... 8

Collagen biosynthesis ...... 9

Collagen Crosslinking ...... 10

Immature bivalent reducible crosslinks ...... 11

Lysyl hydroxylases ...... 12

Hydroxy lysine aldehyde crosslinking pathway ...... 12

Lysine aldehyde crosslinking pathway ...... 13

Mature trivalent crosslinks ...... 14

Elastin ...... 15

Properties of elastin ...... 15

Elastin crosslinking ...... 16

Fibronectin ...... 17

Proteoglycans ...... 18

Laminin and Basement Membranes ...... 19

Matricellular ...... 20

The ECM and Cancer ...... 22

viii

The Provisional ECM and the Wound Healing Response ...... 22

Biomechanical Signaling in the Tumor Microenvironment ...... 23

Dysregulated ECM Dynamics are a Hallmark of Cancer ...... 23

ECM Stiffness and Collagen Crosslinking are Tumor Promotional ..... 25

Abnormal ECM Architectures are Associated with

Tumor Progression ...... 26

Figures ...... 27

II: HYDROXYLAMINE CHEMICAL DIGESTION FOR INSOLUBLE EXTRACELLULAR MATRIX CHARACTERIZATION ...... 32

Introduction ...... 32

Materials and Methods ...... 36

Reagents ...... 36

QconCAT Design...... 36

Sample Preparation for Proteomics ...... 36

Hydroxylamine (NH2OH) Treatment ...... 37

Cyanogen Bromide (CNBr) Treatment ...... 38

Trypsin Digestion ...... 38

LC-SRM Analysis ...... 38

LC-MS/MS Analysis ...... 39

Database Searching and Identification ...... 40

Error Tolerant Searches ...... 41

Results ...... 41

CNBr Versus NH2OH Digestion of Insoluble ECM Components ...... 41

Femur ...... 43

Skin ...... 45

ix

Lung ...... 46

Muscle ...... 48

Liver ...... 50

Evaluation of Chemical Digestion Cleavage Specificity ...... 51

Error Tolerant Searches ...... 52

Discussion ...... 53

Figures ...... 58

III: EXTRACELLULAR MATRIX REMODELING IN MAMMARY GLAND

AND LIVER TISSUE MICROENVIRONMENTS DURING

THE REPRODUCTIVE CYCLE ...... 67

Introduction ...... 67

Materials and Methods ...... 70

Rodent Studies ...... 70

Cell Lines ...... 71

Portal Vein Injections ...... 71

Immunofluorescence and Imaging ...... 71

Sample Preparation for Proteomic Analysis ...... 72

Detergent/chaotrope Removal & Protein Digestion ...... 72

Liquid Chromatography Tandem Mass Spectrometry

& Data Analysis ...... 72

Statistics ...... 73

Results ...... 73

Development of Quantitative ECM Proteomic Methodology ...... 73

QconCAT Based Proteomics Reveals Unique and Shared

Mammary Gland and Liver ECM Profiles ...... 75

x

Mammary Gland ECM Proteomics Across the Reproductive Cycle .. 78

Discussion ...... 80

Note About Use of CNBr ...... 85

Figures ...... 86

IV: INCREASED MAMMOGRAPHIC DENSITY IS CORRELATED WITH

FIBRILLAR COLLAGEN ABUNDANCE ...... 93

Introduction ...... 93

Results ...... 95

Discussion ...... 100

Figures ...... 104

V: HYDROXY LYSINE DERIVED COLLAGEN CROSSLINKS

PROMOTE POOR BREAST CANCER PATIENT PROGNOSIS AND

TREATMENT RESISTANCE ...... 109

Introduction ...... 109

Materials and Methods ...... 112

Preparation of Tissue for Hydrolysis ...... 112

Protein Hydrolysis ...... 112

Preparation of Crosslink Enrichment Column ...... 113

UHPLC Analysis ...... 113

MS Data Acquisition ...... 114

Quantification of Crosslinked Amino Acids ...... 115

Human Breast Tissue ...... 116

Picrosirius Red Staining and Quantification ...... 116

Tissue Preparation for AFM Measurements of ECM Stiffness ...... 117

AFM Measurements of ECM Stiffness on Tissue Sections ...... 117

xi

SHG Image Acquisition ...... 118

LH2 IHC and Prognostic Analyses ...... 118

Study population ...... 118

Tumor evaluation ...... 119

IHC statistical analyses ...... 121

Statistical Analysis ...... 122

Results ...... 123

Overexpression of Lysyl Oxidase in the Mammary Tumor

Stroma Results in Increased Collagen Crosslinking ...... 123

Association Between Collagen Crosslinking Abundance

and Fiber Organization in Human Breast Tissue ...... 125

Characterization of Collagen Crosslinking in Human

Breast Tumor Subtypes ...... 126

High PLOD2 Expression is Associated with Poor Breast

Cancer Prognosis and Treatment Resistance ...... 127

Discussion ...... 129

Figures ...... 133

VI: GENOTYPE TUNES PDAC TENSION TO DRIVE MATRICELLULAR-

ENRICHED FIBROSIS AND TUMOR AGGRESSION ...... 142

Introduction ...... 142

Materials and Methods ...... 146

Mice Studies ...... 146

Histology ...... 147

LC-MS/MS and LC-SRM Proteomic Analysis ...... 147

Atomic Force Microscopy Measurements ...... 147

xii

Two-Photon Second Harmonic Microscopy and Analysis ...... 147

Results ...... 148

Discussion ...... 152

Human PDAC Pilot Summary ...... 155

Figures ...... 158

VII: EXAMINATION OF MATRICELLULAR FIBROSIS AND WOUND HEALING

IN A MODEL OF PANCREATIC DUCTAL ADENOCARCINOMA

PROGRESSION ...... 165

Introduction ...... 165

Materials and Methods ...... 168

Reagents ...... 168

Proteomic Sample Preparation ...... 168

Hydroxylamine (NH2OH) Treatment ...... 169

Trypsin Digestion ...... 170

LC-SRM Analysis ...... 170

LC-MS/MS Analysis ...... 170

Proteomic Data Analysis ...... 171

xAAA Sample Preparation ...... 171

Protein Hydrolysis ...... 172

xAAA Data Analysis ...... 172

Database Searching and Protein Identification ...... 173

H&E Staining ...... 173

Picrosirius Red Staining and Quantification ...... 173

Results ...... 174

Discussion ...... 182

xiii

Figures ...... 187

VIII: SUMMARY OF FINDINGS AND FUTURE DIRECTIONS ...... 193

Tools to Investigate the Extracellular Matrix ...... 193

Mammary Gland ECM Remodeling ...... 196

Human Breast Cancer ...... 197

Pancreatic Cancer ...... 200

Figures ...... 203

IX: COLORADO CLINICAL AND TRANSLATIONAL

SCIENCE FELLOWSHIP ...... 204

Medical Oncology - Breast Cancer Center ...... 204 Autopsy Pathology ...... 205 Summary ...... 207

REFERENCES ...... 208

APPENDIX ...... 237

A. Crosslinked Amino Acid Standard Characterization ...... 237

B. Publications ...... 240

xiv

LIST OF FIGURES FIGURE 1.1 Schematic Diagram of Lysine Hydroxylation and Crosslinking in Collagen .. 27

1.2 Hydroxy Lysine Aldehyde (Hylald) Crosslinking Pathway ...... 28

1.3 Lysine Aldehyde (Lysald) Collagen Crosslinking Pathway ...... 29

1.4 Proposed Scheme of Natural Crosslinking Reactions in Elastin ...... 30

1.5 Abnormal Breast ECM Architectures are Associated with Tumor Progression ...... 31 2.1 Workflow Diagram for Comparative Analysis of CNBr and NH2OH Using Quantitative QconCAT ECM Proteomics ...... 58

2.2 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Femur Matrix ...... 59

2.3 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Femur Matrix ...... 60

2.4 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Skin Matrix ...... 61

2.5 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Lung Matrix ...... 62

2.6 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Muscle Matrix ...... 63

2.7 Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Liver Matrix ...... 64

2.8 CNBr Selectively Brominates Tyrosine Containing Collagen Peptides ...... 65

2.9 Fibrillar Collagen and ECM Abundance Across Tissues ...... 66

3.1 Quantitative QconCAT ECM Proteomics Pipeline ...... 86

3.2 QconCAT Based ECM Proteomics Reveals Unique and Shared Mammary Gland and Liver ECM Profiles ...... 88

3.3 Principal Component Analysis Reveals Dynamic and Cyclical Mammary

xv

Gland ECM Remodeling Across the Reproductive Cycle ...... 90

3.4 Quantitative ECM Proteomics Unravels The Unique Composition and Abundance of ECM Proteins Across the Reproductive Cycle ...... 91

4.1 Comparing the ECM Composition of Human Breast Tissue at Different Densities ...... 104

4.2 Fibrillar Collagen Abundance Correlates with Increased Mammographic Density ...... 106

4.3 Mammographic Density is Not Associated with Significant Changes in Collagen Crosslinking ...... 108

5.1 xAAA Workflow Diagram and Crosslink Analysis of Normal Tissues ...... 133

5.2 Overexpression of Lysyl Oxidase in the Mammary Tumor Epithelium Does Not Alter Collagen Crosslinking ...... 134

5.3 Collagen Crosslinking Closely Correlates with Fibrillar Collagen Accumulation and ECM Stiffness in a Mammary Tumors Overexpressing Lysyl Oxidase in the Stromal Compartment ...... 136

5.4 Curly and Straightened Invasive Ductal Adenocarcinoma Architectures are Both Associated with Increased in Collagen Crosslinking ...... 138

5.5 Triple-Negative Breast Cancer Patients Favor the Formation of Hydroxy Lysine Derived Collagen Crosslinks ...... 139

5.6 High LH2 Expression Correlates with Poorly Differentiated Tumors and Cumulative Distant Metastasis-Free Survival in Triple-Negative Patients .. 140

6.1 PDAC Genotype Tunes Epithelial Tension to Regulate Fibrosis ...... 158

6.2 Targeted Proteomics and Crosslinking Analysis Reveals Changes in Protein Abundance, Solubility and Crosslinking ...... 160

6.3 JAK-Stat3 Signaling Drives ECM Remodeling and Stiffening ...... 162

6.4 Human PDAC Lesions are Characterized by Increased Matricellular Protein Abundance and Crosslinking ...... 163

xvi

7.1 Characterization of Cell Population Markers in Normal Pancreas and During PDAC Progression ...... 187

7.2 KTC PDAC Demonstrate a Unique ECM Composition Relative to Normal Pancreatic Tissue ...... 188

7.3 KTC PDAC Demonstrate a Unique ECM Composition Relative to Normal Pancreatic Tissue ...... 190

7.4 Collagen Crosslinking Decreases During KTC PDAC Progression ...... 191

8.1 Towards Comprehensive Characterization of the Solid Tumor ECM ...... 203

8.2 Broad Applicability of xAAA Method to Hard and Soft Tissues ...... 204

A.1 Summary of xAA Standard Characterization ...... 255

A.2 xAA Standard Characterization by MS ...... 256

xvii

ABBREVIATIONS

BAPN ß-aminopropionitrile BGN Biglycan CELA2A Chymotrypsin Like Elastase Family Member 2A CHAPS 3-[(3-cholamidopropyl)dimethylammonio]-1-propanesulfonate CNBr Cyanogen Bromide COMP Cartilage Oligomeric Matrix Protein CPA1 Carboxypeptidase A1 CSPG Chondroitin Sulphate CTRB1 Chymotrypsinogen B1 CV Coefficient of Variance CV Coefficient of Variance DCIS Ductal Carcinoma In Situ DCN deH-DHLNL dehydro-Dihydroxylysinonorleucine deH-HHMD dehydro-Histidino-Hydroxymerodesmosine deh-HLNL dehydro-Hydroxylysinonorleucine deH-HLNL dehydro-Hydroxylysinonorleucine deH-LNL dehydro-Lysinonorleucine DHLNL Dihydroxy Lysinonorleucine DPT Dermatopontin dPyr deoxy Lysyl Pyridinoline ECM Extracellular Matrix EDTA Ethylenediaminetetraacetic Acid EGFR Epidermal Growth Factor EHS Engelbreth-Holm Swarm EMT Epithelial to Mesenchymal Transition ER Endoplasmic Reticulum FA Formic Acid FACIT Fibril-Associated with Interrupted Triple Helices FGA Fibrinogen α FGB Fibrinogen FGG Fibrinogen FN Fibronectin GAG Glycosaminoglycan GCG Glucagon Precursor GEMM Genetically Engineered Mouse Model GFAP Glial Fibrillary Acidic Protein GFP Green Fluorescent Protein GPCR G-Protein Coupled Receptor H&E Hematoxylin and Eosin

xviii

HHMD Histidino-Hydroxymerodesmosine His Histidine ald HLCC Hydroxy Lysine aldehyde (Hyl )-derived Collagen Crosslinks HLKNL Hydroxylysino-Keto-Norleucine HLNL Hydroxy Lysinonorleucine HSPG Heparan Sulfate Proteoglycans Hyl Hydroxy Lysine ald Hyl Hydroxy Lysine Aldehydes Hyp Hydroxy Proline IDC Invasive Ductal Carcinoma iECM Insoluble ECM IHC Immunohistochemistry INS2 Insulin-2 Precursor JAK Janus Kinase KRT20 , type-I, Cytoskeletal 20 KRT7 Keratin type-II, Cytoskeletal 7 ald LCC Lysine aldehyde (Hyl )-derived Collagen Crosslinks LC-MS/MS Liquid Chromatography Mass Spectrometry LC-SRM Liquid Chromatography Select Reaction Monitoring LH LKNL Lysino-Keto-Norleucine LNL Lysinonorleucine LOX Lysyl Oxidase LOXL Lysyl Oxidase Like LUM Lumican Lys Lysine ald Lys Lysine Aldehydes MD Mammographic Density NaBH4 Sodium Borohydride NG Asparagine Glycine (Asn-Gly) NH2OH Hydroxylamine P3H Prolyl-3 Hydroxylase P4H Prolyl-4 Hydroxylase PanIN Pancreatic Intraepithelial Neoplasia PCA Principal Component Analysis PDAC Pancreatic Ductal Adenocarcinoma PEDF Pigment Epithelium-Derived Factor PLOD Procollagen-Lysine, 2-Oxoglutarate 5-Dioxygenase PLS-DA Partial Least Squares – Discriminant Analysis Pro Proline PRSS2 Anionic Trypsin-2 Precursor

xix

PTM Post-Translational Modification Pyr Hydroxy Lysyl Pyridinoline QconCAT Quantitative Concatemer QM-PDA Quasi-Mesenchymal PDAC ROCK Rho-associated Kinase sECM Soluble ECM SHG Second Harmonic Generation SHH Stromal Sonic Hedgehog SIL Stable Isotope Labeled SLRP Small Leucine Rich Proteoglycans SLRPs Small Leucine-Rich Proteoglycans SMA Smooth Muscle Actin SPARC Secreted Protein Acidic and Rich in Cysteine SPE Solid Phase Extraction SRY Sex-determining Region Y STAT3 Signal Transducer and Activator of Transcription 3 TACS Tumor Associated Collagen Signatures TFA Trifluoracetic Acid TFM Traction Force Microscopy TGF Transforming Growth Factor Beta THBS Thrombospondin TIC Total Ion Current TNC -C TPM2 Tropomyosin beta chain TSR Thrombospondin type-1 Repeats VIM Vimentin VPLIR Virgin, Pregnancy, Lactation, Involution, Regression YAP Yes-associated Protein 1

xx

CHAPTER I

THE EXTRACELLULAR MATRIX AT A GLANCE

Thesis Direction and Summary

Tumors are stiffer than normal tissue and are characterized by a collagen-rich extracellular matrix (ECM) that is dynamically remodeled during disease progression. Recently, much effort has been put forth to understanding how ECM architectures and compositions are involved in the pathogenesis and progression of solid tumors. However, the majority of studies have taken reductionist approaches where only a handful of ECM components are studied at one time. While these studies remain highly influential, a lack of knowledge exists regarding ECM composition and crosslinking in tumor microenvironment. The problem is two-fold: 1) there is a lack of methods that exist capable of solubilizing insoluble ECM proteins

(i.e. fibrillar collagen) and 2) direct characterization of collagen crosslinking in the human tumors has not been attempted. Together these problems have led to a perpetual underrepresentation of ECM components in large proteomic datasets and a gap in knowledge regarding how alterations in the activity of crosslinking-related enzymes, such as lysyl oxidase (LOX) and lysyl hydroxylase (LH), directly affect the abundance and specificity of collagen crosslinks in tumors. In this chapter, I will provide a primer on the components, properties and features of the ECM and how the ECM can modulate some of the hallmarks of cancer. The rest of the chapters have been organized such that each represents an individual manuscript that is either accepted, under review, or in preparation. In Chapter II, I develop a new approach for the characterization of insoluble ECM components and demonstrate

1 how it may be applied to tissues that vary in their overall ECM content. Application of this approach to solubilize the insoluble ECM components of breast and pancreatic tissue in several different contexts is applied throughout this work (Chapters III, IV,

VI, VII) and further demonstrates the broad utility of the approach. In Chapter III, targeted proteomics is utilized to quantitatively characterize ECM remodeling in the rodent mammary gland and liver during the reproductive cycle. In Chapter IV, I investigate the relationship between mammographic density, fibrillar collagen abundance and collagen crosslinking in clinical patient samples using targeted proteomics and crosslinked amino acid analysis (xAAA). In Chapter V, I will provide a detailed description of the xAAA method. Additionally, I will offer evidence for the direct relationship between LOX, collagen crosslinking, and tissue stiffness in human breast cancer. I go on to demonstrate the importance of LH in the formation of hydroxy lysine derived collagen crosslinks (HLCCs) and how its expression affects distant metastasis free survival and treatment resistance in a large breast cancer patient cohort. In Chapters VI and VII, I apply both ECM proteomics and crosslinking analysis to study matricellular fibrosis and wound healing in human pancreatic ductal adenocarcinoma (PDAC) and three genotypically distinct murine models of PDAC. I also perform a follow up study that explores differences between the normal pancreas stroma and PDAC stroma at early and late timepoints to further investigate matricellular fibrosis in the context of disease progression. Taken together, the work included in this thesis delivers an extensive resource regarding

ECM composition and crosslinking in the tumor microenvironment and provides insight into potential therapeutic strategies that may improve the survival of breast

2 and pancreatic cancer patients whose tumors are characterized by fibrosis. A summary of findings is presented in Chapter VIII. A reflection on my experience in the Colorado Clinical and Translational Scientist training program is described in

Chapter IX.

Introduction

The Role of the ECM in Normal Tissue Development and Homeostasis

The extracellular matrix (ECM) is the non-cellular component present within all tissues and organs and is a major component of the cellular microenvironment (1,

2). In addition to functioning as a physical scaffold for cells, the ECM initiates and maintains essential biochemical and biomechanical cues required for tissue development, differentiation and homeostasis through sequestration and release of growth factors and matricryptins, modulation of hydration levels, and pH of the local microenvironment, to name a few (3). Taking this into account, the structure of tissues and organs is critical for their function. Indeed, loss of normal tissue architecture is a prerequisite for, and one of the hallmarks of most cancers. As such, normal organ architecture can act as a powerful tumor suppressor, capable of preventing and reverting malignant phenotypes even in cells with disease causing mutations (4-6). The fact that organ function and homeostasis are driven by organ architecture and that cells in every organ carry the same genetic information, begs the question of how tissue-specific form and function is achieved? Dynamic reciprocity is a concept that describes how tissue-specific function is achieved through interactions between the cell and surrounding extracellular matrix (7). The model describes the dynamic bi-directional cross talk between the ECM and the

3 cellular microenvironment that determines the structure and function of a given tissue (8).

The continuously regulated process of differentiation can be defined as the acquisition of tissue specific functions that modulate interactions between the ECM and the cellular microenvironment. In this way, bi-directional dynamic reciprocity plays a major role in maintaining stable expression of differentiation-specific genes

(9). However, final tissue specific architecture, form and function are influenced by the unique context in which they develop. Every organ is composed of tissues derived from one of the three embryonic germ layers: 1. Endoderm which forms the epithelium of the lungs, pancreas, liver and digestive organs, among others, 2.

Mesoderm which generates blood vessels, bone, muscle, and mesenchymal connective tissue, among others, and 3. Ectoderm which gives rise to the epithelium of the skin and its derivatives, including the mammary gland (8). Interactions between epithelial and mesenchymal constituents during development direct the creation of normal tissue architecture (e.g. morphogenesis). The idea that tissue development is not cell autonomous, but is instead instructed by the surrounding environment was hypothesized as early as 1817 (10).

While cell-ECM crosstalk is integral for the initial development of tissues, it is also important within the context of tissue repair after injury. The process of somatic wound healing, regardless of direct cause (e.g. immunological, physical etc.), proceeds similarly with that of initial embryonic development. As such, the various stages of wound healing are characterized by massive cell migration, phenotypic differentiation, and a heightened biosynthetic activity at the site of repair. The

4 dynamic interactions that occur between growth factors and ECM during these stages are integral to wound healing.

Properties and Features of the ECM

The direct and indirect mechanisms by which the ECM regulates cell behavior are complex. The ECM is composed of biochemically distinct macromolecules that can be broken down into three broad groups: fibrous proteins (including collagens and elastin), glycoproteins (including fibronectin, proteoglycans, and basement membranes) and matricellular proteins (including tenascin and thrombospondins). The protein and non-protein constituents of the ECM vary not only in terms of their functional roles but also in terms of their structure. To that end, these components work together to form both a as well as an interstitial matrix. Stromal cells primarily lay down the fibrous proteins that form the interstitial matrix such as fibrillar collagen, matricellular proteoglycans like tenascin C and structural glycoproteins such as fibronectin. The basement membrane is produced by epithelial, endothelial and stromal cells to separate the epithelium from the stroma. The basement membrane component is primarily composed of type IV collagens, and (among others) which help to connect the basement membrane to the stroma through interactions with collagen. While the basement membrane is readily solubilized in high salt, detergent-rich buffers, the stroma is far more insoluble and requires extensive extraction in a strong chaotrope and subsequent chemical digestion. However, insoluble and soluble ECM components are ordered in a tissue-specific manner which can impart distinct positive or negative biochemical and biomechanical cellular cues. Mechanisms of ECM function include,

5 anchorage to the basement membrane (11, 12), block or facilitate cell migration (e.g. tumor cell dissemination) (13, 14), acting as a signal reservoir for growth factor and cytokines that helps form a concentration gradient and promote timed release mechanisms (15, 16), bind to growth factors and act as a low affinity co-receptor or presenter influencing cell-cell crosstalk (12, 17), action of ECM cleavage products

(i.e. matricryptins) on signaling mechanisms (18, 19), and biomechanical force generation through the focal adhesion complex (20, 21)

Each of these unique properties is related to one another and contributes to the importance of the ECM in development and disease. The ECM is highly dynamic with all of the above mentioned properties occurring simultaneously in space and time. Changes in dynamics can result from genotypic alterations that drive changes in ECM composition and architecture through synthesis and degradation of individual components (14).

Collagen and Elastin

Collagen

History. Collagen is the main structural protein that composes the ECM and is widely considered to be the most abundant protein in the human body. Originally a term coined in France in the 19th century, collagene, was meant to describe the constituent of connective tissues that produced glue. The word was later adapted to its current English form, collagen, in 1865, however; was not officially defined by the

Oxford Dictionary until 1893 – “that constituent of connective tissue that yields gelatin on boiling” (22). However, the uniqueness of the collagen fibril itself had been

6 documented in great detail much earlier by pathologists in the 19th century such as

Henle and Ranvier (23).

During this time, an active controversy was brewing between two schools of thought regarding the origin of collagen fibers. One school believed that collagen fibers developed directly from the cytoplasm of the fibroblast (24-27), and the other that they formed apart from the cell in what was called the “intercellular ground substance” (28-30). In the middle of this controversy, Jean Nageotte proved that acid solubilized collagen could be precipitated into observable fibers which looked similar to those of the intact tissue. This led him to suggest that extracted collagen was a precursor of collagen fibers – providing strong support for the theory of extracellular formation of collagen fibers in vivo (31).

Within the next century, significant progress had been made including the discovery of the existence of the monomeric building block (i.e. tropocollagen) that orders collagen fibrillogenesis (Jerome Gross, 1956) (32). It was soon discovered that the general “collagen molecule” (which we now know was collagen alpha-1 (I)), was composed of three polypeptide chains (2 α1 chains and 1 α2 chain) that assembled into a triple helix with a coiled-coil conformation. The primary sequence for this protein is exceptionally unique – being made for the most part of repeating

Gly-Xaa-Yaa triplets where proline (Pro) is often in the Xaa position, and hydroxy proline (Hyp) in the Yaa position. An extraordinary finding at the time, these modifications were found to be necessary for the maturation of collagen into its fibrillar form that is found in collagen-rich tissues by Ramachandran in 1967 (33).

7

The collagen family. In broader terms, we now know that collagens are comprised of a family of extracellular proteins that collectively make up ~30% of total protein mass (31, 34, 35). In vertebrates, there are now at least 28 collagen types, encoded by more than 40 different genes, which can be categorized into families based on their supramolecular assemblies and other specific features. Specific collagen families include fibrillar collagens, fibril-associated collagens with interrupted triple helices (FACITs), network-forming collagens, transmembrane collagens and some others. The common structural feature that all collagens share is the presence of a triple helix which ranges from 96% of total structure in the case of collagen I to as little as 10% of total structure in the case of collagen XII (35).

Interestingly, we continue to find proteins that contain triple-helical collagenous domains (e.g. C1q, adiponectin, collectins, ficolins, macrophage scavenger receptors, among others), many of which are involved in innate immunity (36).

Fibrillar collagens (type I, II, III, V, XI, XXIV and XXVII) are considered unique among other collagens because they form fibrils with unique repeated banding pattern called D-periodicity, which arises from the ordered staggering of collagen molecules (22, 35). Each molecule is formed from three polypeptide chains named α chains, either in a homotrimeric or a heterotrimeric fashion, depending on the collagen type and some existing variants. Another common characteristic of fibrillar collagens is their propensity to form longer triple helical region with unique Gly-X-Y triplet repeats, flanked by the short non-helical parts called telopeptides. Because is the most abundant collagen found in various connective tissues

8 such as skin, bone, tendon and dentin, it is critical in order to maintain the integrity and elasticity of tissues (22, 35).

Collagen biosynthesis. The assembly of collagen molecules into fibrils is an entropy-driven process, similar to other protein self-assembly systems, such as actin filaments. It is believed that these processes are driven by the loss of solvent molecules from the surface of the collagen molecule and results in assemblies with a circular cross-section, minimizing the surface area/volume ratio of the final assembly. An essential feature of collagen fibril formation is the fact that they are synthesized as soluble procollagens in the endoplasmic reticulum (ER). In the rough

ER, procollagens undergo extensive post-translational modifications including the unique hydroxylation of Pro and lysine (Lys) residues (Figure 1.1A), subsequent O- linked glycosylation of specific hydroxylysine (Hyl) residues, and asparagine-linked glycosylation. Proline residues in the Y position of the Gly-X-Y repeat are mostly hydroxylated by prolyl-4-hydroxylase (P4H), while some of those at the X position are hydroxylated by prolyl-3 hydroxylase (P3H) (37). During this process, several proteins act as chaperones during trimerization and folding of the α chains, either by selectively binding to the unfolded procollagen α chains to prevent premature triple helix formation (e.g. P4H, protein disulphide isomerase (PDI), Bip/Grp78, Grp94, and immunophillins), or binding to the completely folded collagen molecule to stabilize the triple helix and possibly prevent premature aggregation. The modified procollagen molecules are then transported through the Golgi network and secreted.

Shortly after secretion, N- and C-terminal propeptides are removed by proteases, leaving short N- and C-terminal telopeptides. Following processing, collagen

9 molecules spontaneously self-assemble in ordered staggered arrays to form a right- handed super-helix whose intramolecular sterics force the center of the helix to be occupied by glycine residues (any amino acid sequence other than the Gly-X-Y repeat disrupts triple helix formation). Adjacent monomers overlap each other by 234 residues, forming the 67 nm D-period repetitive regions of collagen which consist of the unique “hole zone channels” and overlap zones in the collagen fibrils (36, 38).

There are several factors known to influence collagen fibrillogenesis and organization. The levels of post-translational modifications on the collagen molecule itself (e.g. lysine hydroxylation and glycosylation) can alter the type of fiber that forms, with more highly modified collagen molecules being associated with the formation of fibrils of a smaller diameter (39). Additionally, proteoglycans such as small leucine-rich proteoglycans (SLRPs) like decorin (DCN) that contain attachments of glycosaminoglycan (GAG) chains influence collagen fibril outgrowth through interactions with fibronectin (FN), thrombospondin (THBS) and transforming growth factor-beta (TGF). As such, ECM proteins work together to promote collagen fibril formation through dense networks of protein-protein interactions that act to stabilize the initial aggregation of the collagen and its subsequent outgrowth

(36, 38).

Collagen Crosslinking

After collagen’s spontaneous aggregation into fibrils, it is further stabilized by a final post-translational modification, the formation of intra and intermolecular crosslinks. They are initiated by the generation of aldehydes at specific Lys or Hyl residues in the telopeptide regions of the α chains by the lysyl oxidase (LOX) family

10 of enzymes (Figure 1.1B). LOX is a copper-dependent amine oxidase that initiates the process of covalent intra- and intermolecular crosslinking of collagen by oxidatively deaminating specific Lys and Hyl residues in the telopeptide domains. It includes LOX and LOX-like (LOXL), LOXL1, LOXL2, LOXL3, and LOXL4 proteins.

LOX and LOXL enzymes have been shown to be active for fibrillar collagen while

LOXL2 has been associated with basement membrane type IV collagen (40). The substrate specificity for other isoforms has not been clearly defined (41). It was reported that LOX is highly functional for growing, native collagen fibrils and was suggested that the intermolecular interaction between collagen molecules was important for the enzyme activity (42). Studies have shown that the binding sites for

LOX in type I collagen are in the triple helical region, potentially in the area with highly conserved sequences (Hyl-Gly-His-Arg) where it can catalyze the formation of aldehydes in the telopeptides of the adjacent collagen molecule (43). The oxidative deamination activity of LOX enzymes can be inhibited by the lathyritic agent, ß- aminopropionitrile (BAPN) (44). The resulting Lys aldehydes (Lysald) or Hyl aldehydes (Hylald) are determinants for the tissue-specific crosslinking pathway by their involvement in a series of spontaneous intra- or intermolecular reactions – thus providing the matrices with tensile strength and elasticity which are essential for the functional integrity of the tissue. To that end, mechanical properties of collagen fibers primarily depend on the formation of head to tail Schiff base crosslinks between end-overlapped collagen molecules (44, 45).

Immature bivalent reducible crosslinks. The types of collagen crosslinks that form in normal connective tissues are determined prior to crosslink formation by

11 the optional hydroxylation of specific telopeptide and helical lysine residues on collagen by lysyl hydroxylases (LH1, LH2, and LH3) encoded by distinct procollagen-lysine, 2-oxoglutarate 5-dioxygenase (PLOD) genes (46). In fibrillar collagens, the telopeptides each contain one crosslinking site on the N-terminal telopeptide (Ntx) and one on the C-terminal telopeptide (Ctx) at residues 9 and 16, respectively. In the helical region of processed mature collagen, crosslinking sites are found at restudies 87 and 930 (47). After optional lysine hydroxylation, LOX acts on Lys and Hyl residues to initiate the process of covalent intra- and intermolecular crosslinking of mature collagen, thereby increasing the structural integrity, strength, and stiffness of the ECM. The dual action of LHs and LOXs ultimately form two unique, tissue-specific, crosslinking pathways: 1) Hydroxy lysine aldehyde (Hylald)- derived collagen crosslinks (HLCCs) and 2) Lysine aldehyde (Lysald)-derived collagen crosslinks (LCCs) (44, 45).

Lysyl hydroxylases. Lysine hydroxylation is important to the formation of collagen crosslinks as well as glycosylation (48). Lysyl hydroxylase (LH) is one of the 2-oxoglutarate dioxygenases which catalyzes the hydroxylation reaction of Lys residues in the procollagen α chains and proteins with collagenous sequences, co- and post-translationally. The reaction requires ferrous iron, oxygen, 2-oxoglutarate and ascorbate as cofactors and in turn releases succinate and carbon dioxide (CO2) along with the formation of Hyl (52). In collagens, Hyl residues are present exclusively in the Y position of G-X-Y triplets while in the non-helical region of α chains, Gly is replaced by serine in the N-telopeptide (X-Hyl-Ser) and by alanine in the C-telopeptide (X-Hyl-Ala) of the α chains (49).

12

Hydroxy lysine aldehyde crosslinking pathway. Lysine residues in the telopeptide region of fibrillar collagen are primarily hydroxylated prior to conversion to Hylald by LOX (Figure 1.2). Once LOX acts and the Hylald is formed, it reacts with the ε-amino group of Hyl or Lys residues in the helical region of the adjacent collagen molecule. The Schiff bases formed (dehydro-dihydroxylysinonorleucine

(deH-DHLNL) and dehydro-hydroxylysinonorleucine (deh-HLNL)) then undergo

Amadori rearrangements to form hydroxylysino-keto-norleucine (HLKNL, Hylald ×

Hyl) or lysine-keto-norleucine (LKNL, Hylald × Lys), respectively. These keto-amine forms are more stable crosslinks and likely contribute to the insolubility of fibrillar collagen in mineralized tissues. However, for these immature crosslinks to withstand acid hydrolysis, they have to be stabilized by the reduction process with sodium borohydride (NaBH4), and analyzed in their reduced forms (i.e. dihydroxy lysinonorleucine (DHLNL) and hydroxy lysinonorleucine (HLNL)) (44, 50).

Lysine aldehyde crosslinking pathway. The Lysald in the telopeptides reacts with Hyl or Lys in the helical region forming dehydro-hydroxylysinonorleucine

(deH-HLNL, Lysald × Hyl) or dehydro-lysinonorleucine (deH-LNL, Lysald × Lys)

(Figure 1.3). In addition, a unique reducible tetravalent crosslink, dehydro-histidino- hydroxymerodesmosine (deH-HHMD) can be formed between the aldol condensation product of two Lysald, histidine (His) and a helical Hyl (Lysald × Lysald ×

His × Hyl). Similar to the keto-amine crosslinks, the aldimines are also analyzed in their reduced forms i.e. hydroxylysinonorleucine (HLNL), lysinonorleucine (LNL) and histidino-hydroxymerodesmosine (HHMD) (44, 45, 50).

13

Mature trivalent crosslinks. As we age, the abundance of immature reducible crosslinks declines over time along with collagen solubility, yet paradoxically the strength of connective tissue increases (51, 52). How can this be so? This paradox led to the speculation of the maturation process from immature bivalent crosslinks to mature trivalent crosslinks. In 1977 Fujimoto was the first to isolate and characterize these crosslinks (53). So far only two major forms of the trivalent crosslinks have been identified and characterized (discussed below in detail). Bone is unique example of need for both immature and mature crosslinks.

The reducible crosslinks in bone collagen decreased steeply in content between birth and 25 years, but persist in significant amounts throughout adult life (54). It could be that bone is being constantly remodeled, therefore new collagen is always formed. In addition, it is suggested that the mineralization process may inhibit the maturation of bivalent into trivalent crosslinks (55). In the example of human cartilage, reducible crosslinks have virtually disappeared by 10-15 years of age, being replaced by crosslinked pyridinoline residues, their maturation products (54).

The best characterized and most widely distributed mature crosslinks are the hydroxy lysyl pyridinoline (Pyr) and its deoxy analog, lysyl pyridinoline (dPyr) crosslinks (Figures 1.2 and 1.3) (53, 55, 56). Pyr is widely distributed in the collagens of most vertebrate connective tissues, whereas dPyr, although widely distributed, predominates in bone and dentin (57). Pyr was first discovered in bovine achilles tendon among two Hylald and one Hyl (Hylald × Hylald × Hyl) (53). Later it was discovered that deoxy pyridinoline crosslinks two Hylald and one Lys (Hylald × Hylald ×

Lys) and is formed from the condensation reaction of lysine ketonorleucine (LKNL)

14 with Hylald (56). Their intrinsic fluorescence has allowed them to be characterized previously by traditional HPLC approaches (45, 58).

Pyrrole crosslinks were first identified by Kuyper in 1990 from bovine tendon collagen and it has been shown that decreased levels are found in aged connective tissue (59). Interestingly, pyrrole crosslinks have only been found at the N- telopeptide (Hylald × Lysald × Hyl or Lys) (57, 59).

Elastin

Properties of elastin. Elastin is the major protein component of the elastic fiber and is critical to the structural integrity and function of tissues in which reversible extensibility or deformability are a requirement, such as in major arterial vessels, lungs, and skin. In contrast to collagen, elastin is encoded by a single gene.

Elastin matures in the ECM through the assembly of a soluble precursor molecule

(i.e. tropoelastin) into a highly crosslinked polymer. Of the 37 lysines per 800 residues in tropoelastin, approximately 10 are involved in desmosine and isodesmosine crosslinks, approximately 15 are present in crosslink intermediates, and approximately 5 remain as lysine in mature elastin, with 7 residues unaccounted for (47, 60). In contrast to collagen which primarily self assembles, elastin requires the assistance of helper proteins to align the multiple crosslinking sites on elastin monomers (60). Once the maturation process is complete, crosslinked elastin is among the most insoluble hydrophobic proteins known, with few polar groups. In fact, elastin from higher vertebrates including humans contains over 30% glycine and approximately 75% of the entire sequence is made of just four hydrophobic amino acids (Gly, Val, Ala, Pro) (61). In particular, this property makes elastin

15 among the most stable proteins in the body, able to last the entire lifetime of the organism. Tissues that are rich in elastin include aorta and major vascular vessels

(28-32% dry mass), lung (3-7%), elastic ligaments (50%), tendon (4%), and skin (2-

3%) (61, 62) .

Elastin crosslinking. The highly insoluble and hydrophobic elastin molecule is stabilized by the formation of covalent crosslinks. It is crosslinked by two amino acids unique to elastin that form the centers of tetravalent crosslinked fibers, desmosine and isodesmosine (63). In fact, these were the first discovered of all of the crosslinks in collagen or elastin (64). All known elastin crosslinks are derived from lysine only, as hydroxy lysine is not present in the protein (61). The exact route of formation remains unknown, however; it has been proposed that condensation of allysine aldol with dehydrolysinonorleucine forms dihydrodesmosines, that are oxidized to form desmosine and isodesmosine (Figure 1.4) (47, 60). The entire sequence of pig tropoelastin has been sequenced, 50% by analysis of tryptic peptides which are separated into two groups 1) hydrophobic peptides ranging in size from 17 to 81 residues and 2) polar di, tri, and tetravalent peptides. The polar peptides present in elastin contain mostly Ala and Lys residues in the form of -Ala-

Ala-Lys- and -Ala-Ala-Ala-Lys-, which are the crosslinking sequences of elastin.

Three Lys residues in these sequence motifs are converted to allysine residues and the fourth, distinguished by the sequence Lys-Tyr (in pig elastin), provides the ring nitrogen of the pyridinium crosslink (65). Indeed, much of the biochemistry of elastin was elucidated from pig and bovine samples. However, in bovine elastin phenylalanine replaces the tyrosine C-terminal to the desmosine crosslink in several

16 of the crosslinking sequence motifs and it has been observed that bovine elastin contains less tyrosine residues (6 per 1000) than pig elastin (16 per 1000) (65).

Fibronectin

Fibronectin is a large adhesive glycoprotein that primarily functions by binding membrane spanning receptor proteins called integrins. Through these interactions it plays a role in cell adhesion, migration, growth and differentiation (66, 67).

Fibronectin also interacts directly with ECM components such as collagen, fibrin and other proteoglycans, such as heparan sulfate proteoglycans (HSPGs), among others. Fibronectins are produced from a single gene by a wide variety of epithelial and mesenchymal cells in vitro, including fibroblasts, chondrocytes, myofibroblasts, macrophages, and hepatocytes. It is secreted as a dimer consisting of two similar subunits bonded through a disulfide linkage (68). Each domain of fibronectin is responsible for one of fibronectins many binding functions. Three repeating sequence motifs (I, II, III) are organized into functional domains that contain binding sites for ECM proteins and cell surface receptors (e.g. integrins) (67, 69). Type three repeats account for more than 60% of the sequence and is considered the predominant structural feature of fibronectin (66). However, mRNAs for fibronectin have been shown to give rise to multiple versions of the protein through alternative

RNA spicing that occurs predominately at three sites 1) extra type III domain A (EDA or EIIIA), 2) extra type III domain B (EDB or EIIIB), 3) the connecting segment between the fourteenth and fifteenth type III repeat (IIICS) (70, 71). Five variants are produced from splicing in the IIICS segment. Splicing at these three segments can result in over twenty different fibronectin subunits (72). In addition, several malignant

17 cell lines in three-dimensional, laminin-rich ECM have been shown to preferentially upregulate protein of the fibronectin splice variant, EDA+, compared with nonmalignant cells (73). Despite this knowledge, the biological functions of fibronectin isoforms remains poorly understood.

Fibronectin and its integrin receptors have been shown to play an integral role in the progression of metastatic disease (74). Fibronectin exhibits at least two independent cell adhesion regions with different integrin receptor specificities. As cancer cells are less adhesive (i.e. more invasive) than normal cells, these processes are implicated in tumor progression and are important for tumor cell migration, invasion and metastasis. This interaction may also play a role in chemotaxis and control of proliferative pathways.

Proteoglycans

In contrast to the predominantly fibrillar structure of collagens, proteoglycans form the basis of higher order ECM structures around cells and are composed of genetically distinct families of multidomain proteins that have one or more covalently attached glycosyl amino glycan (GAG) chains (75). GAGs are long, negatively charged, linear chains of disaccharide repeats. At least 25 gene products with at least one GAG modification have been identified with many structural variants. Of note, there are no structural domains common to all proteoglycans. The primary biological function of proteoglycans derives from the biochemical and hydrodynamic characteristics of the GAG components, which bind water to provide hydration and compressive resistance. As such, proteoglycans are highly abundant in compressible ECM tissues like cartilage. Names of classes of proteins

18 are based on the type of GAG chain that is attached, as well as the distribution and the density of these chains along the core protein. This distinguishing characteristic allows proteoglycans to be grouped into several broad categories: heparin sulphate proteoglycans (HSPGs), chondroitin sulphate proteoglycans (CSPGs), small leucine rich proteoglycans (SLRPs), hyaluronan and keratin sulphate, each having a unique function in the ECM (76, 77).

In addition to being dominant components of the ECM, proteoglycans can also function as accessory proteins in tissues rich in other matrix proteins. For example, the SLRPs are a family of proteoglycans that have been implicated in fibrillar collagen assembly and includes well-known members such as decorin, biglycan and lumican, which act to stabilize collagen through association with the mature fibril. SLRPs have also been show to participate in cell – ECM signaling with binding sites for cytokines and growth factors being recently discovered (78).

Laminin and Basement Membranes

Laminin and other basement membrane components (i.e. collagen IV) are primarily found in the basal lamina and mesenchymal compartments and function to bridge the gap between structural ECM molecules and reinforce the network which provides support for cells and soluble molecules within the matrix (79). These large glycoproteins are primarily composed of laminin-type epidermal growth factor (EGF)- like repeats and alpha helical domains. Laminins mediate cell interactions with other

ECM components through cell surface receptors (e.g. integrins) and consist of α, , and chains that combine via a triple-helical coiled-coil domain at the center of each chain to form a cruciform shape (80). Perhaps the most famous laminin isoform is

19 laminin 111 which is composed of single α, and chains that associate to form a cruciform shape. Dr. Mina Bissell’s work over the last γ0 years has provided extensive insight into how laminins mediate tissue-specific gene expression in the mammary epithelium and can enhance functional differentiation.

Collagen-IV is also a major component of the basement membrane, associates with laminin directly, and is integral to the formation of normal physiologic collagen networks (81, 82). Collagen-IV assembles to form tetramers that are stabilized by a covalent Met-Lys crosslink (S-hydroxylysino-methionine) between

Met93 and Hyl211 (83). Loss of an intact basement membrane is a pre-requisite for tumor cell invasion and metastasis and it has been reported via proteomic studies in our own lab that collagen-IV abundance changes drastically during malignant transformation (84).

Matricellular Proteins

Matricellular proteins are a diverse group of proteins that modulate cell function by interacting with cell-surface receptors, proteases, and hormones, among others (85). Matricellular proteins are secreted into the ECM but do not actually have a structural role (86). Instead, distinguishing characteristics of these groups of proteins include 1) increased expression during development and wound healing, 2) binding to many cell – surface receptors, ECM components, growth factors and cytokines and proteases, 3) roles in de-adhesion or counter-adhesion in contrast to the adhesive nature of most structural ECM proteins, 4) a subtle phenotype that is observed in mice with a targeted disruption or some matricellular protein genes (86,

20

87). Their role in these processes is very contextual and dependent upon individual properties of a tissue-specific matrix.

Initially, only three members comprised the matricellular group of proteins –

SPARC, TSP-1, and TNC – mainly grouped as secreted proteins that modulated cell-ECM interactions (88-90). This narrow classification has been expanded to include additional SPARC, TSP (TSPs 1-4, COMP) and tenascin (TN-C, R, W, X, Y) family members with new proteins being introduced in recent years including osteopontin, CCN family of proteins, periostin, R-spondins, short fibulins, galectins,

SLRPs, PEDF, and Plasminogen activator inhibitor-1 (91). Although new and old members are considered to be structurally diverse, most contain repeats of common

ECM structural motifs such as thrombospondin type 1 repeats (TSRs), fibronectin type-III repeats, EGF-like repeats, and are able to bind calcium (91).

Much attention has been given to the role of and thrombospondins during tumor progression and fibrosis. It has been suggested that these help to regulate formation of provisional matrix during solid tumor progression that is characterized by a wound healing response (92). Indeed, our group and others have observed drastic changes in the expression of tenascins (i.e. TNC), fibrinogens and thrombospondins that support a de-adhesive and pro-migratory cellular phenotype

(93). Along these same lines, a myriad of studies have also implicated matricellular protein expression to the increased production of fibrillar collagen and the formation of fibrosis, a hallmark of many solid tumors (94). Through a diverse set of interactions with soluble growth factors, cytokines and ECM molecules themselves,

21 these proteins can either promote or suppress cancer progression in a tissue- specific manner.

The ECM and Cancer

The Provisional ECM and the Wound Healing Response

Repair of tissue after injury depends on the synthesis and deposition of a fibrous extracellular matrix to replace lost or damaged tissue (92). The architecture of the newly synthesized ECM is remodeled over time to emulate a normal tissue matrix. The ECM plays an integral role in the repair process through regulation of the behavior of the wide variety of cell types that are mobilized to the damaged area in order to rebuild the tissue (90, 95). Acute inflammation, re-epithelialization, and contraction all depend on cell–ECM interactions and help to minimize infection and promote rapid wound closure (96). Matricellular proteins are up-regulated during wound healing where they modulate interactions between cells and the extracellular matrix to exert control over events that are essential for efficient tissue repair (86).

Many solid cancers are characterized by a perpetual wound healing response that contributes to fibrotic disease. Aberrant wound healing processes act to stiffen a tissue, offsetting biomechanically sensitive signaling cascades involved in cell proliferation (87). The role of this process in tumor progression is only now beginning to be appreciated. Further exploration of the interplay between these mechanisms may better identify how proteins implicated in these processes may be exploited for therapeutic intervention.

22

Biomechanical Signaling in the Tumor Microenvironment

In vivo studies have demonstrated how aberrant mechanical properties in cancerous tissue contributes to tumor aggression and reduced treatment efficiency

(94, 97-99). Studies have shown that stromal cells regulate the mechanical properties of the ECM through associations with numerous growth factors that act as a signaling reservoir to modulate cellular behavior (17, 100). For example, transforming growth factor 1 (TGF1) is a potent primary activator of quiescent fibroblasts to form highly contractile myofibroblasts involved in tissue fibrosis. Under fibrotic conditions where the ECM is stiff, TGF1 is more highly retained in the matrix and promotes TGF induced epithelial to mesenchymal transition (EMT) and induces a basal like‐ tumor‐ cell phenotype that promotes invasion and metastasis

(95, 101). As another‐ example, several KRAS-driven genetically engineered mouse models of pancreatic cancer exhibit both the loss of TGF- signaling and elevated

1-integrin mechanosignaling, thereby engaging a positive feedback loop whereby

STAT3 signaling promotes tumor progression by increasing matricellular fibrosis and tissue tension (94, 102).

Dysregulated ECM Dynamics are a Hallmark of Cancer

A fibrotic extracellular matrix (ECM) can obliterate the normal architecture and function of the underlying tissue. In the context of a tumor promotional state, a fibrotic ECM is capable of modulating tumor outcome by regulating soluble factors that induce inflammation and angiogenesis, and stimulate cell growth and migration

(101, 103). Although alterations in the fibrotic ECM can initially suppress cellular migration away from a primary site by creating a physical barrier around tumor cells,

23 paradoxically, increased collagen deposition is correlated with poor patient outcomes and suggests that the organization of ECM components is an important feature of tumor progression (104). Thus, the ECM remodeling process must be tightly regulated and is primarily controlled by the expression of ECM enzymes at multiple levels (i.e. prolyl and lysyl hydroxylases in the cell, lysyl oxidases and matrix metalloproteases outside the cell). During solid tumor progression, there is a natural increase in tissue stiffness as a result of compositional changes in the ECM as well as alterations in the abundance and activity of at least one class of crosslinking enzymes - LOXs (105). Levental et al. showed that increased LOX activity can directly modify mammary tumor progression by regulating collagen crosslinking and stiffness (106). Although collagen crosslinking accompanies fibrosis and fibrosis increases the risk of malignancy, the link between collagen crosslinking, tissue stiffness and fibrosis has not been clearly defined at the molecular level.

Activities of ECM remodeling enzymes may become dysregulated with age or in a disease state leading to abnormal ECM dynamics. Stromal cells and immune cells are the major contributors of the altered activity of ECM remodeling enzymes and have been implicated in almost all solid cancers (99). Indeed, increased matrix fiber deposition or reduced ECM turnover are prominent in tissue fibrosis of many organs (e.g. idiopathic pulmonary fibrosis). To this end, various collagens and proteoglycans have shown increases in abundance during tumor formation and progression (94). If the abundance of ECM components is altered it has great influence over the biomechanical properties of the matrix and can potentiate tumor- promotional signaling cascades to promote malignant transformation (107).

24

ECM Stiffness and Collagen Crosslinking are Tumor Promotional

Collagen crosslinking is a critical regulator of desmoplasia and it has been implied that the abundance and specificity of ECM crosslinks in a tissue can impact malignant transformation and alter tumor progression (106). Indeed, breast cancer patients that have tumors with high LOX expression have poor distant metastasis free and overall survival (108-110). LOX stiffens tissues and promotes tumor progression and fibrosis while inhibition of LOX has been shown to eliminate metastasis in mice with orthotopically grown breast tumors (110). This provides evidence that modification of the state of collagen crosslinking and ECM stiffness, two physical properties of the tumor microenvironment, can alter the invasive behavior of a oncogene pre-transformed mammary epithelium.

Previous findings have suggested that tissue fibrosis can regulate cancer behavior by influencing the biophysical properties of the microenvironment to alter force at the cell and/or tissue level. Integrins are heterodimeric, transmembrane cell adhesion receptors for fibronectin and other extracellular matrix molecules. Integrins bind to the ECM and initiate biochemical signaling and stimulate cytoskeletal remodeling to regulate cell behavior (111). Force generation increases integrin expression and the formation of focal adhesions. Concurrent with high LOX activity, breast tumor cells with higher tensile force have elevated integrins and focal adhesions (112). Attenuation of these elements (i.e. decreased cell tension, reduced integrin expression, or inhibition of integrin or LOX activity) has been shown to slow and inhibit breast tumor progression (106).

25

Abnormal Breast ECM Architectures are Associated with Tumor Progression

Before the matrisome was ever defined (113), pathologists and histologists had observed changes in the density and architectures of connective tissue that surrounded a tumor mass (114). Initially, we were limited in our ability to visualize the way in which the macromolecular collagen architecture of tumors changed until the application of connective tissue stains like Masson’s Trichrome to brain tumors in

1938 (115, 116) and ECM-specific imaging techniques such as second harmonic generation (SHG), which was first described on a melanoma tumor slices in 2003

(117). Only a few years later, Dr. Patricia Keely and colleagues were able to use

SHG to observe collagen fibers in breast tissue and their subsequent remodeling during breast tumor progression, a phenomenon she dubbed tumor associated collagen signatures (TACS) (118). Her initial observations included a distinct progression from wavy and relaxed collagen fibers in normal mammary gland, to more linear fibers in ductal carcinoma in situ (DCIS), finally progressing to radially aligned fibers perpendicular to the tumor border prior to invasion and metastasis.

Through these experiments we have been able to correlate these findings with patient prognosis. Further, these findings have allow us to discern that not just collagen fiber deposition, but also collagen architecture (i.e. fiber density and crosslinking) are key determinants of tumor progression (94). Despite this knowledge, it remains unclear what the functional outcome of different macro- architectures are at the molecular level of collagen crosslinking and global ECM composition (119) (Figure 1.5).

26

Figure 1.1: Schematic Diagram of Lysine Hydroxylation and Crosslinking in Collagen. A) Cartoon representation of mature fibrillar collagen fiber. N and C- terminal telopeptides are hydroxylated by lysyl hydroxylase 2 (LH2). N and C-terminal propeptides are cleaved by procollagen endopeptidases to form the mature collagen fiber (300nm). B) Lysine (Lys) and Hydroxy lysine (HyLys) residues in the telopeptide region of mature collagen are targeted by the crosslinking enzyme lysyl oxidase (LOX), which forms reactive aldehyde groups that spontaneously react to form covalent collagen crosslinks.

27

Figure 1.2: Hydroxy Lysine Aldehyde (Hylald) Crosslinking Pathway. (47, 120, 121). Telopeptide lysine residues are modified by lysyl hydroxylase. Lysyl oxidase modifies Hyl residues to hydroxy allysine (Hylald), which spontaneously react with Lys and Hyl residues to form the Schiff bases (dehydro-dihydroxylysinonorleucine (deH- DHLNL) and dehydro-hydroxylysinonorleucine (deh-HLNL)). They then undergo Amadori rearrangements to form hydroxylysino-keto-norleucine (HLKNL) or lysino- keto-norleucine (LKNL), respectively. These crosslinks can be reduced with NaBH4 to form LNL and DHLNL. Mature crosslink products (Pyr and dPyr) are formed from the reaction of lysine ketonorleucine (LKNL) or hydroxy lysinoketonorleucine (HKLNL), with hydroxy allysine.

28

Figure 1.3: Lysine Aldehyde (Lysald) Collagen Crosslinking Pathway (47, 120, 121). Lysyl oxidase modifies Lys residues to form allysine (Lysald), which spontaneously react with Lys and Hyl residues in the helical region to form the Schiff bases (dehydro-hydroxylysinonorleucine (deH-HLNL) and dehydro-lysinonorleucine (deh-HLNL)). The mature products of these crosslinks are currently unknown. Allysine can combine with an additional allysine residue to form allysine aldol. Allysine aldol can form the trivalent crosslink hydroxy merodesmosine, aldol histidine, or the tetravalent crosslink, histidino-hydroxymerodesmosine (HHMD) (only post-reduction products shown) through aldol condensation reactions with Hyl or His, or a combination of the two.

29

Figure 1.4: Proposed Scheme of Natural Crosslinking Reactions In Elastin (47, 64). Lysyl oxidase modifies Lys residues on elastin to form allysine (Lysald), which spontaneously reacts with Lys and Hyl to form the Schiff bases (dehydro- hydroxylysinonorleucine (deH-HLNL) and dehydro-lysinonorleucine (deh-HLNL)). These products undergo aldol condensation reactions to form dihydro merodesmosine, which is ultimately oxidized to form desmosine and isodesmosine.

30

Figure 1.5: Abnormal Breast ECM Architectures are Associated with Tumor Progression (106, 118, 122). Tumor associated collagen signatures (TACS) are associated with tumor prognosis. TACS-1 is associated with good prognosis (1st image), TACS-2, moderate prognosis (2nd image), and TACS-3, poor prognosis (3rd image). TACS remodeling is associated with breast cancer progression and tissue stiffness. We hypothesize that increased abundance of lysyl oxidase (LOX) alters collagen crosslinking to increase tissue stiffness. TACS images courtesy of Dr. Patricia Keely, University of Wisconsin Madison.

31

CHAPTER II

HYDROXYLAMINE CHEMICAL DIGESTION FOR INSOLUBLE

EXTRACELLULAR MATRIX CHARACTERIZATION

Introduction

The extracellular matrix (ECM) serves diverse functions and is a major component of the cellular microenvironment (1, 2). In addition to functioning as a physical scaffold for cells, the ECM initiates and maintains essential biochemical and biomechanical cues required for tissue development, differentiation and homeostasis through sequestration and release of growth factors and cryptic matrix sites, modulation of hydration levels, and pH of the local microenvironment (3) to name a few. Although it is fundamentally composed of water, proteins and polysaccharides, the ECM exhibits an extraordinary level of tissue specificity as a result of reciprocal interactions with various cellular components that generate unique ECM compositions and architectures (1, 8). However, dysregulation can lead to formation of fibrosis that is prevalent in many pathologies such as idiopathic pulmonary fibrosis, atherosclerosis and solid tumors (94, 123-126). Despite these important functions and roles in disease, analytical methods to accurately quantify the abundance of ECM components in tissues are not standard. In particular, there are very few methods currently available to characterize insoluble ECM (iECM) components, which are rich in high molecular weight covalently crosslinked fiber proteins that undoubtedly play a dominant role in defining tissue architectures (47).

32

Isolation of insoluble tissue ECM requires enrichment through extensive decellularization steps to remove bulk cellular components that are solubilized by detergents. Post-decellularization, a chaotrope is often used to facilitate protein denaturation and the solubilization of a fraction of the remaining ECM components.

Chaotropic agents (i.e. Urea, GndHCl), phenol) are dissolved in water and act to disrupt the hydrogen bonding network between water molecules, thereby promoting protein denaturation and solubilization. After this extraction an insoluble pellet remains that contains a significant amount of protein based on acid hydrolysis and amino acid analysis (127). It was recently shown that greater than 80% of fibrillar collagen in murine liver and mammary gland resides in the iECM fraction (93) – a fraction excluded from traditional proteomic methods. Additionally, we have shown previously that the use of chemical digestion with cyanogen bromide (CNBr) cleaves at methionine residues and is used prior to enzymatic cleavage with trypsin. We have found that these steps are both necessary and sufficient to fully solubilize insoluble ECM components, such as fibrillar collagen from the liver and lungs (93,

127). This method was recently used to achieve higher levels of core matrisome protein sequence coverage compared to other ECM analysis strategies (128) and has been successfully applied to the characterization of ECM from mammary gland, liver, heart, lung, bone, and solid tumors, further supporting the general utility of the approach (93, 94, 127, 129-132).

Chemical digestion with cyanogen bromide (CNBr) is an obvious choice for

ECM proteomics due to its extensive use in the biochemical analysis of collagens since its introduction as a cleavage agent in the early 1960s (133). Initially a

33 cyanosulfonium bromide derivative of methionine is formed. Under acidic conditions, this intermediate leads to the formation of methylthiocyanate and homoserine iminolactone, which decompose spontaneously to (carboxy terminal) homoserine lactone and a new NH2-terminal fragment (Figure 2.1A) (133). Despite the ability of

CNBr digestion to solubilize insoluble collagen matrices, there are several caveats associated with the use of CNBr. First, CNBr has a high degree of toxicity and requires careful handling during preparation, use and disposal (134). Second, the protocol for CNBr digestion is relatively laborious and involves a critical transfer step of the post-chaotrope insoluble ECM pellet into a glass vessel suitable for the strongly acidic conditions of digestion, introducing an opportunity for increased technical variability. Third, we have observed declining mass spectrometer performance during acquisition of large data sets of CNBr-digested iECM, necessitating additional sample clean-up steps and instrument preventative maintenance measures.

Therefore, we sought to implement an orthogonal chemical digestion approach to circumvent these issues and improve sample throughput.

Hydroxylamine (NH2OH) is a nucleophilic amine reported to cleave at Asn-Gly (NG) sites in proteins. NG sites cyclize to an imide, which is then subjected to nucleophilic attack by NH2OH. Hydrolysis releases a C-terminal hydroxamate and N-terminal glycinyl residue (Figure 2.1B). In the past, this method has been used successfully to produce small soluble peptides, such as insulin-like growth factor (135). It has several advantages over CNBr including elimination of the transfer step after chaotrope extraction, safety, low cost, relatively low cross-reactivity, and has been

34 reported to be selective for Asn-Gly linkages (136, 137). Therefore, we sought to address two major questions: 1) Compared to CNBr, will hydroxylamine digestion allow for comparable solubilization and quantification of ECM proteins while reducing technical variability? 2) Does tissue type influence the recovery of matrisome components across the chemical digestion approaches?

While traditional global proteomics approaches have improved our understanding of matrix composition, these approaches lack the ability to accurately quantify ECM protein abundance and stoichiometries in the microenvironment. Here we employ our previously described targeted LC-SRM (liquid chromatography- selected reaction monitoring) approach with ECM-specific stable isotope labeled

(SIL) quantitative concatamers (QconCATs) to evaluate the two methods. Five previously described QconCAT proteins covering 170 peptides which represent 82

ECM, ECM-associated, and common cellular contaminant proteins of ECM preparations (93, 94, 127) are utilized here. We leveraged this approach to provide a quantitative comparison of insoluble ECM prepared by CNBr and NH2OH chemical digestion methods. Our data reveal that NH2OH achieves equivalent or greater abundance of ECM components in all tissues analyzed. Importantly, our results provide an alternative ECM enrichment methodology that can be applied to a variety of tissues for identification and quantification of chaotrope-insoluble matrix proteins.

35

Materials and Methods

Reagents

Reagents were purchased from Sigma-Aldrich (St. Louis, MO) unless otherwise noted. Sodium chloride was from Acros Organics (part of Thermo Fisher).

Microcentrifuge tubes and other consumables were from Axygen Inc. (Union City,

CA). Formic acid (FA), trifluoroacetic acid (TFA), and hydroxylamine hydrochloride were from Fluka (Buchs, Switzerland). Anhydrous potassium carbonate, guanidine hydrochloride, sodium hydroxide, and acetonitrile (LC-MS grade) were from Fisher

Scientific (Pittsburgh, PA). Trypsin (sequencing grade, TPCK treated) was from

Promega (Madison, WI).

QconCAT Design

Stable isotope-labeled QconCATs were designed as previously described

(138). Six QconCATs were used to make 170 peptides covering 82 proteins in the

Mus musculus proteome. Sequences can be found in (93).

Sample Preparation for Proteomics

Three replicates of frozen femur, lung, liver, skin, and skeletal muscle harvested from a 6-week old female mouse (The Jackson Laboratory, Bar Harbor,

ME) were powderized in liquid nitrogen using a ceramic mortar and pestle. Weighed tissue (approximately 5 mg of each) was homogenized in freshly prepared high-salt buffer (50 mM Tris-HCl, 3 M NaCl, 25 mM EDTA, 0.25% w/v CHAPS, pH 7.5) containing 1x protease inhibitor (Halt Protease Inhibitor, Thermo Scientific) at a concentration of 10 mg/mL. Homogenization took place in a bead beater (Bullet

36

Blender Storm 24, Next Advance, 1 mm glass beads) for 3 min at 4 ºC. Samples were then spun for 20 min 18,000 x g at 4 ºC, and the supernatant removed and stored as Fraction 1. A fresh aliquot of high-salt buffer was added to the remaining pellet at 10 mg/mL of the starting weight, vortexed at 4 ºC for 15 min, and spun for

15 min. The supernatant was removed and stored as Fraction 2. This high-salt extraction was repeated once more to generate Fraction 3, after which freshly prepared guanidine extraction buffer (6 M guanidinium chloride adjusted to pH 9.0 with NaOH) was added at 10 mg/mL and vortexed for 1 hour at room temperature.

The samples were then spun for 15 min, the supernatant removed, and stored as

Fraction 4. Fractions 1, 2, & 3 were combined and all fractions were stored at -20 ºC until further analysis. The remaining pellets of each tissue representing insoluble

ECM proteins were treated with either hydroxylamine or cyanogen bromide.

Hydroxylamine (NH2OH) Treatment

Following Fraction 4, pellets were treated with freshly prepared hydroxylamine buffer (1 M NH2OH-HCl, 4.5 M guanidine-HCl, 0.2 M K2CO3, pH adjusted to 9.0 with NaOH) at 10 mg/mL of the starting tissue weight (139). The samples were briefly vortexed, then incubated at 45 ºC with end-over-end rotation for

17 hours. The tubes were fastened shut due to pressure build-up during incubation.

Following incubation, the samples were spun for 15 min at 18,000 x g, the supernatant removed, and stored as Fraction 5 at -20 ºC until further proteolytic digestion with trypsin. The final pellet was stored at -20 ºC until further analysis.

37

Cyanogen Bromide (CNBr) Treatment

Following Fraction 4, pellets were transferred to a glass vial and treated with

100 mM CNBr/86% TFA solution at 10 mg/mL of the starting tissue weight. The samples were agitated in the dark at room temperature for 17 hours. Following incubation, the solvent was evaporated under N2, followed by 3 x 1 mL washes with

100 mM ammonium bicarbonate (pH 8) in which the samples were briefly vortexed and dried in a speedvac. The dried samples were stored as Fraction 5 at -20 ºC until further proteolytic digestion with trypsin. Immediately prior to trypsin digestion,

Fraction 5 was re-suspended in freshly prepared urea buffer (8 M urea in 100 mM ammonium bicarbonate) at 10 mg/mL, vortexed for 1 hour at room temperature, and the supernatant used for tryptic digestion.

Trypsin Digestion

200 µL of the Fraction 4 & 5 of all samples were subsequently subjected to reduction, alkylation, and enzymatic digestion with trypsin. 200 fmols of each SIL peptide (170 peptides total) were spiked into 200 µL of sample to allow for two injections per sample (100 fmols eQ 1-6 per injection). A filter-aided sample preparation (FASP) approach, as well as C18 cleanup, was performed as previously described (140).

LC-SRM Analysis

Samples were analyzed by LC-SRM and LC–MS/MS as described (132).

Equal volumes from each post-digestion sample were combined and injected every third run and used to monitor technical reproducibility. Skyline was used for method

38 development and to extract the ratio of endogenous light peptides to heavy internal standards from LC- SRM data for protein quantification as described (141). LC–

MS/MS data was processed as previously described (93). Limits of detection, quantification, and dynamic range were determined for each peptide as previously described (127). Final fmol values are expressed as fmol/mg where milligrams represents milligrams of starting wet tissue weight. Principal component analysis

(PCA) and partial least squares-discriminant analysis (PLS-DA) were performed using the MetaboAnalyst online platform with sum and range scaling normalizations

(142). To control for any technical variability introduced prior to chemical digestion we normalized the CNBr and NH2OH treated samples to the measured protein abundance of the sECM fraction. Additionally, in order to ensure that the increased

12 13 ratios ( C6/ C6) were a result of increased endogenous proteins signal and not a result of decreased QconCAT digestion efficiency, we analyzed a subset of each tissue extract with pre-digested QconCAT and found no evidence of reduced

QconCAT digestion efficiency. All data has been made publically available through the PRIDE database (PX-Submission #185523)

LC-MS/MS Analysis

Samples were analyzed on an LTQ Orbitrap Velos Pro mass spectrometer

(Thermo Fisher Scientific) coupled to an Eksigent nanoLC-2D system through a nanoelectrospray source. 8 μL of sample was injected onto a trapping column

(ZORBAX 300SB-C18, dimensions 5 x 0.γ mm, 5 μm) and washed with 0.1% formic acid (FA) in water at a flow rate of 5 μL/min for 5 min. The analytical column (100 μm i.d. × 150 mm fused silica capillary packed in house with 4 μm 80 Å Synergi Hydro

39

C18 resin (Phenomenex; Torrance, CA)) was then switched on-line at 600 nL/min for

10 min to load the sample. The flow rate was adjusted to 350 nL/min, and peptides were separated over a 120-min linear gradient of 2–40% ACN with 0.1% FA. Data acquisition was performed using the instrument supplied Xcalibur™ (version β.1) software. The mass spectrometer was operated in positive ion mode. Full MS scans were acquired in the Orbitrap mass analyzer over the 300–1800 m/z range with

60,000 resolution at m/z 400. Automatic gain control (AGC) was set at 5.00E+05 and the ten most intense peaks from each full scan were fragmented via HCD with normalized collision energy of 35. MS2 spectra were acquired in the Orbitrap mass analyzer with 15,000 resolution. All replicates of each tissue were run sequentially and pre-digested yeast alcohol dehydrogenase standard (nanoLCMS Solutions LLC,

Rancho Cordova, CA) was run between tissue groups to monitor drift in analytical performance.

Database Searching and Protein Identification

MS/MS spectra were extracted from raw data files and converted into .mgf files using MS Convert (ProteoWizard, Ver. 3.0). Peptide spectral matching was performed with Mascot (Ver. 2.5) against the Uniprot mouse database (release

201701). Mass tolerances were +/- 15 ppm for parent ions, and +/- 0.5 Da for fragment ions. Trypsin + NH2OH or Trypsin + CNBr enzyme specificity was used, allowing for 1 missed cleavage. Met oxidation, Pro hydroxylation, protein N-terminal acetylation, and peptide N-terminal pyroglutamic acid formation were set as variable modifications (maximum number of 8) with Cys carbamidomethylation set as a fixed modification. Scaffold (version 4.4.6, Proteome Software, Portland, OR, USA) was

40 used to validate MS/MS based peptide and protein identifications. NH2OH cleavage at NG or NX sites was evaluated by merging Scaffold peptide reports for all tissues analyzed and reporting the total number of NG or NX cleavage observations in the entire dataset. Peptide identifications were accepted if they could be established at greater than 95.0% probability. Protein identifications were accepted if they could be established at greater than 99.0% probability and contained at least two identified unique peptides.

Error Tolerant Searches

Raw LC-MS/MS data was converted to peak lists using MS Convert

(ProteoWizard, Ver. 3.0). Peptide spectral matching was performed with Mascot

(Ver. 2.5) against the Uniprot mouse database (release 201701). Unbiased error tolerant searches were performed using Byonic (143) in order to identify unknown and unpredicted modifications induced by each chemical digestion method. PSMs were only used for analysis if their expectation value (analogous to p-value) was less than 0.05, and the total number of occurrences of each mass addition/subtraction was plotted using GraphPad Prism 6.0 (GraphPad Software Inc, La Jolla, CA).

Results

CNBr Versus NH2OH Digestion of Insoluble ECM Components

Using previously reported conditions for NH2OH (139) and our existing method for CNBr digestion (127) we evaluated the digestion of five tissues using several criteria including 1) the number of unique proteins identified by data- dependent, global LC-MS/MS analysis 2) quantitative yield using LC-SRM with

41 heavy standards 3) potential side product generation (i.e. modifications) using error- tolerant database searches. In order to assess these criteria, we utilized an experimental workflow based on a modified version of our previously published protocol to characterize five murine organs; lung, liver, muscle, skin and bone

(Figure 2.2A). These tissues were chosen as they represent organs or organ systems that vary greatly in the abundance of their ECM components relative to total protein content. For example, bone was chosen as a non-compliant and high ECM abundance tissue. Along these same lines, skin, muscle and lung were chosen as compliant tissues with high to medium ECM abundance, while liver is representative of tissues with low ECM abundance. Proteomic analysis of ECM components were compared across the two chemical digestion methods (CNBr vs. NH2OH) to determine the ability of each method to solubilize and, with the aid of trypsin, generate proteolytic peptides for ECM protein quantification.

The tissue extraction/fractionation and digestion workflow used prior to LC-

SRM data acquisition is shown as a schematic representation in Figure 2.2A. Our sample preparation protocol yields three distinct fractions: 1) cellular, 2) soluble

ECM (sECM), and 3) iECM. Samples destined for CNBr or NH2OH digestion were prepared identically. The location of CNBr and NH2OH cleavage sites within the primary sequence of collagen alpha-1(I) are shown in Figure 2.2B. CNBr cleaves specifically at methionine residues, with collagen alpha-1(I) having 8 cleavage sites.

In contrast, NH2OH is reported to cleave at Asn-Gly sites, with collagen alpha-1(I) having 6 sites. However, we also observed cleavage at other Asn-X sites in our dataset, which accounts for an additional 7 potential cleavage sites on collagen

42 alpha-1(I), although these were not observed in our MS data (Figure 2.2B). Our SIL

QconCAT library contains 5 reporter peptides mapping to the main chain of collagen alpha-1(I) and 3 reporter peptides mapping to the main chain of collagen alpha-2(I).

Our hypothesis that NH2OH would be a good alternative to CNBr was based on early reports of hydroxylamine-sensitive bonds in collagen alpha-1(I) (144) – the most abundant iECM protein in the tissues analyzed.

Femur

The skeletal system functions to provide structural support, maintain calcium homeostasis, and replace old or compromised bone in order to preserve structural integrity under a wide range of loading conditions (145). In both humans and mice, bone forms initially as woven (immature fibers) bone during development or during a wound-healing response to injury (145). Woven bone has a collagen matrix that is progressively remodeled into parallel layers of collagen lamellae in which osteocytes are embedded, called lamellar bone. When bone development and remodeling go awry, serious conditions such as osteopetrosis, can occur (146). Methods for accurate ECM characterization in bone can generate information that will guide the development of disease models and biologically relevant biomaterials.

Data-dependent acquisition of the iECM fractions of femur prepared using the two methods generated 633 peptides and 97 protein identifications shared across both methods, with 6 and 2 unique protein identifications for CNBr and NH2OH digestions respectively. Consistent with our previous work, the highest number of identifications in this fraction are for fibrillar collagen. Total ion current (TIC) scatterplots from global LC-MS/MS experiments highlight the high correlation in

43 signal abundance among peptides between methods for the same protein (Figure

2.3A). Fibrillar collagen (e.g. Col1a1, Col1a2, Col3a1) falls ouside the 95% confidence band highlighting the difference in peptide yield between CNBr and

NH2OH digestion methods using this semi-quantitative approach.

Targeted LC-SRM analyses used to quantify differences in abundance by functional class showed that fibrillar collagen represents 91% and 94% of total quantifiable structural ECM proteins in NH2OH and CNBr replicates, respectively

(Figure 2.3B) . Additionally, in both methods fibrillar collagen was found to give rise to >99% of total structural ECM protein composition. Despite slight variations in yield within functional classes, the overall yields of CNBr and NH2OH were very comparable with few notable differences.

Comparison of quantitative LC-SRM results using partial least squares - discriminant analysis (PLS-DA) revealed that CNBr- and NH2OH-digested iECMs cluster apart from one another highlighting variations in measured protein abundance (Figure 2.3C). Basement membrane and FACIT collagen components were more abundant in NH2OH digested ECM (1.4- and 1.6-fold higher versus

CNBr, respectively), while matricellular and structural ECM protein classes were found to be more abundant by CNBr digestion (1.3- and 1.2-fold higher in CNBr versus NH2OH, respectively).

The loadings plots revealed that these differences are largely driven by variations in basement membrane (i.e. Col4a2, Lama2) and cytoskeletal (i.e. Act) component abundance between methods. This is despite these proteins having fewer Asn-Gly motifs thnn Met residues (for example, actin proteins have one Asn-

44

Gly sequence at position 12 compared to 16 to 17 Met residues, (depending on the isoform)) spread throughout the primary sequence. This indicates that trypsin access to these proteins is not dependent on release of the protein from crosslinked insoluble protein. Overall these proteins compose less than 3% of the total protein fraction in the insoluble bone ECM.

Skin

The skin acts as an important environmental barrier that covers all other organ systems in the body and is composed of several distinct layers. The ECM component of the skin is largely found in the dermis which is a region of dense irregular collagenous connective tissue that gives the skin its toughness and provides structural protection for underlying skeletal muscles and organs (147).

Alterations in maintenance and architecture of the dermal microenvironment during aging has been shown to perpetuate disease progression (148). Deciphering a compositional baseline of the skin will not only shed light on mechanisms of aging and disease, but help to design better products for the emerging field of tissue engineered biomaterials. Informed design of engineered skin biomaterials that accurately recapitulate tissue biomechanics and composition requires a comprehensive characterization of ECM components.

The two methods generated 554 shared peptides matching to 212 shared protein identifications for the three analytical replicates by global LC-MS/MS analysis. NH2OH-digested iECM resulted in the identification of 79 more proteins than CNBr-digested iECM, however; by comparing total ion currents (TICs) from all spectra used for protein identification for CNBr and NH2OH preparations, we see the

45 total signal is predominantly linear for fibrillar proteins identified by both methods

(Figure 2.4A).

We used the increased sensitivity of our quantitative LC-SRM approach to compare the overall yield and functional class abundance of CNBr and NH2OH iECM fractions. Digestion with NH2OH yielded 1.6 times more total protein than

CNBr and largely contributes to the variation between CNBr and NH2OH. The large degree of variability in CNBr replicates, relative to NH2OH iECM samples, can be observed as a high CV between replicates (43%), based on the median CV for all quantified proteins. In contrast, NH2OH replicates had an median CV of 13% for all quantified proteins. Further, endogenous peptides quantified in NH2OH iECM fractions were more abundant than those of CNBr across every functional class

(Figure 2.4B). Structural ECM, primarily comprised of fibrillar collagen, is the most abundant class of proteins followed by matricellular and cytoskeletal components.

Along these lines, fibrillar collagen represented 75% of total quantified protein in

NH2OH replicates in contrast to 85% in CNBr replicates. This discrepancy is primarily driven by differences in the overall yield of basement membrane and cytoskeletal components such as Col4a2, Act, and Vim. This plot also shows differences in intragroup variability between CNBr and NH2OH replicates with CNBr digestion showing greater variance than NH2OH digestion (Figure 2.4C).

Lung

Extracellular matrix in the lung is essential to maintain patency of the airway, which in conjunction with the large epithelial surface area that permits oxygenation and ventilation. The ECM provides structural support to facilitate gas exchange and

46 prevent airway collapse as well as offering a scaffold capable of imparting mechanical forces onto resident cells (149). These primary mechanical functions are imparted by major ECM components identified in the insoluble ECM fraction. In addition to playing an integral role in normal lung function, the composition of the

ECM is thought to directly impact the pathophysiology of diseased lungs by subverting control of both biomechanical and biochemical cues, especially in fibrotic phenotypes such as Idiopathic Pulmonary Fibrosis (150, 151). Beyond maintenance of lung homeostasis, accurate characterization of lung iECM also has important relevance for tissue engineering efforts as it provides a molecular-level readout to facilitate rational design of engineered organs(127).

Data-dependent acquisition of the iECM fractions of lung prepared using the two methods generated 326 shared peptides and 220 shared protein identifications.

Further, our analysis showed that CNBr and NH2OH iECM fractions share similar total TIC for all identified proteins (Figure 2.5A). CNBr and NH2OH showed almost no discernable variation in the number of identified proteins between the two methods. 86 proteins were unique to CNBr and 84 proteins were unique to NH2OH.

LC-SRM resulted in a median CV for NH2OH replicates of 11%, compared to

28% for CNBr replicates. The average overall yield (i.e. total quantifiable protein) varied slightly between NH2OH and CNBr replicates as well, with NH2OH showing a

1.5-fold increase relative to CNBr. Fibrillar collagen did however, represent a similar percentage of total quantifiable protein in the insoluble ECM from NH2OH (52%) and

CNBr (57%) replicates. Functional protein class analysis supported these overall yield observations but also revealed that some classes are more amenable to a

47 particular method (Figure 2.5B). For example, FACIT collagen (i.e. Col14a1) was found to be more abundant in CNBr replicates compared to NH2OH replicates.

Multivariate analysis of the iECM fractions from either CNBr or NH2OH further highlighted abundance-driven differences between CNBr and NH2OH replicates, with CNBr replicates showing more intragroup variance than NH2OH replicates

(Figure 2.5C).

Muscle

The ECM in skeletal muscle acts to anchor muscle tissue to tendon and bones as well as to effect skeletal movement such as locomotion and maintaining posture (149). In response to increases in compression stress (i.e. loading), microtubule dynamics induce changes in cell shape; at the same time, durostatic gradients of ECM stiffness direct cell motility and migration of fibroblasts and smooth muscle cells (112, 152). Under pathological conditions, such as Ehlers-Danlos syndrome, the ECM surrounding skeletal muscles and joints can become hyperelastic and osteoarthritic as a result of defects in structure, production, or processing of collagen or proteins that interact with collagen (153, 154).

We determined that CNBr and NH2OH iECM fractions from muscle share a similar ECM composition but vary in terms of exclusive unique spectral counts for a given protein using global proteomics data. Despite this discrepancy, TIC scatterplots demonstrate a high correlation in signal abundance between CNBr and

NH2OH methods (Figure 2.6A). Furthermore, iECM fractions of bone prepared using the two methods generated 1201 shared peptides and 213 shared protein identifications. However, the NH2OH method identified an additional 63 unique

48 proteins and 790 total unique peptides, whereas the CNBr method identified 77 unique proteins and 675 total unique peptides.

The median CV for LC-SRM quantified proteins in CNBr replicates was 29%, compared to 25% for NH2OH replicates. Differences between digestion methods can be further realized by grouping proteins together based on their functional class and comparing class abundance between CNBr and NH2OH replicates (Figure

2.6B). Not surprisingly, the most abundant class of proteins in these muscle samples was cytoskeletal proteins (i.e. Act, Tubb, Des, Myh, Vim), which were found to represent and average of 51% of the total protein quantified in the iECM fraction of

NH2OH muscle preparations and 48% for CNBr muscle preparations (Figure 2.6B).

Further analysis showed that fibrillar collagen makes up a smaller percentage of total insoluble protein in both CNBr and NH2OH iECM fractions of muscle relative to other tissues, (43% of total protein in NH2OH iECM fraction versus 45% of total protein in CNBr iECM fraction), further highlighting the similarities in overall yield between digestion methods in this case.

Although the ECM profiles were similar between CNBr and NH2OH groups, we are able to distinguish them from one another based using PLS-DA scores plots

(Figure 2.6C). Once again, CNBr replicates were found to have more intragroup variance compared to NH2OH replicates and is further supported by comparing the

CVs of NH2OH and CNBr.

49

Liver

The liver is a densely cellular organ whose ECM is primarily characterized by a collagenous capsule (Glisson’s capsule) that acts as a layer of connective tissue surrounding the organ. Human hepatic portal structures are supported by a moderate amount of collagen and thin reticular fibers that form a loose network supporting the sinusoids (149, 155). The liver differs in its extracellular organization from other organs in that it has no physical basement membrane surrounding hepatocytes. Instead, Glisson’s capsule directly surrounds hepatocytes with an ECM primarily consisting of fibronectin, collagen and some select basement membrane proteins (81).

Data-dependent acquisition of the iECM fractions of bone prepared using the two methods generated 796 peptides and 380 protein identifications shared between methods. Col1a1 was found to be the most abundant iECM protein in the liver in both methods. Despite both methods identifying a similar number of exclusive unique peptides for Col1a1, there was some discrepancy in the number of spectra used to make that identification. Scatterplots from global LC-MS/MS experiments comparing TIC of identified proteins from either CNBr or NH2OH did show differences in total signal abundance between digestion methods for the same protein (Figure 2.7A). Many of the identified proteins fall outside of the 95% confidence band showing global similarity between CNBr and NH2OH digestion methods.

The targeted analysis revealed that NH2OH yielded higher protein levels than the CNBr approach across almost every protein class. As such, overall yield differed

50 between methods, with NH2OH having a 1.5-fold higher total protein quantified compared CNBr. This is primarily due to differences in the abundance of fibrillar collagen, FACIT collagen and cytoskeletal components (Figure 2.7B). Fibrillar collagen represented 73% and 75% of total quantified structural ECM protein for

NH2OH and CNBr preparations, respectively. However, fibrillar collagen was quantified at 1.4 times higher in NH2OH samples. Comparison of quantitative proteomics data using multivariate PLS-DA analysis revealed that CNBr- and

NH2OH-digested iECMs cluster apart from one another and differ in their degree of intragroup variance (Figure 2.7C). However, this variance is primarily driven by two proteins near the limit of detection, FBLN1 and LGALS1.

Evaluation of Chemical Digestion Cleavage Specificity

NH2OH and CNBr cleavage was evaluated by performing additional database searches using dual cleavage specificity (i.e. trypsin + NH2OH or trypsin + CNBr).

Although NH2OH has previously been reported to be specific for NG sites, we evaluated NX sites as well, where X was allowed to be any other amino acid following N. Although we found that NG represented the combination with the highest number of observed cleavage events, all other amino acid combinations (i.e.

NX) with the exception of cysteine, were also observed, with a preference toward small amino acids. However, NG cleavage was observed, on average, 10-fold higher compared to all other combinations. We did not observe cleavage of QG or QX bonds on the other hand. As a negative control, these search conditions were used on a subset of the sECM fractions that had not been exposed to chemical digestion conditions. Cleavage on the C-termini of asparagine’s in this fraction was observed

51 less than 1% of the time, indicating this phenomenon is specific to NH2OH treatment.

CNBr cleavage was specific to the C-termini of methionine residues as reported. In the case of CNBr database searches, we also included two additional variable modifications, homoserine (Met to Hse) and homoserine lactone (Met to

Hsl). The average observed missed cleavage percentage for trypsin across all CNBr and NH2OH digested tissues was found to be 8.5% and 8.3%, respectively.

Semitryptic peptides with ragged N-termini represented 9.78% (CNBr) and 6.8%

(NH2OH) of total peptides. Semitryptic peptides with ragged C-termini or non-tryptic peptides each represented less than 1% of total peptides.

Error Tolerant Searches

To investigate if observed differences were a result of side reactions generated during CNBr or NH2OH digestion, we performed error tolerant searches on untargeted data-dependent acquisition data. Several expected abundant modifications were identified such as deamidation of Asp, oxidation of methionine and proline, and a +6 Da addition that results from the inclusion of our heavy labeled

QconCAT SIL peptides. One of the most common modifications identified (+15.99

Da) is primarily derived from the endogenous hydroxy-proline residues on fibrillar collagen, and in the case of NH2OH digestion, from the formation of the C-terminal hydroxamates, although rarely observed (Figure 2.1B). Two modifications that are specific to the CNBr cleavage mechanism were also determined: 1. Formation of a homoserine lactone (133) and 2. The low frequency bromination of tyrosine. For example, the collagen alpha-2(I) peptide -GYPGSIGPTGAAGAPGPHGSVGPAGK-

52 was identified in both non-brominated (Figure 2.8A) and brominated forms (Figure

2.8B) and the unique isotopic profile of the brominated peptide can be seen in

Figure 2.8C and 2.8D.

Discussion

Here we have expanded upon existing ECM proteomics methods by developing an alternative method for insoluble protein characterization that avoids some of the drawbacks associated with the cyanogen bromide reagent. We utilized global proteomics to monitor gross variations and a matrisome-targeted method to more accurately quantify ECM protein abundance and stoichiometries from five tissues. All samples presented in the study are technical-replicates to provide the most direct comparison between the methods. The combination of our global LC-

MS/MS and targeted LC-SRM approaches with ECM-specific stable isotope labeled peptides revealed that an abundance of protein is present and available for analysis, from tissues after chaotrope extraction. Not surprisingly, collagen I was the most abundant protein in the iECM fraction.

Our previously described approach to solubilize insoluble ECM using CNBr was effective and capable of outperforming orthogonal methods for ECM composition analysis (128). Despite this success, we sought to develop an improved chemical digestion approach using NH2OH that minimizes sample loss (i.e. no pellet transfer step), avoids the reactive and toxic CNBr compound and strong acid used for digestion, and provides similar or improved quantification precision and reproducibility. We evaluated alternative proteases and chemical digestion methods and determined that hydroxylamine digestion was the top candidate for further

53 development. Advantages of this approach includes digestions being carried out in the same vessel as chaotrope extraction (no sample transfer to glass vial), with buffered guanidine, and a relatively modest chemical activity that is unlikely to generate extensive side reactions that can hamper absolute quantification. In the case of CNBr, we suspect that liberated reactive species may negatively affect protein digestion with trypsin or account for the observed decrease MS instrument performance. Over time, a decrease in the resolution of low mass ions and in transfer efficiency (30-40%) of a calibration standard mix was observed after running samples prepared with CNBr. Generally, solid phase extraction (SPE) using C18 resin reduces, but does not eliminate this effect.

From global LC-MS/MS analysis we determined that CNBr and NH2OH resulted in a similar number of identified ECM proteins irrespective of tissue type.

These findings are supported by our LC-SRM data that showed similarity between methods in the number of quantifiable peptides, however; the methods varied in terms of overall yield (i.e. total quantified protein). The NH2OH method yielded a higher concentration of ECM components and yielded more reproducible measures between replicates across all tissues except for bone. The specificity of the two chemical digestion approaches was evaluated by monitoring the occurrence of non- target cleavage events. While NG were the most preferred site of cleavage, NH2OH is relatively non-specific under the conditions used here. Although we did not observe NX cleavage on fibrillar collagen itself, it was observed in other proteins present in the iECM fraction. As a result, it is recommended that this be considered when performing database searches on data from hydroxylamine treated samples.

54

In addition, quantification of peptides that contain asparagine residues should be used with caution.

A caveat of this work is that we did not perform deglycosylation which should increase sequence coverage, especially for proteoglycans. In our experience, the majority of proteoglycans and glycoproteins are extracted in the soluble ECM fraction prior to chemical digestion of the insoluble pellet. Further, we have intentionally avoided peptides with known glycosylation sites in our stable isotope labeled standards used for quantification. Additionally, deglycosylation of the prominent glycosaminoglycans (GAG) would require numerous enzymatic treatments that would introduce a source of analytical variability for this comparison.

When comparing fibrillar collagen abundance across tissues for NH2OH- treated samples, femur showed the highest abundance of structural ECM and fibrillar collagen relative to lung, liver, skin and muscle (Figure 2.9A). The average fibrillar collagen abundance calculated for skin tissue was second only to the abundance of fibrillar collagen in femur. Interestingly, the ratio of Col1a1/Col1a2 varies based on the peptide followed between the CNBr and NH2OH methods across tissues (2.1 & 1.3 respectively using peptides GSEGPQGVR Col1a1 and

VGAPGPAGAR for Col1a2).

The results highlight how the abundance of functional classes of proteins varies among the tissues analyzed (Figure 2.9B). Overall, the trend in fibrillar collagen as a percentage of total quantified insoluble protein in the iECM fraction trends as expected with tissue compliance. Not surprisingly, femur samples displayed the largest amount of structural ECM in the iECM fraction, the

55 overwhelmingly majority of which was fibrillar collagen. Skin was second in total

ECM abundance, followed by muscle, lung, and liver. Other major differences include the abundance of cytoskeletal components. As one might expect, muscle samples had a higher abundance of cytoskeletal components than all other tissues analyzed. In contrast, basement membrane appears to be most abundant in those organs with lower total ECM content and higher cellularity, such as lung, muscle, and liver. The most abundant basement membrane protein across all tissues was found to be collagen IV as has been previously reported in skeletal muscle (156).

Not surprisingly, basement membrane components were found to be most abundant in lung tissue owing to large membranous surface area supporting the endothelial barrier as needed for gas exchange. Some of the largest differences between the two methods appeared to be a result of differential solubilization of basement membrane and cytoskeletal components such as Col4a2, Lamc2, Tubb, and Vim.

Initially, we speculated that this was due to differences in the number of CNBr

(methionine specific) and NH2OH (Asn-Gly site specific) cleavage sites. Indeed, differences do exist in the number of cleavage sites, but the findings do not support the idea that more cleavage sites lead to more efficient digestion and resulting higher measured protein abundance in the iECM fraction of the tissues tested.

An alternative explanation is that a smaller peptide from chemical digestion is partially lost through the filter-aided sample preparation approach (FASP) approach.

For example, one peptide that we utilized to follow Col1a1 abundance

(GSEGPQGVR) resides in a CNBr cleavage product of 26.2 kDa, while NH2OH digestion results in a Col1a1 fragment of just 9.2 kDa. Thus, we postulated that our

56 proportion of Col1a1 may be under-represented in the NH2OH samples resulting in a lower Col1a1/Col1a2 ratio due the smaller NH2OH fragment being partially lost through the 10 kDa MWCO filter employed. However, the original FASP manuscripts by Wisniewski et al. report efficient retention of small proteins (5 – 10 kDa) using a

10 kDa MWCO filter (157, 158).

Our results provide an updated ECM enrichment methodology for identification of chaotrope insoluble matrix proteins. Based on these findings, the hydroxylamine digestion approach for chaotrope-insoluble matrisome characterization is preferred to the CNBr-based chemical digestion method. With the exception of bone, the hydroxylamine method yielded more reproducible results as demonstrated by coefficients of variance (CVs) and clustering by multivariate analysis. Eliminating the use of a strong acid, along with the reactive and toxic compound CNBr reduces personal and environmental hazards. Overall, the semi- quantitative and quantitative results yielded more protein identifications and higher peptide levels for the new approach, suggesting equal or improved ability to solubilize the insoluble matrix. For bone analysis, we found the two approaches to be relatively similar in their overall yield suggesting either approach would be suitable for iECM characterization, however, it is likely that a standard demineralization step will be required to achieve equivalent or superior results for the mineralized matrix using this new method. For all other tissues tested, hydroxylamine performed the same or better than cyanogen bromide and is the recommended approach to characterize this challenging fraction of the proteome.

57

Figure 2.1: Cyanogen Bromide (CNBr) and Hydroxylamine (NH2OH) Chemical Digestion Mechanisms. A) Nucleophilic attack of the methionine thioether to CNBr leads to the formation of homoserine or homoserine lactone with release of the adjacent residue as a new N-terminus. B) Asn-Gly (or Asn-X) sites cyclize to an imide, which is then subjected to nucleophilic attack by hydroxylamine. Hydrolysis releases a C-terminal hydroxamate and N-terminal glycinyl residue.

58

Figure 2.2: Workflow Diagram for Comparative Analysis of CNBr and NH2OH Using Quantitative QconCAT ECM Proteomics. A) Tissues are sequentially extracted to obtain cellular, soluble ECM (sECM) fractions. The pellet leftover after chaotrope extraction is subjected to either CNBr or NH2OH digestion. QconCATs are spiked into CNBr or NH2OH iECM fractions and samples are then enzymatically digested prior to data acquisition using an LC-SRM approach. B) Schematic representation of the mature fibrillar collagen fibers of collagen alpha-1(1). CNBr cleavage sites (8 for Col1a1) are denoted by a double black slash while NH2OH cleavage sites (13 for Col1a1) are denoted by a double red (NG) or double orange (NX) slash. Green hexagonal markings denote known sites of crosslinking in the telopeptide region fibrillar collagen. Transparent purple boxes mark sequence regions that match to stable isotope labeled (SIL) QconCAT peptides (Uniprot IDs: Col1a1: P11087 and Col1aβ: Q01149). ‡ denotes cleavage sites that were detected by LC- MS.

59

Figure 2.3: Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Femur Matrix. A) Scatterplots derived from total ion current (TIC) from global proteomics compare proteins identified in both CNBr and NH2OH iECM fractions. Results from 3 replicates are averaged to generate a single data point for a given protein identification. B) Functional class abundance plots from LC-SRM data show quantitative comparison of protein class abundance between CNBr and NH2OH. C) Partial least squares-discriminant analysis (PLS-DA) performed on final fmol/mg values from CNBr and NH2OH replicates. Shaded areas represent the 95% confidence interval for the three technical replicates.

60

Figure 2.4: Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Skin Matrix. A) Scatterplots derived from total ion current (TIC) from global proteomics compare proteins identified in both CNBr and NH2OH iECM fractions. Results from 3 replicates are averaged to generate a single data point for a given protein identification. B) Functional class abundance plots from LC-SRM data show quantitative comparison of protein class abundance between CNBr and NH2OH. C) Partial least squares-discriminant analysis (PLS-DA) performed on final fmol/mg values from CNBr and NH2OH replicates. Shaded areas represent the 95% confidence interval for the three technical replicates.

61

Figure 2.5: Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Lung Matrix. A) Scatterplots derived from total ion current (TIC) from global proteomics compare proteins identified in both CNBr and NH2OH iECM fractions. Results from 3 replicates are averaged to generate a single data point for a given protein identification. B) Functional class abundance plots from LC-SRM data show quantitative comparison of protein class abundance between CNBr and NH2OH. C) Partial least squares-discriminant analysis (PLS-DA) performed on final fmol/mg values from CNBr and NH2OH replicates. Shaded areas represent the 95% confidence interval for the three technical replicates.

62

Figure 2.6: Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Muscle Matrix. A) Scatterplots derived from total ion current (TIC) from global proteomics compare proteins identified in both CNBr and NH2OH iECM fractions. Results from 3 replicates are averaged to generate a single data point for a given protein identification. B) Functional class abundance plots from LC-SRM data show quantitative comparison of protein class abundance between CNBr and NH2OH. C) Partial least squares-discriminant analysis (PLS-DA) performed on final fmol/mg values from CNBr and NH2OH replicates. Shaded areas represent the 95% confidence interval for the three technical replicates.

63

Figure 2.7: Quantitative Comparison of Chemical Digestion Methods for Enrichment of Insoluble Liver Matrix. A) Scatterplots derived from total ion current (TIC) from global proteomics compare proteins identified in both CNBr and NH2OH iECM fractions. Results from 3 replicates are averaged to generate a single data point for a given protein identification. B) Functional class abundance plots from LC-SRM data show quantitative comparison of protein class abundance between CNBr and NH2OH. C) Partial least squares-discriminant analysis (PLS-DA) performed on final fmol/mg values from CNBr and NH2OH replicates. Shaded areas represent the 95% confidence interval for the three technical replicates.

64

Figure 2.8: CNBr Selectively Brominates Tyrosine Containing Collagen. Peptides. A) Non-brominated form of collagen alpha-2(I) peptide GYPGSIGPTGAAGAPGPHGSVGPAGK. B) Brominated form of collagen alpha-2(I) peptide GYPGSIGPTGAAGAPGPHGSVGPAGK at position Y2. C) Isotopic distribution of non-brominated collagen alpha-2(I) peptide, theoretical pattern inset. D) Isotopic distribution of brominated collagen alpha-2(I) peptide, theoretical pattern inset.

65

Figure 2.9: Fibrillar Collagen and ECM Abundance Across Tissues. A) Fibrillar collagen abundance across tissues for NH2OH treated samples as a percentage of total quantified protein in the insoluble ECM fraction (iECM) for each tissue. B) Functional class abundance plots represented as a percentage of total quantified protein in the insoluble ECM fraction with fibrillar collagen removed.

66

CHAPTER III

EXTRACELLULAR MATRIX REMODELING OF MAMMARY GLAND AND LIVER

TISSUE MICROENVIRONMENT DURING THE REPRODUCTIVE CYCLE1

Introduction

Mechanistic studies of rodent mammary gland development reveal requisite roles for ECM in mammary epithelial cell proliferation, differentiation, and cell-death decisions (159-164). In fact, pioneering investigations identified the functional unit of the epithelium as the cell plus its adjacent ECM (165, 166). These findings shifted studies aimed at understanding epithelial cell function from a cell-intrinsic to a cell- stroma focus (167). Investigation of the relationship between tissue ECM and epithelium has also been applied to the study of breast cancer with tremendous gains (168).

The functions of matrix proteins in breast cancer have been assessed primarily using single, purified ECM proteins or by admixing ECM proteins of interest with commercially available Engelbreth-Holm Swarm (EHS) matrix that is enriched in laminin-111 (169-172). Such studies identified specific ECM protein-integrin interactions, matrix stiffness, and matrix architecture as critical mediators of tumor cell function (173-176). Single ECM molecules, including collagen I, fibronectin, and tenascin-C, display clear roles in promoting tumor cell proliferation, motility, and invasion (176-180). ECM roles in breast cancer risk are also suggested, as high mammographic breast density, indicative of elevated collagen content in the breast,

1 The work described in this chapter is included in The International Journal of Biochemistry and Cell Biology (ref 93) and has been published here with permission from the editor. 67 increases epithelial cell transformation by 4- to 6-fold (181, 182). The relationship between fibrillar collagen I and cancer incidence and progression have been corroborated in rodent models, where high collagen I content in the mammary gland results in a ~3-fold increase in tumor formation as well as increased lung metastasis

(183). There is also evidence that distinct ECM proteins at secondary sites of metastasis impact metastatic success (184-192). Prominent work out of David

Lyden’s laboratory has shown roles for lung and liver fibronectin in supporting disseminated tumor cell seeding and growth in murine models of colon, mammary, and pancreatic adenocarcinomas (189, 190). Cumulatively, these studies implicate

ECM in all stages of cancer progression, from initiation to metastatic outgrowth at secondary sites.

The reductionist approach of investigating single ECM protein-cell interactions in vitro, or manipulating single proteins in vivo, while revealing, does not replicate the complex ECM milieu of an in vivo tissue environment. One example of the importance of tissue-specific ECM is found in rodent models of postpartum breast cancer. In these models, whole tissue mammary ECM, as opposed to a single protein, has been shown to determine metastatic outcomes (193). Relevance to women is implicated, as postpartum breast cancer patients have a ~3-fold increased risk for metastasis and death (194-196), a poor prognosis attributed, in part, to ECM remodeling during postpartum breast involution (197). Specifically, weaning-induced mammary gland involution is characterized by substantial ECM remodeling, including deposition of radially aligned fibrillar collagen, fibronectin, and tenascin-C

(193, 198). Evidence that involution-specific mammary ECM promotes metastasis

68 has been demonstrated in xenograft models. Tumor cells co-injected with mammary

ECM isolated from involuting glands grew larger tumors within the mammary fat pad and metastasized at much higher rate compared to tumor cells co-injected with mammary ECM isolated from nulliparous rats (193). These data highlight the need to better understand how physiologic cues as well as disease states impact ECM composition and abundance, and provide compelling rationale for developing quantitative methodologies for ECM proteomics.

Robust characterization of tissue-specific ECM complexity and abundance has been largely hindered by technical challenges in the field of proteomics. For unbiased biochemical identification of proteins, mass spectrometry provides a highly sensitive approach. However, improvements to proteomic approaches for the study of tissue-specific ECM have been delayed by the proteolytic- and solubilization- resistant properties of ECM proteins that are often high molecular weight, glycosylated, and covalently crosslinked. While significant advances in ECM protein identification have occurred recently (199-202), proteomics approaches still largely fail to adequately quantify many ECM proteins including the fibrillar collagens, despite high abundance in tissues (203). We have recently established methods for improved tissue solubilization and absolute protein quantification to interrogate tissue-specific ECM. This approach permits quantitative assessment of a subset of

ECM proteins that represent >99% of spectra matching to core ECM and ECM- affiliated proteins identified in mammary gland and liver by global proteomics (203-

205). To gain insight into primary breast cancer and its’ site-specific metastasis, we utilize our quantitative proteomics approach to compare the rat mammary gland to

69 the liver, which is a common and lethal site of breast cancer metastasis. We also investigate ECM composition and abundance changes in the mammary gland across a reproductive cycle. Our objective was to gain insight into potential roles of

ECM in the pro-tumorigenic window of weaning-induced mammary gland involution.

Finally, we provide rationale for combining quantitative proteomics with multi-color immunofluorescence to provide spatial information about ECM-tumor cell interactions. Such a combined approach is anticipated to facilitate improved in vivo characterization and in vitro reconstruction of epithelial cell microenvironments for use in cancer biology, stem cell biology, and regenerative medicine.

Materials and Methods

Rodent Studies

The OHSU Institutional Animal Care and Use Committees approved all animal procedures. Sprague-Dawley female rats (Harlan), 70 +/-3 days of age, were bred and tissues collected as described (163, 206). Snap frozen, pulverized lymph node-free mammary gland [n=5/group for nulliparous, late pregnancy (day 18-21), lactation, involution days 2, 4, 6, 8, and 10, and four weeks post-weaning (regressed)] and gallbladder-free liver [n=6 for nulliparous] were used for ECM-based QconCAT proteomic analyses. Pooled samples generated from the above mentioned biologic replicates were utilized for global proteomics. Balb/c female mice (Jackson

Laboratories), aged 10 weeks, were used for portal vein injections.

70

Cell Lines

D2A1-GFP mouse mammary tumor cells were provided by Jeffrey Green

(National Cancer Institute, Bethesda, MD), and cultured as described (180).

Portal Vein Injections

Intraportal injections were performed as described (207). Briefly, Balb/c female mice were injected with 1x10^6 D2A1-GFP cells directly into the portal vein, and euthanized 90 minutes post-injection, livers were collected for formalin fixation and paraffin embedding.

Immunofluorescence and Imaging

Fluorescent immunohistochemistry (IHC) was performed using the Opal™ kit

(PerkinElmer) according to manufacturer’s recommendations. Sections were antigen retrieved using TRS, EDTA, or proteinase K (Dako). Primary antibodies were added for 1 h at RT or overnight at 4°C, in the following order: Fibronectin, MDBiosciences

MD24941; Collagen I, Abcam ab34710; GFP, Vector Laboratories BA 0702.

Secondary antibodies (ThermoFisher Scientific) were added for 30 min at RT.

Images for multi-color immunofluorescence were acquired on a Zeiss ApoTome2 with 20x0.8 PlanApo objective and Zeiss AxioCam 506 CCD camera, using Zen β™ software. Tissue H&E and trichrome stains were scanned on an Aperio ScanScope

AT, image analysis was performed using Aperio ImageScope software (Leica

Biosystems).

71

Sample Preparation for Proteomic Analysis

Approximately 5 and 50 mgs of fresh frozen mammary gland and liver, respectively, was pulverized in liquid nitrogen and processed as described (203).

Briefly, tissue samples were homogenized in CHAPS (3-((3-cholamidopropyl) dimethylammonio)-1-propanesulfonate) buffer with 2 mm glass beads using mechanical agitation (Bullet Blender®, Next Advance) on power 8 for 3 minutes.

Following homogenization, tissue samples were sequentially extracted using high- speed centrifugation after vortexing in high salt CHAPS buffer, 6 M urea, and CNBr buffers resulting in 3 fractions for each sample: (1) cellular fraction, (2) soluble ECM, and (3) insoluble ECM (Figure 1). All fractions were ran by liquid chromatography- tandem mass spectrometry (LC-MS/MS) and liquid chromatography-selected reaction monitoring (LC-SRM). LC-SRM analysis was done on n=5 mammary gland/group and n=6 liver samples, with n=7 technical replicates. LC-MS/MS analysis was done on pooled biological replicates for an n=1.

Detergent/Chaotrope Removal & Protein Digestion

Sample cleanup and protein digestion was carried out as described (208).

QconCAT standards were spiked into each sample prior to filter assisted sample

13 prep (FASP) to yield values of 100 fmol C6 QconCAT/5 µg of protein for LC-SRM injections. Equal volumes of biological replicates were combined for LC-MS/MS analysis.

Liquid Chromatography Tandem Mass Spectrometry & Data Analysis

Samples were analyzed by LC-SRM and LC-MS/MS as described (208).

Equal volumes from each post-digestion sample were combined and injected every

72 third run and used to monitor technical reproducibility. Skyline was used for method development and to extract the ratio of endogenous light peptides to heavy internal standards from LC-SRM data for protein quantification as described (209). LC-

MS/MS data was processed as previously described (203) . Limits of detection, quantification and dynamic range were determined for each peptide as previously described (203) .

Statistics

Statistical analysis was performed using GraphPad Prism 6. Comparison of two groups was done by two-sided Student’s T-test. Comparison of >two groups was done using One-way ANOVA.

Results

Development of Quantitative ECM Proteomic Methodology

To better understand the complexity of epithelial cell-ECM microenvironments in vivo, we have developed novel extraction and digestion methods for proteomic characterization of ECM. While these methodological advancements have furthered our understanding of ECM composition, they lacked the ability to accurately quantify

ECM protein abundance in the microenvironment. To overcome this barrier, we designed six recombinantly generated Quantitative conCATamers (QconCAT) (203,

210) made of 201 stable isotope labeled (SIL) peptides representing 98 ECM, ECM- associated, and common cellular proteins. Peptides specific to intracellular proteins from different subcellular locations were included to serve as a quality control measure for method development of tissue extraction methods and as a relative

73 measure of cellularity. The reporter peptides are ‘spiked into’ experimental protein lysates at equimolar concentrations to serve as internal quantitation controls. Figure

3.1A shows a schematic representation of the tissue extraction/fractionation, and digestion workflows used prior to LC-SRM data acquisition. Our fractionation protocol yields three distinct fractions, cellular, soluble ECM (sECM), and insoluble

ECM (iECM). The iECM fraction is then solubilized with cyanogen bromide (CNBr).

Reporter ECM peptides are added to all fractions, and samples are proteolytically digested and run by LC-SRM. The consolidated results yield quantity for a given protein within a tissue. This targeted mass spectrometry method allows us to measure all 201 SIL QconCAT peptides and endogenous analogs in a single 30 minute analytical run. Molecular heterogeneity can differentially affect signal intensity during LC-SRM data acquisition, which is why the inclusion of internal standardized controls for each unique protein of interest is essential for determining accurate absolute concentrations. We find the QconCAT generated heavy peptides allow for this level of precise quantification, as the reporter peptides behave identically to the endogenous peptides in terms of mass spectrometry fragmentation and ionization, chromatographic separation, and enzymatic digestion efficiency (Figure 3.1B). To confirm increased detection of rat mammary and liver ECM proteins with this method, the protein identifications within the 3 fractions (cellular, sECM, and iECM) were grouped into functional classifications of cytoskeletal, other cellular, matricellular, and several ECM categories (211). The vast majority of cellular proteins fractionate with CHAPS detergent into the cellular fraction, whereas sECM and iECM fractions were highly enriched for ECM proteins (Figure 3.1C). In the rat

74 liver we again found that the majority of cellular proteins resolved with CHAPS. The liver sECM fraction was enriched for matricellular proteins and the iECM fraction further enriched for fibrillar collagens (Figure 3.1C). Strikingly, 52% of mammary and 83% of liver collagen I was detected in the iECM fraction after CNBr treatment

(Figure 3.1D), a fraction not routinely incorporated into traditional proteomic methods. Additionally, other residual ECM proteins failed to completely solubilize with urea (sECM fraction), highlighting the importance of CNBr solubilization and analysis of the iECM fraction (Figure 3.1C). The utility of using our targeted reporter peptide approach compared to a global proteomics approach is further realized by analyzing the ratio of collagen alpha-1(I) to collagen alpha-2(I) by LC-SRM. The theoretical stoichiometry between these two chains should be 2:1

(COL1A1/COL1A2) based on the assembly of fibrillar collagen triple helices containing two alpha-1 chains and one alpha-2 chain. We find that the targeted approach with QconCATs more accurately reflects the theoretical ratio of 2:1 than a traditional global proteomics approach (Figure 3.1E).

QconCAT Based Proteomics Reveals Unique and Shared Mammary Gland and

Liver ECM Profiles

A major rate-limiting step of metastatic success has been attributed to discordance between the ECM requirements of the seeding tumor cell and the ECM microenvironment at the secondary site (184, 186, 187, 189, 212). In the context of breast cancer, we utilized our quantitative ECM proteomics method to elucidate tissue-specific differences and similarities between the primary and liver metastatic site. We focused on the liver, as one of three common sites of breast cancer

75 metastasis (213-215), which confers the worst prognosis (216-218). The proteomic results from the nulliparous female rat mammary and liver tissues were grouped into

10 functional classifications of proteins including: basement membrane, ECM regulator, fibril-associated collagens with interrupted triple helices (FACIT) collagen, fibrillar collagen, matricellular, other ECM, secreted ECM, and structural ECM.

Cellular proteins were classified into cytoskeletal and ‘other’ (211). This analysis demonstrated an abundance of fibrillar collagens and matricellular proteins in mammary tissue, and high levels of cytoskeletal and cellular proteins in liver (Figure

3.2A), data consistent with the relative ratios of epithelium to stroma within these organs (Figure 3.2B). Further, these tissues differed markedly in overall ECM abundance, with ~100 nmol of ECM in the mammary gland compared to ~8.5 nmol in the liver, per gram of tissue (Figure 3.2C). To further interrogate ECM complexity and abundance of ECM proteins between mammary gland and liver, we removed cellular proteins from our assessment. This ECM-biased analysis revealed that nulliparous mammary gland ECM is >80% fibrillar collagen, ~9% matricellular proteins, 1.3% basement membrane proteins, and ~5% structural ECM, FACIT collagens, other ECM proteins, ECM regulator and secreted ECM proteins (Figure

3.2D). In contrast, in the liver, matricellular proteins dominate, at 44%, followed by

26.4% fibrillar collagen, ~10% basement membrane, and 15.3% structural ECM,

FACIT collagens, ECM regulators and secreted ECM proteins (Figure 3.2D).

Although the absolute concentration of fibrillar collagen is vastly different between mammary gland and liver, fibrillar collagen I remains the most abundant single ECM protein in both tissues (Figure 3.2E), providing further support for an essential role

76 of collagen I in tissue structure and homeostasis (219). Our observed molar concentrations of fibrillar collagens in mammary gland and liver (Figure 3.2F, upper left panel) correlate with relative fibrillar collagen abundance detected by trichrome stain (Figure 3.2F, upper right panel and representative images), reinforcing potential biologic relevance of the QconCAT method.

To investigate ECM complexity beyond fibrillar collagen I, we stratified

QconCAT data based on the next twenty most abundant ECM proteins (Figure

3.2G). Despite the concentration of ECM in the mammary gland dropping significantly with removal of collagen I, ECM concentration of the remaining 20 proteins was still ~4-fold higher in the mammary gland compared to liver (Figure

3.2E & G). While the mammary gland and liver had similar diversity of ECM types

(Figure 3.2G), unique ECM profiles between the tissues were evident, with lumican, collagen VI, and collagen XIV being dominant in the mammary gland, and collagen

VI and fibronectin dominant in liver (Figure 3.2G). In summary, these analyses demonstrate the ability of the QconCAT method to provide absolute molar concentrations of specific ECM proteins within the mammary gland and liver and highlight tissue-specific ECM complexity. Further, the use of quantitative ECM proteomics identifies candidates, such as collagen VI and fibronectin to investigate for possible roles in site-specific liver metastasis. The liver is believed to be a favored site of metastasis because of its rich, dual blood supply (the liver receives blood via the hepatic artery and portal vein) (220).

77

Mammary Gland ECM Proteomics Across the Reproductive Cycle

The microenvironment of the mammary gland can be neutral, tumor- promotional, or tumor-suppressive, dependent upon reproductive state (180, 221).

Specifically, in rodent models of breast cancer, mammary tumor cells grow most robustly in the weaning-induced involuting microenvironment, moderately in the nulliparous mammary microenvironment, and least when transplanted into parous mice, whose mammary glands have completed weaning-induced involution (180,

221). While this dynamic fluctuation in tumor-supportive function is thought to be driven in large part by changes to mammary ECM with reproductive state and history

(180, 193, 198), mammary ECM has never been assessed across the reproductive cycle using quantitative proteomics. To this end, we analyzed rat mammary ECM in whole gland lysates from nulliparous, pregnancy, lactating, involuting (2, 4, 6, 8 and

10 days post-weaning), and fully involuted (regressed) stages. Principal component analysis (PCA) on LC-SRM data generated from these ECM proteomics data revealed a cycle of mammary gland ECM remodeling across pregnancy, lactation and involution, upon which the gland ultimately returns to an ECM microenvironment similar to, but distinct from, the nulliparous state (Figure 3.3). We observed a >2-fold drop in ECM abundance when comparing nulliparous to pregnancy, lactation, and involution day 2 stages (Figure 3.4A), data consistent with the increased cellularity as well as loss of collagen staining at these reproductive stages (163). Total ECM abundance increased to pre-pregnant levels by involution day 6, consistent with epithelial cell loss upon weaning (222). Somewhat surprisingly, we observed increased abundance of ECM in the fully regressed mammary gland compared to

78 the nulliparous host, data suggestive of unique mammary microenvironments in nulliparous and parous hosts (Figure 3.4A). We next compared the top twenty most abundant ECM proteins in the mammary glands from nulliparous, involution days 2 and 6, and regressed stages. We confirmed that collagen I is the predominant ECM protein in the gland (201, 223), and extended these analyses to demonstrate that collagen abundance is dramatically reduced during pregnancy, and is not detected at high levels again until 6 days post-weaning (Figure 3.4B). To investigate ECM complexity further, we removed collagen I from the analysis and found a high abundance of lumican, collagen VI, and collagen XIV in the nulliparous and regressed rat mammary gland (Figure 3.4C & D). In contrast, involution days 2 and

6 exhibited a prominent abundance of collagen VI, thrombospondin 1, and galectin-3

(Figure 3.4C & D). Two additional ECM proteins not found in the top 20, tenascin-C and collagen XII, increased in abundance during mammary gland involution (Figure

3.4D). We also saw significant increases in the abundance of fibronectin during involution. Importantly, collagen I-containing fibrils do not form in the absence of fibronectin which highlights the importance of fibronectin during extracellular matrix remodeling (224). Along these same lines, it is not surprising that we see fibronectin abundance spike early on in involution (i.e. Inv D2). In the context of breast cancer, fibronectin fragments have been shown to induce MMP activity in mouse mammary epithelial cells, thereby providing evidence for the specific role of fibronectin in mammary tissue remodeling (225). Despite this knowledge, we still know very little about the specific splice variants of fibronectin implicated in breast cancer. In order to address this we are currently designing a QconCAT aimed at resolving the many

79 splice variants of fibronectin, some of which have been implicated in breast cancer

(73). These data highlight how quantitative ECM proteomics can provide prime candidates for the investigation of the roles of ECM in breast cancer progression.

Discussion

This work describes advances made in sample preparation techniques and quantitative proteomics for the study of tissue ECM composition and abundance.

Using this experimental pipeline, we characterized tissue-specific ECM composition of the rodent mammary gland and liver, a lethal site of breast cancer metastasis, to a level not previously accomplished. These analyses identified putative tissue-specific

ECM components, including lumican and collagen XIV, which were dominant in the mammary gland and fibronectin, which was dominant in liver. We also found shared

ECM components between these two tissues, including abundant proteins such as collagen types I and VI, as well as less abundant collagen types IV and V. Further, we show that abundance of mammary gland ECM is altered across the reproductive cycle, building upon previous studies that have identified major shifts in ECM with pregnancy, lactation, involution, and regression (163, 198, 206, 226). In particular, we see elevated abundance of known pro-tumorigenic ECM proteins collagen VI, thrombospondin 1, galectin-3, and tenascin-C during weaning-induced mammary gland involution. Further, to the best of our knowledge, we identify collagen XII for the first time as elevated during post-weaning mammary involution. Importantly, potential roles for each of these highlighted ECM proteins in breast cancer metastasis have been elucidated, with the exception of collagen XII (227-230). While collagen XII has been shown to be upregulated in malignant breast cancer cell lines,

80 and has been identified as a prognostic marker in other cancers, its role in breast cancer progression has yet to be established (231, 232). In sum, our data demonstrate tissue-specific ECM complexity as well as ECM protein stoichiometry between the mammary gland and liver, and across a reproductive cycle within the mammary gland; these data are consistent with ECM contributing to differential tissue function. In addition to characterizing tissue specific ECM composition and protein abundance, this quantitative ECM proteomics pipeline has identified potential new biomarkers of postpartum breast cancer progression and may provide insight into site-specific metastasis.

The ECM-based QconCAT proteomics methodology allows for improved solubilization and detection of insoluble ECM proteins such as collagen I using a

CNBr extraction step. The use of in-house generated QconCATs for quantification of tissue ECM proteins has several advantages over traditional relative quantitative approaches, including 1) SIL peptide mimics control for matrix effects during proteomic acquisition, allowing for direct comparison between heterogeneous tissues, 2) inclusion of full-length QconCATs during digestion controls for sample loss or digestion variability, and 3) absolute quantitative values allow for inter-protein and -experiment comparisons between samples. This quantitative ECM proteomics approach facilitates in-depth characterization of tissue-specific ECM abundance and composition at a level not previously attained. However, as with any first-generation experimental pipeline, there are limitations to the methodology that need to be addressed in future work. First, targeted proteomics is inherently more specific and therefore will only quantitate peptides/proteins included in our QconCAT library.

81

However, when comparing QconCAT coverage to samples simultaneously run using global proteomics, only an additional 7 and 8 ECM proteins (primarily Annexins, accounting for 0.43% and 0.11% of total spectral matches) were identified in mammary gland and liver, respectively, which were not covered by the QconCAT library. In contrast, QconCAT proteomics quantified 34 and 41 proteins or protein isoforms in the mammary gland and liver, respectively, not identified by global proteomics. These comparisons highlight the benefits of increased sensitivity when applying a targeted proteomics approach, as this level of detection often requires deep fractionation and multiple runs to achieve similar depth using global proteomics. An additional caveat to QconCAT proteomics is that quantification of endogenous peptides with post-translational modifications (PTMs) is not currently possible. We circumvent this problem by designing QconCAT peptides specific to proteins that either have no known PTMs or do not contain a common PTM motif so that the quantified endogenous peptide has a higher probability of representing the molar equivalent of the protein it represents. Furthermore, we attempt to include multiple peptides per protein of interest to account for splice variants, known PTMs, and matricryptic sites, however a subset of proteins are currently covered by only one peptide. Future generations of this QconCAT library will increase confidence in protein quantification by expanding coverage of ECM, ECM-modifying, and ECM- associated proteins as well as adding additional peptides for all ECM protein targets.

Importantly, the increased depth of ECM coverage that will be gained by design of additional QconCATs will ultimately facilitate more refined characterization of ECM abundance and composition in tissues.

82

Proteomics has been widely adopted in the past decade due to its ability to characterize a large number of proteins from biological samples in a single analytical run. Multiple studies have characterized both the mammary gland and liver in a variety of normal and tumorigenic contexts (199, 200, 204, 205, 233-236) (liver- specific datasets outlining collagen identifications are summarized in. However, because these semi-quantitative approaches provide relative, and not absolute abundance of proteins, it is difficult to compare data across studies. For example, comparison of collagen coverage in the liver across our platform and five published datasets revealed marked variability between both identification and quantification of collagen. While the majority of datasets identified the most abundant collagens, collagens I & VI, they varied dramatically in estimated abundance and in the identification of less abundant collagens. Significant differences in estimated abundance of collagen I between studies are likely derived from variability in enrichment strategies, and the non-uniform analysis of insoluble collagens in the iECM pellet, a protein fraction not routinely captured in standard proteomics pipelines (203). The quantitative advantage of QconCATs became apparent with comparison of collagen alpha-1(I) to collagen alpha-2(I) ratios, as our targeted approach was able to come closest to recapitulating the theoretical 2:1 ratio.

Additionally, collagen alpha-1/2/3(VI) organizes into a 1:1:1 heterotrimer (82), and again, our study was the only one to reveal such a distribution. These findings further suggest that solubilization with CNBr, along with absolute quantification, may provide additional biological relevance to proteomics pipelines focused on ECM proteins. This level of variability across proteomics datasets also highlights the need

83 for standardization, which quantitative proteomics can provide, in order to fully interpret proteomics data across studies.

To the best of our knowledge, our ECM-based QconCAT proteomics pipeline has provided the most quantitative assessment of ECM proteins and tissue composition in mammary gland and liver to date. In the future, the application of this method can be utilized to more fully interrogate breast cancer progression. For example, a comparison of young women’s breast tumors and paired metastases, similar to work done by Naba et al in colorectal cancer (199), is predicted to reveal widespread ECM differences and further inform our understanding of this aggressive form of breast cancer. Further, studies to understand liver, as well as lung, bone, and brain ECM throughout the reproductive cycle, and patient cohort studies to understand site-specific metastasis of postpartum breast cancer patients are warranted.

In addition, we hope to apply our quantitative approach to additional tissues/organs to better understand broad regenerative capacity and the role of ECM components in disease. For example, the impact of the ECM biased proteomic pipeline could be further realized in the context of regenerative medicine, where understanding the composition of ECM components and the relative stoichiometry within specific organs would be critical steps in accurately recapitulating endogenous matrices. Finally, the expansion of our concatamer library for more in-depth protein coverage should be applied to these proposed studies. Ultimately, compilation of similar datasets for additional organs would lay the foundation for an ECM Atlas that

84 would be capable of comparing absolute quantitative measurements between all organs.

Note About Use of CNBr

In Chapter II I conclude that hydroxylamine (NH2OH) is superior to CNBr for digestion of the insoluble ECM for several reasons laid out in detail in the chapter.

However, in this study we used CNBr to digest the insoluble ECM pellet because we had not yet directly compared the two methods to one another before this study was published. In future studies, we plan to use NH2OH as the preferred method of chemical digestion to solubilize insoluble mammary gland ECM components.

85

Figure 3.1: Quantitative QconCAT ECM Proteomics Pipeline

86

Figure 3.1: Quantitative QconCAT ECM Proteomics Pipeline. A) Experimental pipeline for quantitative ECM proteomics. Tissues are sequentially extracted to obtain cellular, soluble ECM (sECM), and insoluble ECM (iECM) fractions. QconCATs are spiked into fractions and samples are then proteolytically digested to determine absolute concentration of proteins by mass spectrometry proteomics B) Table of a subset of the 98 ECM/ECM-associated proteins represented in the Qquantitative conCATamers (QconCAT) used to determine absolute concentration of proteins by mass spectrometry proteomics; the first three amino acids of the peptide represented are identified in italics (top). Representative chromatographic elution profile of equal molar concentration of conCATamer peptides detected by LC-SRM mass spectrometry demonstrates peptide-specific spectral profiles. For labeled peaks, 12 13 darker shading indicates C6 peptide (endogenous) and lighter shading indicates C6 peptide (QconCAT), which is spiked in at known, equimolar concentrations. Integrated peak areas are used for ratio metric determination of endogenous peptide levels, a surrogate for protein concentration (bottom). C) Percent of protein solubilization based on functional groups from cellular, sECM, and iECM fractions of rat mammary gland (left) and liver (right). D) Percent of fibrillar collagen solubilization from cellular, sECM, and iECM fractions of rat mammary gland (left) and liver (right). E) Ratio of collagen alpha-1(I) to collagen alpha-2(I) for peptide spectral matches vs. QconCAT based quantification in rat mammary gland.

87

Figure 3.2: QconCAT Based ECM Proteomics Reveals Unique and Shared Mammary Gland and Liver ECM Profiles.

88

Figure 3.2: QconCAT Based ECM Proteomics Reveals Unique and Shared Mammary Gland and Liver ECM Profiles. A) QconCAT based ECM proteomics of nulliparous rat mammary gland and liver tissues displayed as total abundance of proteins (nmol/gram of tissue) functional classifications; n=5 mammary gland, n=6 livers. B) Representative H&E stained rat mammary gland (MG; left) and liver (right) depicting tissue specific differences in stromal-epithelial cell composition; scale bar=60 µm. (arrow=MG epithelium; liver H&E shows epithelium throughout the tissue). C) Nanomolar concentration of total ECM per gram of tissue from QconCAT proteomics in mammary gland and liver. D) Abundance of ECM and ECM-associated functional groups with cytoskeletal and cellular protein groups excluded. E) Twenty most abundant ECM proteins in the rat MG (left) and liver (right) as detected by QconCAT proteomics, tabular results highlighted in Supplementary Tables 2 & 3. F) Nanomolar concentration of fibrillar collagen in MG and liver from QconCAT proteomic analysis (top left) and trichrome staining quantification in MG and liver (top right). Representative trichrome stained images (blue stain) of rat MG (bottom left) and liver (bottom right); scale bar=250 µm, insert scale bar=60 µm. *=p-value<0.0001, Student’s T-test. G) Twenty most abundant ECM proteins, excluding collagen I, in the rat MG (left) and liver (right) as detected by QconCAT proteomics.

89

Figure 3.3: Principal Component Analysis Reveals Dynamic and Cyclical Mammary Gland ECM Remodeling Across the Reproductive Cycle. Principal component analysis of quantitative ECM proteomics performed on rat mammary glands across the reproductive cycle (Nullip = nulliparous; Preg = pregnancy days 18– 21; Lac = lactation day 10; InvD2-InvD10 = involution days 2, 4, 6, 8, and 10; Reg = regressed, 4 weeks post- weaning); n = 5 rats/grp. Data shows that ECM composition in the mammary gland changes in phase with the reproductive cycle in a stepwise, cyclical fashion.

90

Figure 3.4: Quantitative ECM Proteomics Unravels The Unique Composition and Abundance of ECM Proteins Across the Reproductive Cycle.

91

Figure 3.4: Quantitative ECM Proteomics Unravels The Unique Composition and Abundance of ECM Proteins Across the Reproductive Cycle. A) QconCAT based ECM proteomics of rat MG tissues across the reproductive cycle with cellular protein groups removed (Nullip=nulliparous; Preg=pregnancy days 18-21; Lac=lactation day 10; InvD2-InvD10=involution days 2, 4, 6, 8, and 10; Reg=regressed, 4 weeks post- weaning); n=5 mammary glands/grp. B) Twenty most abundant ECM proteins in Nullip, InvD2, InvD6, and Reg stage rat MG. C) Twenty most abundant ECM proteins in Nullip, InvD2, InvD6, and Reg stage rat MG, with collagen I removed from the analysis; D) Biologic replicates of select tumor suppressive (lumican) and tumor- promotional (collagen VI, thrombospondin 1, galectin-3, tenascin-C) ECM proteins, as well as collagen XII with unknown roles in breast cancer, as determined by QconCAT based ECM proteomics of Nullip, InvD2, InvD6, and Reg stage rat MG; *=p- value<0.05, **=p-value<0.01, ***=p-value<0.001, ****=p-value<0.0001, One-way ANOVA.

92

CHAPTER IV

INCREASED MAMMOGRAPHIC DENSITY IS CORRELATED WITH FIBRILLAR

COLLAGEN ABUNDANCE

Introduction

The radiographic appearance of the female breast is heterogeneous among women due to differences in tissue composition (237). These differences stem from variations in the abundance of fat, connective and epithelial tissues within the fibroglandular parenchyma. Variations in the density of breast tissue are referred to as parenchymal patterns and are characterized by x-ray mammograms. Fat is radiographically translucent and appears dark on an x-ray mammogram, however; both epithelium and stroma are radiographically dense and appear light (237). A substantial number of studies have now studied the relationship between density and risk of breast cancer (238-240). Although there is much heterogeneity in the risk estimates reported, studies have shown a significant positive association between dense parenchymal patterns and breast cancer risk (i.e. greater risk with increased densities). It is now generally accepted that women with dense tissue in 75% or more of the breast have a four to six-fold greater risk of developing breast cancer during their lifetime compared to women with little or no dense tissue (241-243).

There are several other risk factors that influence a woman’s risk of developing breast cancer which include gender, age, breast density, body mass index (BMI) and history of pregnancy. Importantly, breast density is not as strong of a risk factor as is sex and age, however; it is a stronger risk factor than BMI and parity (244, 245).

Breast densities are graded on a scale from 1 (least dense) to 4 (most dense) and

93 are clinically defined as follows (Figure 4.1A): 1) Almost entirely fatty, 2) Scattered areas of fibroglandular density 3) Heterogeneously dense 4) Extremely dense.

Furthermore, extensive mammographic density may make breast cancer more difficult to detect by mammography and thus increases the risk of the development of cancer between mammographic screening tests (i.e. the masking hypothesis)

(246). As density will naturally influence the detection of cancer, estimates of the risk that increased density poses to the development of breast cancer may be distorted.

Breast cancer risk is further influenced by age, parity, BMI and menopause (240).

Despite the known association between density and breast cancer, we have yet to comprehensively compare the stromal composition of breasts that vary in density.

The relationship between fibrillar collagen, cancer incidence and progression have been corroborated in rodent models, where high collagen I content in the mammary gland results in a ~3-fold increase in tumor formation as well as increased lung metastasis (183). More recent molecular studies have begun to focus on what constituted mammographically dense tissue. This initial examination used hematoxylin and eosin (H&E) staining to suggest that the fibrous tissue detected in these samples was collagen. Indeed, even today the chief component of increased mammographic density is often over-generalized as collagen because there have yet to be any definitive studies that characterize and quantify differences in global

ECM composition of mammary glands of different densities. Furthermore, we know very little about how collagen is differentially organized in breast tissue of different densities at the molecular level. Some have suggested that increases in density are

94 accompanied by increases in collagen crosslinking, though a direct relationship has not been explored (106, 247).

Through the application of quantitative extracellular matrix proteomics and crosslinking analysis we set out to answer two major questions about the ECM composition of radiographically dense breast tissue 1) Is breast density correlated with collagen abundance? 2) Is increased breast density associated with increased collagen crosslinking? Providing insight into the crosslinking status of breast tissues of different densities allows us to infer how collagen is organized in the stroma.

Results

The initial goal in investigating the ECM composition of tissues of different mammographic densities was aimed at quantifying fibrillar collagen abundance and finding other extracellular matrix components that correlate with mammographic density, if any. ECM proteomics offers a means to accurately quantify ECM components and to compare abundances between breasts of different densities.

Preparation of tissues for proteomics was done using the hydroxylamine chemical digestion approach outline in Chapter II . Briefly, human prophylaxis breast tissues were excised from BRCA negative, pre-menopausal women undergoing a mastectomy (mammographic density was previously determined by their clinicians).

From each surgically excised breast, Egan sections of 0.5cm x 0.5cm x 1cm dimension were removed. Fresh tissues were either snap frozen in vial or embedded in OCT before freezing. ~100um sections were cut from the Frozen OCT embedded tissues. ~1g of snap frozen was cut. For proteomic analysis, embedded tissues were washed with ethanol and water to remove polymeric storage solution and

95 lyophilized. Approximately 2mg of dried tissue was weighed and sequentially extracted to yield the cellular, sECM and iECM fractions. (Figure 4.1B). Cellular, sECM and iECM fractions were trypsin digested with 13C QconCAT standards at equimolar amounts and analyzed by LC-SRM.

Multivariate analysis was performed in order to determine if breasts of different densities could be distinguished from one another based on their ECM composition. Scores plot from partial least squares – discriminant analysis (PLS-DA) shows that while mammographic densities one and four distinguish from all groups, mammographic densities two and three seem to cluster together (Figure 4.1C).

Furthermore, there is some overlap between densities 3 and 4, largely driven by the heterogeneity in the MD 2 group. There are two importance measures in PLS-DA : one is variable importance in projection (VIP) and the other is weighted sum of absolute regression coefficients. Using this analysis, the top 25 variables are plotted based on the how much they are contributing to the variance seen in the scores plot

(Figure 4.1D). The colored boxes on the right indicate the relative concentrations of the corresponding protein in each group under study. Among all variables analyzed, collagens made up 10 of the top 25 proteins. COL1A1 and COL1A2 appear to contribute the most to the variance in the ECM composition across breast densities and these fibrillar collagens were significantly more abundant in mammographic density of 3 and 4, than in mammographic densities of 1 and 2 (Figure 4.1D). Small leucine rich proteoglycans (SLRPs) such as dermatopontin (DPT), biglycan (BGN), decorin (DCN), and lumican (LUM), are all ranked high among variables and is likely a natural result of increased fibrillar collagen synthesis and abundance. Lumican is

96 significantly more abundant in the highest mammographic density compared to lowest. Interestingly, we have shown in mammary gland reproductive studies (i.e.

VPLIR studies, Chapter III) that LUM is a major component of the nulliparous rodent mammary gland that is tumor suppressive, decreasing during involution – a mammary gland reproductive state at higher risk of oncogenic transformation (93).

Among basement membrane proteins such as COL4A1/2, LAMC1, and LMNA, abundance was highest in mammographic densities of 1 and 2 (Figure 4.1C).

Although most other proteins other than collagen exhibited modest changes as a function of density, we also examined other functional classes of ECM components to determine how their abundance was related to the observed increase in density. Protein groups were plotted based on their abundance, and separated based into high and low abundance macro groups (Figure 4.1D & E). Cytoskeletal components increase among low densities from 1 to 2, but then exhibit a gradual decline in the higher densities of 3 and 4. Fibrillar collagen represents the most abundant functional class in the mammary gland and increases significantly between low and high mammographic densities. Interestingly, patients in the MD 3 group had the highest abundance of fibrillar collagen, followed by MD 4, MD 2 and MD 1.

Similarly, structural ECM components (i.e. COL6A1, COL18A1, FN1, MGP, among others) increase significantly in denser breast tissue. Cellularity of a given tissue is inferred in our assay by following representative cellular proteins such as histone

H2A and histone H1. Groups MD 2 and MD 3 also had the highest levels of cellular proteins compared to MD 1 and MD 4, and appear to generally decrease in higher mammographic densities (Figure 4.1D).

97

Patients in the MD 2 group had the highest basement membrane abundance across all densities tested. FACIT collagens (i.e. COL12A1 and COL14A1) were overall found to decrease significantly in breast tissue of high mammographic density (MD 4) compared to low mammographic density (MD 1) (Figure 4.1E).

However, an increase in abundance was observed in COL12A1 alone, in high density MD3 compared to low density MD1 was observed (Figure 4.1E).

Due to the important role collagen abundance seems to play in defining density, further analysis was performed to investigate this relationship. All collagens were curated from our full ECM proteomics analysis and multivariate analysis was performed to determine if different breast density groups could be distinguished from one another solely based on their collagen content (Figure 4.2A). Indeed, scores plot from PCA shows that all density groups are differentiated from one another using only collagen abundance. From this analysis, collagen variables were ranked in order to determine the importance of each in defining the separation between groups (Figure 4.2B). Fibrillar collagens (COL1A1, COL1A2, COL5A1) and the

FACIT collagen COL12A1 were among the most important collagens contributing to separation between groups. A Pearson correlation analysis we performed and it was determined that the same collagens that contribute to the intergroup variance also exhibit a positive correlation with mammographic density (Figure 4.2C). Not surprisingly, fibrillar collagens showed the strongest correlation coefficients with density (Figure 4.2D). However, the relationship between mammographic density and FACIT collagen abundance has not been previously observed. Individual collagens are quantified in Figure 4.2E.

98

Interestingly, we found that (MGP) abundance increased linearly with mammographic density (Figure 4.2F). At least one study has shown that the MGP gene was among the genes up-regulated in cases where the prognosis was poor, indicating that the mRNA levels of MGP are a potential prognostic marker of breast cancer, however; IHC of breast tumor microarrays did not show a correlation between MGP expression and overall survival (248).

Interestingly and despite this, others have also shown that MGP is down regulated in the invasive breast cancer stromal transcriptome, relative to normal breast stroma

(249).

The relationship between density and fibrillar collagen abundance was further explored through the application of crosslinked amino acid analysis (xAAA) (Figure

4.3A). Tissue hydrolysates were enriched using cellulose chromatography and analyzed by LC-MS/MS. We did not find a positive association between breast density and collagen crosslinking. In fact, total collagen crosslinks (normalized to total collagen content and starting (dry) weight) exhibited a downward trend with increased density (Figure 4.3B). This was further corroborated by plotting individual divalent and trivalent xAAs (Figure 4.3C & D). AFM was also performed on these patient biopsies to correlate tissue stiffness and collagen crosslinking with mammographic density (not shown). No correlation was found between stiffness

(micro scale) and mammographic density (macro scale) by this measure, however; this is likely due to the significantly heterogeneous AFM measurements between women in the same group.

99

Discussion

Mammographic density remains an important risk factor for the development of breast cancer in women. Until now, the relationship between specific collagen types and mammographic density has not been clearly identified. While the findings presented here confirm what many have long hypothesized regarding fibrillar collagen and breast density, we found several notable trends in these data. First, basement membrane proteins exhibit a unique phenotype that is inversely proportional to breast density. Of note, during breast tumorigenesis there are often drastic changes in the abundance of basement membrane proteins that are interacting with an invasive breast stroma (160). This is often manifested as a decrease in basement membrane protein abundance as pre-malignant tissue becomes transformed. We also see a decrease in basement membrane protein abundance during involution in the rat mammary gland reproductive cycle ECM remodeling (250) (Chapter III). Commensurate with the observed decreases in basement membrane protein abundance are increases in collagen abundance.

Whether this is a result of increased stromal cell populations or just overproduction of fibrous proteins by existing resident stromal cells is unknown. However, we believe increases in fibrillar collagen are primarily contributing to the observed increase in breast density. As such, we focused our analysis on further dissecting which collagen types may be contributing to this phenotype.

Correlation analysis revealed that COL1A1/2 and COL5A1 are among the biggest contributors to what distinguishes a minimally dense stroma (MD1 & 2) from a heterogeneously dense and extremely dense stroma (MD 3 &4). Of note,

100

COL12A1 was also found to weakly correlate with breast density, a finding that has not been previously reported in the literature. While the role of COL12A1 in disease is not well understood, we believe it may be integral to the formation of large, dense collagen networks in the breast. FACIT collagens are characterized by interrupted triple helices allowing for less rigid connections. It’s possible that within the breast compartment where there may be several distinct areas of dense tissue, COL12A1 is providing support to separate regions of dense fibrillar collagen and connecting them to form a tighter spatial network of fibers.

For the first time, we have compared collagen crosslinking profiles in the context of mammographic density. These data demonstrated that there is no significant difference in the abundance of collagen crosslinks in different mammographic densities. Surprisingly, a flat or slightly downward trend was observed suggesting that denser breast tissue has very modest changes in crosslinking. This could be a result of simply having a higher abundance of fibrillar collagen but a normal activity of crosslinking enzymes such as LOX as was confirmed by LC-SRM.

Through the application of quantitative ECM proteomics and collagen crosslinking analysis we have provided a deeper understanding of how ECM component abundance and the organization of collagen is related to differences in breast density. We have also been able to quantitatively confirm that increased breast density is strongly correlated with increased fibrillar collagen abundance.

Interestingly, we also showed a slight positive correlation between density with the

FACIT collagen COL12A1. Increases in collagen abundance were not associated

101 with increases in collagen crosslinking or elastic modulus measurements (not shown). Overall, these data suggest that denser breast tissues have more fibrillar collagen but it is more loosely organized.

One caveat to the data presented here is that we have used tissue samples from pre-menopausal women and have not controlled for their menstrual cycle status. During menstruation, many women experience changes in breast texture with some women reporting that their breasts feel particularly lumpy and/or tender (251).

This is a result of the glands in the breast enlarging to prepare for a possible pregnancy. If pregnancy does not happen, the breasts return to normal size. Once menstruation begins, the cycle begins again. In future studies, I believe it would be more appropriate to analyze post-menopausal women in the same way we have done here to remove this potential confounding variable and to be more representative of the total at risk population. Additionally, controlling for parity, or at least stratifying women based on parity, would reduce any confounding because of pregnancy.

In follow-up studies, we plan to use this approach to investigate the relationship between breast density, BRCA1 mutants and stromal composition and crosslinking. The overarching challenge is to understand how germline mutations in

BRCA1 confer a 50–80% lifetime risk of developing breast cancer. There are no effective methods to prevent breast cancer short of bilateral mastectomy in these women. The elevated risk of breast cancer in BRCA1 mutation carriers is attributed to a triad of intrinsic cellular effects that compromise DNA damage repair, increase hormone responsive proliferation, and skew lineage commitment. What is missing

102 from this already complicated picture is stromal context that critically mediates function and cancer risk.

Prior studies have failed to consider the contribution of the mammary stroma to in the development of breast cancer in women with BRCA1 mutations. This gap is evident in the wide use of conditional deletion murine models targeted to mammary epithelium. Importantly, analysis of a mouse in which germline Brca1 is dysregulated led to a plausible mechanism of action that correlates stromal production of IGF1,

ECM stiffness, mechnosignaling and receptor activator of nuclear factor kappa-B ligand (RANKL) overexpression (252). These studies place dysregulated stroma proximal to breast cancer risk for BRCA1 mutations carriers.

Knowing her risk of breast cancer helps a woman decide when to start screening for breast cancer, how often to get screening mammograms, and what risk-reducing therapy is best for her. If BRCA1 mutations induce a specific stroma that promote neoplastic progression in mutant epithelial cells, such knowledge will likely provide the conceptual basis for strategies to prevent sporadic disease.

103

Figure 4.1: Comparing the ECM Composition of Human Breast Tissue at Different Densities

104

Figure 4.1: Comparing the ECM Composition of Human Breast Tissue at Different Densities (A) Representative X-ray mammograms of breast tissue at four different densities. Photo adapted from UT Austin Diagnostic Clinic (Catherine Young, MD) B) ECM proteomics workflow diagram. Approximately 3 mg of tissue is sequentially extracted to yield cellular, soluble ECM (sECM), and insoluble ECM (iECM) fractions. QconCAT stable isotope labeled peptides are spiked into samples at a known concentration prior to trypsin digestion and peptides are monitored by LC- SRM(n=23 total clinical specimens, MD1 = 6, MD2 = 4, MD3 = 5, MD4 = 8) C) Scores plot from principal component analysis for all ECM and ECM-related proteins identified by LC-SRM. D) Ranked variable of importance (VIP) plot of top 25 variables ranked by p-value from Students t-test E) Function class abundance plot of high abundance ECM groups. F) Functional class abundance plot of low abundance ECM groups. For all plots, values from each group were averaged and plotted with the standard error of the mean (SEM). Subsequent statistical analysis was performed with unpaired two- sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001, “ns” not significant).

105

Figure 4.2: Fibrillar Collagen Abundance Correlates with Increased Mammographic Density

106

Figure 4.2: Fibrillar Collagen Abundance Correlates with Increased Mammographic Density A) Scores plot from partial least squares – discriminant analysis (PLS-DA) of collagen variables quantified by LC-SRM. B) Collagen variables ranked by p-value from Students t-test C) Pearson correlation matrix of all identified collagen variables with mammographic density D) Ranked collagen variables that correlate with mammographic density. E) Individual bar plots of fibrillar collagen (COL1A1, COL1A2, COL5A1) and FACIT collagen (COL12A1). F) Individual bar plots of MGP and ECM1 proteins. For all plots, values from each group were averaged and plotted with the standard error of the mean (SEM). Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001, “ns” not significant).

107

Figure 4.3 Mammographic Density is Not Associated with Significant Changes in Collagen Crosslinking A) Workflow diagram of crosslinked amino acid analysis with LC-MS/MS. Approximately 10 mg from a punch biopsy of breast tissue is weighted, OCT storage fluid is removed, and tissue is hydrolyzed in 6N HCl. the resulting hydrolysate is enriched using solid phase extraction (SPE) with cellulose and LC-MS/MS xAAA is performed on the enriched hydrolysate. B) Total crosslinks scatter plot. Total values calculated by summing individual xAAs from C. C) Divalent crosslink scatter plots, lysinonorleucine (LNL), hydroxy lysinonorleucine (HLNL) and dihydroxy lysinonorleucine (DHLNL). D) Trivalent crosslink scatter plots, deoxy lysyl pyridinoline (dPyr), lysyl pyridinoline (Pyr). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas. Individual scatter plot points represent individual animals with horizontal and vertical bars representing the mean and standard error of the mean (SEM), respectively (P < 0.05)

108

CHAPTER V

HYDROXY LYSINE DERIVED COLLAGEN CROSSLINKS PROMOTE POOR

BREAST CANCER PATIENT PROGNOSIS AND TREATMENT RESISTANCE

Introduction

A defining trait of many advanced solid tumors is an abundant collagen-rich extracellular matrix (ECM) characterized by fibrosis, increased tissue stiffness and increased collagen crosslinking (98, 106, 119). ECM stiffness can enhance cell growth and survival and promote migration away from a primary site, while ECM rigidity disrupts tissue morphogenesis by increasing cell tension. Collagen is the most abundant ECM scaffolding protein in the stroma and contributes to the tensile strength of tissues. It has been shown previously that enhanced matrix crosslinking forces breast tumor progression through enhanced integrin signaling (106).

During biosynthesis and extracellular maturation, collagen acquires a number of post-translational modifications that directly affect its mechanical strength and stability and architecture. This process is primarily driven by collagen-modifying enzymes such as lysyl hydroxylase (LH) and lysyl oxidase (LOX) (253-255)—both of which have been shown to promote hypoxia-induced breast and lung tumor metastasis in vivo (110, 256-258). While all three LH family members hydroxylate the collagenous domain of type 1 procollagen, only LH2 is capable to hydroxylating

N telopeptides – thus priming collagen fiber maturation through LOX-mediated crosslinking (259-261). With imposition of these modifications, LOX oxidatively deaminates telopeptide lysine (Lys) and hydroxylysine (HyL) residues, which initiates a spontaneous process of covalent intra- and intermolecular crosslinking of

109 mature collagen, thereby increasing the mechanical strength of fibrils (Figure 1.1 &

1.2) (47, 262). Importantly, the dual action of LH2 and LOX ultimately generates two distinct species of crosslinking substrates; Hydroxy lysine aldehyde (Hylald)-derived collagen crosslinks (HLCCs) (Figure 1.2) and Lysine aldehyde (Lysald)-derived collagen crosslinks (LCCs) (121, 254, 255) (Figure 1.3). We have previously published that specific breast cancer subtypes have enhanced fibrosis and stiffer tumor stroma, particularly HER2+ and Basal-like relative to Luminal A breast cancer

(98).

Until now, current techniques to analyze LH2- and LOX-derived crosslinks have been low throughput and rely on multiple assays to measure the full repertoire of crosslinks. Additionally, these approaches have been primarily applied to hard tissues (e.g. bone, connective tissues) in which crosslinks are of naturally high abundance (58, 121). The current approach for measuring collagen crosslinks involves the detection of crosslinked amino acids (xAA) after tissue hydrolysis using a combination of three assays: 1) UV detection of ninhydrin derivatized divalent xAAs, 2) Fluorescence detection of trivalent xAAs, 3) Hydroxyproline determination by UV absorbance. Importantly, the identification of xAAs by these methods rely on the use of high performance liquid chromatography (HPLC) retention times to identify which features in a given chromatogram match to that of crosslink standards

(58). Utilization of this approach is hindered by the lack of commercially available standards for xAAs and specificity of the HPLC assay. These methodologies have not been widely adopted due to technical hurdles associated with implementing these assays and the assumptions that must be made regarding accurate

110 identification of unknown HPLC features. As such, a need exists for a highly sensitive assay capable of unambiguous and reproducible identification and quantification of crosslinks in soft tissues (e.g. human breast tumors) in which crosslinks are of low abundance.

Here, we modify and streamline traditional HPLC based techniques to develop an ultra-high performance liquid chromatography with tandem mass spectrometry (UHPLC-MS/MS) based approach to separate, detect, and quantify xAAs (Figure 5.1A & B) from tissues that vary in their observed fibrillar collagen, properties as we previously published (98, 107). This MS approach also allows us to validate identified crosslinks with fragmentation spectra (Figure 5.1B). In our initial proof of concept experiment, total collagen and total xAA abundance were quantified from several normal human tissues using the xAAA approach. Stiffer tissues, such as tendon, bone and trachea had higher levels of total collagen and total crosslinks than softer tissues such as skin, lung, and breast. We can further stratify collagen crosslinking data based on LCC and HLCC abundance. Hard tissues (i.e. tendon, bone, trachea) tend to have a higher percentage of HLCCs than LCCs, with HLCCs representing ≥ 75% of the total crosslink abundance (tendon 98%, trachea 75%, bone 99%). In contrast, soft tissue HLCCs represented ≤ 62% of the total crosslink abundance (skin 42%, lung 62%, breast 60%) (Figure 5.1C). It has been shown previously that these tissues are drastically different in their measured stiffness and likely have different abundances of collagen crosslinks (107).

111

Materials and Methods

Preparation of Tissue for Hydrolysis

OCT was removed from tissue blocks by first transferring biospecimens to a conical tube and then performing 5X washes with 70% EtOH followed by 5X washes with 18 mΩ H2O(263). Each wash consisted of vortexing the sample for 15 minutes at 4°C and then centrifuging at 18,000 x g for 15 minutes at 4°C. Between 1 and 3 milligrams of tissue is washed with 1X PBS buffer by vortexing for 15 minutes at 4°C and then sonicated on ice for 20 seconds using a Sonic Dismembrator M100

(ThermoFisher, San Jose, CA, USA). The homogenate was then centrifuged at

18,000 x g for 20 minutes at 4°C. The supernatant was removed and the pellet was re-suspended in 1mg/mL NaBH4 (prepared in 0.1N NaOH) in 1X PBS for 1 hour at

4°C with vortexing. The reaction was the neutralized by adding glacial acetic acid to a final concentration of 0.1% (pH ~ 3 -4) (254). The sample was then centrifuged at

18,000 x g for 20 minutes at 4°C. The supernatant was removed and the pellet was washed three times with 18 mΩ H2O to remove any residual salt or acetic acid that may interfere with downstream LC-MS/MS analysis. The remaining pellet is dried using a lyophilizer system. Normal human tissues and tumor tissue from mouse studies were treated the same except OCT removal was not required.

Protein Hydrolysis

The dried sample is placed in a glass hydrolysis vessel and hydrolyzed in a volume of 6N HCl, 0.1% phenol. The hydrolysis vessel is flushed with N2 gas, sealed and placed in a 110°C oven for 24 hours. After hydrolysis, the sample was cooled to

112 room temperature and then placed at -80°C for 30 minutes prior to lyophilization.

The dried sample is re-hydrated in 100uL of 18 mΩ H2O for 5 minutes, then 100uL of glacial acetic acid for 5 minutes and finally 400uL of butan-1-ol for 5 minutes.

Importantly, 10uL of sample is removed after re-hydration in water and saved for determination of hydroxy proline content.

Preparation of Crosslink Enrichment Column

CF-11 cellulose powder is loaded in a slurry of butan-1-ol: glacial acetic acid, water (4:1:1) solution onto a Nanosep MF GHP 0.45μm spin columns until a settled resin bed volume of approximately 5mm is achieved (58). The resin is washed with

1.5mL 4:1:1 organic mixture using an in-house vacuum manifold set up. Re-hydrated samples are then loaded onto individual columns, the vacuum is turned on and the sample is pulled through the resin into glass collection vials. The flow through is again passed over the resin to ensure maximal binding of crosslinked amino acids and set aside. The column is then washed with 1.5 mL of fresh 4:1:1 organic mixture. A fresh collection vessel is placed under the column and 750 uL of 18 mΩ

H2O is used to elute crosslinked amino acids off of the CF-11 resin. The eluent is then placed in a speed vac and run until complete dryness. Dried eluent is then reconstituted in a buffer appropriate for downstream MS analysis on reversed-phase or amide HILIC UHPLC columns.

UHPLC Analysis

Up to 20uL of tissue hydrolysates were analyzed on a Vanquish UPHLC system (ThermoFisher, San Jose, CA, USA) using an Acquity UHPLC BEH Amide

113 column (β.1 x 100mm, 1.7μm particle size – Waters, Milford, MA, USA). Samples were separated through a 5 minute gradient elution (55% - 40% Mobile phase B) at

β50μL/min (mobile phase: (A) 10mM NH4CH3CO2, pH 10.2, (B) 95% acetonitrile, 5%

Mobile Phase A,, pH 10.2, column temperature: 35°C.

MS Data Acquisition

The Vanquish UPHLC system (ThermoFisher, San Jose, CA, USA) was coupled online with a QExactive mass spectrometer (Thermo, San Jose, CA, USA), and operated in two different modes – 1. Full MS mode (β μscans) at 70,000 resolution from 75 to 600 m/z operated in positive ion mode and 2. PRM mode at

17,500 resolution with an inclusion list of in-tact crosslinked amino acid masses, and an isolation window of 4 m/z. Both modes were operated with 4 kV spray voltage, 15 sheath gas and 5 auxiliary gas. Calibration was performed before each analysis using a positive calibration mix (Piercenet – Thermo Fisher, Rockford, IL, USA).

Limits of detection (LOD) were characterized by determining the smallest injected crosslinked amino acids (LNL, DHLNL, d-Pyr, Desmosine/Isodesmosine) amount required to provide a signal to noise (S/N) ratio greater than three using < 5ppm error on the accurate intact mass. Based on a conservative definition for Limit of

Quantification (LOQ), these values were calculated to be threefold higher than determined LODs.

MS Data acquired from the QExactive were converted from a .raw file format to .mzXML format using MassMatrix (Cleveland, OH, USA). Assignment of crosslinked amino acids was performed using MAVEN (Princeton, NJ, USA) (264).

The MAVEN software platform provides the means to look at data acquired in Full

114

MS and PRM modes and allows user to import in-house curated peak lists for rapid validation of features. Normalization of crosslinked amino acid peak areas was performed using two parameters, 1. Hydroxy proline content and 2. Tissue dry weight pre-hydrolysis (in milligrams). Hydroxy proline content is determined by running a 1:10 dilution of the pre-enrichment sample through the Full MS mode

(only) described above and exporting peak areas for each run.

Quantification of Crosslinked Amino Acids

Relative quantification of crosslinked amino acids was performed by exporting peak areas from MAVEN into GraphPad (La Jolla, CA, USA) and normalizing based on the two parameters described above. Statistical analysis, including T test and

ANOVA (significance threshold for P values <0.05) were performed on normalized peak areas. Total crosslink plots were generated by summing normalized peak areas for all crosslinks in a given sample and comparing prophylactic to tumor tissue. The ratio of hydroxy lysine collagen crosslinks (HLCC) to lysine collagen crosslinks (LCC) was determined by summing normalized peak areas of HLCCs

(DHLNL, d-Pyr, Pyr) and taking the ratio to LCCs (LNL). Calibration curves were generated for the standards in the background of an E. coli hydrolysate which showed linearity over three orders of magnitude and lower limits of quantification

(LLOQ) in the hundreds of femtomoles (Appendix A). These characteristics of the assay highlight the capability of this approach to be applied to soft tissues with low crosslink abundance.

115

Human Breast Tissue

Fresh human breast tissue samples from prophylaxis mastectomy or breast tumor mastectomy were either embedded in an OCT (Tissue-Tek) aqueous embedding compound within a disposable plastic base mold (Fisher) and were snap frozen by direct immersion into liquid nitrogen and kept at -80 1°C until cryo- sectioning for analysis, or formalin fixed and para n-embedded. All human breast tissue samples were prospectively collected from ffipatients undergoing surgery at

UCSF or Duke University Medical Center between 2010 and 2014. The selected samples were de-identified, stored and analyzed according to the procedures described in Institutional Review Board Protocol #10-03832 and #10-05046, approved by the UCSF Committee of Human Resources and the Duke IRB

(Pro00034242).

Picrosirius Red Staining and Quantification

Flash frozen OCT embedded frozen tissues were cryo-sectioned at 5 μm, fixed in 4% neutral buffered formalin and stained using 0.1% picrosirius red (Direct

Red 80, Sigma) and counterstained with Weigert’s hematoxilin, as previously described. Polarized light images were acquired using an Olympus IX81 microscope fitted with an analyzer (U-ANT) and a polarizer (U-POT, Olympus) oriented parallel and orthogonal to each other. Images were quantified using ImageJ. Briefly, a minimal intensity threshold was used to eliminate the background and then the fiber density was measured as image % area coverage. For analysis of tumor progression samples, 5 images per tissue were taken for control (n=3) and Lox

116 overexpression (n=4) groups. The results per tissue region type were then pooled and averaged and normalized to adjacent normal tissue values.

Tissue Preparation for AFM Measurements of ECM Stiffness

Human breast tissue samples and mouse mammary tumors were analyzed following cryopreservation. Frozen tissue blocks were then cut into 20 µm sections for human tissues and 30 µm sections for mouse tissues using disposable low profile microtome blades (Leica, 819) on a cryostat (Leica, CM1900-3-1). Prior to the AFM measurement, each section was fast thawed by immersion in PBS at room temperature. The samples were maintained in a proteinase inhibitor in PBS

(Protease Inhibitor Cocktail Roche Diagnostics, 11836170001), with propidium iodide (SIGMA P4170, β0 mgml) during the AFM session. Five patient’s samples for each breast cancer subtype were used for AFM quantification of Young’s elastic modulus of the cancer-associated stroma.

AFM Measurements of ECM Stiffness on Tissue Sections

AFM measurements were performed using a similar approach as in Acerbi et. al. (98). Briefly, all AFM indentations were performed using an MFP3D-BIO inverted optical AFM (Asylum Research) mounted on a Nikon TE2000-U inverted fluorescent microscope, as previously Described (6). Briefly, we used silicon nitride cantilevers with a spring constant of 0.06 N m-1 with a borosilicate glass spherical tip with 5 µm diameter (Novascan Tech). The cantilever was calibrated using the thermal oscillation method prior to each experiment. Samples were indented at β0 μms-1 loading rate, with a maximum force of 2 nN. Ten AFM force maps were typically

117 obtained on each sample, each map as 80 x 80 μm raster series of indentations utilizing the FMAP function of the IGOR PRO build supplied by Asylum Research.

The Hertz model was used to determine the elastic properties of the tissue (E1).

Tissue samples were assumed to be incompressible and a Poisson’s ratio of 0.5 was used in the calculation of the Young’s elastic modulus.

SHG Image Acquisition

For two-photon imaging, fresh tissue sections were fixed post-AFM indentation in 4% paraformaldehyde. We used custom resonant-scanning instruments based on published designs containing a five-PMT array (Hamamatsu,

C7950) operating at video rate (265). The setup was used with two channel simultaneous video rate acquisition via two PMT detectors and an excitation laser

(2W MaiTai Ti-Sapphire laser, 710–920 nm excitation range). Second harmonics imaging was performed on a Prairie Technology Ultima System attached to an

Olympus BX-51 fixed stage microscope equipped with a 25X (NA 1.05) water immersion objective. Tissue samples were exposed to polarized laser light at a wavelength of 830 nm and emitted light was separated using a filter set (short pass filter, 720 nm; dichroic mirror, 495 nm; band pass filter, 475/40 nm). Images of x–y planes at a resolution of 0.656 mm per pixel were captured using at open-source

Micro-Magellan software (266).

LH2 IHC and Prognostic Analyses

Study population. The female Malmö Diet and Cancer Study (MDCS) cohort consists of 17,035 women from 1923–1950) (267, 268). Information on incident

118 breast cancer is annually retrieved from the Swedish Cancer Registry and the South

Swedish Regional Tumor Registry. Follow-up until December 31, 2010, identified a total of 1,016 women with incident breast cancer. For the current study population of

910 breast cancer patients, the following conditions excluded patients: 1) with in situ only cancers (n=68), 2) who received neo-adjuvant treatments (n=4), 3) with distant metastasis at diagnosis (n=14), 4) those who died from breast cancer-related causes

≤ 0.γ years from diagnosis (n=β), and finally 5) patients with bilateral cancers (n=17).

In addition, one patient who declined treatment for four years before accepting surgery was excluded. Patient characteristics at diagnosis and pathological tumor data were obtained from medical records. Information on cause of death and vital status was retrieved from the Swedish Causes of Death Registry, with last follow-up

December 31st, 2014. Ethical permission was obtained from the Ethical Committee at Lund University (Dnr 472/2007). All participants originally signed a written informed consent form.

Tumor evaluation. Tumor samples from incident breast cancer cases in

MDCS were collected, and a tissue microarray (TMA) including two 1-mm cores from each tumor was constructed (Beecher, WI, USA). Within the study population

(N=910), tumor tissue cores were accessible from 718 patients. 4-μm sections dried for one hour in 60 degree Celsius were automatically pretreated using the

Autostainer plus, DAKO staining equipment with Dako kit K8010 (Dako, DK). A primary mouse monoclonal Lysyl Hydroxylase 2 (LH2) antibody (Cat. No: TA803224, dilution 1:150, Origene) was used for the immunohistochemical staining.

119

TMA cores were analyzed by a cohort of 4 anatomic pathologists (ACN, AC, JG, AN) using the PathXL digital pathology system (http://www.pathxl.com, PathXL Ltd., UK) blinded to all other clinical and pathologic variables. Immunohistochemistry for LH2 was assessed separately for stromal and neoplastic epithelial components of the tumors. Stromal LH2 staining was assessed with the semi-quantitative H-score which combines intensity and proportion positive assessments into a continuous variable from 0-300 (269). Cellular stromal components were assessed (including fibroblasts, macrophages, endothelial cells, adipocytes, and other stromal cell types) while areas of significant lymphocytic infiltrate were specifically excluded from the percent positive estimation. Neoplastic epithelial LH2 staining intensity was scored

0-3+ based on the predominant intensity pattern in the tumor—invasive tumor cells did not display significant intra-tumoral heterogeneity of PLOD2 staining within each core. Verification of inter-observer reproducibility for the H-score was established in a test series of 16 cases evaluated by all study pathologists. Inter-observer agreement in the training set was very high, evaluating the IHC scores both as continuous variables (Pearson correlation coefficients ranging from 0.912-0.9566, all p values < 0.0001), and after transformation into categorical data (negative/low, moderate, and high; weighted kappa coefficients ranging from 0.673-0.786). In addition, 50 cases of the study cohort were evaluated blindly by two pathologists to confirm data fidelity; the Pearson correlation coefficient = 0.7507 (p = 5.7 E-05), considered a strong level of agreement.

After exclusion of cases for which LH2 was not evaluable on the TMA, H- scores for 505 total patients were included for statistical associations with

120 clinicopathologic features and patient outcome. Each patient was represented by two cores, and TMA core 1 and core 2 were merged into a joint variable favoring the highest stromal LH2 H-score or epithelial LH2 intensity because we predict that higher H-score would drive patient outcome in accordance with our gene expression data demonstrating high PLOD2 expression correlated with poor outcome. The

Pearson correlation coefficient between cores = 0.647, demonstrating moderate agreement among the stromal LH2 H-scores the two cores. In cases with only one

TMA core providing a LH2 score, the expression of this core was used. Further, this joint LH2 variable was categorized into tertiles based on the study population with valid LH2 annotation (N=505). The lowest tertile of LH2 H-scores were defined as H- scores between 0 and less or equal to 120 (N=171), the intermediate H-score to above 120 and equal or less than 230 (N=188), and the highest stromal LH2 score as above 230 (N=146).

IHC statistical analyses. Patient and tumor characteristics at diagnosis in relation to stromal LH2 expression were categorized and presented as percentages.

Continuous variables are presented as the mean and min/max. The association between LH2 expression and prognosis was examined using breast-cancer specific mortality as endpoint; defined as the incidence of breast cancer-related death.

Follow-up was calculated from the date of breast cancer diagnosis to the date of breast cancer-related death, date of death from another cause, date of emigration or the end of follow-up as of December 31st, 2014. Breast cancer related mortality

(BCM) was assessed by comparison of BCM among patients defined by the stromal or epithelial LH2 expression. Main analyses included the overall population;

121 analyses were additionally performed in subgroup analyses stratified by estrogen receptor (ER) or axillary lymph node involvement (ALNI) status. The prognostic impact of stromal LH2 expression was analyzed through Cox proportional hazards analyses, which yielded hazard ratios (HR) and 95% confidence intervals (CI) for crude models, and multivariate models adjusted for age at diagnosis (model 1) and tumor characteristics ER (dichotomized, cut-off 10% stained nuclei), ALNI (none or any positive lymph node involvement), histological grade (Nottingham grade I-III), and tumor size (dichotomized using cut-off 20 mm). Kaplan-Meier curves indicated

LH2 status to particularly impact the first 10 years after diagnosis and survival variables constructed to capture this effect was used in Cox regression models investigating the effects during the first post-diagnostic decade. All other statistical analyses were performed in SPSS version 22.0 (IBM).

Statistical Analysis

Statistical significance comparing di erent histological regions representing tumor progression within the same patient ffwas assessed either using a two-tailed paired Mann–Whitney nonparametric test or ANOVA. Statistical significance comparing tissue samples from different tumor subtypes was assessed either using a two-tailed unpaired Mann–Whitney non-parametric test, Wilcoxon Rank Sum Test, or ANOVA as appropriate. Means are presented ± standard deviation of multiple measurements and statistical significance was considered at P < 0.05.

122

Results

Overexpression of Lysyl Oxidase in the Mammary Tumor Stroma Results in

Increased Collagen Crosslinking

While the role of lysyl oxidase (Lox) in the formation of crosslinks is well understood, inducible modulation of its activity in genetically engineered mouse models (GEMMs) has proven difficult. While Lox has been shown to be essential for hypoxia-induced metastasis, it remains unclear whether expression of Lox expression in the epithelium or stromal compartment contributes to observed increases in tissue stiffness, and is the subject of ongoing debate. Thus, we set out to generate two mammary tumor models that can induce expression of Lox in the epithelium or the stroma using two different promoters. Using a genetically engineered mouse model of luminal breast cancer (MMTV-PyMT; MMTV- rtTA;TetO_mLOX), mammary tumors were induced to overexpress Lox in the tumor epithelium. In order to assess how Lox overexpression alters the stroma, picrosirius

(PS) red staining was performed (Figure 5.2A) and quantified under polarized light

(Figure 5.2B). The functional outcome (i.e. changes in collagen crosslinking) from

Lox overexpression was measured using xAAA on these tumors and tissues and total crosslinks were quantified. As this is a Tet-ON system, animals treated with

DOX will turn on overexpression of Lox (DOX-ind PyMT Lox OX). Animals on water

(Water PyMT control) or animals who did not have the correct genotype (DOX-ind

PyMT control) served as control groups. In comparing these groups, no significant differences in total crosslinks (Figure 5.2C) or individual LCC or HLCC crosslinks

(Figure 5.2 D & E) was observed. We also observed no difference in HLCCs alone

123

(Figure 5.2 F), implying that Lox overexpression by epithelial tumor cells does not functionally change the abundance of collagen crosslinks in the tumor stroma.

Under control of the Col1a1 promoter (MMTV-PyMT+/-; Col1a1-tTA+/-

;TetO_mLox+/-), a novel tetracycline (Tet-OFF) inducible model of murine Lox overexpression in the stromal compartment was generated. As this is a Tet-OFF system, animals taken off DOX will turn on overexpression of Lox (PyMT Lox OX) in the stroma, while animals on DOX will retain normal expression of Lox (PyMt

Control). H&E staining identified areas of invasion in PyMT tumor models (Figure

5.3A). Stromal Lox overexpression (PyMT Lox OX) in mammary tumors led to an increase in fibrillar collagen compared to normal mammary gland (Normal MG) when assessed by PS red staining under polarized light (Figure 5.3A). Further characterization of the stroma using SHG revealed the formation of dense, linearized fibers in Lox overexpressing mammary tumors relative to PyMT control and normal mammary gland tissue (Figure 5.3A). In line with this, Lox OX mammary tumors had elevated levels of focal adhesion kinase (FAK) phosphorylation—a mechanosignaling modulator—in tumor cells indicating an increase in integrin signaling (Figure 5.3A).

As it is known that Lox overexpression should promote increased collagen crosslinking, it was hypothesized that this would be associated with increased tissue stiffness. Indeed, Lox OX mammary tumors had a significantly stiffer stroma via atomic force microscopy (AFM) indentation. We were also able to determine that Lox

OX mammary tumors had a higher median value for the upper 10% of elastic modulus measurements (Figure 5.3B). The final outcome of Lox OX in the

124 mammary tumor stroma was assessed using xAAA. Lox overexpression significantly increased the total number of crosslinks relative to normal mammary gland and

PyMT control tumors suggesting a positive relationship between Lox expression and crosslinking (Figure 5.3C). Upon assessment of individual LCC and HLCC crosslinks we observed a potential preference for HLCC crosslinks in Lox OX mammary tumors (Figure 5.3D). Through grouping individual HLCCs (DHLNL, Pyr) and plotting total HLCC abundance we found that Lox OX tumors have significantly more HLCC crosslinks than their PyMT counterparts (Figure 5.3E). Correlation analysis confirmed a positive association between collagen crosslinking and tissue stiffness (Figure 5.3F) and was specifically correlated with the abundance of the

HLCC crosslinks DHLNL and Pyr.

Association Between Collagen Crosslinking Abundance and Fiber Organization in

Human Breast Tissue

Breast tumors are known to undergo significant ECM remodeling during tumorigenesis and tumor progression (99, 106, 118, 183). Although these architectures have been imaged widely using second harmonic generation (SHG), little is known about how the observed architectural changes are manifested at the molecular level in changes to collagen crosslinks. For example, healthy normal breast tissue is characterized by wavy and relaxed collagen architectures (Figure

5.4A). Traditionally, invasive ductal carcinoma (IDC) ECM is remodeled to form straightened fibers that align with the tumor boundary, a phenomenon termed tumor associated collagen signatures (TACS) (118, 183). It has been hypothesized that this remodeling is associated with increases in collagen crosslinking, however; it

125 remains unclear what the relationship between fiber density and realignment, and crosslinking are at the molecular level. In recent years, the heterogeneity of ECM architectures breast tumors has been realized and we now know that IDC can present with both curly (IDC – Curly, IDCc) and straightened (IDC – Straight, IDCs) collagen fiber structure (Figure 5.4B). Taking this into account, it was hypothesized that IDCc and IDCs collagen architectures will have different crosslinking profiles.

Interestingly, both IDCc and IDCs groups demonstrated significant increases in total collagen crosslinks relative to healthy normal breast tissue (Figure 5.4C). However, we did not observe differences between IDCc and IDCs groups. Further analysis of individual LCC and HLCC crosslinks revealed that IDCs HLCCs (i.e DHLNL and Pyr) exhibited the largest changes in crosslink abundance, relative to healthy normal breast tissue (Figure 5.4D).

Characterization of Collagen Crosslinking in Human Breast Tumor Subtypes

Based on our data and previous studies that had implicated LOX in human breast cancer (106, 110, 185), we hypothesized that breast tumor subtypes with an extensive fibrosis would have a higher abundance of HLCC crosslinks. Using our xAAA approach, we first demonstrated breast tumors have significantly more collagen crosslinks relative to prophylactic breast tissue (Figure 5.5A). Knowing breast tumors are significantly more crosslinked (Figure 5.4B), we sought to determine if specific tumor subtypes have distinct collagen crosslinking profiles. We obtained OCT-embedded frozen human breast tumors classified as either ER+/PR+ or -/HER2- (ER+), HER2+, or triple negative (TN), as well as normal breast tissue.

Further stratifying breast tumors by subtype, we found that TN tumors exhibited the

126 greatest increase of total crosslinks relative to the other tumor subtypes and normal breast tissue (Figure 5.5B). Individual xAAs were compared between tumor subtypes and a strong preference for HLCCs in TN tumors relative to ER+ and

HER2+ tumors was identified (Figure 5.5C). Further quantification of HLCCs alone, demonstrated a significant difference in HLCC crosslink abundance between TN breast tumors, normal, and all other subtypes analyzed. Furthermore, TN tumors were the only subtype where all HLCCs correlated with stroma stiffness (Figure

5.5E). Based on these findings, we hypothesized that the preference for HLCCs is driven by lysyl hydroxylase 2 (LH2), which acts intracellularly to hydroxylate telopeptide lysine residues on collagen fibrils. Intracellular LH2 activity is necessary for priming collagen maturation and stability by promoting HLCC formation (260,

270). Of note, others have investigated a role of LH2 in lung cancer and have shown that its expression correlates with poorly differentiated lung tumors and worse overall survival in patients, although the focus was on tumor cell-associated LH2

(271).

High PLOD2 Expression is Associated with Poor Breast Cancer Prognosis

Because fibrillar collagens are mainly expressed by cancer-associated fibroblasts in the tumor microenvironment (122, 272), we evaluated stromal LH2 in human breast cancer (Figure 5.6A). First, we developed a stroma-specific LH2 H- score to determine the relationship between LH2 expression, tumor grade and survival in breast cancer patients, independent of subtype. Patient biopsies with high stromal LH2 H-score were enriched for poorly differentiated grade 3 tumors.

Remarkably, we uncovered a significant correlation between high stromal LH2 H-

127 score and shorter breast cancer-associated survival when adjusted to age of diagnosis (Figure 5.6B)—this correlation was particularly significant when we limited the follow up to the first 10 years (Figure 5.6C). In support of a pro-metastatic role of

LH2 in breast cancer, our data showed stromal LH2 expression correlated with breast cancer-associated survival in lymph node positive patients, but not negative

(Figures 5.6D & E). Once a relationship between stromal LH2 and poor prognosis in breast cancer patients had been established, we were interested in determining whether there is a subtype specific association. While the number of biopsies were low for HER2+ (n=36) and TN (n=32), we observed a clear enrichment for intermediate and high LH2 H-score compared with ER+/HER2- tumors (n=296). We further solidified the relationship between TN tumors and LH2 by demonstrating elevated gene expression levels of PLOD2 (gene name of LH2) in ER-/HER2- breast tumors (basal-like, n=132) relative to HER+ (n=72) and ER+/HER2- (luminal, n=313) using patient cohorts published in Yau et al (273) (Figure 5.6F). Moreover, we used the same dataset to show significant correlations between high PLOD2 expression and distant metastasis-free survival (DMFS) in ER-/HER2- and HER2+ tumors, compared with ER+/HER2- tumors (Figure 5.6G & H). Finally, we also demonstrated increased risk for relapse in TN patients with high PLOD2 expression but not with ER+ or HER2+ tumors using different datasets from Szasz et al.(274)

(Figure 5.6I). Altogether, our results provide evidence for HLCC-driven fibrosis in TN breast cancer elevates risk for treatment resistance and poor prognosis.

128

Discussion

While it has been observed time and time again that stiff tumors are poorly differentiated and have a worse prognosis, it has remained unclear what the direct relationship is between fiber architecture, tissue stiffness and collagen crosslinking.

In part, this was due to a lack of genetically engineered mouse models that were capable of modulating the activity of crosslinking enzymes, and an outdated approach to the analysis of crosslinks in soft tissues. The updated xAAA approach presented here is broadly applicable to soft and hard tissues including normal breast tissue. We have focused on characterizing stromal remodeling in the breast as it has been shown to play a role in tumor progression. It was presumed that the remodeling of breast tumor ECM into denser fibers will naturally have a higher abundance of collagen crosslinks, however; this assumption had yet to be mechanistically explored.

Metastatic breast tumors have previously been shown to have a higher elastic modulus (i.e. a stiffer tissue) than their non-metastatic counterparts (98). This stiffening has been shown to be a result of overexpression of the crosslinking enzyme LOX (106), which increases tumor volume, tumor grade and has been found to be essential for hypoxia induced breast tumor metastasis (110). Importantly, it has remained unknown whether or not increased tissue stiffness driven by LOX overexpression is derived from epithelial tumor cells or stromal cells. We set out to answer this question by using two GEMMs that direct Lox overexpression to the epithelium or the stroma. Interestingly, we found no association between epithelial

Lox overexpression and increased collagen crosslinking. However, using a Col1a1

129 promoter and directing overexpression to the stroma, we see a robust increase in total collagen crosslinking and a potential role for HLCCs in driving this phenotype.

Thus, we are able to definitively conclude that Lox overexpression in the stroma is essential for the alteration of collagen crosslinking during mammary tumor progression.

Through application of the xAAA approach, the relationship between observable architecture (i.e. SHG) and crosslinking was explored in human breast tissue. We found that although invasive ductal carcinomas (IDC) may have different architectures (i.e. IDCc vs IDCs), their abundance of crosslinks is similar. These data suggest that it is the abundance of crosslinks and not a particular observable architecture that is associated with invasive carcinomas. Of note, IDCc patients exhibited a larger fold change relative to normal mammary gland than IDCs patients.

We next sought to determine if unique crosslinking profiles were associated with specific subtypes of breast cancer. Through the analysis of human breast cancer patient samples we determined that total crosslinking abundance is positively associated with breast tumor aggression with triple negative cases having the highest abundance of collagen crosslinks. We stratified individual LCC and HLCC crosslinks individually and determined a strong preference for HLCCs in triple negative but not Her2+ or ER/PR+ cases. Further examination of the relationship between crosslinks and AFM tissue stiffness measurements revealed that HLCCs, but not LCCs, correlated with tissue stiffness, with the strongest relationship found in triple negative cases.

130

As these data suggest a requisite role for lysyl hydroxylase 2 (LH2) in the development of stiffer tumors that are more aggressive, we sought to investigate its clinical significance in a larger cohort of breast cancer patient samples, specifically focusing on triple negative disease. To vet LH2 as a potential biomarker for triple- negative disease, we evaluated tumor microarray data from hundreds of patients and found that high LH2 expression in human breast tumors correlates with ER status, histological grade, and survival – underscoring the clinical relevance of how a single feature of fibrosis may alter the course of a disease. Furthermore, these data emphasize the presence of unique molecular features of breast tumor fibrosis, and that the desmoplastic response is likely to differ depending on the subtype.

Ultimately, we argue that these preferences can lead to changes in the mechanical properties of a desmoplastic tumor stroma and drive tumor aggression and treatment resistance (94, 98).

xAAA coupled to LC-MS/MS analysis builds on previous approaches describing the identification of crosslinks with significant improvements including 1. A means to profile collagen crosslinks from OCT embedded patient breast tumor biopsies 2. Detection limits in the hundreds of femtomoles with the capability of detecting all known crosslinks without chemical derivatization (divalent, trivalent, tetravalent) in a single assay 3. Application of a targeted MS approach allowing for unambiguous identification of crosslinks through acquisition of unique fragmentation spectra for each crosslink, thus drastically improving the specificity of the method and 4. A means to decisively identify new crosslinks and other crosslink species that remain elusive – such as the mature products of the Lysald crosslinking pathways

131

(i.e. pyrroles). Altogether, this method provides a sensitive, specific and efficient approach to interrogate the molecular phenotype of fibrosis in tumors.

The analyses presented here reveal a strong relationship between LH2 expression, HLCC abundance, patient prognosis and treatment resistance, especially in triple negative subtypes. By using a modern xAAA approach we have provided valuable insight into how collagen crosslinking is defined by different processes among breast cancer pathologies whose stromas have unique mechanical properties. To that end, we have been the first to characterize collagen crosslinks in human breast cancer and demonstrated that a distinct feature of fibrosis in triple negative breast cancer is increased expression of LH2 leading to the formation of HLCCs which act to stiffen the tumor stroma.

132

Figure 5.1: xAAA Workflow Diagram and Crosslink Analysis of Normal Tissues. A) Approximately 5 mg of dried tissue is hydrolyzed in 6N HCl for 24 hrs and subjected to solid phase extraction (SPE) enrichment. An enriched hydrolysate is achieved through separation based on polarity. B) LC-MS/MS on a hybrid quadrupole orbritrap instrument generates MS2 spectra used to accurately identify xAAs. C) xAAA applied to soft and hard tissues (pooled n=3 each tissue). Collagen abundance is plotted using hydroxy proline measurements derived from pre-enriched hydrolysates. Total xAA abundance is plotted for each tissue and is further stratified by lysine collagen crosslinks (LCC) and hydroxy lysine collagen crosslinks (HLCC). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

133

Figure 5.2: Overexpression of Lysyl Oxidase in the Mammary Tumor Epithelium Does Not Alter Collagen Crosslinking.

134

Figure 5.2: Overexpression of Lysyl Oxidase in the Mammary Tumor Epithelium Does Not Alter Collagen Crosslinking. A) Lysyl oxidase (Lox) expression was induced in the mammary epithelium using a Tet-ON system, animals treated with DOX will turn on overexpression of Lox (DOX-ind PyMT Lox OX). Animals on water (Water PyMT control) or animals who did not have the correct genotype (DOX-ind PyMT control) served as control groups. B) Picrosirius red staining and polarized lightimaging of doxycycline treated mice with MMTV-PyMT tumors (Control) or mice on water PyMT Lox was overexpressed (Lox OX) C) Quantification of picrosirius red polarized light images plotted as a percentage of total area. D) Total crosslinks scatter plot. Total values calculated by summing individual xAAs in E E) Scatterplots for individual crosslinks identified from xAAA. F) Total HLCCs bar plots. Total HLCCs were calculated by summing DHLNL and Pyr, and is stratified as such. For bar plots, values from each group were averaged and plotted with the SEM. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

135

Figure 5.3: Collagen Crosslinking Closely Correlates with Fibrillar Collagen Accumulation and ECM Stiffness in a Mammary Tumors Overexpressing Lysyl Oxidase in the Stromal Compartment

136

Figure 5.3: Collagen Crosslinking Closely Correlates with Fibrillar Collagen Accumulation and ECM Stiffness in a Mammary Tumors Overexpressing Lysyl Oxidase in the Stromal Compartment A) Immunohistological stainining (H&E, picrosirius red with polarized light, second harmonic generation, FAK pY397) and imaging of normal mammary gland, Dox induced PyMT control tumors, and Lox overexpressing (Lox OX) tumors. Lysyl oxidase (Lox) expression was induced in the stromal compartment under the control of the Col1a1 promoter using a TetOFF system when MMTV-PyMT mice were taken off doxycycline treatment to induce crosslinking. B) AFM used to measure ECM stiffness in control and Lox OX tumors. C) Total crosslinks scatter plot. Total values calculated by summing individual xAAs in D D) Scatterplots for individual crosslinks identified from xAAA. Individual points represent individual animals with horizontal and vertical bars representing the mean and SEM. E) Total HLCCs bar plots. Total HLCCs were calculated by summing DHLNL and Pyr, and is stratified as such. For bar plots, values from each group were averaged and plotted with the SEM. F) Pearson correlation analysis showed a positive correlation between the top 10% of elastic modulus measurements for each tissue and HLCC crosslink abundance. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

137

Figure 5.4: Curly and Straightened Invasive Ductal Adenocarcinoma Architectures are Both Associated with Increased in Collagen Crosslinking. A) Second harmonic generation imaging of collagen architectures in normal breast tissue, invasive ductal carcinoma – straight architecture (IDCs), and invasive ductal carcinoma – curly architecture (IDCc). B) Total crosslinks scatter plot. Total values calculated by summing individual xAAs in C C) E) Total HLCCs bar plots. Total HLCCs were calculated by summing DHLNL and Pyr, and is stratified as such. For bar plots, values from each group were averaged and plotted with the SEM. For scatter plots, horizontal and vertical bars represent the mean and SEM. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

138

Figure 5.5: Triple-Negative Breast Cancer Patients Favor the Formation of Hydroxy Lysine Derived Collagen Crosslinks. A) Quantification of total crosslinks between normal (prophylactic) tissue and tumor independent of subtype B) Plot of total crosslinks among various human breast cancer subtypes C) Scatterplots for individual crosslinks identified from prophylactic breast tissue and breast tumor subtypes of varying subtypes D) Total HLCCs bar plots. Total HLCCs were calculated by summing DHLNL and Pyr, and is stratified as such. For bar plots, values from each group were averaged and plotted with the (SEM). E) Correlation maps display the relationships between ECM stiffness, fibrillar collagen content, and HLCC crosslink abundance. Horizontal and vertical bars represent the mean and SEM. All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001).

139

Figure 5.6: High LH2 Expression Correlates with Poorly Differentiated Tumors, Cumulative Distant Metastasis-Free Survival in Triple-Negative Patients.

140

Figure 5.6: High LH2 Expression Correlates with Poorly Differentiated Tumors, Cumulative Distant Metastasis-Free Survival in Triple-Negative Patients. A) Tumor samples from incident breast cancer cases in MDCS were collected, and a tissue microarray (TMA) including two 1-mm cores from each tumor was constructed (Beecher, WI, USA). Within the study population (N=910), tumor tissue cores were accessible from 718 patients. Stromal LH2 staining was assessed with the semi- quantitative H-score which combines intensity and proportion positive assessments into a continuous variable from 0-300. B) Association of stromal LH2 score with tumor differentiation. C) Cumulative breast cancer specific survival (BCSS) was assessed in all breast cancer (BC) patients. D) & E) BCSS curves with patients stratified based on being either lymph node positive or negative. F) PLOD2 expression levels in patients stratified by ER/PR status. G) Distant metastasis free survival (DMFS) curve for ER+ HER2- patients. H) DMFS curve for ER+/- HER2+ patients. I) DMFS curve for ER- HER2- patients. Subsequent statistical analysis was performed with unpaired two- sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001).

141

CHAPTER VI

GENOTYPE TUNES PDAC TENSION TO DRIVE MATRICELLULAR-ENRICHED

FIBROSIS AND TUMOR AGGRESSION 2

Introduction

Pancreatic ductal adenocarcinomas (PDACs) are characterized by the development of extensive fibrosis at the primary tumor site. PDAC is the fourth leading cause of cancer-related deaths in the United States and 5-year survival rate that is less than 5% (275). PDAC fibrosis induces interstitial fluid pressure to disrupt blood vessel integrity and induce hypoxia that compromises drug delivery and promote disease aggression and therapy resistance (276-278). Consequently, considerable resources have been expended to develop strategies to reduce the fibrotic burden of PDACs (279). To this end, inhibition of stromal sonic hedgehog

(SHH) signaling in a mouse model of PDAC significantly reduced fibrosis and increased intratumoral vascular density to increase drug update that, at least transiently, stabilized the disease (280). Similarly, reducing mouse pancreatic tumor hyaluronan, using hyaluronidase, or treating xenografted human pancreatic tumors with an angiotensin inhibitor to reduce tissue tension, decreased interstitial fluid pressure and normalized the vasculature to facilitate chemotherapy response

(281, 282). Yet, phase II clinical trials in PDAC patients treated with the SHH inhibitors IPI-926 or GDC-0449, or with a monoclonal antibody against the collagen crosslinking enzyme LOXL2, failed (NCT01472198) (283). Experiments in mouse

2 The work described in this chapter is included in Nature Medicine (ref 94) and has been published here with permission from the editor. 142 models of PDAC revealed that, while depletion of proliferating -smooth muscle actin

(SMA) positive stromal cells reduced fibrosis, the vasculature remained abnormal and the tumor, while reduced, was hypoxic and poorly differentiated, with accelerated mortality (102). Despite a frank reduction in fibrosis and enhancement of tissue vascularity, genetic ablation of SHH or treatment with a smoothened inhibitor induced mouse PDACs that were less differentiated and more, not less, aggressive (284). These data imply that the stroma can both promote and restrain tumor progression, and suggest stromal dependency may be context dependent. Whether such complexity could be explained by distinct tumor genotype - stromal interactions or by the natural evolution of PDACs remains unclear.

Malignant transformation of an epithelial tissue is universally accompanied by

ECM deposition and remodeling (107, 119). Nevertheless, the extent and nature of the fibrosis and the responsiveness of the transformed epithelium to the desmoplastic ECM can vary widely across cancers, amongst tumor subtypes and even within one tumor (285, 286). Indeed, the fibrotic response in patients with aggressive, treatment-resistant, quasi-mesenchymal PDACs (QM-PDA) is less prominent and tumor cells isolated from QM-PDA patients are only marginally anchorage-dependent for their growth and survival (287, 288). By contrast, patients with classical PDACs have a better prognosis and classical PDACs are more differentiated, and tumor cells isolated from these cancers retain Ras dependence and express higher levels of cell adhesion molecules (102).

Although the origins of the QM and classical histophenotypes have yet to be

143 determined, PDAC development has been irrevocably linked to a handful of genetic modifications. Thus, pre-malignant pancreatic lesions (PanINs) frequently possess activating point mutations in the Kras proto-oncogene and PDAC progression correlates with either the genetic and/or epigenetic inactivation of the tumor suppressor genes p16INK4a (90%), p53 (75%) and SMAD4 (DPC4, 55%) (289).

Consistently, genetically-engineered mouse models (GEMMs) in which an activated

Kras is expressed in the pancreatic ductal epithelium develop PanINS and when combined with deletion of a single allele of p53, p16INK4a, Smad4 or Tgfbr2 develop PDACs (100, 290-292). Of these genetic modifications, mice with combined

Kras mutations and Tgfbr2 deletion are very aggressive and exhibit a mesenchymal-like phenotype following stromal ablation (100, 102). Moreover, the human mesenchymal-like PDAC phenotype most frequently associates with aberrant TGF signaling in the epithelium (293). These findings imply that distinct genotypes may dictate unique stromal-epithelial phenotypes. Importantly, as mouse

PDACs develop they also increase expression of mesenchymal-like features, as do patients with recurrent PDACs, and ablation of proliferating α-SMA positive cells in

Kras/p53 mouse PDAC permits the expansion of mesenchymal-like, aggressive tumors (294-296). These observations suggest that the epithelium likely evolves over time towards a less stromally- dependent phenotype and implies that this evolution may be linked to the engagement of pathways that promote a mesenchymal-like transition.

High grade PDACs express more Sex-determining region Y (SRY)-Box2

(SOX2), a transcription factor that drives an epithelial-to-mesenchymal transition

144

(EMT), with elevated Sox2 levels in PDACs linked to poor PDAC patient prognosis

(297, 298). Poorly differentiated, mesenchymal-like PDACs also express higher levels and activity of the Hippo transcription factor Yes-associated protein 1 (YAP), and YAP directly induces SOX2 and an EMT (299, 300). These findings suggest

QM-PDACs may arise through elevated YAP and SOX2 activity. YAP is exquisitely sensitive to mechanical stimuli such that cells interacting with a stiff ECM activate more ROCK to increase nuclear YAP and induce YAP-dependent gene expression

(301, 302). Importantly, PDACs are mechanically-activated tumors composed of a progressively stiffened ECM and high interstitial pressure (276, 281, 282). Thus, the elevated tissue mechanics mediated by the stiffened tissue stroma and high interstitial pressure could eventually activate YAP to drive tumor aggression and induce an EMT. Yet, many oncogenes also induce tissue tension by increasing

ROCK-dependent contractility. Indeed, the majority of PDACs have activated Kras, and Kras activity, per se, increases ROCK to drive cell contractility which, in turn, induces ECM remodeling and stiffening to drive integrin-dependent mechanosignaling and malignant transformation (303-305). It is therefore also equally plausible that the genotype of the pancreatic tumor epithelium additionally elevates tissue tension to drive tumor progression. Here, we exploit a series of

PDAC GEMMs to explore the relationship between tumor genotype, stromal- epithelial interactions, tissue tension and crosslinking in PDAC progression and aggression.

145

Materials and Methods

Mice Studies

All mice were maintained in accordance with University of California

Institutional Animal Care and use Committee guidelines under protocol number

AN105326-01D. Transgenic mouse strains, KrasLSL-G12D/+;Ptf1a-Cre (KC) (306),

KrasLSL-G12D/+;TGFbR2flox/wt or flox/flox;Ptf1a-Cre (KTC) (100); KrasLSL-

G12D/+;TrP53R172H/+; Pdx1-Cre (KPC) (290) LSL-1V7γ7N (307), Stat3flox/flox

(308) and constitutively active Stat3C (309) were described previously. Mice were interbred and maintained in mixed background. Aged matched littermates not expressing Cre as well as Ptf1a-Cre and Pdx1- Cre mice were used as controls.

For FAK inhibition studies, Ptf1a-Cre/1V7γ7N mice were treated with FAK inhibitor

PND-1186 at 0.5 mg/ml in 5% sucrose in the drinking water, control mice were provided 5% sucrose as drinking water (n = 5 per group). Treatment started at 3 weeks and mice were killed after 3 weeks. For JAK inhibition, KTC mice were treated with JAK inhibitor Ruxolitinib twice daily by oral gavage at 60 mg/kg body weight in 0.5% methylcellulose in water. Control mice were provided 0.5% methylcellulose in water (n = 5 per group). Treatment started at 3 weeks and mice were killed after 3 weeks of treatment. For orthotopic xenografts, 5x105 firefly luciferase-mApple expressing cells in Matrigel (BD Biosciences) were injected into the pancreas of 8-week old nude mice (Simonsen laboratory) (n= 5 per group) and tumor growth was monitored by weekly bioluminescent imaging. For bioluminescent imaging animals were injected intraperitoneally with 3 mg of D-Luciferin and imaged using IVIS spectrum imaging system.

146

Histology

Paraffin-embedded or Fresh frozen pancreatic tissues were analyzed by H&E, picrosirius red, masson’s trichrome or alcian blue according to the manufacturer’s instructions.

LC-MS/MS and LC-SRM Proteomic Analysis

Proteomic analysis was performed in triplicate on 5 milligrams of fresh frozen

KC, KPC, and KTC pancreatic tissues as previously described (127) Briefly, tissues were milled in liquid N2 followed by sequential extraction of cellular proteins, soluble

ECM proteins, and insoluble ECM proteins. Tryptic digests of all fractions were analyzed by both LC-MS/MS and LC-SRM mass spectrometry. Peptide Spectral

Matches (PSMs) from LC-MS/MS was used for relative comparisons of protein abundances between samples. Additionally, Stable Isotope Labeled (SIL) peptides were used for absolute quantification of ECM proteins by LC-SRM.

Atomic Force Microscopy Measurements

Atomic force microscopy and analysis were performed as previously described (310).

Two-Photon Second Harmonic Microscopy and Analysis

Two-photon imaging of pancreatic tissues and collagen gels was performed and quantified as previously described (311).

147

Results

To explore the relationship between TGF signaling, tissue mechanics, the fibrotic phenotype and tumor aggression, we exploited available PDAC GEMMs. We used a GEMM in which mutant Kras was conditionally expressed in the pancreatic epithelial cells (KrasLSL-G12D/+; Ptf1-Cre; KC) either alone or in combination with mice heterozygous for mutant P53 (KrasLSL-G12D/+;TrP53R172H/+;

Pdx1-Cre; KPC), lacking one allele of P53 (KrasLSL-

G12D/+;TrP53flox/wt; Ptf1; KP-Ptf1α-C) or those lacking one allele of the TGF receptor II (KrasLSL-G12D/+;Tgfbr2flox/wt; Ptf1a-Cre; KTC). As has been previously reported, by 20 weeks Kras mice had progressed to PanIN lesions, whereas both the KPC and KTC mice developed frank PDACs (100, 290).

Coincident with tumor formation, the pancreatic tissue of both the KPC and KTC mice was highly fibrotic. PDACs in the KTC mice had significantly thicker collagen bundles that were particularly evident in the region surrounding the tumor epithelium, which atomic force microscopy (AFM) indicated was significantly stiffer than the stroma surrounding the KPC tumor epithelium (Figure 6.1A, B & C). Upon further scrutiny, we noted that the altered fibrillar collagen phenotype and the elevated stromal stiffness in the KTC PDACs was accompanied by a significant increase in levels of matricellular proteins including tenascin C, fibronectin and collagen XII

(Figure 6.1D). The epithelium of the KTC PDACs also had higher mechanosignaling, as revealed by more activated 1 integrin and p397FAK, with greater nuclear levels of the mechano-activated transcription factor YAP1 (Figure

6.1A). These findings indicate that loss of pancreatic epithelial TGF signaling

148 induces a mechanically-activated, matricellular-enriched fibrotic phenotype.

Importantly, we quantified more pMLC2 and pMyPT1 in the epithelium of the KTC

PDACs, implicating loss of TGF signaling in the elevated tissue mechanics (Figure

6.1A). Indeed, traction force microscopy (TFM), which quantifies the contractility phenotype of individual cells, revealed that freshly isolated pancreatic KTC tumor cells and cultured pancreatic Kras tumor cells lacking Tgfbr2 expression were significantly more contractile (Figure 6.1E). Further studies confirmed this observation and showed the KTC tumor cells were able to induce more

ROCK-dependent collagen gel contraction than isolated cancer cells from KPC or

KC mouse tumors (Figure 6.1G). The KTC pancreatic tumor cells also drove more

ROCK-dependent collagen gel remodeling and stiffening (Figure 6.1G), more

ROCK-dependent fibrosis and tissue stiffening when injected into the pancreas of immune- compromised mice, and activated more Yap as indicated by higher nuclear levels and increased CTGF expression (Figure 6.1G). Not only were the pancreatic tumor cells able to directly promote ROCK-dependent tissue fibrosis, but the phenotype was independent of proliferation with the non-fibrotic KPC pancreatic tumor cells forming larger, not smaller, tumors than those generated by the control and ROCK-knockdown KTC tumor cells. Thus, although pancreatic transformation is universally accompanied by a progressive fibrosis and stiffening of the ECM, the nature of the fibrotic response and the mechano-phenotype of the cancer can be modified by the genotype of the tumor. In particular, loss of TGF signaling in the epithelium can increase cancer cell contractility to "tune" the mechano-fibrotic phenotype of the malignant tissue.

149

PDACs derived from both KPC and KTC models were characterized by abundant quantities of fibrillar collagen as measured by targeted proteomics

(Col1a1, Col1a2, Col5a1). However, total collagen levels of KC, KPC and KTC mice were approximately the same despite KTC tumors showing a higher degree of fibrosis (Figure 6.2A). Further analyses of ECM proteomics data further confirmed that KTC PDAC fibrosis was accompanied by a significant increase in tenascin C, fibronectin and collagen, type XII, alpha 1 (Figure 6.2B, C & D). Multivariate analysis was performed on targeted proteomics data in order to determine which matricellular proteins contributed to variation in fibrosis between GEMMs (Figure

6.2E). Matricellular proteins were curated from the larger targeted proteomics results and principal component analysis (PCA) was performed. Matricellular scores plot shows that KC, KPC, and KTC genotypic groups are distinguished from one another based solely on their matricellular protein component. In addition, we also asked whether or not matricellular protein solubility changes. Analysis of individual fractions

(cellular, sECM, and iECM) revealed that in KTC lesions, matricellular proteins elute earlier in the extraction process (Figure 6.2F) indicating an increase in protein solubility.

We hypothesized that the increase in stiffness was not necessarily driven by the abundance of fibrillar collagen, per se, but rather a result of changes in the collagen architecture itself, potentially driven by increased collagen crosslink abundance. Using crosslinked amino acid analysis (xAAA), crosslinking was assessed in KC, KPC, and KTC PDAC lesions. Remarkably, it was shown that KC

(PanIN) lesions displayed the highest abundance of collagen crosslinks and they

150 decreased in KPC and KTC terminal PDACs (Figure 6.2G). While this result was surprising, it was corroborated by solubility analysis of fibrillar collagen which showed a decrease in fibrillar collagen solubility in KPC and KTC PDAC lesions relative to KC lesions (not shown). DHLNL was the most abundant crosslink measured, followed closely by its maturation product, Pyr. Both of the trivalent crosslinks identified, Pyr and dPyr, showed significant decreases abundance in KTC

PDACs, relative to KC PanINs.

We also sought to clarify how reduced TGF- signaling could increase tumor cell tension. G protein coupled receptor (GPCR)-mediated Janus kinase (Jak) activation stimulates activation of both signal transducer and activator of transcription 3 (Stat3) and ROCK, and induces actomyosin-mediated cell contractility and Stat3 has been implicated in PDACs (312, 313). We noted that pStat3 staining was significantly higher in the pancreatic epithelium of the KTC mice compared to

KPC or KC mice (Figure 6.3A), even in the eight week old PanINs where the amount of infiltrating immune cells is quite low. Although both the stroma and the

PDACs in the KPC and KTC mice stained positively for pStat3, coincident with an abundant immune infiltrate (314), pStat3 was significantly higher only in the tumor pancreatic epithelium in KTC mice (Figure 6.3A). We also detected abundant pStat3 in vitro in non-stimulated KTC, but not KPC or KC, cells (Figure 6.3B). KTC tumor cell contraction and remodeling of collagen gels were blocked by treatment with the

JAK inhibitor Ruxolitinib (Figure 6.3C & D).

151

Discussion

While PDACs present with elevated stromal fibrosis, recent findings suggest that collagen abundance may associate with better patient prognosis, with high collagen content correlating with a well-differentiated PDAC phenotype.

Nevertheless, elevated fibrillar collagen has repeatedly been implicated in both

PDAC aggression and treatment resistance (102). To address these contradictory perspectives, we examined the relationship between tumor genotype, tissue tension, and fibrosis composition and architecture in PDAC progression and aggression.

Using a series of well-characterized GEMMs, we identified a unique matricellular-stromal signature that associates with PDAC genotypes that elevate pancreatic cell contractility and tumor aggression. This coordination of genetically- induced tumor cell contractility and matricellular-enriched fibrosis "tunes" PDAC tissue tension and ultimately activates Yap to promote tumor aggression. This interplay suggests that bulk collagen content is a poor surrogate for the multifaceted contributions of PDAC fibrosis to cancer aggression. Specifically, genetically-elevated epithelial tension promoted Yap activation and tumor aggression by stimulating Stat3-mediated ECM remodeling and stiffening of the stroma surrounding the developing pancreatic lesions. Consistently, ectopically- increasing epithelial mechanosignaling induced a matricellular-rich fibrosis and stiffened the stroma adjacent to the pancreatic epithelium, and activated Yap and Stat3 to induce inflammation and accelerate pancreatic transformation and induce aggression. Mechanistically, genetically-increasing epithelial Stat3 activity

152 amplified epithelial tension to drive a matricellular-enriched stiffened phenotype and accelerate malignancy, whereas genetic-ablation of epithelial Stat3 reduced epithelial tension, diminished fibrosis and tempered tumor aggression. Our findings provide the first direct evidence that PDAC genotype calibrates tumor cell contractility to modulate the fibrotic phenotype of the tissue and by so doing modifies the pathology of the resultant cancer.

While the magnitude of the fibrotic response derived from each distinct PDAC genotype initially varied, all of the PDAC models used in our studies eventually trended towards the development of fibrosis, albeit with different end stage fibrotic phenotypes. Thus, although the KPC model demonstrated muted epithelial contractility and initial lower levels of tissue tension compared to the highly fibrotic

KTC model, KPC tumor progression was eventually accompanied by an activated stroma that developed into a robust fibrotic response, a progressive stiffening of the ECM and accumulation of high nuclear Yap, in at least a subset of pancreatic tumor cells. The data suggest a primary tumor evolution towards the mesenchymal-like features often observed in patients presenting with recurrent

PDAC. In fact, mesenchymal-like PDACs express higher levels and activity of YAP which, in turn, feeds back to further promote a pro- fibrotic, mesenchymal-like tumor phenotype both directly and via stimulation of SOX2. Our results thereby suggest that patients presenting with high grade PDAC will exhibit a highly mechanoresponsive epithelial phenotype early in PDAC progression while patients presenting with low grade disease will exhibit a gradual elevation of tissue fibrosis leading to eventual YAP activation in the epithelial and stromal compartments by

153 end state PDAC, as well as in recurring tumors. Given that YAP drives an EMT and mesenchymal-like tumor cells exhibit stromally- independent survival and growth, stromal ablation in PDAC patients with pre-existing disease or aggressive genotypes would predictably fail. Thus, our results may explain why ablation of fibrosis does not always block PDAC progression and, in some instances, actually promotes tumor aggression and why recent clinical trials using anti-stromal fibrotic agents failed to provide benefit to patients. Indeed, given that Kras-dependency is strongly linked to epithelial status which diminishes with EMT and that YAP drives PDAC aggression independently of Kras by activating an EMT-like program (315), most

PDAC patients may similarly exhibit resistance to targeted therapies including receptor tyrosine kinases and their downstream effectors and new generation Ras therapies.

Importantly, our studies suggest that novel treatment modalities holistically targeting both epithelial and stromal-driven fibrosis, cellular contractility and YAP activity, such as combinatory FAK/JAK inhibitor cocktails, will likely prove to be more efficacious therapeutic strategy with which to treat PDAC patients. These observations are consistent with previous implications of YAP in mechanotrasduction (316). Additionally, we provide a mechanistic rationale with which to design future therapeutic interventions for patients since the highly contractile cell phenotype driven by Stat3 signaling exhibits high YAP activity which is reminiscent of the quasi-mesenchymal-like and recurrent patient PDAC phenotype. Taken together with previous findings, our work would then suggest that the use of FAK inhibitors should provide therapeutic benefit for the quasi-

154 mesenchymal and Kras-independent PDAC subtypes with elevated YAP, block progression and recurrence of low grade PDAC, and slow the progression of high grade PDAC.

Human PDAC Pilot Summary

In collaboration with Dr. Wells Messersmith, I have obtained several human

PDAC patient samples with matching control tissue. I have performed an initial pilot study of 3 vs 3 (PDAC vs Normal) and have presented the preliminary results here.

Comprehensive proteomic studies with the goal of characterizing differences in ECM composition between normal pancreatic tissue and PDAC lesions have been limited.

In part, this is due to the stroma of PDAC lesions being difficult to solubilize using normal proteomic approaches. Thus, application of hydroxylamine digestion to characterize the insoluble matrix (see Chapter 2) was applied here with three major goals in mind; 1) Comprehensively characterize PDAC lesions stroma 2) Assess similarities and differences in ECM composition between human PDAC and murine

PDAC lesions 3) Characterize collagen crosslinking in human PDAC lesions.

In comparing the collagen abundance of PDAC lesions and matching normal tissue several trends were observed. First, fibrillar collagen (COL1A1, COL1A2,

COL3A1, COL5A1/2) abundance is highest in PDAC lesions and represents the most abundant protein group in these samples (Figure 6.4A). Additionally, network collagens (COL6A1/2) are also significantly higher in PDAC lesions compared to that of normal matching tissue. Perhaps most striking however, is the drastic increase in the FACIT collagens COL12A1 and COL14A1 which assist in formation of the stromal meshwork, primarily composed of fibrillar collagen. This was a similar trend

155 to what was observed in normal pancreas and KTC PDAC lesions in mice. An additional similarity between murine and human data is the decrease in basement membrane collagens in diseased tissue (COL4A1/2). As matricellular proteins were identified as playing a major role in the fibrotic phenotype observed in KTC PDAC lesions, they were also analyzed here. Interestingly, THBS1 and TNC were not detected in normal human pancreatic tissue but were among the most abundant matricellular proteins in PDAC lesions (Figure 6.4B). This was identical to what was observed when comparing normal murine pancreas and KTC PDAC lesions. Based on these findings we decided to apply multivariate analysis to investigate similarities between PDAC GEMMs presented earlier in this chapter to that of human PDAC lesions (Figure 6.4C). PCA scores plot of all ECM proteins shows several interesting trends. Not surprisingly, normal murine pancreatic tissue at 5 week and 20 week time points have a similar ECM component abundance and cluster near one another. KC and KPC PanIN and PDAC lesions, respectively, cluster near one another. Interestingly, normal human pancreatic tissue is separated from normal murine tissue based on ECM composition. This may be a result of comparing a true normal pancreas (in mouse) to a normal section of diseased pancreatic tissue (in human). In terms of the ECM composition of human and murine PDAC lesions, they are actually quite similar as is evident from their spatial relationship in the PCA scores plot (Figure 6.4D). 20 week KTC PDAC lesions fall right in between KPC and human PDAC groups. This similarity is primarily driven by the congruent changes that exist in collagen and matricellular protein groups in both murine and human diseases.

156

As we have previously reported that patients with a poor prognosis have increased fibrosis and tissue stiffness we performed crosslinked amino acid analysis

(xAAA) to gain insight into if increased tissue stiffness is directly associated with increased crosslinking. Overall, human PDAC lesions had significantly more crosslinks than normal control tissue (Figure 6.4D). In particular, hydroxy lysine derived collagen crosslinks exhibited the largest changes including DHLNL, Pyr and dPyr. This suggests PDAC lesions favor increases in HLCC crosslinks which are primarily a result of increased lysyl hydroxylase activity. The finding of human PDAC lesions having increased total crosslinks as well as favoring HLCCs is supported by

ECM proteomics data which showed that human PDAC lesions had elevated levels of two enzymes involved in collagen crosslinking, lysyl oxidase (LOX) and lysyl hydroxylase 2 (PLOD2) (Figure 6.4E).

In future studies, I hope to investigate the relationship between hydroxy lysine collagen crosslinks and patient outcomes, similar to what was performed in Chapter

V. I also plan to analyze additional human PDAC patient samples and their matching control tissue. Furthermore, Dr. Messersmith’s lab has created PDX tumors grown in mice derived from these patients. This is exciting as I plan to apply species specific

ECM proteomics with QconCATs to interrogate, in the context of PDX tumor growth, what the relative contributions in ECM composition are from the host vs. tumor.

These experiments will provide invaluable information regarding how the stroma is remodeled during PDX tumor progression.

157

A B C

D

E F

G H

I

Figure 6.1: PDAC Genotype Tunes Epithelial Tension to Regulate Fibrosis (94)

158

Figure 6.1: PDAC Genotype Tunes Epithelial Tension to Regulate Fibrosis (94). A) SHG images from 20 week KC, KPC or KTC transgenic pancreatic tissues (top); Scale bar, 75 µm. Force maps from AFM PDAC ECM (top insert). Immunofluorescence images of p-Mlc2, p-MyPT1 (insert), 1-integrin and p-Ptk2, Yap1 and DAPI; Scale bar, 50 µm. Tenascin C, Fibronectin (insert), Collagen type XII alpha1 and DAPI; Scale bar, 75 µm. B) Quantification of SHG fibril thickness and distribution around PDAC lesions. C) Distribution of PDAC ECM stiffness measured by AFM. D) Quantification of Tenascin C, Fibronectin and Collagen type XII alpha 1 images shown in (A). E) Quantification traction force KC, KPC and KTC cells on 2300 Pa polyacrylamide gels. F) Quantification of mean collagen fiber diameter in three- dimensional collagen gels with KC, KPC or KTC cells or with KTC cells treated with vehicle or ROCK inhibitor Y27632 at 24 hours. G) PR staining of tissue excised from nude mice 3 weeks after injection with KC, KPC, or KTC cells expressing either a control shRNA or an shRNA to Rock1 (top); Scale bar, 75 µm. Immunofluorescence of p-Mlc2, p-MyPT1 (insert), Tenascin C, Yap1 and DAPI; Scale bar, 50 µm. H) Quantification of stiffness of tissue in (E). I) Quantification of total levels of fibrillar collagen. For in vitro bar graphs, 3 technical replicates were performed and results are the mean +/− SEM of γ independent experiments. For in vivo experiments, n = 5 mice per group. Subsequent statistical analysis was performed with either unpaired two-sided student t-tests, one-way ANOVA with Tukey’s method for multiple comparisons. For survival analysis, the log-rank (Mantel-Cox) test was used. (*P < 0.05; **P < 0.01, ***P <0.001, ****P < 0.0001, “ns” not significant).

159

Figure 6.2: Targeted Proteomics and Crosslinking Analysis Reveals Changes in Protein Abundance, Solubility and Crosslinking

160

Figure 6.2: Targeted Proteomics and Crosslinking Analysis Reveals Changes in Protein Abundance, Solubility and Crosslinking. A, B & C) Characterization of ECM proteins by LC-SRM. Individual bar plots of fibrillar collagen, FACIT collagen, and select structural ECM and matricellular proteins. D) IHC of TNC, FN1, and COL12A1 on KC, KPC, and KTC PDAC lesions (from 6.2). E) Scores plot from principal component analysis (PCA) of matricellular variables quantified by LC-SRM. F) Matricellular protein solubility plot. Plots were generated by analyzing the percentage of a given protein found in the cellular, soluble ECM (sECM) and insoluble ECM (iECM) fractions out of the total abundance G) Total crosslinks bar plot. Total values calculated by summing individual xAAs in H H) Scatterplots for individual crosslinks identified from KC, KPC, and KTC lesions, with horizontal and vertical bars representing the mean and SEM, respectively. For bar plots, values from each group were averaged and plotted with the SEM. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

161

A

B C D

Figure 6.3: JAK-Stat3 Signaling Drives ECM Remodeling and Stiffening. (94) A) Immunofluorescence images and quantification of epithelial and stromal staining in 20 week old KC, KPC and KTC mice stained for pStat3 and DAPI; Scale bar, 50 µm. (B) Immunofluorescence images of KC, KPC and KTC tumor cells stained for p-Stat3 and DAPI; Scale bar, 25 µm. (C) Quantification of collagen contraction as measured by total collagen gel area in three dimensional collagen gels with KTC tumor cells for 24 hours either with vehicle or the Jak inhibitor Ruxolitinib. (D) Polarized light images and quantification of PR stained color-coded fibrillar collagen diameter on gels from (C) Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001).

162

Figure 6.4: Human PDAC Lesions are Characterized by Increased Matricellular Protein Abundance and Crosslinking

163

Figure 6.4: Human PDAC Lesions are Characterized by Increased Matricellular Protein Abundance and Crosslinking. A) Analysis of collagen abundance (i.e. peptide spectral matches, PSMs) in normal contralateral and PDAC tissue from global LC-MS/MS data. B) Curated list of all matricellular proteins identified in normal pancreas and PDAC tissues from global LC-MS/MS data and plotted based on relative abundance. C) Principal component analysis (PCA) of ECM curated global LC-MS/MS data from human normal and PDAC, KC (PanIN), and KPC and KTC PDACs. D) Total crosslinks and individual xAA scatterplots. Total values calculated by summing individual xAAs. E) Abundance plots of enzymes involved in collagen crosslinking identified from global LC-MS/MS. For bar plots, values from each group were averaged and plotted with the SEM. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas

164

CHAPTER VII

EXAMINATION OF MATRICELLULAR FIBROSIS AND WOUND HEALING IN A

MODEL OF PANCREATIC DUCTAL ADENOCARCINOMA PROGRESSION

Introduction

The pancreas is an endodermal organ that primarily functions to regulate the digestion of protein and carbohydrates and maintain glucose homeostasis. The exocrine pancreas, which accounts for roughly 80% of the total tissue mass, is composed of a branching network of acinar and duct cells that produce and secrete digestive zymogens into the ductal lumen in response to cues from the stomach and duodenum (317). The endocrine pancreas, which regulates metabolism and glucose homeostasis through the secretion of hormones into the bloodstream, is composed of four specialized endocrine cell types gathered together into clusters called Islets of Langerhans (318). Mirroring the physiologic and cellular diversity of the pancreas is a spectrum of distinct pancreatic malignancies that possess histological and molecular features that recall the characteristics of the various normal cellular constituents. Pancreatic ductal adenocarcinoma (PDAC) is the most common pancreatic neoplasm, accounting for >85% of all pancreatic tumor cases.

Additionally, more than 95% of all pancreatic cancers arise from exocrine elements

(i.e. they start in the duct where digestive enzymes are produced), while cancers that arising from endocrine elements (i.e. neuroendocrine tumors and islet cell tumors) account for <5% of cases. Pancreatic cancer is one of the most lethal malignancies and has a median survival of <6 months and a 5-year survival rate of

3-5% (319, 320).

165

Histologically, a tumor consists of far more than a collection of homogenous cancer cells; it also includes the stroma, which consists of stromal cells and the extracellular matrix (ECM) framework that surrounds and interacts with cancer cells

(119). The fibrosis that develops during PDAC progression compromises drug delivery, impedes immune cell accessibility and promotes disease aggression and therapy resistance (278, 281, 321). Paradoxically, others have shown that the same desmoplastic stroma that confers drug resistance also might reduce the ability of

PDAC cells to invade and metastasize. Adding to the complexity of this disease, phase II clinical trials aimed at reducing tissue fibrosis have not shown an improvement in patient survival (283). In fact, when KTC PDACs in mice were depleted of alpha smooth muscle actin positive myofibroblasts at the early (PanIN) or late stage, cells formed invasive tumors with necrotic regions, and reduced overall survival (102, 284). Taken together, these findings illustrate the complex role of the tumor stroma during tumorigenesis and progression and the need for a deeper understanding of the PDAC ECM microenvironment.

Our inability to eliminate tumors therapeutically by targeting fibrosis is at least in part due to our lack of understanding of the role of tumor microenvironment in tumor pathogenesis and therapy resistance. As such, much of pancreatic cancer research has shifted from investigating the parenchyma to investigating the stroma as evidence has shown that even tumors with identical germline mutations can exhibit diverse stromal phenotypes that can predict tumor aggressiveness (Chapter

VI) (94, 322-324). Indeed, studies on pancreatic cancer genetics and epigenetics have led to the identification of notable genetic alterations, such as Kras, p53,

166

Smad4, and p16 (317). Importantly, these signature genetic events, combined with accompanying histopathological alterations (i.e. fibrosis), suggest a sequential transformation roadmap of pancreatic cancer from normal pancreatic epithelium to increasing grades of pancreatic intraepithelial neoplasia (PanIN) to, ultimately, invasive PDAC. In support of this, recent studies have demonstrated that the genotype of PDAC actually tunes epithelial tension to regulate fibrosis and accelerate PDAC progression in mice (Chapter VI) (94).

Additionally, other studies have shown that myofibroblasts, whose proliferation is driven by cytokines, are at least partially responsible for the formation of collagen-rich scar tissue and is dependent on the synthesis and secretion of specific ECM components, such as fibronectin, and fibronectin splice variants containing the extra domain-A (ED-A) fibronectin (95, 325). Fibronectin, in combination with fibrinogen and fibrin, are crosslinked together by transglutaminase

2 to form a provisional ECM and a classic wound healing response that perpetuates throughout cancer progression (326). Furthermore, the polymerization of fibronectin into the ECM is required for the accumulation of other types of ECM components, such as Col1 (327). Thus, the temporal and spatial existence of stromal and malignant cell compartments should be considered when evaluating the effect the stromal compartment has on pancreatic cancer progression.

Although the biological impact of pancreatic cancer stroma on tumor cells has been investigated for some time, the molecular mechanisms that underlie stromal formation are not well understood. In part, this is because there is a lack of methods aimed specifically at characterizing a stroma which is highly insoluble and covalently

167 crosslinked. Here, we have characterized the stroma of a normal pancreas and compare it to early and late PDAC stroma to better understand the individual components that contribute to cancer associated matricellular fibrosis during tumorigenesis and progression in mice. We have also utilized our previously reported mass spectrometry based approach to characterize collagen crosslinking in normal pancreas and early and late PDAC, providing a more granular view of ECM composition and implications about architecture.

Materials and Methods

Reagents

Reagents were purchased from Sigma-Aldrich (St. Louis, MO) unless otherwise noted. Sodium chloride was from Acros Organics (). Microcentrifuge tubes and other consumables were from Axygen Inc. (Union City, CA) and RINO Screw

Camp Tubes from Next Advance (Averill Park, NY). Formic acid (FA), trifluoroacetic acid (TFA), and hydroxylamine (NH2OH) hydrochloride were from Fluka (Buchs,

Switzerland). Anhydrous potassium carbonate, guanidine hydrochloride, sodium hydroxide, and acetonitrile (LC-MS grade) were from Fisher Scientific (Pittsburgh,

PA). Trypsin (sequencing grade, TPCK treated) was from Promega (Madison, WI).

Proteomic Sample Preparation

Three biological replicates of normal pancreas early (5 weeks), normal pancreas late (20 weeks), KTC PDAC early (5 weeks) and KTC PDAC late (20 weeks) were harvested from either normal (C57Bl) or from a genetically engineered

KTC (KrasLSL-G12D/+Tgfbr2flox/+Ptf1a-Cr) mice (n=3 each group). Tissues were flash

168 frozen in liquid nitrogen and powderized using a ceramic mortar and pestle. Tissue was dried overnight in a lyophilizer and weighed tissue (approximately 1 mg of each) was homogenized in freshly prepared high-salt buffer (50 mM Tris-HCl, 3 M NaCl,

25 mM EDTA, 0.25% w/v CHAPS, pH 7.5) containing 1x protease inhibitor (Halt

Protease Inhibitor, Thermo Scientific) at a concentration of 10 mg/mL.

Homogenization took place in a bead beater (Bullet Blender Storm 24, Next

Advance, 1 mm glass beads) for 3 min at 4 ºC. Samples were then spun for 20 min

18,000 x g at 4 ºC, and the supernatant removed and stored as the cellular fraction.

A fresh aliquot of high-salt buffer was added to the remaining pellet at 10 mg/mL of the starting weight, vortexed at 4 ºC for 15 min, and spun for 15 min. The supernatant was removed and stored as Fraction 2. This high-salt extraction was repeated once more to generate Fraction 3, after which freshly prepared guanidine extraction buffer (6 M guanidinium chloride adjusted to pH 9.0 with NaOH) was added at 10 mg/mL and vortexed for 1 hour at room temperature. The samples were then spun for 15 min, the supernatant removed, and stored as Fraction 4. Fractions

1, 2, & 3 (Cellular) were combined and all fractions were stored at -20 ºC until further analysis. The remaining pellets of each tissue representing insoluble ECM proteins were treated with hydroxylamine.

Hydroxylamine (NH2OH) Treatment

Following Fraction 4, pellets were treated with freshly prepared hydroxylamine buffer (1 M NH2OH-HCl, 4.5 M guanidine-HCl, 0.2 M K2CO3, pH adjusted to 9.0 with NaOH) at 10 mg/mL of the starting tissue weight. The samples were briefly vortexed, then incubated at 45 ºC with vortexing for 16 hours. Following

169 incubation, the samples were spun for 15 min at 18,000 x g, the supernatant removed, and stored as Fraction 5 at -80 ºC until further proteolytic digestion with trypsin. The final pellet was stored at -80 ºC until further analysis.

Trypsin Digestion

100 µL of the Cellular fraction (combined fractions 1, 2 and 3) 200 µL of the

Fraction 4 & 5 of all samples were subsequently subjected to reduction, alkylation, and enzymatic digestion with trypsin. 100 fmols of each SIL peptide (170 peptides total) were spiked into 100 µL of sample to allow for four injections per sample (50 fmols eQ 1-6 per injection). A filter-aided sample preparation (FASP) approach, as well as C18 cleanup, was performed as previously described (42).

LC-SRM Analysis

Samples were analyzed by LC-SRM and LC–MS/MS as described (17).

Equal volumes from each post-digestion sample were combined and injected every third run and used to monitor technical reproducibility.

LC-MS/MS Analysis

Samples were analyzed on an Q Exactive HF mass spectrometer (Thermo

Fisher Scientific) coupled to an EASY-nanoLC 1000 system through a nanoelectrospray source. 8 μL of sample was injected. The analytical column (100

μm i.d. × 150 mm fused silica capillary packed in house with 4 μm 80 Å Synergi

Hydro C18 resin (Phenomenex; Torrance, CA)) was then switched on-line at 600 nL/min for 10 min to load the sample. The flow rate was adjusted to 350 nL/min, and peptides were separated over a 120-min linear gradient of 2–40% ACN with 0.1%

170

FA. Data acquisition was performed using the instrument supplied Xcalibur™

(version 2.1) software. The mass spectrometer was operated in positive ion mode.

Full MS scans were acquired in the Orbitrap mass analyzer over the 300–1800 m/z range with 60,000 resolution at m/z 400. Automatic gain control (AGC) was set at

5E+05 and the ten most intense peaks from each full scan were fragmented via

HCD with normalized collision energy of 35. MS2 spectra were acquired in the

Orbitrap mass analyzer with 15,000 resolution. All replicates of each tissue were run sequentially and pre-digested yeast alcohol dehydrogenase standard (nanoLCMS

Solutions LLC, Rancho Cordova, CA) was run between tissue groups to monitor drift in analytical performance.

Proteomic Data Analysis

Skyline was used for method development and to extract the ratio of endogenous light peptides to heavy internal standards from targeted LC- SRM data for protein quantification as described (43). Global LC–MS/MS data was processed as previously described (12). Limits of detection, quantification, and dynamic range were determined for each peptide as previously described (11). Principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) were performed using the MetaboAnalyst with sum and range scaling normalizations (44). xAAA Sample Preparation

Approximately 5 mg of dried tissue was re-suspended in 1mg/mL NaBH4

(prepared in 0.1N NaOH) in 1X PBS for 1 hour at 4°C with vortexing. The reaction was the neutralized by adding glacial acetic acid to a final concentration of 0.1% (pH

171

~ 3 -4) (254). The sample was then centrifuged at 18,000 x g for 20 minutes at 4°C.

The supernatant was removed and the pellet was washed three times with 18 mΩ

H2O to remove any residual salt or acetic acid that may interfere with downstream

LC-MS/MS analysis. The remaining pellet is dried using a lyophilizer system.

Protein Hydrolysis

The dried sample is placed in a glass hydrolysis vessel and hydrolyzed in a volume of 6N HCl, 0.1% phenol. The hydrolysis vessel is flushed with N2 gas, sealed and placed in a 110°C oven for 24 hours. After hydrolysis, the sample was cooled to room temperature and then placed at -80°C for 30 minutes prior to lyophilization.

The dried sample is re-hydrated in 200uL of 18 mΩ H2O for 5 minutes, then 200uL of glacial acetic acid for 5 minutes and finally 800uL of butan-1-ol for 5 minutes.

Importantly, 10uL of sample is removed after re-hydration in water and saved for determination of hydroxy proline content. Samples were enriched with cellulose SPE xAAA Data Analysis

Peak areas of detected crosslinked amino acids were normalized to hydroxy proline content and starting tissue (dry) weight. All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas. After normalizing the crosslinks that were identified by xAAA, the total abundance of all crosslinks in each sample group (n=3 each group) was summed and plotted as total crosslinks per total fibrillar collagen. Further details can be found in Chapter VI and Appendix A, which describes the method in detail.

172

Database Searching and Protein Identification

MS/MS spectra were extracted from raw data files and converted into .mgf files using MS Convert (ProteoWizard, Ver. 3.0). Peptide spectral matching was performed with Mascot (Ver. 2.5) against the Uniprot mouse database (release

201701). Mass tolerances were +/- 10 ppm for parent ions, and +/- 0.2 Da for fragment ions. Trypsin specificity was used for cellular and sECM fractions, allowing for 1 missed cleavage. For iECM fraction, C-terminal N and trypsin were used. Met oxidation, Pro hydroxylation, protein N-terminal acetylation, and peptide N-terminal pyroglutamic acid formation were set as variable modifications with Cys carbamidomethylation set as a fixed modification. Scaffold (version 4.4.6, Proteome

Software, Portland, OR, USA) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Mascot scoring algorithm. Protein identifications were accepted if they could be established at greater than 99.0% probability and contained at least two identified unique peptides.

H&E Staining

Paraffin-embedded or Fresh frozen pancreatic tissues were analyzed by

H&E, according to the manufacturer’s instructions.

Picrosirius Red Staining and Quantification

Flash frozen FFPE tissues were cryo-sectioned at 5 µm, fixed in 4% neutral buffered formalin and stained using 0.1% picrosirius red (Direct Red 80, Sigma) and counterstained with Weigert’s hematoxylin, as previously described (98). Polarized

173 light images were acquired using Olympus IX81 microscope, fitted with an analyzed

(U-ANT) and a polarizer (U-POT, Olympus) oriented parallel and orthogonal to each other. Images were quantified using ImageJ as previously described (98).

Results

Comparative analysis of normal pancreatic stroma to PDAC stroma has yet to be reported using a semi-quantitative proteomics approach. In part, this is because isolation of tissue stroma requires a non-standard enrichment approach involving decellularization steps to remove bulk cellular components that are solubilized by detergents. Post-decellularization, a chaotrope is often used to facilitate protein denaturation and the solubilization of a fraction of the remaining ECM components.

After this extraction, an insoluble pellet remains that contains a significant amount of protein based on previously reported studies (127). For the solubilization of the post-chaotrope insoluble pellet, we applied our recently published protocol for chemical digestion using hydroxylamine (NH2OH), which cleaves somewhat non- specifically at asparagine residues. The sample preparation protocol yields three distinct fractions: 1) cellular, 2) soluble ECM (sECM), and 3) insoluble ECM (iECM) fractions. Fractionating the tissue in this way allows us to 1) individually analyze cellular, soluble ECM and insoluble ECM components, 2) perform relative solubility analysis which can indicate changes in matrix architecture or post-translational status.

To infer changes about the abundance of exocrine, endocrine, ductal, and stromal cell populations during tumor progression, pancreatic cell markers were quantified from global proteomics experiments. A shift in the cell populations

174 between normal pancreas and PDACs was observed in just 5 weeks’ time (Figure

7.1A). First, these data show that exocrine cell population markers (e.g. Cela2a,

Cpa1, Prss2, Ctrb1) are approximately 6-fold higher abundance in normal pancreatic tissue relative to PDAC. These cells are integral to normal pancreatic function as they produce pancreatic enzymes for digestion, and make up about 95% of cellular pancreatic tissue. The remaining 5% of cells consists of endocrine cells that produce hormones that regulate blood sugar and pancreatic secretions. Although less abundant than exocrine cell population markers, endocrine cell population markers

(e.g. Gcg and Ins2) were present in the normal pancreas at 2-fold higher abundance relative to KTC PDACs. Importantly, even in early (5 week) KTC PDACs, marked decreases in the abundance of exocrine and endocrine markers as pancreatic cancer cells was observed. As exocrine/endocrine populations decrease, cancer cell

(i.e. malignant ductal cells) populations expand and begin to release various factors that stimulate the stroma. Stromal cells, in turn, release mitogenic factors that stimulate tumor growth, invasion and resistance to therapy – thereby creating a positive feedback loop that supports disease progression. Accompanying the sharp decreases in exocrine and endocrine cell markers is a 9-fold increase in ductal (i.e.

Krt20, Tpm2, and Krt7) and a 2.7-fold increase in stromal (i.e. Gfap and Vim) cell markers, relative to normal pancreatic tissue. Interestingly, these data also revealed that KTC tumors retain a relatively high abundance of exocrine cell markers early on

(i.e. 5 weeks), but in later stage KTC tumors (i.e. 20 weeks) the cell populations in these tumors have decreased more than 10-fold.

175

Comparison of semi-quantitative proteomic cell population marker data using partial least squares – discriminate analysis (PLS-DA) revealed that based on these markers, normal pancreas, early KTC, and late KTC tissues cluster apart from one another with PC1 accounting for 91.5% of the variance between groups (Figure

7.1B). Not surprisingly, early normal pancreas at 5 weeks overlaps substantially with late normal pancreas at 20 weeks and the little separation that is seen is driven by slight differences in endocrine (i.e. Gcg and Ins2) protein marker abundance.

Generally, the separation that is seen between groups is due to the sharp decreases in endocrine and exocrine cell populations and sharp increases in ductal and stromal cell populations from tumorigenesis and onwards through tumor progression. More specifically, a large degree of separation between early and late KTC tumors is observed. The loadings plots revealed that the retention of exocrine cell markers (i.e.

Ctrb1) in early KTC tumors and progressive increases in the ductal cell population in late KTC tumors is largely responsible for the distinction.

To further investigate the flux from exocrine/endocrine cell populations, to more ductal and stromal cell populations the morphologic changes that form the bases of contemporary PDAC diagnoses were compared. Regions with confirmed malignant transformation were identified on H&E stained sections and revealed stark differences in cellular morphology (Figure 7.1C). Further, because PDACs are characterized by a highly active stroma, we also sought to investigate the relationship between fibrillar collagen and PDAC progression. Evidence of fibrosis in

KTC PDACs was indicated by polarized imaging of picrosirius red stain parallel

176 sections revealed that these collagens were progressively assembled into thicker fibers between early and late tumor timepoints.

Based on these observations it was hypothesized that accompanying the changes in cell populations would be commensurate changes in the composition of the ECM component stroma. Normal pancreas and KTC PDAC tissues were extracted and fractionated as previously described into cellular, soluble ECM, and insoluble ECM fractions (Chapter II). Individual protein fractions were digested with trypsin and resultant peptides were analyzed by fraction using data-dependent acquisition mode on a QExactive HF mass spectrometer. After combining the data from individual fractions (i.e. cellular, sECM, and iECM fractions) and curating the list of 3353 identified proteins to separate ECM and ECM-related proteins (~100 proteins), the top 25 most significant changes were plotted in a heatmap (Figure

7.2A). The data show that there are large differences in several classes of proteins.

Most notably, we see a progressive loss of basement membrane proteins (i.e.

Lamb2, Col4a2, and Lamc1) from normal pancreas through KTC PDAC progression.

This is consistent with the observed destruction of the basement membrane as normal cellular organization is lost to facilitate invasion and metastasis. Of note, a stark increase in matricellular (i.e. Tnc, Thbs1), ECM regulator (i.e. Lox, Tgm2,

Plod2), and structural ECM (i.e. Fn1) protein abundance was observed.

Early and late normal pancreas are distinguished from early and late KTC

PDACs based solely on their stromal composition using multivariate analysis

(Figure 7.2B). PC1 accounts for 55.1% of the variance seen between groups. In terms of functional classes of ECM proteins, we see that the variance observed in

177 the PLS-DA plot is primarily driven by decreases in basement membrane abundance, and increases in matricellular, collagen, and ECM regulator protein abundance (Figure 7.2C). These data fit well with cell population data which indicated large increases in stromal cell populations. Additionally, when comparing early to late KTC PDACs we observe decreases in structural ECM (i.e. Fn1) and matricellular proteins (i.e. Tnc, Thbs1), as well as further increases in collagen and

ECM regulator protein abundance (Figure 7.2D).

KTC PDACs exhibit an approximate 3-fold increase in fibrillar collagen (i.e.

Col1a1, Col1a2, Col2a1, Col3a1, and Col5a1/2) after just 5 weeks of tumor progression (Figure 7.2D), a remarkable example of how fast an activated stroma can be remodeled, relative to normal tissue. Notably, fibril-associated collagen with interrupted triple helix (FACIT) collagens were either not detected (Col12a1) or detected at a relatively low abundance (Col14a1) in normal pancreatic tissue, however; their abundance is increased dramatically during tumor progression.

Between early and late KTC PDACs, additional ECM remodeling is also evident and highlights the stromal dynamics during disease progression.

Based on the observed changes in collagen during the course of tumor progression, it was hypothesized that these changes in abundance would alter the solubility of the stroma. To answer this question, the percentage of total collagen found in the cellular, sECM, and iECM fractions, were compared between normal and KTC groups (Figure 7.2E & F). This approach revealed changes in the solubility profiles of normal and KTC PDACs. In conjunction with an overall decrease in abundance of basement membrane components, a progressive decrease in

178 basement membrane solubility from 26.4% present in the iECM fraction in normal pancreas early to 35.3% present in the iECM fraction in early KTC PDAC was observed. The solubility of matricellular proteins and collagens stays consistent during tumor progression. However, if we stratify fibrillar collagens away from total collagen we observe some unique trends (Figure 7.2D). For instance, in comparing normal early pancreas with normal late pancreas, we observed a 12.9% decrease in the solubility of fibrillar collagen. This is likely due to the formation of a crosslinked stroma during normal growth and development. Interestingly, the same comparison in KTC progression shows a 5.4% decrease in fibrillar collagen solubility in later stage PDACs. These data indicate that more fibrillar collagen is deposited in the stroma in KTC PDACs than in normal pancreatic tissue, but it is in fact more soluble.

Another characteristic of the PDAC stroma that was observed was a classic wound healing response. Fibronectin isoforms and fibrinogen chains increase significantly in abundance during the course of tumor progression (Figure 7.2A) and are classic markers of the early phases of wound healing and the formation of a provisional ECM (95). These proteins are abundant at the wound site and become part of a crosslinked ECM. In part, this allows resident stromal, red blood cells, and platelets to adhere and begin to repair the wound. Interestingly, another ECM crosslinking enzyme, Tgm2, increased 5-fold during tumor progression, as well as several known substrates (i.e. Fn1, Fga, Fgb, Fgg). Additionally, Tgm2 decreases substantially in solubility (27% decrease) in PDACs relative to normal pancreas tissue. This can be explained by the observation that Tgm2 auto-crosslinks to structural ECM components such as Fn1 (328, 329). The fibrinogens decrease in

179 solubility in early KTC PDACs but return to approximate normal levels in late KTC

PDACs, suggesting progression past the provisional ECM stage.

Matricellular-enrich fibrosis has been previously reported to increase tumor cell tension and accelerate PDAC progression in mice (Chapter VI) (94). In this KTC tumor progression study, a strong matricellular phenotype was observed. This phenotype was characterized by major changes in almost all proteins defined as matricellular (85, 91). Matricellular proteins in normal pancreatic tissue (including early and late tissues) were much less abundant and/or below the limit of detection in our assay. However, after just five weeks of tumor progression a strong matricellular ECM phenotype is observed (Figure 7.3A). Tnc abundance is highest early in KTC Early samples and then decreases approximately 5-fold by the KTC 20 week timepoint. This suggests that Tnc may be integral in the early formation of matricellular fibrosis and has previously been identified as a metastatic niche component involved in colonizing the lung during breast cancer metastasis (330,

331). Using PLS-DA to sort variables based on their contribution to the observed variance, we can determine which variables are contributing to the separation seen in the PLS-DA scores plot (Figure 7.3B). The contribution of small leucine-rich proteoglycans (SLRPs), such as biglycan, lumican, and decorin is not surprising as the expression of these proteins is directly related to collagen expression, which increases progressively during KTC tumor progression.

Tenascin-C is expression is upregulated in pancreatic cancer and correlates with cell differentiation, cell growth and motility, and effects cell adhesion through activation of integrin signaling (332, 333). It has also been shown to be a major

180 component of matricellular tumor fibrosis in a genotype specific manner (94). Thus, we used our proteomic dataset to determine the strength in which matricellular proteins correlated with Tnc expression (Figure 7.3C). Indeed, almost all matricellular proteins showed a positive correlation with Tnc, with several proteins

(i.e. Fbln2, Thbs1/2, Sparc, Lgals9, and Spp1) having calculated correlation coefficients above 0.75.

In light of the findings regarding the abundance and solubility of fibrillar collagen, we hypothesized that changes in fibrillar collagen deposition and architecture would be accompanied by changes in collagen crosslinking by LOX family of enzymes. We first analyzed the percentage of insoluble fibrillar collagen found in the iECM fraction of normal pancreatic tissue and KTC PDACs (Figure

7.4A) and found that insoluble fibrillar collagen abundance in the iECM fraction is highest in Normal late pancreas and lowest in KTC late PDAC. These findings supported the observed changes in solubility and provided further evidence for commensurate changes in collagen crosslinking. To assess collagen crosslinking we have applied crosslinked amino acid analysis (xAAA) to determine the abundance of crosslinks in these tissues. Briefly, tissue hydrolysates were enriched using cellulose chromatography and the resultant eluate was analyzed via LC-MS/MS. We quantified hydroxy proline using two measures 1) total abundance of hydroxy proline amino acid in tissue hydrolysates via xAAA and 2) total counts of hydroxy proline modified peptides from proteomics data (Figure 7.4B). Not surprisingly, these data demonstrate that increased fibrillar collagen is accompanied by increased total abundance of hydroxy proline, and hydroxy proline modified collagen peptides

181

(Figure 7.4C). The data demonstrate that during tumor progression there is a decrease in the total crosslinks per total fibrillar collagen or hydroxyproline between normal pancreatic tissue and PDAC. Similarly, individual divalent crosslinks such a lysinonorleucine (LNL), dihydroxy lysinonorleucine (DHLNL), and hydroxy lysinonorleucine (HLNL), showed a progressive decrease in crosslink abundance with the most significant differences between normal pancreatic tissue and KTC

PDACs (Figure 7.4D). Of note, the trivalent crosslink, pyridinoline (Pyr), was not detected in normal early and normal late pancreas groups but were detected in early and late KTC PDACs.

Discussion

Here we have provided characterization of the changes in stromal composition that occur during PDAC progression at an early and late timepoint.

Global proteomics data was used to identify compositional and crosslinking post- translational variations in the matrisome of these tumors relative to normal pancreatic tissue collagen during tumor progression. Combining our ECM proteomics approach with crosslinking analysis provides a more granular view of stromal remodeling during tumor progression. Leveraging these techniques, we have been able to identify a unique provisional matrix in PDAC that is characterized by a wound healing response, increased matricellular protein abundance, increased solubility of fibrillar collagen, and decreased total collagen crosslinking per total collagen.

One advantage to the sequential fractionation approach with hydroxylamine chemical digestion is that we have achieved complete solubilization of the insoluble

182 stroma based on visual inspection. This is an important point because the insoluble pellet remaining after chaotrope extraction is often overlooked, and therefore underrepresented in proteomics datasets (334, 335). By achieving this level of stromal solubilization it allows us to assess total stromal composition across fractions and characterize the amount of protein that is extracted at each step.

Furthermore, this approach allowed us to fully characterize the flux of cell populations between normal pancreatic tissue (i.e. exocrine and endocrine cells) and malignant PDAC tissue (i.e. ductal and stromal cells). For instance, we found that

KTC PDACs at 5 weeks still retained some of their exocrine cell populations, however; by 20 weeks these had been almost completely depleted and replaced with increasing populations of ductal and stromal cells.

Characterization of soluble and insoluble ECM fractions showed that after just

5 weeks of tumor growth the most abundant matrisome component, fibrillar collagen, had increased 3-fold relative to normal pancreatic tissue, owing to the rapid increase in stromal cell populations seen concurrent with tumorigenesis. Remarkably, the fibrosis induced by this excessive fibrillar collagen deposition appears to have a more soluble fibrillar collagen architecture as is evident through solubility analysis.

Instead, other major stromal collagen such as FACIT collagens, Col12a1 and

Col14a1, appear to play a larger role than has been previously been appreciated. In normal pancreatic tissue, Col12a1 was undetected, while Col14a1 was detected at a

10-fold lower level than was observed in KTC PDACs. Of note, FACIT collagens do not by themselves form fibrils, but rather associate with the surface of existing collagen fibrils to create bridges between fibrils. This finding has implications about

183 how the PDAC collagen architecture is fundamentally organized differently than normal pancreas. This may facilitate invasion and metastasis and has been corroborated by our recent work which characterized the stroma of genotypically distinct PanIN and PDAC mouse models, and identified Col12a1 as a major stromal component in the remodeled tumor microenvironment (Chapter VI).

Perhaps most striking is the change in the matricellular protein phenotype found during tumorigenesis and progression. Matricellular protein abundance is very low, or non-detectable in normal pancreatic tissue. However, almost all matricellular proteins demonstrated an increased abundance during tumor progression. In particular, SLRPs such as biglycan, decorin and lumican, appear to contribute significantly to the matricellular composition of these tumors – a likely result of increased collagen biosynthesis. Additionally, tenascin-C was found to be one of the most abundant matricellular proteins, with the highest abundance found in early KTC

PDACs. We found that almost all matricellular proteins analyzed here showed a strong positive correlation with tenascin-C expression. It has been shown previously that tenascin-C expression is upregulated in human PDAC and correlates with differentiation. Furthermore, there is evidence to suggest that tenascin-C is secreted in exosomes and helps to initiate a pre-metastatic niche in the liver (332, 336)

ECM analysis also revealed a classic wound healing response and formation of a provisional matrix involving fibronectin and fibrinogen. It has been shown that a provisional ECM provides the proper microenvironment for resident and invading cells to proliferate and migrate during the repair process (337). It is likely that this process supports the migration of cancer associated fibroblasts and myofibroblasts

184 to the tumor microenvironment to synthesize and secrete other provisional ECM molecules, such as hyaluronan and proteoglycans which interact with and stabilize the fibrin-enriched ECM. The matrix is likely further stabilized by transglutaminase crosslinks between fibronectin and fibrinogen as a part of the wound healing response. We believe that chronic fibrosis in these tumors further perpetuates the wound healing response and supports PDAC progression in this model. We have previously identified tenascin-C and fibronectin as major players in the formation of matricellular fibrosis in three different genetically engineered mouse models of

PDAC formation (Chapter VI) (94).

We also investigated collagen crosslinking in the context of PDAC progression using crosslinked amino acid analysis. Our results suggest that total crosslinking per total collagen is reduced during tumor progression relative to normal pancreatic tissue. There could be several reasons for this observation. While the

KTC model of PDAC progression does well to mimic the fibrotic component of human PDACs, organ fibrosis proceeds more rapidly than what is observed in human disease. It is possible that crosslinking enzymes such as lysyl oxidase (Lox) and procollagen-lysine,2-oxoglutarate 5-dioxygenase 2 (Plod2) cannot keep up with the massive amount of fibrillar collagen that is being synthesized by activated stromal cells like cancer associated fibroblasts and myofibroblasts. Thus, because there is actually three-fold more fibrillar collagen in KTC PDACs we believe it is still likely to have more crosslinks in total, but not per total collagen.

The proteomic data presented here has provided the most comprehensive assessment of ECM composition during PDAC tumor progression to date.

185

Assessment of normal pancreatic tissue was performed and was critical to better understand the fibrosis that occurs in KTC PDACs. We have also been able to validate our findings from the KTC model in Chapter VI, and provide a more in-depth analysis of matricellular proteins. We hope that the information presented here will be a valuable resource in the design of 3D cell culture systems that aim to accurately recapitulate the in vivo matrix to assess the relationship between specific matricellular proteins that contribute to disease progression.

186

Figure 7.1: Characterization of Cell Population Markers in Normal Pancreas and During PDAC Progression. A) Heatmap of cell population marker abundance in normal early (5 weeks), normal late (20 weeks), KTC early (5 weeks), KTC late (20 weeks) (n=3 each group) identified from global LC-MS/MS. Principal component analysis (PCA) scores plot from cell population proteomic markers. C) Immunohistological staining (H&E, picrosirius red with polarized light) on normal pancreas, KTC early, and KTC late PDACs. Picrosirius red percent area quantification is plotted with shaded area representing the SEM. Significance is relative to normal pancreas tissue. Subsequent statistical analysis was performed with unpaired two- sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001).

187

Figure 7.2: KTC PDAC Demonstrate a Unique ECM Composition Relative to Normal Pancreatic Tissue

188

Figure 7.2: KTC PDAC Demonstrate a Unique ECM Composition Relative to Normal Pancreatic Tissue. A) Top 25 heatmap of ECM and ECM-related proteins from global LC-MS/MS experiments. Top 25 ranking is based on P-value significance. B) PCA scores plot from ECM curated global proteomics results list with trends denoted by red (increasing abundance) or blue (decreasing abundance) arrows. C) Bar plots of ECM and ECM-related proteins grouped by their functional class. D) Collagen abundance plotted as a fold change relative to normal early tissue and sorted from high to low abundance. E & F) Solubility plots for normal early pancreas and KTC PDAC early. Protein functional group solubility was determined by analyzing cellular, sECM, and iECM fractions and determining the percentage that each group makes up out of the total. Individual * markers denote the protein was not detected in normal tissue.

189

Figure 7.3: Matricellular Protein Abundance Increases Progressively During PDAC Progression. A) Heatmap from matricellular protein curated list from global LC-MS/MS data B) Ranked matricellular variables based on p-value and PCA scores plot created from matricellular curated list of LC-SRM results. C) Pearson correlation analysis of matricellular variables correlated with tenascin C abundance.

190

Figure 7.4: Collagen Crosslinking Decreases During KTC PDAC Progression

191

Figure 7.4: Collagen Crosslinking Decreases During KTC PDAC Progression. A) Percentage of fibrillar collagen found in the iECM fraction. The percentage of fibrillar collagens (COL1A1, COL1A2, COL5A1) in each fraction was calculated as in Figure 7.3. B) Quantification of hydroxy proline via the xAAA method (i.e. free hydroxy proline) or by counting the number of hydroxy proline modified peptides detect via global LC-MS/MS experiments and PTM analysis. C) Total crosslinks bar plot. Total values calculated by summing individual xAAs in D D) Scatterplots for individual crosslinks identified from normal pancreas or KTC PDAC lesions, with horizontal and vertical bars representing the mean and SEM, respectively. For bar plots, values from each group were averaged and plotted with the SEM. Subsequent statistical analysis was performed with unpaired two-sided student t-tests (*P < 0.05; **P < 0.01, ***P < 0.001, ****P < 0.0001). All crosslink values are normalized to total collagen content (i.e. hydroxy proline abundance), starting tissue weight, and plotted as log2 transformed normalized peak areas.

192

CHAPTER VIII

SUMMARY OF FINDINGS AND FUTURE DIRECTIONS

Tools to Interrogate the Extracellular Matrix

It is an exciting time to be studying the role of the extracellular matrix (ECM) in the tumor microenvironment. The ECM is increasingly being recognized as a major player in multiple stages of disease progression and can modulate the hallmarks of cancer through reciprocal interactions between proteins and cells in the stroma. However, the reductionist approach of investigating single ECM protein-cell interactions in vitro, or manipulating single proteins in vivo, while revealing, does not replicate the complex ECM milieu of an in vivo tissue environment. What has been missing from these studies is a comprehensive assessment of changes in the tumor stroma.

The work presented here provides us with a roadmap of how to go about in- depth characterization of the stromal compartment (Figure 8.1). First, high resolution microscopy techniques such as second harmonic generation (SHG), have provided us with the ability to visualize changes in the architecture of the matrix during disease progression. Using these approaches, we now know that particular matrix architectures are associated with tumor progression, but it has remained unknown how global ECM compositions change in congruence with this. Application of mass spectrometry based techniques was a logical step to globally characterize

ECM composition. However, traditional proteomic approaches have been poor at characterizing insoluble matrix components. Thus, solubilization of insoluble ECM

193 components (i.e. fibrillar collagen) is a critical step in ECM characterization. As such, we developed a chemical digestion method using hydroxylamine that is capable of solubilizing insoluble ECM components. We validated our approach using several tissues that varied in their overall ECM content, further highlighting the broad applicability of the method (Chapter II). Another important factor in characterizing

ECM composition by mass spectrometry is the ability to accurately quantify components from a complex mixture. Typical global proteomics experiments are excellent at providing accurate protein identifications, however; perform poorly for accurate quantification. This is mainly due to the way the data is acquired, such that the most abundant species are sequenced more often creating some bias in relative quantification. To circumvent this, we use a targeted proteomics approach with quantitative concatemers QconCATs for absolute quantification. During my time in the lab we expanded our standard library of peptide reporters from 200 to over 900

(human and mouse specific) and cover >99% of all known ECM proteins. Design of additional QconCATs has facilitated more refined characterization of ECM abundance and composition in tissues. Importantly, using these standards with our targeted proteomics platform we have been able to absolutely quantify subtle and stark differences in ECM composition during disease progression (Chapters II and

III). By complimenting this approach with SHG imaging, we are essentially putting a name to a face by obtaining a readout of what proteins comprise a given architecture. What was missing from this picture was a more granular view of how the changes we observed in matrix organization (i.e. SHG) and ECM composition

(i.e. proteomics) were reflected in changes in architecture at the molecular level (i.e.

194 crosslinking). Thus, we set out to develop a mass spectrometry approach to characterize a major architectural component of the matrix, collagen crosslinks. Until now, current techniques to analyze crosslinks have been low throughput and rely on multiple assays to measure the full repertoire of crosslinks. More importantly, these approaches have been primarily applied to hard tissues (e.g. bone, connective tissues) in which crosslinks are of naturally high abundance. However, there is much interest in investigating collagen crosslinking in soft tissues (i.e. tumors) due to mounting evidence for the role biomechanical signaling plays in tumor progression and aggression. The mass spectrometry based xAAA approach is applicable to both hard and soft tissues (Figure 8.2), including breast and pancreatic tumors, and allows for the measurement all crosslinks in a single LC-MS/MS experiment

(Appendix A, Chapter V).

The obvious next step in characterizing collagen crosslinking in the tumor stroma is to identify the specific peptides that are crosslinked in collagen instead of just the amino acid species that forms these crosslinks. As such, we have already developed a UPLC-based approach for the enrichment of crosslinked peptide species from complex stromal extracts and optimized an LC-MS/MS method for identification. Application of this approach to tumor fibrosis will provide more granularity regarding how crosslinking is altered during tumor progression.

Taken together, high resolution imaging, ECM proteomics and crosslinking analysis provide a comprehensive diagnostic toolkit to gain new insights into the relationship between stromal composition, organization and disease progression

(Figure 8.1).

195

Mammary Gland ECM Remodeling

Investigation of stromal remodeling in the rodent mammary gland was an important first step in understanding the role the ECM plays in normal mammary gland development and remodeling post-pregnancy. Following pregnancy and lactation, the mammary gland regresses to its pre-pregnant state by a tissue remodeling process known as involution. From our ECM proteomics analysis of the microenvironment of the involuting mammary gland, we can glean several attributes associated with inflammation and wound healing. In particular, we saw elevated abundance of known pro-tumorigenic matricellular proteins, thrombospondin 1, tenascin-C and galectin-3, during weaning-induced mammary gland involution. We also identified collagen XII for the first time as elevated during post-weaning mammary involution. It has been shown previously that thrombospondin-1 regulates tumor cell proliferation and angiogenesis, and promotes a dose-dependent increase in breast cancer cell invasion. Tenascin-C has also been intimately linked with the invasive properties of breast cancer cells. Very low expression is observed in healthy nulliparous mammary glands, however, it is highly upregulated during involution and during breast cancer, especially at the invasive front.

To the best of our knowledge, our ECM-based QconCAT proteomics pipeline has provided the most quantitative assessment of ECM proteins and tissue composition in mammary gland to date. In the future, the application of this method can be utilized to more fully interrogate breast cancer progression. Additionally, studies to understand liver, as well as lung, bone, and brain ECM throughout the reproductive cycle, and patient cohort studies to understand site-specific metastasis

196 of postpartum breast cancer patients are warranted. Finally, In the future I hope to be able to apply the xAAA method to measure how changes in ECM composition are associated with collagen crosslinking during mammary gland remodeling – something that has not been assessed previously. If the involuting mammary gland is indeed more stiff than other states, it would be interesting to further investigate the relationship between mammary gland ECM remodeling, stiffness, and mechanosignaling in normal development.

Human Breast Cancer

Mammographic density is a leading risk factor for breast cancer, yet differences in stromal composition have never been assessed comprehensively using a quantitative proteomics approach. We were able to prove a strong correlation between breast density and fibrillar collagen abundance, namely

COL1A1, COL1A2, and COL5A1. Based on these findings we hypothesized that denser breast tissue would be more crosslinked. However, collagen crosslink appeared to be reduced (per total collagen) in denser breast tissue, possibly suggesting a looser fiber organization.

In general, there is a high degree of heterogeneity between patients in the same density groups and we will likely need to greatly expand the power of the study by measuring these subtle differences in more patients. We also plan to use the methods presented here to investigate the relationship between breast density,

BRCA1 mutants and stromal composition and crosslinking. The elevated risk of breast cancer in BRCA1 mutation carriers is attributed to cellular effects that compromise DNA damage repair, increase hormone responsive proliferation, and

197 skew lineage commitment. What is missing from this already complicated picture is stromal context that critically mediates function and cancer risk.

Fibrillar collagen architectures have been shown to be an important part of breast cancer progression with specific tumor associated collagen signatures

(TACS) being associated with poor prognosis. During disease progression, invasive ductal carcinoma (IDC) pathologies are remodeled and can take on both curly and straightened alignments. These features have primarily been defined at the macro scale using SHG and immunohistochemistry, however, the intermolecular features of these signatures have remained elusive. Thus, we sought to investigate crosslinking using crosslinked amino acid analysis (xAAA). We found that despite IDC patients being characterized by both curly and straight fiber orientations, both of these architectures have significant increases in crosslinking relative to native breast tissue. Our data finally answers a question that has plagued the field for some time – whether or not straightened fibers are more crosslinked than relaxed fibers in the context of IDC architectures. These data suggest that curly fibers in IDC can be significantly more crosslinked than native tissue, but have a similar crosslink abundance to that of straightened fibers in IDC. These findings highlight the different flavors of stromal remodeling and the need to investigate this relationship on an individualized patient-by-patient basis. In future studies, we plan to correlate our crosslinking data with tissue biomechanics in curly and straight IDC pathologies using atomic force microscopy (AFM) to determine elastic modulus measurements, and immunohistochemistry to investigate relevant biomechanical signaling pathways linked to malignancy.

198

Breast tumors are more fibrotic than normal tissue and triple negative (TN) breast tumors are more fibrotic than luminal breast tumors. While others have suggested that these changes are a result of LOX-driven crosslinking, no one has proven a direct relationship in human breast cancer. Additionally, an ongoing debate in the breast cancer field exists regarding whether or not epithelial or stromal derived

LOX regulates tissue stiffness. We have been able to definitively answer this question using two genetically engineered mouse models of LOX overexpression in the mammary epithelium or stroma and have proven that LOX overexpression in the stroma drives collagen crosslinking and tissue stiffness. Investigation of crosslinking in different human breast cancer subtypes highlighted how collagen crosslinking is defined by different processes among unique breast cancer pathologies and that these preferences can lead to changes in the mechanical properties of the stroma.

To that end, we have been the first to characterize collagen crosslinks in human breast cancer and demonstrated that a distinct feature of fibrosis in triple negative breast cancer is a preference for LH2-derived collagen crosslinks. In order to further investigate the role of LH2 in breast cancer survival, we expanded our analysis to cohort of clinical samples. Tumor microarray data from showed that high LH2 expression in human breast tumors correlates with ER(-) status, histological grade, and survival – underscoring the clinical relevance of how a unique feature of fibrosis may alter the course of a disease. This study provides an example of how we can use our findings from xAAA to discover and examine new therapeutic targets.

Based on these findings the logical next step is to begin to modulate LH2 activity during the course of tumor progression using a mouse model that mimics the

199 fibrosis observed in triple negative disease. In doing so we will be able to fine-tune the types of collagen crosslinks that are formed in vivo during disease progression and explore therapeutic intervention strategies. Importantly, both LOX and LH2 are driven by a hypoxic tumor microenvironment. It has been shown previously, that hypoxia inducible factor 1 (HIF1) activates the transcription of genes that control multiple steps in the metastatic process and is highly overexpressed in triple negative breast cancer. Several drugs (acriflavine, digoxin, and ganetespib) that have been shown to inhibit HIF-1 in murine mammary tumors and are associated with reduced tumor growth, angiogenesis, pre-metastatic niche formation, and local invasion. Further investigation is needed regarding the effect HIF-1 inhibition has on stromal remodeling. Therapeutics aimed at inhibiting HIF1 may represent a novel targeted therapy for reduced LOX and LH2 activity in these hard-to-treat breast cancers.

Pancreatic Cancer

Pancreatic ductal adenocarcinomas (PDACs) are characterized by a substantial fibrotic component that promotes malignancy and treatment resistance. It is important to understand what the composition and abundance of the ECM is in normal pancreas in order to understand the magnitude of fibrosis in different stages of PDAC progression. Thus, we applied ECM proteomics and crosslinking to investigate the ECM composition of the normal pancreas and compare it to KTC

PDAC during tumor progression at two different time points. The KTC model is driven by mutant KRAS and loss of Tgbfr2, thus providing a rapid and enhanced fibrosis in these PDACs. Perhaps most striking is the change in the matricellular

200 protein phenotype found during tumorigenesis and progression. Matricellular protein abundance is very low, or non-detectable in normal pancreatic tissue. To that end, we found that in just 5 weeks, PDAC lesions had remarkable increases in matricellular protein abundance. ECM analysis also revealed a classic wound healing response and formation of a provisional matrix involving fibronectin and fibrinogen. It has been shown that a provisional ECM provides the proper microenvironment for resident and invading cells to proliferate and migrate during the repair process. For the first time, we suggest that the matrix is likely further stabilized by transglutaminase crosslinks between fibronectin and fibrinogen as a part of the wound healing response.

Comparative analysis of KTC PDACs to clinically normal and PDAC patient samples revealed some interesting trends that have not been described before.

Notably, FACIT collagen abundance is extremely low or not detectable in normal pancreatic tissue, however; during tumor progression these collagens (COL12A1 and COL14A1) increase dramatically in abundance. Little is known about the role of

FACIT collagens in PDAC progression, however; studies in colorectal cancer have revealed that COL12A1 is a marker of myofibroblast differentiation during colorectal cancer (338). Data from this study shows that COL12A1 is highly expressed in the desmoplastic stroma by and around alpha-smooth muscle actin positive cancer associated fibroblasts (CAFs), as well as in cancer cells lining the invasive front of the tumor, in a small cohort of colorectal cancer patients. In future studies, I would like to further investigate COL12A1 as a novel candidate marker for myofibroblasts and/or PDAC-associated tumor cells undergoing differentiation. Ideally, we would be

201 able to pinpoint when myofibroblasts or CAFs begin to express COL12A1 and develop intervention strategies.

In summary, we are able to bring forth quantitative solutions with high granularity to characterize matrix composition, collagen crosslinking and tissue biomechanics in human breast and pancreatic cancers (93). As we provide more color on the complexity of fibrosis, we are motivated to better elucidate the role of desmoplastic stroma in hard-to-treat malignancies and identify new therapeutic targets to reduce the fibrotic burden in solid tumors.

202

Figure 8.1: Towards Comprehensive Characterization of the Solid Tumor ECM. Crosslinking analysis (i.e xAAA), ECM proteomics (i.e. LC-SRM) and high resolution imaging (i.e. PS/SHG) provide a triad of techniques that are capable of delivering extensive information about stiffness, composition and architecture.

Figure 8.2: Broad Applicability of xAAA Method to Hard and Soft Tissues. A) Collagen abundance as determined by hydroxy proline measurements from xAAA. B) Total crosslink abundance bar plots. Total crosslink abundance was determined by summing all detected crosslinks from xAAA. All tissues are from pooled n=3 females.

203

CHAPTER IX

COLORADO CLINICAL AND TRANSLATIONAL SCIENCE FELLOWSHIP

Medical Oncology – Breast Cancer Center

In recent years the importance of preparing and training nationally competitive clinical translational scientists has been realized. The Colorado Clinical and

Translational Science Institute (CCTSI) fellowship has broadly allowed me to attain knowledge of translational research methods and techniques in order to train and further prepare myself in clinical research. More specifically, I have focused the majority of my time in the CCTSI program working with breast cancer clinicians. Dr.

Virginia Borges has served as my clinical mentor from my acceptance into the program and has consistently provided valuable insight into clinical and translational research. As the head of the University of Colorado Hospital – Young Women’s

Breast Cancer Clinic, Dr. Borges focuses on clinical breast cancer research including clinical trials with anti-endocrine therapies, novel biologic drugs and immunotherapies. Dr. Borges has had a longstanding collaboration with my research mentor, Dr. Kirk Hansen, which has focused on the use of cutting edge proteomic technologies to characterize the breast tumor microenvironment in the context of pregnancy associated breast cancer.

My clinical experience with Dr. Borges has been exceptionally fruitful for my development and understanding of clinical research. The clinical aspect of my

CCTSI fellowship was spent doing clinical rotations led by Dr. Borges and consisted of several practices that enriched my knowledge about the treatment of breast

204 cancer. I was involved in the review of ongoing cases, discussion about patient treatment plans and direct patient interaction. Through this experience I was able to gain an appreciation of the standard of care that is practiced in the clinic and the uniqueness of each cancer that presents in the clinic. I was also able able to apply the technologies that I have developed in the lab to clinical specimens and gain a deeper understanding of how extracellular matrix biology and biochemistry is altered in different disease states and tumor subtypes. Through investigation of the relationship between tissue stiffness, collagen crosslinking and breast tumor aggression, I have been able to develop a potential diagnostic biomarker for breast tumor fibrosis from patient samples in lysyl hydroxylase 2. Further, application of

ECM proteomics to human breast tumors has allowed us to attain a comprehensive understanding of the role ECM remodeling plays during tumor progression. As Dr.

Borges is also an expert in pregnancy associated breast cancer, I also investigated the importance of the involution opportunity window during mammary gland remodeling following pregnancy.

It has been a humbling experience to be able to interact with patients across the entire spectrum of disease. The resiliency of the patients that I met is incredible and I count this experience as one of the most rewarding times I had during my time in graduate school.

Autopsy Pathology

In recent years there has been much interest in the characterization of the entire human proteome. This interest has resulted in the publication of several human “draft proteomes”, in which dozens of tissues from a single person were

205 characterized by semi-quantitative proteomics (334, 335). These studies were massive undertakings and have undoubtedly provided the most comprehensive characterization of the human proteome to date. However, ECM proteins, such as fibrillar collagen, as significantly underrepresented in these datasets, despite the fact that they are regularly cited as the most abundant protein in the human body. How can this be so? Due to the highly crosslinked and glycosylated nature of fibrillar collagen, specific extraction and digestion procedures must be performed in order to solubilize these proteins and is described in detail in this thesis. Thus, we have set out to create our own ECM Atlas in collaboration with Dr. Carrie Marshall at the

University of Colorado Hospital – Autopsy Pathology.

To characterize the extracellular matrix composition of a broad sampling of tissues in the human body, I attended six unrestricted autopsies under the guidance of Dr. Marshall in order to collect tissues that could be used to create the ECM Atlas.

During my rotations in the morgue, I observed the entire autopsy procedure and collected tissues from all major organ systems in the human body. I also observed characterization of normal and diseased tissues and how they are prepared for future histopathology. During this time, I regularly asked questions of the resident pathologists regarding organ system anatomy and patient co-morbidities. Altogether, we were able to collect over 200 tissues that will be used to create the ECM Atlas. I believe that this will be an invaluable resource to the research community with broad appeal across many different fields. Knowing the ECM composition of all organ systems will provide a one-of-a-kind resource that will be used to create more

206 physiologically relevant 3D cell culture matrices and components for regenerative medicine.

My time in autopsy pathology was an incredibly unique experience that required me to face my own mortality through a deeper understanding of human anatomy. I believe that this experience was and important aspect to my development as a clinical and translational scientist and further exposed me to various aspects of clinical research.

Summary

The CCTSI TL1 fellowship has provided me with a critical head-start for the design of research studies that will allow me to translate my research from bench to bedside, and back again. Furthermore, my participation in the CCTSI program has exposed me to a clinical perspective rarely afforded to structural biology and biochemistry graduate students. My experience here will allow me to apply my knowledge of proteomic technologies in a clinical setting to move personalized medicine forward and generate discoveries that improve patient’s lives.

207

REFERENCES

1. Frantz, C.; Stewart, K. M.; Weaver, V. M., The extracellular matrix at a glance. Journal of cell science 2010, 123, (24), 4195-4200.

2. Mouw, J. K.; Ou, G.; Weaver, V. M., Extracellular matrix assembly: a multiscale deconstruction. Nature Reviews Molecular Cell Biology 2014, 15, (12), 771-785.

3. DuFort, C. C.; Paszek, M. J.; Weaver, V. M., Balancing forces: architectural control of mechanotransduction. Nature reviews Molecular cell biology 2011, 12, (5), 308-319.

4. Wang, F.; Hansen, R. K.; Radisky, D.; Yoneda, T.; Barcellos-Hoff, M. H.; Petersen, O. W.; Turley, E. A.; Bissell, M. J., Phenotypic reversion or death of cancer cells by altering signaling pathways in three-dimensional contexts. Journal of the National Cancer Institute 2002, 94, (19), 1494-1503.

5. Howlett, A. R.; Bailey, N.; Damsky, C.; Petersen, O. W.; Bissell, M. J., Cellular growth and survival are mediated by beta 1 integrins in normal human breast epithelium but not in breast carcinoma. Journal of Cell Science 1995, 108, (5), 1945- 1957.

6. Weaver, V. M.; Petersen, O. W.; Wang, F.; Larabell, C.; Briand, P.; Damsky, C.; Bissell, M. J., Reversion of the malignant phenotype of human breast cells in three- dimensional culture and in vivo by integrin blocking antibodies. The Journal of cell biology 1997, 137, (1), 231-245.

7. Bissell, M. J.; Hall, H. G.; Parry, G., How does the extracellular matrix direct gene expression? Journal of theoretical biology 1982, 99, (1), 31-68.

8. Nelson, C. M.; Bissell, M. J., Of extracellular matrix, scaffolds, and signaling: tissue architecture regulates development, homeostasis, and cancer. Annual review of cell and developmental biology 2006, 22, 287.

9. Adams, J. C.; Watt, F. M., Regulation of development and differentiation by the extracellular matrix. DEVELOPMENT-CAMBRIDGE- 1993, 117, 1183-1183.

10. Pander, C. H., Beiträge zur Entwickelungs-Geschichte des Hühnchens im Eye. LP. 1817.

11. Beningo, K. A.; Dembo, M.; Wang, Y.-l., Responses of fibroblasts to anchorage of dorsal extracellular matrix receptors. Proceedings of the National Academy of Sciences 2004, 101, (52), 18024-18029.

12. Ignotz, R. A.; Massague, J., Transforming growth factor-beta stimulates the expression of fibronectin and collagen and their incorporation into the extracellular matrix. Journal of Biological Chemistry 1986, 261, (9), 4337-4345.

208

13. Sheetz, M. P.; Felsenfeld, D. P.; Galbraith, C. G., Cell migration: regulation of force on extracellular-matrix-integrin complexes. Trends in cell biology 1998, 8, (2), 51-54.

14. Lu, P.; Weaver, V. M.; Werb, Z., The extracellular matrix: a dynamic niche in cancer progression. The Journal of cell biology 2012, 196, (4), 395-406.

15. Vaday, G. G.; Lider, O., Extracellular matrix moieties, cytokines, and enzymes: dynamic effects on immune cell behavior and inflammation. Journal of Leukocyte Biology 2000, 67, (2), 149-159.

16. Hynes, R. O., The extracellular matrix: not just pretty fibrils. Science 2009, 326, (5957), 1216-1219.

17. Taipale, J.; Keski-Oja, J., Growth factors in the extracellular matrix. The FASEB Journal 1997, 11, (1), 51-59.

18. Ricard Blum, S.; Salza, R., Matricryptins and matrikines: biologically active fragments of the extracellular matrix. Experimental dermatology 2014, 23, (7), 457- 463. ‐

19. Davis, G. E.; Bayless, K. J.; Davis, M. J.; Meininger, G. A., Regulation of tissue injury responses by the exposure of matricryptic sites within extracellular matrix molecules. The American journal of pathology 2000, 156, (5), 1489-1498.

20. Hotchin, N. A.; Hall, A., The assembly of integrin adhesion complexes requires both extracellular matrix and intracellular rho/rac GTPases. The Journal of cell biology 1995, 131, (6), 1857-1865.

21. Gimbrone Jr, M. A.; Nagel, T.; Topper, J. N., Biomechanical activation: an emerging paradigm in endothelial adhesion biology. Journal of Clinical Investigation 1997, 99, (8), 1809.

22. Van der Rest, M.; Garrone, R., Collagen family of proteins. The FASEB journal 1991, 5, (13), 2814-2823.

23. Ranvier, L., On the cellular elements of tendons and of loose connective tissue. Quarterly Jounal of Microscopical Science 1870, 10, 367-380.

24. Schwann, T., Mikroskopische Untersuchungen über die Uebereinstimmung in der Struktur und dem Wachsthum der Thiere und Pflanzen: mit 4 Kupfertafeln. Reimer: 1839.

25. Flemming, W., Morphologie der zelle. 1897.

26. Mall, F. P., On the development of the connective tissues from the connective tissue syncytium. American Journal of Anatomy 1902, 1, (3), 329-365. ‐

209

27. Lewis, W. H.; Lewis, M. R., Behavior of cross striated muscle in tissue cultures. American Journal of Anatomy 1917, 22, (2), 169-194.

28. Virchow, R. L. K., Die Cellularpathologie: in ihrer Begründung auf physiologische und pathologische Gewebelehre. Verlag von August Hirschwald, Unter den Linden No. 68: 1871; Vol. 1.

29. Ebner, V., Die Chorda dorsalis der niederen Fische und die Entwicklung des fibrillären Bindegewebes. Zeitschr. f. wiss. Zool 1897, 42, 469.

30. Maximow, A. A.; Bloom, W., Text-book of Histology. The American Journal of the Medical Sciences 1930, 180, (6), 852.

31. Verzár, F., Aging of the collagen fiber. Academic Press: 1964.

32. Gross, J., The behavior of collagen units as a model in morphogenesis. The Journal of biophysical and biochemical cytology 1956, 2, (4), 261.

33. Ramachandran, G., Biochemistry of collagen. Springer Science & Business Media: 2013.

34. Nageotte, J., Über die Überpflanzung von abgetöteten Bindegewebsstücken. Erwiderung an Fr. Weidenreich und A. Busacca. Virchows Archiv 1927, 263, (1), 69- 88.

35. Ricard-Blum, S., The collagen family. Cold Spring Harbor perspectives in biology 2011, 3, (1), a004978.

36. Hulmes, D., Collagen diversity, synthesis and assembly. In Collagen, Springer: 2008; pp 15-47.

37. Prockop, D. J.; Kivirikko, K. I.; Tuderman, L.; Guzman, N. A., The biosynthesis of collagen and its disorders. New England Journal of Medicine 1979, 301, (2), 77-85.

38. Orgel, J.; Antipova, O.; Sagi, I.; Bitler, A.; Qiu, D.; Wang, R.; Xu, Y.; San Antonio, J., Collagen fibril surface displays a constellation of sites capable of promoting fibril assembly, stability, and hemostasis. Connective tissue research 2011, 52, (1), 18-24.

39. Bätge, B.; Winter, C.; Notbobm, H.; Acil, Y.; Brinckmann, J.; Müller, P. K., Glycosylation of human bone collagen I in relation to lysylhydroxylation and fibril diameter. The Journal of Biochemistry 1997, 122, (1), 109-115.

40. Bignon, M.; Pichol-Thievend, C.; Hardouin, J.; Malbouyres, M.; Bréchot, N.; Nasciutti, L.; Barret, A.; Teillon, J.; Guillon, E.; Etienne, E., Lysyl oxidase-like protein- 2 regulates sprouting angiogenesis and type IV collagen assembly in the endothelial basement membrane. Blood 2011, 118, (14), 3979-3989.

210

41. Siegel, R. C.; Fu, J. C.; Chang, Y.-H., Collagen cross-linking: the substrate specificity of lysyl oxidase. In Iron and Copper Proteins, Springer: 1976; pp 438-446.

42. Cronlund, A. L.; Smith, B. D.; Kagan, H. M., Binding of lysyl oxidase to fibrils of type I collagen. Connective tissue research 1985, 14, (2), 109-119.

43. Bailey, A. J.; Paul, R. G.; Knott, L., Mechanisms of maturation and ageing of collagen. Mechanisms of ageing and development 1998, 106, (1), 1-56.

44. Avery, N.; Bailey, A., Restraining cross-links responsible for the mechanical properties of collagen fibers: natural and artificial. In Collagen, Springer: 2008; pp 81- 110.

45. Sricholpech, M., Mechanisms and Functions of Collagen Glycosylation in Bone. Carolina Digital Repository 2010.

46. Mercer, D. K.; Nicol, P. F.; Kimbembe, C.; Robins, S. P., Identification, expression, and tissue distribution of the three rat lysyl hydroxylase isoforms. Biochemical and biophysical research communications 2003, 307, (4), 803-809.

47. Eyre, D. R.; Paz, M. A.; Gallop, P. M., Cross-linking in collagen and elastin. Annual review of biochemistry 1984, 53, (1), 717-748.

48. Kagan, H. M., Intra-and extracellular enzymes of collagen biosynthesis as biological and chemical targets in the control of fibrosis. Acta tropica 2000, 77, (1), 147-152.

49. Myllylä, R.; Wang, C.; Heikkinen, J.; Juffer, A.; Lampela, O.; Risteli, M.; Ruotsalainen, H.; Salo, A.; Sipilä, L., Expanding the lysyl hydroxylase toolbox: new insights into the localization and activities of lysyl hydroxylase 3 (LH3). Journal of cellular physiology 2007, 212, (2), 323-329.

50. Yamauchi, M., Collagen biochemistry: an overview. Advances in tissue banking 2002.

51. Bailey, A.; Shimokomaki, M., Age related changes in the reducible cross-links of collagen. FEBS letters 1971, 16, (2), 86-88.

52. Robins, S. P.; Shimokomaki, M.; Bailey, A. J., The chemistry of the collagen cross-links. Age-related changes in the reducible components of intact bovine collagen fibres. Biochemical Journal 1973, 131, (4), 771-780.

53. Fujimoto, D.; Akiba, K.-y.; Nakamura, N., Isolation and characterization of a fluorescent material in bovine achilles tendon collagen. Biochemical and biophysical research communications 1977, 76, (4), 1124-1129.

211

54. Eyre, D.; Dickson, I.; Van Ness, K., Collagen cross-linking in human bone and articular cartilage. Age-related changes in the content of mature hydroxypyridinium residues. Biochem. J 1988, 252, 495-500.

55. Eyre, D., Crosslink maturation in bone collagen. Dev Biochem 1981, 22, 51-55.

56. Ogawa, T.; Ono, T.; Tsuda, M.; Kawanishi, Y., A novel fluor in insoluble collagen: a crosslinking moiety in collagen molecule. Biochemical and biophysical research communications 1982, 107, (4), 1252-1257.

57. Hanson, D. A.; Eyre, D. R., Molecular site specificity of pyridinoline and pyrrole cross-links in type I collagen of human bone. Journal of Biological Chemistry 1996, 271, (43), 26508-26516.

58. Avery, N. C.; Sims, T. J.; Bailey, A. J., Quantitative determination of collagen cross-links. Extracellular Matrix Protocols: Second Edition 2009, 103-121.

59. Horgan, D. J.; King, N. L.; Kurth, L. B.; Kuypers, R., Collagen crosslinks and their relationship to the thermal properties of calf tendons. Archives of biochemistry and biophysics 1990, 281, (1), 21-26.

60. Brown-Augsburger, P.; Tisdale, C.; Broekelmann, T.; Sloan, C.; Mecham, R. P., Identification of an elastin cross-linking domain that joins three peptide chains possible role in nucleated assembly. Journal of Biological Chemistry 1995, 270, (30), 17778-17783.

61. Vrhovski, B.; Weiss, A. S., Biochemistry of tropoelastin. The FEBS Journal 1998, 258, (1), 1-18.

62. Uitto, J., Biochemistry of the elastic fibers in normal connective tissues and its alterations in diseases. Journal of Investigative Dermatology 1979, 72, (1), 1-10.

63. Akagawa, M.; Suyama, K., Mechanism of formation of elastin crosslinks. Connective tissue research 2000, 41, (2), 131-141.

64. Partridge, S.; Elsden, D.; Thomas, J., Constitution of the cross-linkages in elastin. Nature 1963, 197, (4874), 1297-1298.

65. van Eldijk, M. B.; McGann, C. L.; Kiick, K. L.; van Hest, J. C., Elastomeric polypeptides. In Peptide-Based Materials, Springer: 2011; pp 71-116.

66. Mosher, D., Fibronectin. Elsevier: 2012.

67. Pankov, R.; Yamada, K. M., Fibronectin at a glance. Journal of cell science 2002, 115, (20), 3861-3863.

212

68. Clement, B.; Grimaud, J. A.; Campion, J. P.; Deugnier, Y.; Guillouzo, A., Cell types involved in collagen and fibronectin production in normal and fibrotic human liver. Hepatology 1986, 6, (2), 225-234.

69. Patel, R. S.; Odermatt, E.; Schwarzbauer, J.; Hynes, R., Organization of the fibronectin gene provides evidence for exon shuffling during evolution. The EMBO journal 1987, 6, (9), 2565.

70. van der Straaten, H. M.; Canninga-van Dijk, M. R.; Verdonck, L. F.; Castigliego, D.; Borst, H. E.; Aten, J.; Fijnheer, R., Extra-domain-A fibronectin: a new marker of fibrosis in cutaneous graft-versus-host disease. Journal of investigative dermatology 2004, 123, (6), 1057-1062.

71. Schwarzbauer, J. E., Alternative splicing of fibronectin: three variants, three functions. Bioessays 1991, 13, (10), 527-533.

72. Tamkun, J. W.; Schwarzbauer, J. E.; Hynes, R. O., A single rat fibronectin gene generates three different mRNAs by alternative splicing of a complex exon. Proceedings of the National Academy of Sciences 1984, 81, (16), 5140-5144.

73. Nam, J.-M.; Onodera, Y.; Bissell, M. J.; Park, C. C., Breast cancer cells in three- dimensional culture display an enhanced radioresponse after coordinate targeting of integrin α51 and fibronectin. Cancer research 2010, 70, (13), 5238-5248.

74. Karnoub, A. E.; Dash, A. B.; Vo, A. P.; Sullivan, A.; Brooks, M. W.; Bell, G. W.; Richardson, A. L.; Polyak, K.; Tubo, R.; Weinberg, R. A., Mesenchymal stem cells within tumour stroma promote breast cancer metastasis. Nature 2007, 449, (7162), 557.

75. Knudson, C. B.; Knudson, W. In Cartilage proteoglycans, Seminars in cell & developmental biology, 2001; Elsevier: 2001; pp 69-78.

76. Davis, D. A. S.; Parish, C. R., Heparan sulfate: a ubiquitous glycosaminoglycan with multiple roles in immunity. Frontiers in immunology 2013, 4.

77. Zhang, L., Glycosaminoglycans in development, health and disease. Academic Press: 2010; Vol. 93.

78. Iozzo, R. V., The biology of the small leucine-rich proteoglycans Functional network of interactive proteins. Journal of Biological Chemistry 1999, 274, (27), 18843-18846.

79. Mercurio, A. M.; Shaw, L. M., Laminin binding proteins. Bioessays 1991, 13, (9), 469-473.

80. Timpl, R.; Brown, J. C., The laminins. Matrix biology 1994, 14, (4), 275-281.

213

81. Hahn, E.; Wick, G.; Pencev, D.; Timpl, R., Distribution of basement membrane proteins in normal and fibrotic human liver: collagen type IV, laminin, and fibronectin. Gut 1980, 21, (1), 63-71.

82. Chu, M. L.; Pan, T. C.; Conway, D.; Saitta, B.; Stokes, D.; Kuo, H. J.; Glanville, R. W.; Timpl, R.; Mann, K.; Deutzmann, R., The structure of type VI collagen. Ann N Y Acad Sci 1990, 580, 55-63.

83. Vanacore, R.; Ham, A.-J. L.; Voehler, M.; Sanders, C. R.; Conrads, T. P.; Veenstra, T. D.; Sharpless, K. B.; Dawson, P. E.; Hudson, B. G., A sulfilimine bond identified in collagen IV. Science 2009, 325, (5945), 1230-1234.

84. Barsky, S.; Siegal, G.; Jannotta, F.; Liotta, L., Loss of basement membrane components by invasive tumors but not by their benign counterparts. Laboratory investigation; a journal of technical methods and pathology 1983, 49, (2), 140-147.

85. Bornstein, P., Matricellular proteins: an overview. Journal of cell communication and signaling 2009, 3, (3-4), 163.

86. Chiodoni, C.; Colombo, M. P.; Sangaletti, S., Matricellular proteins: from homeostasis to inflammation, cancer, and metastasis. Cancer and Metastasis Reviews 2010, 29, (2), 295-307.

87. Chong, H. C.; Tan, C. K.; Huang, R.-L.; Tan, N. S., Matricellular proteins: a sticky affair with cancers. Journal of oncology 2012, 2012.

88. Podhajcer, O. L.; Benedetti, L. G.; Girotti, M. R.; Prada, F.; Salvatierra, E.; Llera, A. S., The role of the matricellular protein SPARC in the dynamic interaction between the tumor and the host. Cancer and Metastasis Reviews 2008, 27, (4), 691.

89. Bornstein, P.; Armstrong, L. C.; Hankenson, K. D.; Kyriakides, T. R.; Yang, Z., Thrombospondin 2, a matricellular protein with diverse functions. Matrix Biology 2000, 19, (7), 557-568.

90. Bornstein, P.; Sage, E. H., Matricellular proteins: extracellular modulators of cell function. Current opinion in cell biology 2002, 14, (5), 608-616.

91. Murphy-Ullrich, J. E.; Sage, E. H., Revisiting the matricellular concept. Matrix Biology 2014, 37, 1-14.

92. Wight, T. N.; Potter-Perigo, S., The extracellular matrix: an active or passive player in fibrosis? American Journal of Physiology-Gastrointestinal and Liver Physiology 2011, 301, (6), G950-G955.

93. Goddard, E. T.; Hill, R. C.; Barrett, A.; Betts, C.; Guo, Q.; Maller, O.; Borges, V. F.; Hansen, K. C.; Schedin, P., Quantitative extracellular matrix proteomics to study mammary and liver tissue microenvironments. The International Journal of Biochemistry & Cell Biology 2016, 81, 223-232.

214

94. Laklai, H.; Miroshnikova, Y. A.; Pickup, M. W.; Collisson, E. A.; Kim, G. E.; Barrett, A. S.; Hill, R. C.; Lakins, J. N.; Schlaepfer, D. D.; Mouw, J. K., Genotype tunes pancreatic ductal adenocarcinoma tissue tension to induce matricellular fibrosis and tumor progression. Nature medicine 2016, (22.5), 497-505.

95. Gabbiani, G., The myofibroblast in wound healing and fibrocontractive diseases. The Journal of pathology 2003, 200, (4), 500-503.

96. Midwood, K. S.; Williams, L. V.; Schwarzbauer, J. E., Tissue repair and the dynamics of the extracellular matrix. The international journal of biochemistry & cell biology 2004, 36, (6), 1031-1037.

97. Cassereau, L.; Mouw, J.; Barnes, M.; Lakins, J.; Weaver, V. In ECM Stiffness Enhances Integrin Signaling to Alter Tumor Cell Metabolism, Molecular Biology of the Cell, 2014; Amer Soc Biology 8120 Woodmont Ave, STE 750, Bethesda, MD 20814- 2755 USA: 2014.

98. Acerbi, I.; Cassereau, L.; Dean, I.; Shi, Q.; Au, A.; Park, C.; Chen, Y.-Y.; Liphardt, J.; Hwang, E.; Weaver, V. M., Human breast cancer invasion and aggression correlates with ECM stiffening and immune cell infiltration. Integrative Biology 2015.

99. Schedin, P.; Keely, P. J., Mammary gland ECM remodeling, stiffness, and mechanosignaling in normal development and tumor progression. Cold Spring Harbor perspectives in biology 2011, 3, (1), a003228.

100. Ijichi, H.; Chytil, A.; Gorska, A. E.; Aakre, M. E.; Fujitani, Y.; Fujitani, S.; Wright, C. V.; Moses, H. L., Aggressive pancreatic ductal adenocarcinoma in mice caused by pancreas-specific blockade of transforming growth factor- signaling in cooperation with active Kras expression. Genes & development 2006, 20, (22), 3147-3160.

101. Bierie, B.; Chung, C. H.; Parker, J. S.; Stover, D. G.; Cheng, N.; Chytil, A.; Aakre, M.; Shyr, Y.; Moses, H. L., Abrogation of TGF- signaling enhances chemokine production and correlates with prognosis in human breast cancer. The Journal of clinical investigation 2009, 119, (6), 1571.

102. Özdemir, B. C.; Pentcheva-Hoang, T.; Carstens, J. L.; Zheng, X.; Wu, C.-C.; Simpson, T. R.; Laklai, H.; Sugimoto, H.; Kahlert, C.; Novitskiy, S. V., Depletion of carcinoma-associated fibroblasts and fibrosis induces immunosuppression and accelerates pancreas cancer with reduced survival. Cancer Cell 2014, 25, (6), 719- 734.

103. Coussens, L. M.; Raymond, W. W.; Bergers, G.; Laig-Webster, M.; Behrendtsen, O.; Werb, Z.; Caughey, G. H.; Hanahan, D., Inflammatory mast cells up- regulate angiogenesis during squamous epithelial carcinogenesis. Genes & development 1999, 13, (11), 1382-1397.

215

104. Whatcott, C. J.; Watanabe, A.; LoBello, J.; Von Hoff, D.; Han, H., Desmoplasia in primary tumors and metastatic lesions of pancreatic cancer. Cancer Research 2014, 74, (19 Supplement), 191-191.

105. Decitre, M.; Gleyzal, C.; Raccurt, M.; Peyrol, S.; Aubert-Foucher, E.; Csiszar, K.; Sommer, P., Lysyl oxidase-like protein localizes to sites of de novo fibrinogenesis in fibrosis and in the early stromal reaction of ductal breast carcinomas. Laboratory investigation; a journal of technical methods and pathology 1998, 78, (2), 143-151.

106. Levental, K. R.; Yu, H.; Kass, L.; Lakins, J. N.; Egeblad, M.; Erler, J. T.; Fong, S. F.; Csiszar, K.; Giaccia, A.; Weninger, W., Matrix crosslinking forces tumor progression by enhancing integrin signaling. Cell 2009, 139, (5), 891-906.

107. Butcher, D. T.; Alliston, T.; Weaver, V. M., A tense situation: forcing tumour progression. Nature reviews. Cancer 2009, 9, (2), 108.

108. Erler, J. T.; Bennewith, K. L.; Cox, T. R.; Lang, G.; Bird, D.; Koong, A.; Le, Q.- T.; Giaccia, A. J., Hypoxia-induced lysyl oxidase is a critical mediator of bone marrow cell recruitment to form the premetastatic niche. Cancer cell 2009, 15, (1), 35-44.

109. Perryman, L.; Erler, J. T., Lysyl oxidase in cancer research. Future Oncology 2014, 10, (9), 1709-1717.

110. Erler, J. T.; Bennewith, K. L.; Nicolau, M.; Dornhöfer, N.; Kong, C.; Le, Q.-T.; Chi, J.-T. A.; Jeffrey, S. S.; Giaccia, A. J., Lysyl oxidase is essential for hypoxia- induced metastasis. Nature 2006, 440, (7088), 1222-1226.

111. Xiong, J.; Balcioglu, H. E.; Danen, E. H., Integrin signaling in control of tumor growth and progression. The international journal of biochemistry & cell biology 2013, 45, (5), 1012-1015.

112. Wang, H.-B.; Dembo, M.; Hanks, S. K.; Wang, Y.-l., Focal adhesion kinase is involved in mechanosensing during fibroblast migration. Proceedings of the National Academy of Sciences 2001, 98, (20), 11295-11300.

113. Hynes, R. O.; Naba, A., Overview of the matrisome—an inventory of extracellular matrix constituents and functions. Cold Spring Harbor perspectives in biology 2012, 4, (1), a004903.

114. Gullino, P. M.; Grantham, F. H., The influence of the host and the neoplastic cell population on the collagen content of a tumor mass. Cancer research 1963, 23, (4 Part 1), 648-653.

115. Foot, N. C., The Masson trichrome staining methods in routine laboratory use. Stain technology 1933, 8, (3), 101-110.

116. Foot, N. C., Useful methods for the routine examination of brain tumors. The American journal of pathology 1938, 14, (2), 245.

216

117. Brown, E.; McKee, T.; Pluen, A.; Seed, B.; Boucher, Y.; Jain, R. K., Dynamic imaging of collagen and its modulation in tumors in vivo using second-harmonic generation. Nature medicine 2003, 9, (6), 796-800.

118. Provenzano, P. P.; Eliceiri, K. W.; Campbell, J. M.; Inman, D. R.; White, J. G.; Keely, P. J., Collagen reorganization at the tumor-stromal interface facilitates local invasion. BMC medicine 2006, 4, (1), 38.

119. Pickup, M. W.; Mouw, J. K.; Weaver, V. M., The extracellular matrix modulates the hallmarks of cancer. EMBO Rep 2014, 15, (12), 1243-53.

120. Eyre, D., Collagen cross-linking amino acids. Methods in enzymology 1987, 144, 115-139.

121. Eyre, D. R.; Weis, M. A.; Wu, J.-J., Advances in collagen cross-link analysis. Methods 2008, 45, (1), 65-74.

122. Xing, F.; Saidou, J.; Watabe, K., Cancer associated fibroblasts (CAFs) in tumor microenvironment. Frontiers in bioscience: a journal and virtual library 2010, 15, 166.

123. Pickup, M. W.; Mouw, J. K.; Weaver, V. M., The extracellular matrix modulates the hallmarks of cancer. EMBO reports 2014, e201439246.

124. King, T. E.; Pardo, A.; Selman, M., Idiopathic pulmonary fibrosis. The Lancet 2011, 378, (9807), 1949-1961.

125. Schuppan, D. In Structure of the extracellular matrix in normal and fibrotic liver: collagens and glycoproteins, Seminars in liver disease, 1990; © 1990 by Thieme Medical Publishers, Inc.: 1990; pp 1-10.

126. Heeneman, S.; Cleutjens, J. P.; Faber, B. C.; Creemers, E. E.; van Suylen, R. J.; Lutgens, E.; Cleutjens, K. B.; Daemen, M. J., The dynamic extracellular matrix: intervention strategies during heart failure and atherosclerosis. The Journal of pathology 2003, 200, (4), 516-525.

127. Hill, R. C.; Calle, E. A.; Dzieciatkowska, M.; Niklason, L. E.; Hansen, K. C., Quantification of Extracellular Matrix Proteins from a Rat Lung Scaffold to Provide a Molecular Readout for Tissue Engineering. Molecular & Cellular Proteomics 2015, 14.4, 961-973.

128. Krasny, L.; Paul, A.; Wai, P.; Howard, B. A.; Natrajan, R. C.; Huang, P. H., Comparative proteomic assessment of matrisome enrichment methodologies. Biochemical Journal 2016, 473, (21), 3979-3995.

129. Calle, E. A.; Hill, R. C.; Leiby, K. L.; Le, A. V.; Gard, A. L.; Madri, J. A.; Hansen, K. C.; Niklason, L. E., Targeted proteomics effectively quantifies differences between native lung and detergent-decellularized lung extracellular matrices. Acta Biomaterialia 2016, 46, 91-100.

217

130. Goddard, E. T.; Hill, R. C.; Nemkov, T.; D'Alessandro, A.; Hansen, K. C.; Maller, O.; Mongoue-Tchokote, S.; Mori, M.; Partridge, A. H.; Borges, V. F., The Rodent Liver Undergoes Weaning-Induced Involution and Supports Breast Cancer Metastasis. Cancer Discovery 2016, CD-16-0822.

131. Hill, R. C.; Wither, M. J.; Nemkov, T.; Barrett, A.; D'Alessandro, A.; Dzieciatkowska, M.; Hansen, K. C., Preserved proteins from extinct bison Latifrons identified by tandem mass spectrometry; hydroxylysine glycosides are a common feature of ancient collagen. Molecular & Cellular Proteomics 2015, 14, (7), 1946-1958.

132. Johnson, T. D.; Hill, R. C.; Dzieciatkowska, M.; Nigam, V.; Behfar, A.; Christman, K. L.; Hansen, K. C., Quantification of decellularized human myocardial matrix: a comparison of six patients. PROTEOMICS-Clinical Applications 2016, 10, (1), 75-83.

133. Gross, E., The cyanogen bromide reaction. Methods in enzymology 1967, 11, 238-255.

134. ToxNet, HSDB: CYANOGEN BROMIDE. US National Library of Medicine 2014, 708.

135. Antorini, M.; Breme, U.; Caccia, P.; Grassi, C.; Lebrun, S.; Orsini, G.; Taylor, G.; Valsasina, B.; Marengo, E.; Todeschini, R., Hydroxylamine-induced cleavage of the asparaginyl–glycine motif in the production of recombinant proteins: the case of insulin-like growth factor I. Protein expression and purification 1997, 11, (1), 135-147.

136. Park, H.-B.; Pyo, S.-H.; Hong, S.-S.; Kim, J.-H., Optimization of the hydroxylamine cleavage of an expressed fusion protein to produce a recombinant antimicrobial peptide. Biotechnology letters 2001, 23, (8), 637-641.

137. Bornstein, P.; Balian, G., [14] Cleavage at Asn Gly bonds with hydroxylamine. Methods in enzymology 1977, 47, 132-145.

138. Dzieciatkowska, M.; D'Alessandro, A.; Hill, R. C.; Hansen, K. C., Plasma QconCATs reveal a gender-specific proteomic signature in apheresis platelet plasma supernatants. Journal of Proteomics 2015, 120, 1-6.

139. Simpson, R. J., Cleavage of asn-gly bonds by hydroxylamine. CSH protocols 2007, 2007, pdb. prot4697.

140. Wither, M. J.; Hansen, K. C.; Reisz, J. A., Mass Spectrometry Based Bottom Up Proteomics: Sample Preparation, LC MS/MS Analysis, and Database Query Strategies. Current Protocols in Protein Science 2016, 16.4. 1-16.4. 20.‐ ‐ ‐ 141. MacLean, B.; Tomazela, D. M.; Shulman, N.; Chambers, M.; Finney, G. L.; Frewen, B.; Kern, R.; Tabb, D. L.; Liebler, D. C.; MacCoss, M. J., Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26, (7), 966-968.

218

142. Xia, J.; Sinelnikov, I. V.; Han, B.; Wishart, D. S., MetaboAnalyst 3.0—making metabolomics more meaningful. Nucleic acids research 2015, 43, (W1), W251-W257.

143. Bern, M.; Kil, Y. J.; Becker, C., Byonic: advanced peptide and protein identification software. Current Protocols in Bioinformatics 2012, 13.20. 1-13.20. 14.

144. Bornstein, P., The nature of a hydroxylamine-sensitive bond in collagen. Biochemical and Biophysical Research Communications 1969, 36, (6), 957-964.

145. Clarke, B., Normal bone anatomy and physiology. Clinical journal of the American Society of Nephrology 2008, 3, (Supplement 3), S131-S139.

146. Del Fattore, A.; Cappariello, A.; Teti, A., Genetics, pathogenesis and complications of osteopetrosis. Bone 2008, 42, (1), 19-29.

147. Wysocki, A. B., Skin anatomy, physiology, and pathophysiology. The Nursing Clinics of North America 1999, 34, (4), 777-97, v.

148. Quan, T.; Fisher, G. J., Role of age-associated alterations of the dermal extracellular matrix microenvironment in human skin aging: a mini-review. Gerontology 2015, 61, (5), 427-434.

149. Treuting, P. M.; Dintzis, S. M., Comparative Anatomy and Histology: A Mouse and Human Atlas (Expert Consult). Academic Press: 2011.

150. Parker, M. W.; Rossi, D.; Peterson, M.; Smith, K.; Sikström, K.; White, E. S.; Connett, J. E.; Henke, C. A.; Larsson, O.; Bitterman, P. B., Fibrotic extracellular matrix activates a profibrotic positive feedback loop. The Journal of clinical investigation 2014, 124, (4), 1622-1635.

151. White, E. S., Lung extracellular matrix and fibroblast function. Annals of the American Thoracic Society 2015, 12, (Supplement 1), S30-S33.

152. Wong, J. Y.; Velasco, A.; Rajagopalan, P.; Pham, Q., Directed movement of vascular smooth muscle cells on gradient-compliant hydrogels. Langmuir 2003, 19, (5), 1908-1913.

153. Beighton, P.; De Paepe, A.; Steinmann, B.; Tsipouras, P.; Wenstrup, R. J., Ehlers-Danlos syndromes: revised nosology. am J Med genet 1998, 77, (1), 31-7.

154. Beighton, P.; Horan, F., Orthopaedic aspects of the Ehlers-Danlos syndrome. J Bone Joint Surg Br 1969, 51, (3), 444-453.

155. Ohtani, O., Three-dimensional organization of the collagen fibrillar framework of the human and rat livers. Archives of histology and cytology 1988, 51, (5), 473-488.

156. Sanes, J. R., The basement membrane/basal lamina of skeletal muscle. Journal of Biological Chemistry 2003, 278, (15), 12601-12604.

219

157. Wisniewski, J. R.; Zougman, A.; Nagaraj, N.; Mann, M., Universal sample preparation method for proteome analysis. Nature methods 2009, 6, (5), 359.

158. Wíniewski, J. R.; Zougman, A.; Mann, M., Combination of FASP and StageTip-based fractionation allows in-depth analysis of the hippocampal membrane proteome. Journal of proteome research 2009, 8, (12), 5674-5678.

159. Aggeler, J.; Park, C. S.; Bissell, M. J., Regulation of milk protein and basement membrane gene expression: the influence of the extracellular matrix. J Dairy Sci 1988, 71, (10), 2830-42.

160. Streuli, C. H.; Bailey, N.; Bissell, M. J., Control of mammary epithelial differentiation: basement membrane induces tissue-specific gene expression in the absence of cell-cell interaction and morphological polarity. J Cell Biol 1991, 115, (5), 1383-95.

161. Werb, Z.; Sympson, C. J.; Alexander, C. M.; Thomasset, N.; Lund, L. R.; MacAuley, A.; Ashkenas, J.; Bissell, M. J., Extracellular matrix remodeling and the regulation of epithelial-stromal interactions during differentiation and involution. Kidney Int Suppl 1996, 54, S68-74.

162. Fata, J. E.; Werb, Z.; Bissell, M. J., Regulation of mammary gland branching morphogenesis by the extracellular matrix and its remodeling enzymes. Breast cancer research : BCR 2004, 6, (1), 1-11.

163. Schedin, P.; Mitrenga, T.; McDaniel, S.; Kaeck, M., Mammary ECM composition and function are altered by reproductive state. Mol Carcinog 2004, 41, (4), 207-20.

164. Nelson, C. M.; Vanduijn, M. M.; Inman, J. L.; Fletcher, D. A.; Bissell, M. J., Tissue geometry determines sites of mammary branching morphogenesis in organotypic cultures. Science 2006, 314, (5797), 298-300.

165. Bissell, M. J.; Barcellos-Hoff, M. H., The influence of extracellular matrix on gene expression: is structure the message? J Cell Sci Suppl 1987, 8, 327-43.

166. Barcellos-Hoff, M. H.; Aggeler, J.; Ram, T. G.; Bissell, M. J., Functional differentiation and alveolar morphogenesis of primary mammary cultures on reconstituted basement membrane. Development 1989, 105, (2), 223-35.

167. Howlett, A. R.; Bissell, M. J., The influence of tissue microenvironment (stroma and extracellular matrix) on the development and function of mammary epithelium. Epithelial Cell Biol 1993, 2, (2), 79-89.

168. Weigelt, B.; Ghajar, C. M.; Bissell, M. J., The need for complex 3D culture models to unravel novel pathways and identify accurate biomarkers in breast cancer. Adv Drug Deliv Rev 2014, 69-70, 42-51.

220

169. Shaw, K. R.; Wrobel, C. N.; Brugge, J. S., Use of three-dimensional basement membrane cultures to model oncogene-induced changes in mammary epithelial morphogenesis. J Mammary Gland Biol Neoplasia 2004, 9, (4), 297-310.

170. Lee, G. Y.; Kenny, P. A.; Lee, E. H.; Bissell, M. J., Three-dimensional culture models of normal and malignant breast epithelial cells. Nat Methods 2007, 4, (4), 359- 65.

171. Fischbach, C.; Chen, R.; Matsumoto, T.; Schmelzle, T.; Brugge, J. S.; Polverini, P. J.; Mooney, D. J., Engineering tumors with 3D scaffolds. Nat Methods 2007, 4, (10), 855-60.

172. Krause, S.; Maffini, M. V.; Soto, A. M.; Sonnenschein, C., A novel 3D in vitro culture model to study stromal-epithelial interactions in the mammary gland. Tissue Eng Part C Methods 2008, 14, (3), 261-71.

173. Schmeichel, K. L.; Weaver, V. M.; Bissell, M. J., Structural cues from the tissue microenvironment are essential determinants of the human mammary epithelial cell phenotype. J Mammary Gland Biol Neoplasia 1998, 3, (2), 201-13.

174. DuFort, C. C.; Paszek, M. J.; Weaver, V. M., Balancing forces: architectural control of mechanotransduction. Nature reviews. Molecular cell biology 2011, 12, (5), 308-19.

175. Schedin, P.; Keely, P. J., Mammary gland ECM remodeling, stiffness, and mechanosignaling in normal development and tumor progression. Cold Spring Harb Perspect Biol 2011, 3, (1), a003228.

176. Levental, K. R.; Yu, H.; Kass, L.; Lakins, J. N.; Egeblad, M.; Erler, J. T.; Fong, S. F.; Csiszar, K.; Giaccia, A.; Weninger, W.; Yamauchi, M.; Gasser, D. L.; Weaver, V. M., Matrix crosslinking forces tumor progression by enhancing integrin signaling. Cell 2009, 139, (5), 891-906.

177. Hancox, R. A.; Allen, M. D.; Holliday, D. L.; Edwards, D. R.; Pennington, C. J.; Guttery, D. S.; Shaw, J. A.; Walker, R. A.; Pringle, J. H.; Jones, J. L., Tumour- associated tenascin-C isoforms promote breast cancer cell invasion and growth by matrix metalloproteinase-dependent and independent mechanisms. Breast Cancer Res 2009, 11, (2), R24.

178. Maity, G.; Choudhury, P. R.; Sen, T.; Ganguly, K. K.; Sil, H.; Chatterjee, A., Culture of human breast cancer cell line (MDA-MB-231) on fibronectin-coated surface induces pro-matrix metalloproteinase-9 expression and activity. Tumour Biol 2011, 32, (1), 129-38.

179. Nguyen-Ngoc, K. V.; Cheung, K. J.; Brenot, A.; Shamir, E. R.; Gray, R. S.; Hines, W. C.; Yaswen, P.; Werb, Z.; Ewald, A. J., ECM microenvironment regulates collective migration and local dissemination in normal and malignant mammary epithelium. Proc Natl Acad Sci U S A 2012, 109, (39), E2595-604.

221

180. Maller, O.; Hansen, K. C.; Lyons, T. R.; Acerbi, I.; Weaver, V. M.; Prekeris, R.; Tan, A. C.; Schedin, P., Collagen architecture in pregnancy-induced protection from breast cancer. J Cell Sci 2013, 126, (Pt 18), 4108-10.

181. Boyd, N. F.; Lockwood, G. A.; Martin, L. J.; Knight, J. A.; Byng, J. W.; Yaffe, M. J.; Tritchler, D. L., Mammographic densities and breast cancer risk. Breast Dis 1998, 10, (3-4), 113-26.

182. Li, T.; Sun, L.; Miller, N.; Nicklee, T.; Woo, J.; Hulse-Smith, L.; Tsao, M. S.; Khokha, R.; Martin, L.; Boyd, N., The association of measured breast tissue characteristics with mammographic density and other risk factors for breast cancer. Cancer Epidemiol Biomarkers Prev 2005, 14, (2), 343-9.

183. Provenzano, P. P.; Inman, D. R.; Eliceiri, K. W.; Knittel, J. G.; Yan, L.; Rueden, C. T.; White, J. G.; Keely, P. J., Collagen density promotes mammary tumor initiation and progression. BMC Med 2008, 6, 11.

184. Barkan, D.; El Touny, L. H.; Michalowski, A. M.; Smith, J. A.; Chu, I.; Davis, A. S.; Webster, J. D.; Hoover, S.; Simpson, R. M.; Gauldie, J.; Green, J. E., Metastatic growth from dormant cells induced by a col-I-enriched fibrotic environment. Cancer Res 2010, 70, (14), 5706-16.

185. Erler, J. T.; Bennewith, K. L.; Cox, T. R.; Lang, G.; Bird, D.; Koong, A.; Le, Q. T.; Giaccia, A. J., Hypoxia-induced lysyl oxidase is a critical mediator of bone marrow cell recruitment to form the premetastatic niche. Cancer Cell 2009, 15, (1), 35-44.

186. Oskarsson, T.; Acharyya, S.; Zhang, X. H.; Vanharanta, S.; Tavazoie, S. F.; Morris, P. G.; Downey, R. J.; Manova-Todorova, K.; Brogi, E.; Massague, J., Breast cancer cells produce tenascin C as a metastatic niche component to colonize the lungs. Nat Med 2011, 17, (7), 867-74.

187. Malanchi, I.; Santamaria-Martinez, A.; Susanto, E.; Peng, H.; Lehr, H. A.; Delaloye, J. F.; Huelsken, J., Interactions between cancer stem cells and their niche govern metastatic colonization. Nature 2012, 481, (7379), 85-9.

188. Ghajar, C. M.; Peinado, H.; Mori, H.; Matei, I. R.; Evason, K. J.; Brazier, H.; Almeida, D.; Koller, A.; Hajjar, K. A.; Stainier, D. Y.; Chen, E. I.; Lyden, D.; Bissell, M. J., The perivascular niche regulates breast tumour dormancy. Nat Cell Biol 2013, 15, (7), 807-17.

189. Costa-Silva, B.; Aiello, N. M.; Ocean, A. J.; Singh, S.; Zhang, H.; Thakur, B. K.; Becker, A.; Hoshino, A.; Mark, M. T.; Molina, H.; Xiang, J.; Zhang, T.; Theilen, T. M.; Garcia-Santos, G.; Williams, C.; Ararso, Y.; Huang, Y.; Rodrigues, G.; Shen, T. L.; Labori, K. J.; Lothe, I. M.; Kure, E. H.; Hernandez, J.; Doussot, A.; Ebbesen, S. H.; Grandgenett, P. M.; Hollingsworth, M. A.; Jain, M.; Mallya, K.; Batra, S. K.; Jarnagin, W. R.; Schwartz, R. E.; Matei, I.; Peinado, H.; Stanger, B. Z.; Bromberg, J.; Lyden, D., Pancreatic cancer exosomes initiate pre-metastatic niche formation in the liver. Nat Cell Biol 2015, 17, (6), 816-26.

222

190. Kaplan, R. N.; Riba, R. D.; Zacharoulis, S.; Bramley, A. H.; Vincent, L.; Costa, C.; MacDonald, D. D.; Jin, D. K.; Shido, K.; Kerns, S. A.; Zhu, Z.; Hicklin, D.; Wu, Y.; Port, J. L.; Altorki, N.; Port, E. R.; Ruggero, D.; Shmelkov, S. V.; Jensen, K. K.; Rafii, S.; Lyden, D., VEGFR1-positive haematopoietic bone marrow progenitors initiate the pre-metastatic niche. Nature 2005, 438, (7069), 820-7.

191. Burnier, J. V.; Wang, N.; Michel, R. P.; Hassanain, M.; Li, S.; Lu, Y.; Metrakos, P.; Antecka, E.; Burnier, M. N.; Ponton, A.; Gallinger, S.; Brodt, P., Type IV collagen- initiated signals provide survival and growth cues required for liver metastasis. Oncogene 2011, 30, (35), 3766-83.

192. Goto, R.; Nakamura, Y.; Takami, T.; Sanke, T.; Tozuka, Z., Quantitative LC- MS/MS Analysis of Proteins Involved in Metastasis of Breast Cancer. PLoS One 2015, 10, (7), e0130760.

193. McDaniel, S. M.; Rumer, K. K.; Biroc, S. L.; Metz, R. P.; Singh, M.; Porter, W.; Schedin, P., Remodeling of the mammary microenvironment after lactation promotes breast tumor cell metastasis. Am J Pathol 2006, 168, (2), 608-20.

194. Callihan, E. B.; Gao, D.; Jindal, S.; Lyons, T. R.; Manthey, E.; Edgerton, S.; Urquhart, A.; Schedin, P.; Borges, V. F., Postpartum diagnosis demonstrates a high risk for metastasis and merits an expanded definition of pregnancy-associated breast cancer. Breast Cancer Res Treat 2013, 138, (2), 549-59.

195. Johansson, A. L.; Andersson, T. M.; Hsieh, C. C.; Cnattingius, S.; Lambe, M., Increased mortality in women with breast cancer detected during pregnancy and different periods postpartum. Cancer Epidemiol Biomarkers Prev 2011, 20, (9), 1865- 72.

196. Stensheim, H.; Moller, B.; van Dijk, T.; Fossa, S. D., Cause-specific survival for women diagnosed with cancer during pregnancy or lactation: a registry-based cohort study. J Clin Oncol 2009, 27, (1), 45-51.

197. Schedin, P., Pregnancy-associated breast cancer and metastasis. Nat Rev Cancer 2006, 6, (4), 281-91.

198. Lyons, T. R.; O'Brien, J.; Borges, V. F.; Conklin, M. W.; Keely, P. J.; Eliceiri, K. W.; Marusyk, A.; Tan, A. C.; Schedin, P., Postpartum mammary gland involution drives progression of ductal carcinoma in situ through collagen and COX-2. Nat Med 2011, 17, (9), 1109-15.

199. Naba, A.; Clauser, K. R.; Whittaker, C. A.; Carr, S. A.; Tanabe, K. K.; Hynes, R. O., Extracellular matrix signatures of human primary metastatic colon cancers and their metastases to liver. BMC Cancer 2014, 14, 518.

200. Naba, A.; Clauser, K. R.; Lamar, J. M.; Carr, S. A.; Hynes, R. O., Extracellular matrix signatures of human mammary carcinoma identify novel metastasis promoters. Elife 2014, 3, e01308.

223

201. Hansen, K. C.; Kiemele, L.; Maller, O.; O'Brien, J.; Shankar, A.; Fornetti, J.; Schedin, P., An in-solution ultrasonication-assisted digestion method for improved extracellular matrix proteome coverage. Mol Cell Proteomics 2009, 8, (7), 1648-57.

202. Didangelos, A.; Yin, X.; Mandal, K.; Baumert, M.; Jahangiri, M.; Mayr, M., Proteomics characterization of extracellular space components in the human aorta. Mol Cell Proteomics 2010, 9, (9), 2048-62.

203. Hill, R. C.; Calle, E. A.; Dzieciatkowska, M.; Niklason, L. E.; Hansen, K. C., Quantification of Extracellular Matrix Proteins from a Rat Lung Scaffold to Provide a Molecular Readout for Tissue Engineering. Mol Cell Proteomics 2015.

204. Baiocchini, A.; Montaldo, C.; Conigliaro, A.; Grimaldi, A.; Correani, V.; Mura, F.; Ciccosanti, F.; Rotiroti, N.; Brenna, A.; Montalbano, M.; D'Offizi, G.; Capobianchi, M. R.; Alessandro, R.; Piacentini, M.; Schinina, M. E.; Maras, B.; Del Nonno, F.; Tripodi, M.; Mancone, C., Extracellular Matrix Molecular Remodeling in Human Liver Fibrosis Evolution. PLoS One 2016, 11, (3), e0151736.

205. Geiger, T.; Velic, A.; Macek, B.; Lundberg, E.; Kampf, C.; Nagaraj, N.; Uhlen, M.; Cox, J.; Mann, M., Initial quantitative proteomic map of 28 mouse tissues using the SILAC mouse. Mol Cell Proteomics 2013, 12, (6), 1709-22.

206. Bemis, L. T.; Schedin, P., Reproductive state of rat mammary gland stroma modulates human breast cancer cell migration and invasion. Cancer Res 2000, 60, (13), 3414-8.

207. Goddard, E.; Fischer, J.; Schedin, P., A portal vein injection model to study liver metastasis of breast cancer. Journal of Video Experiments 2016, Accepted.

208. Johnson, T. D.; Hill, R. C.; Dzieciatkowska, M.; Nigam, V.; Behfar, A.; Christman, K. L.; Hansen, K. C., Quantification of decellularized human myocardial matrix: A comparison of six patients. Proteomics Clin Appl 2016, 10, (1), 75-83.

209. MacLean, B.; Tomazela, D. M.; Shulman, N.; Chambers, M.; Finney, G. L.; Frewen, B.; Kern, R.; Tabb, D. L.; Liebler, D. C.; MacCoss, M. J., Skyline: an open source document editor for creating and analyzing targeted proteomics experiments. Bioinformatics 2010, 26, (7), 966-8.

210. Pratt, J. M.; Simpson, D. M.; Doherty, M. K.; Rivers, J.; Gaskell, S. J.; Beynon, R. J., Multiplexed absolute quantification for proteomics using concatenated signature peptides encoded by QconCAT genes. Nat Protoc 2006, 1, (2), 1029-43.

211. Dennis, G., Jr.; Sherman, B. T.; Hosack, D. A.; Yang, J.; Gao, W.; Lane, H. C.; Lempicki, R. A., DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 2003, 4, (5), P3.

212. Luzzi, K. J.; MacDonald, I. C.; Schmidt, E. E.; Kerkvliet, N.; Morris, V. L.; Chambers, A. F.; Groom, A. C., Multistep nature of metastatic inefficiency: dormancy

224 of solitary cells after successful extravasation and limited survival of early micrometastases. Am J Pathol 1998, 153, (3), 865-73.

213. Berman, A. T.; Thukral, A. D.; Hwang, W. T.; Solin, L. J.; Vapiwala, N., Incidence and patterns of distant metastases for patients with early-stage breast cancer after breast conservation treatment. Clin Breast Cancer 2013, 13, (2), 88-94.

214. Savci-Heijink, C. D.; Halfwerk, H.; Hooijer, G. K.; Horlings, H. M.; Wesseling, J.; van de Vijver, M. J., Retrospective analysis of metastatic behaviour of breast cancer subtypes. Breast Cancer Res Treat 2015, 150, (3), 547-57.

215. Harrell, J. C.; Prat, A.; Parker, J. S.; Fan, C.; He, X.; Carey, L.; Anders, C.; Ewend, M.; Perou, C. M., Genomic analysis identifies unique signatures predictive of brain, lung, and liver relapse. Breast Cancer Res Treat 2012, 132, (2), 523-35.

216. Wyld, L.; Gutteridge, E.; Pinder, S. E.; James, J. J.; Chan, S. Y.; Cheung, K. L.; Robertson, J. F.; Evans, A. J., Prognostic factors for patients with hepatic metastases from breast cancer. Br J Cancer 2003, 89, (2), 284-90.

217. Tarhan, M. O.; Demir, L.; Somali, I.; Yigit, S.; Erten, C.; Alacacioglu, A.; Ellidokuz, H.; Seseogullari, O.; Kucukzeybek, Y.; Can, A.; Dirican, A.; Bayoglu, V.; Akyol, M., The clinicopathological evaluation of the breast cancer patients with brain metastases: predictors of survival. Clin Exp Metastasis 2013, 30, (2), 201-13.

218. Tseng, L. M.; Hsu, N. C.; Chen, S. C.; Lu, Y. S.; Lin, C. H.; Chang, D. Y.; Li, H.; Lin, Y. C.; Chang, H. K.; Chao, T. C.; Ouyang, F.; Hou, M. F., Distant metastasis in triple-negative breast cancer. Neoplasma 2013, 60, (3), 290-4.

219. Mouw, J. K.; Ou, G.; Weaver, V. M., Extracellular matrix assembly: a multiscale deconstruction. Nat Rev Mol Cell Biol 2014, 15, (12), 771-85.

220. Zetter, B. R., The cellular basis of site-specific tumor metastasis. New England Journal of Medicine 1990, 322, (9), 605-612.

221. Martinson, H. A.; Jindal, S.; Durand-Rougely, C.; Borges, V. F.; Schedin, P., Wound healing-like immune program facilitates postpartum mammary gland involution and tumor progression. Int J Cancer 2014.

222. Lund, L. R.; Romer, J.; Thomasset, N.; Solberg, H.; Pyke, C.; Bissell, M. J.; Dano, K.; Werb, Z., Two distinct phases of apoptosis in mammary gland involution: proteinase-independent and -dependent pathways. Development 1996, 122, (1), 181- 93.

223. O'Brien, J. H.; Vanderlinden, L. A.; Schedin, P. J.; Hansen, K. C., Rat mammary extracellular matrix composition and response to ibuprofen treatment during postpartum involution by differential GeLC-MS/MS analysis. J Proteome Res 2012, 11, (10), 4894-905.

225

224. Kadler, K. E.; Hill, A.; Canty-Laird, E. G., Collagen fibrillogenesis: fibronectin, integrins, and minor collagens as organizers and nucleators. Current opinion in cell biology 2008, 20, (5), 495-501.

225. Schedin, P.; Strange, R.; Mitrenga, T.; Wolfe, P.; Kaeck, M., Fibronectin fragments induce MMP activity in mouse mammary epithelial cells: evidence for a role in mammary tissue remodeling. J Cell Sci 2000, 113, (5), 795-806.

226. O'Brien, J.; Lyons, T.; Monks, J.; Lucia, M. S.; Wilson, R. S.; Hines, L.; Man, Y. G.; Borges, V.; Schedin, P., Alternatively activated macrophages and collagen remodeling characterize the postpartum involuting mammary gland across species. Am J Pathol 2010, 176, (3), 1241-55.

227. Iyengar, P.; Espina, V.; Williams, T. W.; Lin, Y.; Berry, D.; Jelicks, L. A.; Lee, H.; Temple, K.; Graves, R.; Pollard, J.; Chopra, N.; Russell, R. G.; Sasisekharan, R.; Trock, B. J.; Lippman, M.; Calvert, V. S.; Petricoin, E. F., 3rd; Liotta, L.; Dadachova, E.; Pestell, R. G.; Lisanti, M. P.; Bonaldo, P.; Scherer, P. E., Adipocyte-derived collagen VI affects early mammary tumor progression in vivo, demonstrating a critical interaction in the tumor/stroma microenvironment. J Clin Invest 2005, 115, (5), 1163- 76.

228. Ioachim, E.; Charchanti, A.; Briasoulis, E.; Karavasilis, V.; Tsanou, H.; Arvanitis, D. L.; Agnantis, N. J.; Pavlidis, N., Immunohistochemical expression of extracellular matrix components tenascin, fibronectin, collagen type IV and laminin in breast cancer: their prognostic value and role in tumour invasion and progression. Eur J Cancer 2002, 38, (18), 2362-70.

229. Yee, K. O.; Connolly, C. M.; Duquette, M.; Kazerounian, S.; Washington, R.; Lawler, J., The effect of thrombospondin-1 on breast cancer metastasis. Breast Cancer Res Treat 2009, 114, (1), 85-96.

230. Zhang, H.; Luo, M.; Liang, X.; Wang, D.; Gu, X.; Duan, C.; Gu, H.; Chen, G.; Zhao, X.; Zhao, Z.; Liu, C., Galectin-3 as a marker and potential therapeutic target in breast cancer. PLoS One 2014, 9, (9), e103482.

231. Karagiannis, G. S.; Petraki, C.; Prassas, I.; Saraon, P.; Musrap, N.; Dimitromanolakis, A.; Diamandis, E. P., Proteomic signatures of the desmoplastic invasion front reveal collagen type XII as a marker of myofibroblastic differentiation during colorectal cancer metastasis. Oncotarget 2012, 3, (3), 267-85.

232. Yen, T. Y.; Haste, N.; Timpe, L. C.; Litsakos-Cheung, C.; Yen, R.; Macher, B. A., Using a cell line breast cancer progression system to identify biomarker candidates. J Proteomics 2014, 96, 173-83.

233. Lai, K. K.; Kolippakkam, D.; Beretta, L., Comprehensive and quantitative proteome profiling of the mouse liver and plasma. Hepatology 2008, 47, (3), 1043-51.

226

234. Lai, K. K.; Shang, S.; Lohia, N.; Booth, G. C.; Masse, D. J.; Fausto, N.; Campbell, J. S.; Beretta, L., Extracellular matrix dynamics in hepatocarcinogenesis: a comparative proteomics study of PDGFC transgenic and Pten null mouse models. PLoS Genet 2011, 7, (6), e1002147.

235. Da Costa, G. G.; Gomig, T. H.; Kaviski, R.; Santos Sousa, K.; Kukolj, C.; De Lima, R. S.; De Andrade Urban, C.; Cavalli, I. J.; Ribeiro, E. M., Comparative Proteomics of Tumor and Paired Normal Breast Tissue Highlights Potential Biomarkers in Breast Cancer. Cancer Genomics Proteomics 2015, 12, (5), 251-61.

236. Moreira, J. M.; Cabezon, T.; Gromova, I.; Gromov, P.; Timmermans-Wielenga, V.; Machado, I.; Llombart-Bosch, A.; Kroman, N.; Rank, F.; Celis, J. E., Tissue proteomics of the human mammary gland: towards an abridged definition of the molecular phenotypes underlying epithelial normalcy. Mol Oncol 2010, 4, (6), 539-61.

237. Johns, P. C.; Yaffe, M. J., X-ray characterisation of normal and neoplastic breast tissues. Physics in medicine and biology 1987, 32, (6), 675.

238. Clemons, M.; Goss, P., Estrogen and the risk of breast cancer. New England Journal of Medicine 2001, 344, (4), 276-285.

239. Chen, Z.; Wu, A. H.; Gauderman, W. J.; Bernstein, L.; Ma, H.; Pike, M. C.; Ursin, G., Does mammographic density reflect ethnic differences in breast cancer incidence rates? American journal of epidemiology 2004, 159, (2), 140-147.

240. Byrne, C.; Schairer, C.; Wolfe, J.; Parekh, N.; Salane, M.; Brinton, L. A.; Hoover, R.; Haile, R., Mammographic features and breast cancer risk: effects with time, age, and menopause status. JNCI: Journal of the National Cancer Institute 1995, 87, (21), 1622-1629.

241. Boyd, N. F.; Lockwood, G. A.; Byng, J. W.; Tritchler, D. L.; Yaffe, M. J., Mammographic densities and breast cancer risk. Cancer Epidemiology Biomarkers & Prevention 1998, 7, (12), 1133-1144.

242. Boyd, N. F.; Rommens, J. M.; Vogt, K.; Lee, V.; Hopper, J. L.; Yaffe, M. J.; Paterson, A. D., Mammographic breast density as an intermediate phenotype for breast cancer. The lancet oncology 2005, 6, (10), 798-808.

243. Wolfe, J. N.; Saftlas, A. F.; Salane, M., Mammographic parenchymal patterns and quantitative evaluation of mammographic densities: a case-control study. American Journal of Roentgenology 1987, 148, (6), 1087-1092.

244. Dupont, W. D.; Page, D. L., Risk factors for breast cancer in women with proliferative breast disease. New England Journal of Medicine 1985, 312, (3), 146- 151.

245. Dawson, D. A.; Thompson, G. B., Breast cancer risk factors and screening; United States, 1987. 1990.

227

246. Whitehead, J.; Carlile, T.; Kopecky, K. J.; Thompson, D. J.; Gilbert, F. I.; Present, A. J.; Threatt, B. A.; Krook, P.; Hadaway, E., Wolfe mammographic parenchymal patterns. A study of the masking hypothesis of Egan and Mosteller. Cancer 1985, 56, (6), 1280-1286.

247. Ng, M. R.; Brugge, J. S., A stiff blow from the stroma: collagen crosslinking drives tumor progression. Cancer cell 2009, 16, (6), 455-457.

248. Yoshimura, K.; Takeuchi, K.; Nagasaki, K.; Ogishima, S.; Tanaka, H.; Iwase, T.; Akiyama, F.; Kuroda, Y.; Miki, Y., Prognostic value of matrix Gla protein in breast cancer. Molecular medicine reports 2009, 2, (4), 549-553.

249. Casey, T.; Bond, J.; Tighe, S.; Hunter, T.; Lintault, L.; Patel, O.; Eneman, J.; Crocker, A.; White, J.; Tessitore, J., Molecular signatures suggest a major role for stromal cells in development of invasive breast cancer. Breast cancer research and treatment 2009, 114, (1), 47-62.

250. Martinson, H. A.; Lyons, T. R.; Giles, E. D.; Borges, V. F.; Schedin, P., Developmental windows of breast cancer risk provide opportunities for targeted chemoprevention. Experimental cell research 2013, 319, (11), 1671-1678.

251. Moos, R. H., Typology of menstrual cycle symptoms. American Journal of Obstetrics and Gynecology 1969, 103, (3), 390-402.

252. Toriola, A. T.; Dang, H. X.; Hagemann, I. S.; Appleton, C. M.; Colditz, G. A.; Luo, J.; Maher, C. A., Increased breast tissue receptor activator of nuclear factor-κB ligand (RANKL) gene expression is associated with higher mammographic density in premenopausal women. Oncotarget 2017.

253. Mammoto, T.; Ingber, D. E., Mechanical control of tissue and organ development. Development 2010, 137, (9), 1407-1420.

254. Yamauchi, M.; Shiiba, M., Lysine hydroxylation and cross-linking of collagen. Post-translational Modifications of Proteins 2008, 95-108.

255. Yamauchi, M.; Sricholpech, M., Lysine post-translational modifications of collagen. Essays in biochemistry 2012, 52, 113-133.

256. Gilkes, D. M.; Bajpai, S.; Wong, C. C.; Chaturvedi, P.; Hubbi, M. E.; Wirtz, D.; Semenza, G. L., Procollagen lysyl hydroxylase 2 is essential for hypoxia-induced breast cancer metastasis. Molecular cancer research 2013, 11, (5), 456-466.

257. Schietke, R.; Warnecke, C.; Wacker, I.; Schödel, J.; Mole, D. R.; Campean, V.; Amann, K.; Goppelt-Struebe, M.; Behrens, J.; Eckardt, K.-U., The lysyl oxidases LOX and LOXL2 are necessary and sufficient to repress E-cadherin in hypoxia̶ insights into cellular transformation processes mediated by HIF-1. Journal of Biological Chemistry 2010, 285, (9), 6658-6669.

228

258. Chen, Y.; Terajima, M.; Yang, Y.; Sun, L.; Ahn, Y. H.; Pankova, D.; Puperi, D. S.; Watanabe, T.; Kim, M. P.; Blackmon, S. H.; Rodriguez, J.; Liu, H.; Behrens, C.; Wistuba, II; Minelli, R.; Scott, K. L.; Sanchez-Adams, J.; Guilak, F.; Pati, D.; Thilaganathan, N.; Burns, A. R.; Creighton, C. J.; Martinez, E. D.; Zal, T.; Grande- Allen, K. J.; Yamauchi, M.; Kurie, J. M., Lysyl hydroxylase 2 induces a collagen cross- link switch in tumor stroma. J Clin Invest 2015, 125, (3), 1147-62.

259. Takaluoma, K.; Lantto, J.; Myllyharju, J., Lysyl hydroxylase 2 is a specific telopeptide hydroxylase, while all three isoenzymes hydroxylate collagenous sequences. Matrix biology 2007, 26, (5), 396-403.

260. van der Slot, A. J.; van Dura, E. A.; de Wit, E. C.; DeGroot, J.; Huizinga, T. W.; Bank, R. A.; Zuurmond, A.-M., Elevated formation of pyridinoline cross-links by profibrotic cytokines is associated with enhanced lysyl hydroxylase 2b levels. Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease 2005, 1741, (1), 95- 102.

261. van der Slot, A. J.; Zuurmond, A.-M.; van den Bogaerdt, A. J.; Ulrich, M. M.; Middelkoop, E.; Boers, W.; Ronday, H. K.; DeGroot, J.; Huizinga, T. W.; Bank, R. A., Increased formation of pyridinoline cross-links due to higher telopeptide lysyl hydroxylase levels is a general fibrotic phenomenon. Matrix biology 2004, 23, (4), 251- 257.

262. Pickup, M. W.; Laklai, H.; Acerbi, I.; Owens, P.; Gorska, A. E.; Chytil, A.; Aakre, M.; Weaver, V. M.; Moses, H. L., Stromally derived lysyl oxidase promotes metastasis of transforming growth factor-beta-deficient mouse mammary carcinomas. Cancer Res 2013, 73, (17), 5336-46.

263. Weston, L. A.; Hummon, A. B., Comparative LC-MS/MS analysis of optimal cutting temperature (OCT) compound removal for the study of mammalian proteomes. Analyst 2013, 138, (21), 6380-6384.

264. Clasquin, M. F.; Melamud, E.; Rabinowitz, J. D., LC MS Data Processing with MAVEN: A Metabolomic Analysis and Visualization Engine. Current Protocols in Bioinformatics 2012, 14.11. 1-14.11. 23. ‐

265. Chang, J. M.; Moon, W. K.; Cho, N.; Yi, A.; Koo, H. R.; Han, W.; Noh, D.-Y.; Moon, H.-G.; Kim, S. J., Clinical application of shear wave elastography (SWE) in the diagnosis of benign and malignant breast diseases. Breast cancer research and treatment 2011, 129, (1), 89-97.

266. Pinkard, H.; Stuurman, N.; Corbin, K.; Vale, R.; Krummel, M. F., Micro- Magellan: open-source, sample-adaptive, acquisition software for optical microscopy. Nature methods 2016, 13, (10), 807-809.

267. Berglund, G.; Elmståhl, S.; Janzon, L.; Larsson, S., Design and feasibility. Journal of internal medicine 1993, 233, (1), 45-51.

229

268. Manjer, J.; Carlsson, S.; Elmståhl, S.; Gullberg, B.; Janzon, L.; Lindström, M.; Mattisson, I.; Berglund, G., The Malmö diet and cancer study: representativity, cancer incidence and mortality in participants and non participants. European Journal of Cancer Prevention 2001, 10, (6), 489-499. ‐ 269. Cohen, D. A.; Dabbs, D. J.; Cooper, K. L.; Amin, M.; Jones, T. E.; Jones, M. W.; Chivukula, M.; Trucco, G. A.; Bhargava, R., Interobserver agreement among pathologists for semiquantitative hormone receptor scoring in breast carcinoma. American journal of clinical pathology 2012, 138, (6), 796-802.

270. Oxlund, H.; Barckman, M.; Ørtoft, G.; Andreassen, T., Reduced concentrations of collagen cross-links are associated with reduced strength of bone. Bone 1995, 17, (4), S365-S371.

271. Chen, Y.; Terajima, M.; Yang, Y.; Sun, L.; Ahn, Y.-H.; Pankova, D.; Puperi, D. S.; Watanabe, T.; Kim, M. P.; Blackmon, S. H., Lysyl hydroxylase 2 induces a collagen cross-link switch in tumor stroma. The Journal of clinical investigation 2015, 125, (3), 1147.

272. Kalluri, R.; Zeisberg, M., Fibroblasts in cancer. Nature Reviews Cancer 2006, 6, (5), 392-401.

273. Yau, C.; Esserman, L.; Moore, D. H.; Waldman, F.; Sninsky, J.; Benz, C. C., A multigene predictor of metastatic outcome in early stage hormone receptor-negative and triple-negative breast cancer. Breast cancer research 2010, 12, (5), R85.

274. Szász, A. M.; Lánczky, A.; Nagy, Á.; Förster, S.; Hark, K.; Green, J. E.; Boussioutas, A.; Busuttil, R.; Szabó, A.; Győrffy, B., Cross-validation of survival associated biomarkers in gastric cancer using transcriptomic data of 1,065 patients. Oncotarget 2016, 7, (31), 49322-49333.

275. Ryan, D. P.; Hong, T. S.; Bardeesy, N., Pancreatic adenocarcinoma. New England Journal of Medicine 2014, 371, (11), 1039-1049.

276. Chauhan, V. P.; Boucher, Y.; Ferrone, C. R.; Roberge, S.; Martin, J. D.; Stylianopoulos, T.; Bardeesy, N.; DePinho, R. A.; Padera, T. P.; Munn, L. L., Compression of pancreatic tumor blood vessels by hyaluronan is caused by solid stress and not interstitial fluid pressure. Cancer cell 2014, 26, (1), 14.

277. Swartz, M. A.; Lund, A. W., Lymphatic and interstitial flow in the tumour microenvironment: linking mechanobiology with immunity. Nature reviews. Cancer 2012, 12, (3), 210.

278. Yu, M.; Tannock, I. F., Targeting tumor architecture to favor drug penetration: a new weapon to combat chemoresistance in pancreatic cancer? Cancer cell 2012, 21, (3), 327-329.

230

279. Neesse, A.; Krug, S.; Gress, T. M.; Tuveson, D. A.; Michl, P., Emerging concepts in pancreatic cancer medicine: targeting the tumor stroma. OncoTargets and therapy 2014, 7, 33.

280. Olive, K. P.; Jacobetz, M. A.; Davidson, C. J.; Gopinathan, A.; McIntyre, D.; Honess, D.; Madhu, B.; Goldgraben, M. A.; Caldwell, M. E.; Allard, D., Inhibition of Hedgehog signaling enhances delivery of chemotherapy in a mouse model of pancreatic cancer. Science 2009, 324, (5933), 1457-1461.

281. Chauhan, V. P.; Martin, J. D.; Liu, H.; Lacorre, D. A.; Jain, S. R.; Kozin, S. V.; Stylianopoulos, T.; Mousa, A. S.; Han, X.; Adstamongkonkul, P., Angiotensin inhibition enhances drug delivery and potentiates chemotherapy by decompressing tumour blood vessels. Nature communications 2013, 4.

282. Provenzano, P. P.; Cuevas, C.; Chang, A. E.; Goel, V. K.; Von Hoff, D. D.; Hingorani, S. R., Enzymatic targeting of the stroma ablates physical barriers to treatment of pancreatic ductal adenocarcinoma. Cancer cell 2012, 21, (3), 418-429.

283. Rosow, D. E.; Liss, A. S.; Strobel, O.; Fritz, S.; Bausch, D.; Valsangkar, N. P.; Alsina, J.; Kulemann, B.; Park, J. K.; Yamaguchi, J., Sonic Hedgehog in pancreatic cancer: from bench to bedside, then back to the bench. Surgery 2012, 152, (3 0 1), S19.

284. Rhim, A. D.; Oberstein, P. E.; Thomas, D. H.; Mirek, E. T.; Palermo, C. F.; Sastra, S. A.; Dekleva, E. N.; Saunders, T.; Becerra, C. P.; Tattersall, I. W., Stromal elements act to restrain, rather than support, pancreatic ductal adenocarcinoma. Cancer cell 2014, 25, (6), 735-747.

285. Cheung, K. J.; Gabrielson, E.; Werb, Z.; Ewald, A. J., Collective invasion in breast cancer requires a conserved basal epithelial program. Cell 2013, 155, (7), 1639-1651.

286. Hoadley, K. A.; Yau, C.; Wolf, D. M.; Cherniack, A. D.; Tamborero, D.; Ng, S.; Leiserson, M. D.; Niu, B.; McLellan, M. D.; Uzunangelov, V., Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 2014, 158, (4), 929-944.

287. Collisson, E. A.; Sadanandam, A.; Olson, P.; Gibb, W. J.; Truitt, M.; Gu, S.; Cooc, J.; Weinkle, J.; Kim, G. E.; Jakkula, L., Subtypes of pancreatic ductal adenocarcinoma and their differing responses to therapy. Nature medicine 2011, 17, (4), 500-503.

288. Network, C. G. A. R., Comprehensive molecular profiling of lung adenocarcinoma. Nature 2014, 511, (7511), 543.

289. Pérez–Mancera, P. A.; Guerra, C.; Barbacid, M.; Tuveson, D. A., What we have learned about pancreatic cancer from mouse models. Gastroenterology 2012, 142, (5), 1079-1092.

231

290. Hingorani, S. R.; Wang, L.; Multani, A. S.; Combs, C.; Deramaudt, T. B.; Hruban, R. H.; Rustgi, A. K.; Chang, S.; Tuveson, D. A., Trp53R172H and KrasG12D cooperate to promote chromosomal instability and widely metastatic pancreatic ductal adenocarcinoma in mice. Cancer cell 2005, 7, (5), 469-483.

291. Aguirre, A. J.; Bardeesy, N.; Sinha, M.; Lopez, L.; Tuveson, D. A.; Horner, J.; Redston, M. S.; DePinho, R. A., Activated Kras and Ink4a/Arf deficiency cooperate to produce metastatic pancreatic ductal adenocarcinoma. Genes & development 2003, 17, (24), 3112-3126.

292. Bardeesy, N.; Cheng, K.-h.; Berger, J. H.; Chu, G. C.; Pahler, J.; Olson, P.; Hezel, A. F.; Horner, J.; Lauwers, G. Y.; Hanahan, D., Smad4 is dispensable for normal pancreas development yet critical in progression and tumor biology of pancreas cancer. Genes & development 2006, 20, (22), 3130-3146.

293. Kabashima, A.; Higuchi, H.; Takaishi, H.; Matsuzaki, Y.; Suzuki, S.; Izumiya, M.; Iizuka, H.; Sakai, G.; Hozawa, S.; Azuma, T., Side population of pancreatic cancer cells predominates in TGF mediated epithelial to mesenchymal transition and invasion. International journal of cancer 2009, 124, (12), 2771-2779. ‐ ‐ 294. Izeradjene, K.; Combs, C.; Best, M.; Gopinathan, A.; Wagner, A.; Grady, W. M.; Deng, C.-X.; Hruban, R. H.; Adsay, N. V.; Tuveson, D. A., KrasG12D and Smad4/Dpc4 haploinsufficiency cooperate to induce mucinous cystic neoplasms and invasive adenocarcinoma of the pancreas. Cancer cell 2007, 11, (3), 229-243.

295. Izumchenko, E.; Chang, X.; Michailidi, C.; Kagohara, L.; Ravi, R.; Paz, K.; Brait, M.; Hoque, M. O.; Ling, S.; Bedi, A., The TGF–miR200–MIG6 pathway orchestrates the EMT-associated kinase switch that induces resistance to EGFR inhibitors. Cancer research 2014, 74, (14), 3995-4005.

296. Rhim, A. D.; Mirek, E. T.; Aiello, N. M.; Maitra, A.; Bailey, J. M.; McAllister, F.; Reichert, M.; Beatty, G. L.; Rustgi, A. K.; Vonderheide, R. H., EMT and dissemination precede pancreatic tumor formation. Cell 2012, 148, (1), 349-361.

297. Herreros-Villanueva, M.; Zhang, J.; Koenig, A.; Abel, E.; Smyrk, T. C.; Bamlet, W.; De Narvajas, A. A.; Gomez, T.; Simeone, D.; Bujanda, L., SOX2 promotes dedifferentiation and imparts stem cell-like features to pancreatic cancer cells. Oncogenesis 2013, 2, (8), e61.

298. Jiang, J.; Li, Z.; Yu, C.; Chen, M.; Tian, S.; Sun, C., MiR-1181 inhibits stem cell- like phenotypes and suppresses SOX2 and STAT3 in human pancreatic cancer. Cancer letters 2015, 356, (2), 962-970.

299. Shao, D. D.; Xue, W.; Krall, E. B.; Bhutkar, A.; Piccioni, F.; Wang, X.; Schinzel, A. C.; Sood, S.; Rosenbluh, J.; Kim, J. W., KRAS and YAP1 converge to regulate EMT and tumor survival. Cell 2014, 158, (1), 171-184.

232

300. Yimlamai, D.; Christodoulou, C.; Galli, G. G.; Yanger, K.; Pepe-Mooney, B.; Gurung, B.; Shrestha, K.; Cahan, P.; Stanger, B. Z.; Camargo, F. D., Hippo pathway activity influences liver cell fate. Cell 2014, 157, (6), 1324-1338.

301. Dupont, S.; Morsut, L.; Aragona, M.; Enzo, E.; Giulitti, S.; Cordenonsi, M.; Zanconato, F.; Le Digabel, J.; Forcato, M.; Bicciato, S., Role of YAP/TAZ in mechanotransduction. Nature 2011, 474, (7350), 179.

302. Yu, F.-X.; Zhao, B.; Panupinthu, N.; Jewell, J. L.; Lian, I.; Wang, L. H.; Zhao, J.; Yuan, H.; Tumaneng, K.; Li, H., Regulation of the Hippo-YAP pathway by G- protein-coupled receptor signaling. Cell 2012, 150, (4), 780-791.

303. Paszek, M. J.; Zahir, N.; Johnson, K. R.; Lakins, J. N.; Rozenberg, G. I.; Gefen, A.; Reinhart-King, C. A.; Margulies, S. S.; Dembo, M.; Boettiger, D., Tensional homeostasis and the malignant phenotype. Cancer cell 2005, 8, (3), 241-254.

304. Calvo, F.; Ege, N.; Grande-Garcia, A.; Hooper, S.; Jenkins, R. P.; Chaudhry, S. I.; Harrington, K.; Williamson, P.; Moeendarbary, E.; Charras, G., Mechano- transduction and YAP-dependent matrix remodelling is required for the generation and maintenance of cancer associated fibroblasts. Nature cell biology 2013, 15, (6).

305. Samuel, M. S.; Lopez, J. I.; McGhee, E. J.; Croft, D. R.; Strachan, D.; Timpson, P.; Munro, J.; Schröder, E.; Zhou, J.; Brunton, V. G., Actomyosin-mediated cellular tension drives increased tissue stiffness and -catenin activation to induce epidermal hyperplasia and tumor growth. Cancer cell 2011, 19, (6), 776-791.

306. Hingorani, S. R.; Petricoin, E. F.; Maitra, A.; Rajapakse, V.; King, C.; Jacobetz, M. A.; Ross, S.; Conrads, T. P.; Veenstra, T. D.; Hitt, B. A., Preinvasive and invasive ductal pancreatic cancer and its early detection in the mouse. Cancer cell 2003, 4, (6), 437-450.

307. Mouw, J. K.; Yui, Y.; Damiano, L.; Bainer, R. O.; Lakins, J. N.; Acerbi, I.; Ou, G.; Wijekoon, A. C.; Levental, K. R.; Gilbert, P. M., Tissue mechanics modulate microRNA-dependent PTEN expression to regulate malignant progression. Nature medicine 2014, 20, (4), 360-367.

308. Musteanu, M.; Blaas, L.; Mair, M.; Schlederer, M.; Bilban, M.; Tauber, S.; Esterbauer, H.; Mueller, M.; Casanova, E.; Kenner, L., Stat3 is a negative regulator of intestinal tumor progression in Apc min mice. Gastroenterology 2010, 138, (3), 1003- 1011. e5.

309. Li, R.; Mitra, N.; Gratkowski, H.; Vilaire, G.; Litvinov, R.; Nagasami, C.; Weisel, J. W.; Lear, J. D.; DeGrado, W. F.; Bennett, J. S., Activation of integrin αIIbßγ by modulation of transmembrane helix associations. Science 2003, 300, (5620), 795- 798.

310. Lopez, J. I.; Kang, I.; You, W.-K.; McDonald, D. M.; Weaver, V. M., In situ force mapping of mammary gland transformation. Integrative Biology 2011, 3, (9), 910-921.

233

311. Pickup, M. W.; Laklai, H.; Acerbi, I.; Owens, P.; Gorska, A. E.; Chytil, A.; Aakre, M.; Weaver, V. M.; Moses, H. L., Stromally derived lysyl oxidase promotes metastasis of transforming growth factor-–deficient mouse mammary carcinomas. Cancer research 2013, 73, (17), 5336-5346.

312. Sanz-Moreno, V.; Gaggioli, C.; Yeo, M.; Albrengues, J.; Wallberg, F.; Viros, A.; Hooper, S.; Mitter, R.; Féral, C. C.; Cook, M., ROCK and JAK1 signaling cooperate to control actomyosin contractility in tumor cells and stroma. Cancer cell 2011, 20, (2), 229-245.

313. Fukuda, A.; Wang, S. C.; Morris, J. P.; Folias, A. E.; Liou, A.; Kim, G. E.; Akira, S.; Boucher, K. M.; Firpo, M. A.; Mulvihill, S. J., Stat3 and MMP7 contribute to pancreatic ductal adenocarcinoma initiation and progression. Cancer cell 2011, 19, (4), 441-455.

314. Ijichi, H.; Chytil, A.; Gorska, A. E.; Aakre, M. E.; Bierie, B.; Tada, M.; Mohri, D.; Miyabayashi, K.; Asaoka, Y.; Maeda, S., Inhibiting Cxcr2 disrupts tumor-stromal interactions and improves survival in a mouse model of pancreatic ductal adenocarcinoma. The Journal of clinical investigation 2011, 121, (10), 4106.

315. Kapoor, A.; Yao, W.; Ying, H.; Hua, S.; Liewen, A.; Wang, Q.; Zhong, Y.; Wu, C.-J.; Sadanandam, A.; Hu, B., Yap1 activation enables bypass of oncogenic Kras addiction in pancreatic cancer. Cell 2014, 158, (1), 185-197.

316. Moroishi, T.; Hansen, C. G.; Guan, K.-L., The emerging roles of YAP and TAZ in cancer. Nature reviews. Cancer 2015, 15, (2), 73.

317. Hezel, A. F.; Kimmelman, A. C.; Stanger, B. Z.; Bardeesy, N.; DePinho, R. A., Genetics and biology of pancreatic ductal adenocarcinoma. Genes & development 2006, 20, (10), 1218-1249.

318. Baetens, D.; Malaisse-Lagae, F.; Perrelet, A.; Orci, L., Endocrine pancreas: three-dimensional reconstruction shows two types of islets of Langerhans. Science 1979, 206, (4424), 1323-1325.

319. Siegel, R. L.; Miller, K. D.; Jemal, A., Cancer statistics, 2015. CA: a cancer journal for clinicians 2015, 65, (1), 5-29.

320. DeSantis, C. E.; Lin, C. C.; Mariotto, A. B.; Siegel, R. L.; Stein, K. D.; Kramer, J. L.; Alteri, R.; Robbins, A. S.; Jemal, A., Cancer treatment and survivorship statistics, 2014. CA: a cancer journal for clinicians 2014, 64, (4), 252-271.

321. Swartz, M. A.; Lund, A. W., Lymphatic and interstitial flow in the tumour microenvironment: linking mechanobiology with immunity. Nature Reviews Cancer 2012, 12, (3), 210-219.

234

322. Erkan, M.; Hausmann, S.; Michalski, C. W.; Fingerle, A. A.; Dobritz, M.; Kleeff, J.; Friess, H., The role of stroma in pancreatic cancer: diagnostic and therapeutic implications. Nature Reviews Gastroenterology and Hepatology 2012, 9, (8), 454-467.

323. Erkan, M.; Reiser-Erkan, C.; Michalski, C. W.; Deucker, S.; Sauliunaite, D.; Streit, S.; Esposito, I.; Friess, H.; Kleeff, J., Cancer-stellate cell interactions perpetuate the hypoxia-fibrosis cycle in pancreatic ductal adenocarcinoma. Neoplasia 2009, 11, (5), 497-508.

324. Xie, D.; Xie, K., Pancreatic cancer stromal biology and therapy. Genes & diseases 2015, 2, (2), 133-143.

325. Serini, G.; Bochaton-Piallat, M.-L.; Ropraz, P.; Geinoz, A.; Borsi, L.; Zardi, L.; Gabbiani, G., The fibronectin domain ED-A is crucial for myofibroblastic phenotype induction by transforming growth factor-1. The Journal of cell biology 1998, 142, (3), 873-881.

326. Nielsen, M. F. B.; Mortensen, M. B.; Detlefsen, S., Key players in pancreatic cancer-stroma interaction: Cancer-associated fibroblasts, endothelial and inflammatory cells. World journal of gastroenterology 2016, 22, (9), 2678.

327. Sottile, J.; Hocking, D. C., Fibronectin polymerization regulates the composition and stability of extracellular matrix fibrils and cell-matrix adhesions. Molecular biology of the cell 2002, 13, (10), 3546-3559.

328. Cardoso, I.; Østerlund, E. C.; Stamnaes, J.; Iversen, R.; Andersen, J. T.; Jørgensen, T. J.; Sollid, L. M., Dissecting the interaction between transglutaminase 2 and fibronectin. Amino acids 2017, 49, (3), 489-500.

329. Akimov, S. S.; Krylov, D.; Fleischman, L. F.; Belkin, A. M., Tissue transglutaminase is an integrin-binding adhesion coreceptor for fibronectin. The Journal of cell biology 2000, 148, (4), 825-838.

330. Sethi, T.; Rintoul, R. C.; Moore, S. M.; MacKinnon, A. C.; Salter, D.; Choo, C.; Chilvers, E. R.; Dransfield, I.; Donnelly, S. C.; Strieter, R., Extracellular matrix proteins protect small cell lung cancer cells against apoptosis: a mechanism for small cell lung cancer growth and drug resistance in vivo. Nature medicine 1999, 5, (6).

331. Oskarsson, T.; Acharyya, S.; Zhang, X. H.; Vanharanta, S.; Tavazoie, S. F.; Morris, P. G.; Downey, R. J.; Manova-Todorova, K.; Brogi, E.; Massagué, J., Breast cancer cells produce tenascin C as a metastatic niche component to colonize the lungs. Nature medicine 2011, 17, (7), 867-874.

332. Juuti, A.; Nordling, S.; Louhimo, J.; Lundin, J.; Haglund, C., Tenascin C expression is upregulated in pancreatic cancer and correlates with differentiation. Journal of clinical pathology 2004, 57, (11), 1151-1155.

235

333. Paron, I.; Berchtold, S.; Vörös, J.; Shamarla, M.; Erkan, M.; Höfler, H.; Esposito, I., Tenascin-C enhances pancreatic cancer cell growth and motility and affects cell adhesion through activation of the integrin pathway. PloS one 2011, 6, (6), e21684.

334. Wilhelm, M.; Schlegl, J.; Hahne, H.; Gholami, A. M.; Lieberenz, M.; Savitski, M. M.; Ziegler, E.; Butzmann, L.; Gessulat, S.; Marx, H., Mass-spectrometry-based draft of the human proteome. Nature 2014, 509, (7502), 582.

335. Kim, M.-S.; Pinto, S. M.; Getnet, D.; Nirujogi, R. S.; Manda, S. S.; Chaerkady, R.; Madugundu, A. K.; Kelkar, D. S.; Isserlin, R.; Jain, S., A draft map of the human proteome. Nature 2014, 509, (7502), 575-581.

336. Costa-Silva, B.; Aiello, N. M.; Ocean, A. J.; Singh, S.; Zhang, H.; kumar Thakur, B.; Becker, A.; Hoshino, A.; Mark, M. T.; Molina, H., Pancreatic cancer exosomes initiate pre-metastatic niche formation in the liver. Nature cell biology 2015, 17, (6), 816.

337. DiPietro, L. A., Wound healing: the role of the macrophage and other immune cells. Shock 1995, 4, (4), 233-240.

338. Karagiannis, G. S.; Petraki, C.; Prassas, I.; Saraon, P.; Musrap, N.; Dimitromanolakis, A.; Diamandis, E. P., Proteomic signatures of the desmoplastic invasion front reveal collagen type XII as a marker of myofibroblastic differentiation during colorectal cancer metastasis. Oncotarget 2012, 3, (3), 267.

236

APPENDIX A

CROSSLINKED AMIND ACID STANDARD CHARACTERIZATION

Appendix Table A.1: Summary of xAA Standard Characterization. DHLNL and LNL standard were provided by Dr. Valerie Weaver (UCSF). dPyr, desmosine and isodesmosine were purchased from Toronto Research Chemicals. Serial dilutions of standards were made to determine the lowest limit of detection (LLOD) and the lowest limit of quantification (LLOQ). A LLOD is the analyte concentration that is required to produce a signal that is three times the noise level. The LLOQ is the analyte concentration that is required to produce a signal that is three times the LLOD.

237

Appendix Figure A.1: xAA Standard Characterization by MS

238

Appendix Figure A.1: xAA Standard Characterization by MS. A) MS2 fragmentation spectra of commercially available standards. Partial assignment of MS2 fragment ions is show above ions if they have been assigned B) Calibration curves from 15 minute Amide HILIC high pH method on QExactive mass spectrometer. R-squared (RSQ), limit of detection (LOD), and limit of quantification (LOQ) are inset inside each plot.

239

APPENDIX B

PUBLICATIONS

Alexander S. Barrett, Ryan C. Hill, Matthew Wither, Julie Haines, Monika Dzieciatkowska and Kirk C. Hansen “Hydroxylamine Chemical Digestion for Insoluble Extracellular Matrix Characterization” Journal of Proteome Research, (Accepted with revisions) - Chapter II, ASB developed the method and performed proteomic experiments on all tissues, made figures, data analysis, and wrote the manuscript Erica Goddard*, Ryan C. Hill*, Alexander S. Barrett, Courtney Betts, Qiuchen Guo, Ori Maller, Virginia F. Borges, Kirk C. Hansen, and Pepper Schedin “Quantitative extracellular matrix proteomics to study mammary and liver tissue microenvironments” International Journal of Biochemistry & Cell Biology October 2016 - Chapter III, ASB performed proteomic experiments on mammary gland, data analysis, figure generation and added manuscript additions Alexander S. Barrett*, Ori Maller*, Andrew Nelson, Jonathon N. Lakins, Irene Acerbi, J. Matthew Barnes, Catharina Hagerling, Aestha Chauhan, Aqsa Nasir, Signe Borgquist, Jonas Manjer, Yunn-Yi Chen, E. Shelley Hwang, Zena Werb, Valerie M. Weaver, Kirk C. Hansen. “Hydroxy lysine derived collagen crosslinks promote poor breast cancer patient prognosis and treatment resistance.” (In Preparation) - Chapter V, ASB developed the method and performed proteomic and crosslinking experiments on all tissues, data analysis, made figures and wrote the manuscript Hanane Laklai, Yekaterina Miroshnikova, Michael Pickup , Eric Collisson, Grace Kim, Alexander S. Barrett, Ryan Hill, Johnathon Lakins , David D. Schlaepfer, Janna Mouw, Valerie LeBleu, Sergey Novitskiy, Julia Johansen, Valeria Poli, Kirk C Hansen, Raghu Kalluri, Harold Moses, Matthias Hebrok, Christine Iacobuzio- Donahue, Laura Wood, Nilotpal Roy, and Valerie Weaver “Genotype tunes pancreatic ductal adenocarcinoma tissue tension to induce matricellular-fibrosis and tumor progression” Nature Medicine, March β016 - Chapter VI, ASB performed global and targeted proteomic experiments, xAAA, data analysis, figure generation and added manuscript additons Alexander S. Barrett, Michael Pickup, Ori Maller, Jon Lakins, Valerie M. Weaver and Kirk C. Hansen “Targeted proteomics and cross-linking analysis reveal alterations in the microenvironment of pancreatic tumors” (In Preparation) - Chapter VII. ASB performed proteomic and crosslinking experiments on all tissues, data analysis, made figures and wrote the manuscript

240

Alexander S. Barrett*, Robert Chalkley*, Aaron Issaian and Kirk C. Hansen “Identification of in vivo cross-linked NTX and CTX collagen peptides via strong cation exchange enrichment and LC-MS/MS” (In preparation) - Not included. ASB developed the method and performed proteomic experiments, data analysis and added manuscript additions Lucas A. Tomko, Ryan C. Hill, Alexander S. Barrett, Joseph M. Szulczewski, Matthew C. Conklin, Kevin W. Eliceiri, Kirk C. Hansen and Particia J. Keely “Matricellular proteins Tenascin-C and Thrombospondin-2 co-localize with aligned collagen fibers in invasive ductal carcinoma patient samples” Matrix Biology (Under Review) - Not included. ASB performed data analysis and made figures J. Matthew Barnes, Laralynne Przybyla, Russell O. Bainer, FuiBoon Kai, Elliot C. Woods, Jason C. Tung, Alexander S. Barrett, Kan V. Lu, Jonathon N. Lakins, Kirk C. Hansen, Gabriele Bergers, Joanna Phillips, Carolyn R. Bertozzi, and Valerie M. Weaver “A glycoprotein-mediated mechanical switch promotes glioblastoma aggression” Nature Cell Biology (Under Review) - Not included. ASB performed proteomic experiments on glioblastomas, data analysis and made figures Jovylyn Gatchalian, Muzaffar Ali, Forest H Andrews, Yi Zhang, Alexander S. Barrett and Tatiana G. Kutateladze “Structural insight into recognition of methylated histone HγK4 by Setγ” Journal of Molecular Biology September 2016 - Not included, ASB performed structural characterization experiments during 1st year rotation.

Ryan C. Hill, Matthew Wither, Travis Nemkov, Alexander S. Barrett, Angelo D’ Alessandro, Monika Dzieciatkowska, and Kirk C. Hansen “Preserved Proteins from Extinct Bison latifrons Identified by Tandem Mass Spectrometry; Hydroxylysine Glycosides are a Common Feature of Ancient Collagen” Molecular Cellular Proteomics, May 2015

- Not included. ASB performed data analysis and edited the manuscript during 1st year rotation.

241