Global Analysis of Kidney Glomerular in Mice of Different Strains and Genders

A thesis submitted to the University of Manchester for the degree of MPhil in the Institute of Human Development, Faculty of Medical and Human Sciences

September 2013

Thomas Robert Denny

School of Medicine

2

Table of Contents

Table of Contents ...... 2

List of Figures ...... 6

List of Tables ...... 8

Abstract ...... 9

Details and Declaration ...... 10

Copyright ...... 11

Acknowledgements ...... 12

Part 1: General Introduction ...... 13

The glomerulus ...... 13

Albuminuria and glomerular pathologies ...... 14

Different urinary albumin excretion rates in humans of different genetic

backgrounds and sexes ...... 15

Urinary albumin excretion rates vary in inbred mouse strains ...... 16

Part 2: Glomerular Filtration Barrier Structure and Function ...... 20

Podocytes ...... 22

Fenestrated Endothelial Cells ...... 22

Glomerular Mesangial Cells ...... 22

Glomerular extracellular matrix ...... 23

Laminins ...... 25

Collagens ...... 25 3

Nidogens ...... 26

Heparan Sulphate ...... 26

Part 3: Methodologies ...... 28

Proteomics ...... 28

Mass Spectrometry ...... 29

Simplification of glomerular ECM Sample ...... 30

RNA Array ...... 33

Bioinformatic Analysis ...... 34

Aims and Hypotheses ...... 36

Aims ...... 36

Hypotheses...... 37

Methods ...... 38

Antibodies ...... 38

Isolation of mouse glomeruli by sieving ...... 39

Isolation of mouse glomeruli by magnetic dynabeads ...... 39

Quantification of glomerular isolation method effectiveness ...... 40

Enrichment of glomerular ECM ...... 40

Western blotting ...... 41

In-gel proteolytic digestion ...... 42

Liquid chromatography–tandem mass spectrometry analysis ...... 43 4

Bioinformatic Analysis ...... 44

RNA Microarray analysis of isolated glomeruli ...... 46

Analysis of RNA Microarray results ...... 46

Statistical analysis of RNA Microarray data...... 47

Comparison of Mass Spectrometric and RNA Microarray data sets ...... 48

Results Part 1: Comparison of glomerular isolation methodologies ...... 50

Discussion ...... 54

Results Part 2: Enrichment of Isolated Glomerular Extracellular Matrix Proteins ..... 57

Discussion ...... 59

Results Part 3: Analysis of Mass Spectrometric Data ...... 61

Discussion ...... 80

Results Part 4: Comparison of Mass Spectrometric and RNA Array Data Sets ...... 82

Discussion ...... 90

Overall Discussion ...... 92

Genes of interest ...... 93

Genes coding for proteins expressed in the female gender only ...... 94

Umod ...... 94

Fras1 ...... 95

Genes coding for proteins expressed in the FVB strain only ...... 96

Fgf2 ...... 96

Serpina3k and Agrin ...... 97 5

Comparison of Mass Spectrometric and RNA data ...... 98

Main Conclusions ...... 99

Future Directions of Investigation ...... 100

References ...... 102

Zhang B. Zhou K. K. and Ma J. (2010) Inhibition of Growth Factor

Overexpression in Diabetic Retinopathy by SERPINA3K via Blocking the WNT/β-

Catenin Pathway, Diabetes, 59: 1809–1816...... 113

Appendices ...... 115

Appendix 1: Tables of glomerular extracellular matrix proteins detected by mass

spectrometry in male FVB, female FVB, male C57 and female C57 mice ...... 115

Appendix 2: Visual comparison of data (MS NSC/RNA RFU) ...... 124

Words: 27,079

6

List of Figures

Figure 1 Graphs showing glomerular number per kidney (A) and urinary 18

albumin excretion rate (B) in male (M) and female (F) mice from

the C57 and FVB strains.

Figure 2 The glomerulus 21

Figure 3 Coomassie stained gel and Western blots showing enrichment 32

of glomerular ECM proteins

Figure 4 Light microscope image of a relatively pure glomerular isolate 51

Figure 5 Mean numbers of whole glomeruli from sieving and Dynabead 53

based isolation methods

Figure 6 A Workflow of sieving-based glomerular isolation method 55

Figure 6 B Workflow of Dynabead-based glomerular isolation method 56

Figure 7 Coomassie stained gel and Western blots showing enrichment 58

of glomerular ECM proteins

Figure 8 Venn diagram of MS analysis of enriched glomerular isolates 62

from male and female mice from the C57 and FVB strains

Figure 9 Bar charts summarizing MS data values for products of 68

selected genes characteristic of one sex or one strain

Figure 10 Total numbers of glomerular ECM proteins that are unique to 70

each group and classes of glomerular ECM protein that are

unique to each mouse group

Figure 11 Bar Relative sizes of the four protein classes in the common 73 7

matrisome

Figure 12 Percentages that each of the protein classes contribute to the 75

glomerular matrisomes

Figure 13 Mean Normalised Spectral count values 77

Figure 14 Correlations between RNA array and MS data sets 83

8

List of Tables

Table 1 Lists of proteins within each category of the Venn Diagram 63

Table 2 Summary of ranking comparison 78

Table 3 Summary of frequencies of different types of product 88

relationships

Table of Abbreviations

Abbreviation Full name

ECM Extracellular Matrix

GBM Glomerular

GFR Glomerular filtration rate

C57 C57BL6/J mouse strain

FVB FVB mouse strain

MS Mass spectrometry

LC-MS/MS liquid chromatography–tandem mass spectrometry

PBS Phosphate buffered saline

SDS Sodium dodecylsulfate

TB Trition-based buffer

EB Extraction buffer

ACN acetonitrile

dH2O Deionised H2O

RFU relative fluorescence units

NSC Normalised spectral count

9

Abstract University: The University of Manchester

Name: Thomas Robert Denny

Degree: Biomedical Sciences Master of Philosophy

Thesis Title: Global Analysis of Kidney Glomerular Extracellular Matrix in Mice of Different Strains and Genders

Date: September 2013

Research to date has shown that human populations of different racial backgrounds and sexes have differing levels of albuminuria, suggesting that the genetics of an individual influences the function and integrity of their glomerular filtration barrier (GFB). In addition, different strains and sexes of mice are known to have variable patterns of albuminuria comparable with those seen amongst human races and sexes.

The aim of this research project was to use mass spectrometry (MS)-based proteomics to characterise the effect of sex and strain on the protein composition of glomerular ECMs from male and female mice from the C57BL/6J (C57) and FVB inbred strains as well as identify any specific glomerular ECM components that vary in these mice and relate to their degree of albuminuria. Therefore, the central hypothesis for this research project stated that global proteomic analysis of the glomerular extracellular matrix (ECM) would produce novel data regarding sex and strain-related differences in protein composition in the murine GBM and mesangial matrix.

This report presents the investigations in four related areas: 1: whether techniques relying on either sieving or magnetic Dynabeads are more effective for the isolation of murine glomeruli; 2: the enrichment of the murine glomerular ECMs using buffered washes; 3: the analysis of ECM components via liquid chromatography–tandem mass spectrometry; and 4: a comparison of analogous data sets relating to RNA expression and protein abundance in the murine glomerulus.

The results have shown that the Dynabead-based glomerular isolation method was superior because it produced an isolate containing nearly five times as many glomeruli compared with its sieving based alternative. A sample suitable for analysis by MS was produced through the successful enrichment of glomerular ECM proteins as assessed by Western blotting. Analysis of the samples by MS provided a list of 120 ECM proteins that make up the glomerular matrisome, or the ensemble of ECM proteins and associated factors, in the four mouse groups studied. The different patterns of abundance observed in these proteins across the mouse groups have shown that the glomerular ECMs of male and female mice from the C57 and FVB strains are compositionally different. Additionally, the presence or absence of certain proteins can be associated with increased levels of albuminuria. Comparative analysis of RNA array and MS datasets has revealed previously unreported patterns of relating to the mouse groups studied.

This work provides a basis for an MS analysis of glomerular ECMs and for future investigations aimed at determining the roles of specific components and mechanisms that control sex and strain-associated changes in the structure and function of the glomerular ECM.

10

Details and Declaration

No portion of the work referred to in the thesis has been submitted in support of an application for another degree or qualification of this or any other university or other institute of learning.

11

Copyright

1. The author of this thesis (including any appendices and/or schedules to this

thesis) owns certain copyright or related rights in it (the “Copyright”) and s/he

has given The University of Manchester certain rights to use such Copyright,

including for administrative purposes.

2. Copies of this thesis, either in full or in extracts and whether in hard or

electronic copy, may be made only in accordance with the Copyright, Designs

and Patents Act 1988 (as amended) and regulations issued under it or, where

appropriate, in accordance with licensing agreements which the University has

from time to time. This page must form part of any such copies made.

3. The ownership of certain Copyright, patents, designs, trademarks and other

intellectual property (the “Intellectual Property”) and any reproductions of

copyright works in the thesis, for example graphs and tables (“Reproductions”),

which may be described in this thesis, may not be owned by the author and

may be owned by third parties. Such Intellectual Property and Reproductions

cannot and must not be made available for use without the prior written

permission of the owner(s) of the relevant Intellectual Property and/or

Reproductions.

4. Further information on the conditions under which disclosure, publication and

commercialisation of this thesis, the Copyright and any Intellectual Property

and/or Reproductions described in it may take place is available in the

University IP Policy (see http://www.campus.manchester.ac.uk/medialibrary/

policies/intellectualproperty.pdf), in any relevant Thesis restriction declarations

deposited in the University Library, The University Library’s regulations (see

http://www.manchester.ac.uk/library/aboutus/regulations) and in The

University’s policy on presentation of Theses

12

Acknowledgements

This report could not have been produced without the funding of the Biotechnology and

Biological Sciences Research Council (BBSRC) as well as the support and guidance of my University of Manchester supervisors Dr. Rachel Lennon and Professor Adrian S.

Woolf, and my academic advisor Professor Martin Humphries. Thanks must also be given to fellow laboratory members Dr. Hellyeh Hamidi and Mr. Michael Randles for their invaluable assistance in learning numerous experimental techniques, Dr. Corina

Anders for her general advice and Dr. Tony Starborg for his help with the staining and preparation of samples for electron microscopic imaging carried out as part of an additional line of enquiry, the data from which has not been included in this report.

Thanks also go to Dr. David A. Long and Ms. Jennifer Huang of University College

London for sharing their RNA array data and assistance in the process of glomerular isolation using dynabead-perfusion.

The assistance given by all these individuals has been invaluable and gratefully received.

13

Part 1: General Introduction

The kidneys are multifunctional organs that are required for the maintenance of biochemical balance in numerous systems throughout the body. Their primary role is to regulate both the volume and ionic composition of the internal environment. They achieve this by controlling the removal of water, organic and inorganic solutes and H+ ions, in addition to waste products such as urea, uric acid, and creatinine from muscles. The kidney also functions as an endocrine gland secreting erythropoietin, renin and 1 25-dihydroxyvitamin D3, which are involved in the regulation of the rate of red blood cell production, blood pressure and calcium metabolism respectively (Vander et al., 2001).

The glomerulus

The functional unit of the kidney is the nephron. Stereological analysis demonstrated that in a healthy human adult there are a median of 1,429,200 nephrons per kidney

(Keller et al., 2003). The glomerulus is ball-like structure comprising a tuft of tightly wrapped capillary loops, which is surrounded by the renal or Bowman’s capsule. It can be described as the 'first' part of the nephron because it is where the blood capillaries of the vascular system first abut onto the nephron epithelia; allowing the process of ultrafiltration to take place as the water and small ion components of the blood passes through the three layers of the glomerular filtration barrier due to hydrostatic pressure exerted by blood pressure.

The glomerular filtration barrier is most commonly divided into three, highly specialised components: i) the specialised fenestrated endothelial cells of the capillaries that make up the glomerular tuft, ii) the extracellular matrix (ECM) layer of the glomerular basement membrane (GBM) and iii) the junctions between podocyte epithelial cells spanned by specialised intercellular junctions known as slit diaphragms. These capillary loops are supported by pericyte-like mesangial cells, which together form the 14

most complex biological membrane so far encountered. The main function of the glomerular filtration barrier is to prevent the cellular and macromolecular constituents of the blood from passing into the tubules of the nephrons during the urine-forming process of ultrafiltration.

A key measurement in determining an individual’s kidney excretory function is the glomerular filtration rate (GFR), defined as the volume of ultrafiltrate (i.e. liquid and small molecular components that have passed through the glomerular filtration barrier) generated per unit time. It is classically determined by measuring the concentration of a marker substance, which can be any one that is both found at stable concentrations in the blood and filtered freely by the kidneys (the most commonly used is creatinine).

The average GFR of a healthy human adult of 70 kg is 180 L/day (Vander et al., 2001).

Some studies have reported differences in the GFR's of males and females from numerous species, including humans and mice; for example the GFR vales for male and female C57BL/6J mice were on average 236.69 and 140.20 µl/min respectively ,

(Qi et al., 2004). However, these sex-based differences disappeared when GFR is factored for body weight (Long et al., 2013).

The composition of the ultrafiltrate, such as how much protein passes through the glomerular filtration barrier, is a key indicator of renal function and health. Defects in the glomerular filtration barrier lead to a number of pathologies that all manifest themselves by increasing the amount of protein in the urine, which can sometimes progress to life-threatening conditions.

Albuminuria and glomerular pathologies

Studies have shown that major disruption of any of the layers of the glomerular filtration barrier, such as can occur through injury or the alteration of a protein component through the mutation of its gene, results in increased leakage of blood proteins into the glomerular ultrafiltrate (Kestilä et al., 1998) (Boute et al., 2000). Albumin, the most 15

commonly occurring blood protein, typically makes up the largest percentage of lost protein and is used as an indicator of urinary protein levels. This increased urinary protein loss is defined as macroalbuminuria, and has been identified as a sign of glomerular disease. When there is massive albuminuria, nephrotic syndrome develops.

This is characterised by albuminuria>300 mg/24 hrs together with low levels of serum albumin (<25g/l) and consequent oedema, (Molitch et al., 2004).

However, lower levels of albuminuria between 30-300 mg/24 hrs or 20-200 µg/min, known as microalbuminuria, may also have clinical significance. Studies have shown that microalbuminuria is an independent risk factor for cardiovascular mortality and other pathological events (Weir, 2007). Additionally, individuals with diabetic mellitus who have microalbuminuria are 24 times more likely to progress to overt, or clinical, proteinuria (>0.5g protein/24 hrs). In addition, albuminuria is a risk factor for generalised cardiovascular disease (Viberti et al., 1982). However, the mechanism behind this progression to pathology is so far unknown.

Studies of albuminuria in humans have shown that its incidence is not equally distributed throughout Western populations. These same studies have begun to determine risk factors for the disease, both environmental and genetic.

Different urinary albumin excretion rates in humans of different genetic backgrounds and sexes

African-American, Mexican, and non-Hispanic white racial groups living in the United

States were shown to have differing levels of albuminuria (Hanevold et al., 2008).

Levels of microalbuminuria were shown to be greater in Black and Mexican populations than those of non-Hispanic white heritage (Jones et al., 2002). These studies have also indicated that the male sex in humans is associated with a higher prevalence of microalbuminuria compared with females. Additionally, ‘normal’ male albuminuria levels are higher than females even after risk factors such as age, body mass index and 16

plasma glucose levels have been taken into account (Verhave, 2003). These higher levels of albuminuria have been associated with the higher levels of cardiovascular disease prevalent in male populations as well as, interestingly, a short-statured phenotype (Gould et al., 1993); (Isles et al., 1992). This evidence is consistent with the hypothesis that genetic background and sex are involved in the higher levels of albuminuria observed in African-American and Mexican-originating populations in addition to the male sex generally. However, the underlying factors that result in the increased albuminuria have remained unidentified to date.

Urinary albumin excretion rates vary in inbred mouse strains

As in humans, both genetic background and sex have been shown to be associated with consistent differences in levels of albuminuria in inbred mouse strains. An investigation of albuminuria in males and females from 30 inbred mouse strains aged

12, 18 and 24 months showed that there was up to a 100-fold difference in albuminuria between the groups (Tsaih et al., 2010). Interestingly, this study also showed that not all inbred strains show the same sex-related pattern as observed in humans, because the females of some strains show higher urinary albumin excretion than their male counterparts.

Long et al. (2013) used a more focused approach by studying male and female mice from the C57BL/6J (C57) and FVB strains, two of the 30 inbred mice strains used by

Tsaih et al. (2010). They found that at the age of 18 weeks C57BL/6J mice were less- albuminuric than FVB mice. Although the mice used by Tsaih et al. were much older

(three age groups were used at 12, 18 and 24 months) than those used by Long et al. the findings are comparable as glomerulogenesis is finished and ceases around a week after birth in mice, meaning that both studies use models that have a mature renal phenotype (Vernier and Birch-Andersen, 1962). These mouse models appear to be particularly relevant to human studies because both strains demonstrate higher levels urinary albumin loss in the males compared with females. Additionally, the levels 17

of albuminuria at this age were shown to be within the range of what would be considered ‘microalbuminuria’ when factoring in the differences in body mass between mice and humans. A mouse is around x2000 lighter than a human (i.e. 15-150 µg/24 hrs for mice and 30-300 µg/24 hrs in humans).

Long et al.'s results show that the relative rates of albuminuria detected in the mouse models were as follows: FVB male > FVB female > C57 male > C57 female. This rank order of albuminuria negatively correlates with the numbers of glomeruli measured in the kidneys of each strain, suggesting that glomerular number is an indicator of albuminuria. Additionally, testosterone was shown to induce an increase in the level of albuminuria in female C57 mice. As this albuminuric response to testosterone was not observed in FVB females, further investigations were undertaken that demonstrated a negative correlation exists between glomerular number and levels of albuminuria in male and female mice of both strains studied (Figure. 1). RNA-array analysis identified a list of 39 transcripts, including Aqpl, Bhmt2 and Gnmt, which were associated with increased urinary albumin excretion and five genes, including Cldn11 and Pln that were negatively associated with albuminuria. Many of these genes are associated with metabolic processes (Long et al., 2013). 18

A

B

Figure 1 Graphs showing glomerular number per kidney (A) and urinary albumin excretion rate (B) in male (M) and female (F) mice from the C57 and FVB strains. The two graphs demonstrate a negative correlation between reduced glomerular numbers and albuminuria (Long et al., 2013). 19

Thus, it has been established that variations in rates of albuminuria is associated with different mouse sexes and strains. The work of Long et al. (2013) has demonstrated that not only certain genotypes but also lower glomerular numbers are associated with higher albuminuria. They suggest that genetic variation (i.e. polymorphisms) and/or androgen hormones may be causative factors for higher rates of urinary albumin excretion. They have also identified two mouse strains (C57 and FVB) as strong candidates for use as models for future investigations into this area of research.

However, these studies have not demonstrated compositional differences at the protein level between these groups and this is therefore a logical step in the investigatory process.

20

Part 2: Glomerular Filtration Barrier Structure and Function

The main function of the glomerular filtration barrier is to act as a sieve for the blood passing through the glomerular capillaries. It inhibits the passage of cellular and macromolecular components while allowing the water and small solutes through into the nephron in the process known as ultrafiltration. The effectiveness of the ultrafiltration process maintains the osmotic potential of the blood and, indirectly, helps to control blood pressure.

The glomerular filtration barrier is comprised of fenestrated endothelial cells, a podocyte epithelial layer and an intervening basement membrane. Glomerular extracellular matrix components are secreted by the surrounding cellular layers and these include the podocytes, endothelial cells and glomerular mesangial cells.

21

Parietal cells of the Bowman’s capsule Podocytes (note

large cell body and

interdigitation of

foot processes)

Renal arterioles A B

Figure 2 The glomerulus: (A) An artistic representation of the glomerulus showing with the afferent and efferent arterioles entering and leaving the ball of glomerular capillaries [left], parietal cells of the Bowman’s capsule form a circular structure around the glomerulus; podocytes cell bodies shown as yellow bulbous structures, foot processes are curled round the glomerular capillaries. Cross sections of distal convoluted tubules are visible around the central glomerulus (Stanis, 2006); (B) scanning electron micrograph of enlarged glomerular podocyte as seen from the urinary space. Thick primary processes, which radiate out from the central cell body, further divide into fine secondary (foot) processes that interdigitate with foot processes from neighbouring podocytes. The glomerular basement membrane that surrounds the glomerular capillary, though not visible, is located under the foot processes (Smoyer and Mundel, 1998, p. 174).

22

Podocytes

A podocyte is a highly specialised kidney epithelial cell that comprises a large cell body and multiple elongated ‘foot processes’ that interdigitate to form the outer-most layer of the glomerular filtration barrier (Figure 2A and 2B). The individual foot processes are separated by gaps ranging between 25-60 nm, which are covered by structures known as slit diaphragms. These features form a network of often non-linear interconnecting channels, irregular in both size and shape (Wartiovaara et al., 2004); the largest of which have a slightly smaller than that of albumin (80x80x30 A/3.6nm) (Carter et al.,

1989; Sugio et al., 1999). In addition, podocytes are surrounded by their own sialic acid-rich glycocalyx, which contributes to the net negative charge of the glomerular filtration barrier (Economou et al., 2004).

Fenestrated Endothelial Cells

The cells that make up the capillaries of the glomerular endothelium are characterised by irregularly shaped and sized pores or 'fenestrae' (Bulger et al., 1983) (Fujita et al.,

1976) (Rostgaard and Qvortrup, 2002). Histological examination of the fenestrated capillaries has shown that they possess two non-cellular surface layers: the apical glycocalyx, a thin coat of glycoproteins found on all capillaries, and the thicker, luminal endothelial cell surface layer, which comprises glycoproteins, glycosaminoglycans and proteoglycans such as hyaluronic acid, heparin and chondroitin (Haraldsson and

Nyström, 2012). Both of these structures have been suggested to contribute to the mechanical and charge barriers of the glomerular filtration barrier. Their loss has been associated with the development of proteinuria and renal disease (Salmon et al., 2012)

(Rostgaard and Qvortrup, 2002).

Glomerular Mesangial Cells

A population of specialised pericytes known as the glomerular mesangial cells are situated within the ‘ball’ of the glomerulus. The primary roles of these cells are to maintain the structural and functional integrity of the glomerular tuft. Mesangial cells 23

have smooth muscle-like contractile capabilities controlled by molecules including angiotensin II, arginine vasopressin and histamine (Ausiello et al., 1980). They produce the mesangial ECM and secrete inflammatory, vasoactive and growth factors

(Schlondorff, 1987). These features allow mesangial cells not only to modulate the rate of glomerular filtration but also to modulate local responses to injury in relation to inflammation, cell proliferation and ECM remodelling.

These cellular components are essential to the functioning of the glomerulus. As a result of their interconnected nature, the presence of these cells needs to be taken into account in an investigation of the glomerulus as a whole or into any one individual part.

Glomerular extracellular matrix

ECMs comprise structured masses of material that exist outside of and adjacent to the plasma membrane. ECMs maintain the shape and structure of associated cells. The importance of the roles played by ECMs is demonstrated by the significant reduction or gain of cellular organisation, activity and function with the respective removal or addition of ECM materials in an in vitro setting (Karp, 2007).

There are four distinct ECMs that exist within the glomerulus: the GBM, the mesangial matrix, the Bowman's capsule ECM and the vascular pole ECM. All these types of

ECM play essential roles in the functioning and maintenance of the glomerulus.

The GBM is situated between and secreted by both the endothelial and podocyte layers of the glomerular filtration barrier. However, the endothelial layer only secretes during development (Haraldsson et al., 2008). Studies have shown that the principal role of the GBM is in the restriction of fluid and, possibly, solute flux (Deen et al., 2001).

With a width of between 240-370 nm in mice, the GBM is around 3-6 times thicker than that associated with the basement membranes of capillary beds in other tissues such as muscle (40-80 nm by comparison) (Danielli,1939) (Osawa et al., 2003). Like all basement membranes it is a fibrous network of proteins including collagen IV (Miner et 24

al., 1994, Sanes et al., 1990), glycoproteins such as laminin (Noakes et al., 1995) and entactin, along with proteoglycans such as agrin and (Hassell et al., 1980),

(Kanwar et al., 1980). The specific isoforms of most of these components such as laminin-521 and the collagen(IV) α3α4α5 are unique in the case of the GBM (even when compared with BM’s of neighbouring regions of the nephron) (Abrahamson,

1985; Miner et al., 1997; Khoshnoodi et al., 2006).

Studies in mice have demonstrated that damage to or alteration of the GBM, such as loss of a component as in a knock-out study or disease such as Pierson syndrome, can result in proteinuria, which can be detected prior to alterations in podocyte morphology

(Miner et al., 2006). Data such as these strongly suggest that the GBM is an essential component of a healthy glomerular filtration barrier and that its specific functions are dependent on composition.

The mesangial ECM occupies a space within the glomerular tuft of capillaries and provides structural support as well as fulfilling regulatory roles for the surrounding cells by secreting cytokines such as secreting Interleukin-1 (Abboud et al., 1987), Platelet derived growth factor (Aron et al., 1989) and Insulin-like growth factor (Zoja et al.,

1991). It is predominantly composed of collagen IV, laminin, fibronectin and several

Heparan sulphate proteoglycans. This ECM compartment is known to undergo changes in several glomerular diseases such as IgA nephritis (Julian and Novak 2004) and diabetic nephropathy (Schena and Gesualdo, 2005) that result in excessive accumulation of both normal and abnormal ECM material (including collagens I and III).

The glycoproteins (such as laminins), collagens, heparan sulphate proteoglycans and nidogens are considered to be some of the most abundant components of the glomerular ECM. These classes of components make up the majority of the mass of the GBM (an element of the GFB) and the mesangial matrix (that supports the cellular components of the glomerulus). Therefore, these classes of protein are potential 25

candidates for involvement in the mechanisms underlying the increased amounts of albuminuria observed in mice from the male sex or the FVB strain observed in Tsaih et al. (2010) and Long et al.'s (2013) studies.

Laminins

Laminins are a family of over 15 large, heterotrimeric glycoproteins named based on their component subunits that form the triple helix. The laminin composition of the GBM is known to change during glomerular development. During the process of maturation the immature laminins LM-111 and LM-511, are gradually replaced by LM-521, the adult laminin associated with the GBM (Miner et al., 1997) as part of maturation (Miner

1994). Deficiencies in LM-521 lead to pathologies such as the autosomal recessive

Pierson syndrome. This disease is caused by mutations in the gene (LAMB2), which codes for the laminin β2 subunit. It results in a severe phenotype characterised by massive proteinuria and nephrotic syndrome (Chen et al., 2011).

Collagens

Collagens are a family of 28 helical proteins that form the major constituent of ECM’s in animals, acting as scaffolds for the other matrix components and are involved in adhesion, cell organisation and differentiation, and tissue development (Khoshnoodi et al., 2006). Collagen gene products are known as α-chains (at least 43 of which have been identified) that undergo the process of trimerisation to form heterotrimeric structures called protomers, which in turn are secreted into extracellular space to form complete collagens via dimerisation. Although there are numerous types of collagen, it is the three network-forming collagen IV heterotrimers α1α2α1, α3α4α5 and α5α6α5 that act as the primary structural components in basement membranes (reviewed in

Khoshnoodi et al., 2006). 26

Nidogens

Nidogens (N-1 and N-2) are a family of dumbbell-shaped proteins that are components of animal basement membranes including those located within the glomerulus. Unlike the N-isoforms found in mice, which share similar binding affinities to other BM components, the human N-2 isoform shows reduced binding affinity for laminins and fibulins. This inter-specific variation has direct relevance to research involving ECMs because loss of either one of the murine nidogens (such as in a knock-out mouse model) does not produce any noticeable proteinuric phenotype. However, the loss of both nidogen proteins has been associated with reduced membrane integrity and perinatal lethality (Bader et al., 2005). This functional redundancy has not been observed in humans where the loss of either nidogen results in proteinuria (Salmivirta,

2002).

Heparan Sulphate Proteoglycans

Heparan sulphate proteoglycans (HSPGs) are a family of proteoglycans that are found in several tissue types including cartilage and all basement membranes. Like all proteoglycans HSPGs are composed of a central protein core, which is surrounded by numerous glycosaminoglycan (GAG) chains that are covalently attached via serine residues. Agrin, a 210 kda HSPG (Rupp et al., 1991, 1992) together with agrin and collagen XVIII are the major HSPG constituents of the glomerular ECM. Due to the negative charge of their sulphated residues HSPG are thought by some to make up a large percentage of the glomerular charge barrier (Kanwar et al., 1991). However absence of agrin or perlecan is not associated with a significant glomerular phenotype

(Miner, 2012).

In conclusion, the glomerulus is a highly complex structure that is dependent upon numerous ECM and cellular components to function correctly. As a result of the many interactions that occur, not only within the glomerular filtration barrier but between glomerular cellular and extracellular tissues, investigations of a single structure or cell 27

type must take these other interactions and components into account. This consideration is important in the design of studies, such as this one, that aim to identify specific components of the glomerular ECM; for example knowledge of interactions between components could be used to ensure that the desired component(s) are isolated successfully. However, these contextual considerations will also be important in any subsequent investigative stages, in which the aim is to establish the roles of components and complexes that have been identified. 28

Part 3: Methodologies

Proteomics

The aim of proteomics is to determine wide-scale gene and cellular function through investigation of proteins. It deals with the proteome: “the entire repertoire of proteins expressed at a given level of complexity within an organism” (Aebersold and Mann,

2003). The given level can be an organism as a whole, an organ (such as the kidney) or a specific cell type (such as a podocyte). Like those of the other global sciences, such as genomics and transcriptomics, proteomic investigations are designed with an unbiased approach rather than with the aim of answering a specific question. To accomplish this objective, global investigations frequently forgo the prior formulation of hypotheses, allowing the initial data gathered to act as an aid to the design of follow up studies that utilise the hypothesis driven approach.

The Matrisome Project (Hynes and Naba, 2012) was produced with the aim of improving understanding of the molecular basis of cell adhesion and the roles that it plays in cell behaviour and disease states. It represents a complementary form of proteomic investigation, which uses bioinformatic analysis of the human and mouse proteomes rather than techniques such as MS. As part of their research, the group compiled a series of lists of the approximately one thousand genes together with their protein products that contribute to the formation of ECM tissues from the human and mouse genomes. The proteins that make up these lists comprise around 300 core proteins, which are the primary structural components of ECM tissue such as collagens, glycoproteins (such as the laminins) and various proteoglycans. The rest of the list is concerned with the matrix associated proteins that include ECM-affiliated proteins such as annexins, lectins and mucins; ECM regulators that include matrix metaloprotease enzymes that allow matrix turnover; and secreted factors that include small molecules such as growth factors or chemokines. 29

The original intention was to collate these lists using a method based on isolation of the

ECM followed by analysis by mass-spectrometry. However, the ontological tools and databases used by the group, at the time, were shown to be unreliable; for example, proteins that were known to exist in the intracellular compartment were mis-annotated as being extracellular. Therefore, the group decided to use a bioinformatics approach based on search engines identifying domains common to ECM proteins along with numerous rounds of non-ECM protein negative selection and manual corrections of the list. This approach has yielded a more inclusive, comprehensive overview of the human and mouse matrisomes than other, similar databases. It promises to be a powerful tool for the initial analysis and screening of large data sets, such as those generated through proteomic or transcriptomic means, in order to identify extracellular components present (Hynes and Naba, 2012; Naba et al., 2012). Consequently, the approach chosen for this investigation involved the use of two principal data bases:

DAVID (http://david.abcc.ncifcrf.gov/) and the Matrisome Project database (Hynes and

Naba, 2012) to minimise the chance of any ECM proteins being missed. Additionally, to ensure that the list of glomerular ECM proteins did not include any false-positives all supposedly ECM proteins identified through the use of the two data bases were manually confirmed to be known (or potential) ECM components using online literature searches.

Mass Spectrometry

MS encompasses a range of techniques in which the atomic masses of an analytes ionised component particles can be measured once converted into a gaseous phase.

MS was first used in biological research in 1958 by Carl-Ove Andersson to analyse amino acids from bile salt samples (Andersson, 1958). These techniques, which were further refined by numerous groups such as Hunt et al. in their studies of major histocompatibility complex I (Hunt et al., 1992), have formed the basis for all further MS analysis of biological materials. 30

MS is less sensitive than other protein-detecting methodologies such as Western blotting. Therefore, unless suitably large samples (>50 mg of protein) (that are not always available) are used, there is always a chance that of some of a sample’s least abundant constituents will fall below the limit of detection (Gygi et al., 2000).

Consequently, a major weakness of MS analysis is that it does not necessarily provide a read-out detailing 100% of a given sample's composition. As shown by Gygi et al.

(2000), the signals from the more abundant components in a complex sample can mask less abundant components.

Despite these caveats, MS is the most powerful tool available to science for determining the composition of biological samples. It allows complex samples to be classified faster and without the inherent bias of other composition-determining techniques, such as Western blotting, that only detect the presence of pre-specified components. MS can generate representative datasets far larger than any other biochemical or histological method can produce. Using the correct software packages, databases, and statistical tests these datasets can be used to generate further hypotheses that help to focus and direct researchers towards specific targets rather than having to rely on the comparatively random approach of testing for the presence and/or interactions of proteins on an individual basis.

Simplification of glomerular ECM Sample

To minimise the chance of the less abundant components of the glomerular ECM not being detected during MS analysis two procedures have been used to simplify the samples and enrich the ECM components. The first step involves the isolation of individual glomeruli from the surrounding renal tissue, which can be achieved through several methods. One of the first to be used was the sieve-based method described by

Kirkwood and Nagi (1970) and more recently employed by Lennon et al. (manuscript in press) that involves passing homogenised kidney though a series of sieves to catch the glomeruli, which are then re-suspended and repeatedly washed in phosphate buffer 31

solution (PBS). The second method involves intravascular perfusion of living, anaesthetised mice with magnetic Dynabeads that become lodged in glomeruli. These glomeruli can be isolated by pressing pieces of bead-perfused kidney through a coarse sieve to homogenise the tissue and then washing the remaining tissue through with

PBS. The glomeruli can then be removed from the resultant suspension using a magnet stand. This technique, although technically more demanding, has tended to provide samples with higher numbers of glomeruli and few fragments of non-glomerular tissue per kidney used in animal studies (Takemoto et al., 2002; Long et al., 2013). 32

Figure 3 Coomassie stained gel and Western blots showing enrichment of glomerular ECM proteins: Coomassie stained gel ('Coomassie') and sections of Western blots ('Collagen IV', 'Laminin', 'Nephrin', 'Actin' and 'Lamin B1') showing enrichment of glomerular ECM proteins. Fractions numbered 1-3 were derived from supernatants of buffered washes described in methods section and ECM contains remaining protein (Lennon et al. in press). 33

The second step in the simplification process involves removing the unwanted cellular components from the glomeruli themselves though the application of a series of lysis buffers (Lennon et al., in press). This method used four such buffers to remove cytoplasmic proteins, disrupt cell–ECM interactions, remove nuclear proteins and finally to solubilise and collect ECM components. These protein fractions are presented in a

Coomassie stain (demonstrating the protein contained in each fraction) and Western blots (demonstrating the presence of specific proteins known to be intra or extracellular in each fraction) (Figure 3). Fractions 1-3 contain mainly cellular and nuclear proteins; demonstrated by the strong signals for nephrin and actin (cellular proteins) and lamin

B1 (a nuclear protein) shown by the Western blot sections. The ECM fraction contains the strongest signals for Collagen IV and Laminin (protein components of the ECM discussed above), while showing the weakest signals for nephrin, actin and lamin B1.

The Western blots show that unwanted intracellular (nephrin and actin) and nuclear

(lamin B1) proteins, that are most abundant in fractions 1-3, have been removed by these washes while the ECM proteins (collagen IV and Laminin) are present in the final

ECM fraction and are therefore 'enriched' (Lennon et al., in press). These steps are designed to reduce the number of proteins present in the glomerular samples down from the tens of thousands to the ~300 that are listed in the Hynes group's Matrisome

Project (Naba et al., 2012) and the online resource DAVID

(http://david.abcc.ncifcrf.gov/).

RNA Array

RNA arrays are a powerful and frequently employed research tool used in transcriptomics. The transcriptome is the set of RNA transcripts that exists within a single or specified population of cells at a given developmental stage and physiological condition (Wang et al., 20010). There are numerous types of RNA array, both solid phase and bead-based, that can be broad spectrum i.e. detect target sequences representing the whole of an organism's transcriptome, or specific to a specific set of 34

RNAs. The array comprises a solid surface or chip, most commonly a glass slide or nylon membrane, onto which a series of gene-specific nucleic acids are added in a specific pattern of dots. In an RNA array, the RNA from a sample (such as isolated tissue) is extracted then converted to cDNA (usually incorporating fluorescent tags) by reverse transcription. This sample cDNA is applied, in suspension, to the surface of the chip and allowed to hybridise with the complementary bound sequences. After the required length of time the unbound sample is washed off and the hybridised sequences detected by measuring the strength of the signal for each of the original bound dots. This procedure allows not only the detection of specific nucleic acid sequences but also their quantification based on the strength of the signals from each individual dot.

There are numerous interchangeable terms used in literature for array components.

The term target for example, can be used for both those nucleic acid sequences originally bound to the chip and those suspended in solution that are being tested, which are also known as probes in some papers (Schena et al., 1995).

RNA arrays provide data on the entire transcriptome of the cellular sample being analysed. They are far more efficient at identifying genes undergoing transcription than previous methods, which only detected single transcripts. A comparison of the data from the MS analysis of a tissue sample with the data provided by RNA arrays can be used to further verify the presence of, or determine the roles played by, post- translational control mechanisms on the abundance of specific proteins.

Bioinformatic Analysis

The term bioinformatics originated in the early 1970's and describes an interdisciplinary field covering the storage, retrieval, organisation and analysis of biological data

(Hogeweg, 2011). The goal of bioinformatics is to facilitate greater understanding of biological processes that are too complex to model without the help of computer 35

software. The types of data that can be modelled in this way include DNA, RNA and amino acid sequences, as well as the domains or overall structures of proteins

(Attwood et al., 2011). The process of analysing biological data in this manner is known as computational biology. The other major part of bioinformatics is the development of new software tools and algorithms to aid in these roles, as well as their application.

There are two principle approaches used in the modelling of biological systems: static and dynamic. The static approach typically models interaction data relating to proteins, nucleic acids and peptides using data from microarrays, protein or metabolite networks.

The dynamic modelling approach aims to take into account not only the many components of a biological system but also the fluxes and variation that exert influence upon the interactions and individual components during reactions (Feiglin et al., 2012).

Bioinformatics has become a key tool in global investigatory approaches such as proteomics and genomics as a result of the large volumes of data generated.

36

Aims and Hypotheses

Aims

This investigation involved the following four aims:

Aim 1:

To compare the two methods of isolating glomeruli, namely glomerular sieving and

Dynabead perfusion followed by sieving. This comparison was to determine which is the most effective method based on yield (number of glomeruli isolated) and the purity

(numbers of glomeruli compared with tubule fragments) of the sample.

Aim 2:

To determine whether the buffer-based ECM enrichment method used previously for human samples, in the laboratory of my primary supervisor Dr Rachel Lennon, is suitable for use on isolated murine glomeruli.

Aim 3:

To determine whether and what differences exist between the glomerular ECM proteomes of different sexes and strains of mice by analysing MS data gathered from enriched glomerular ECM fractions from male and female mice from the C57 and FVB strains.

Aim 4:

To compare MS and RNA array datasets focusing on the glomerular ECM proteomes of male and female mice of the C57 and FVB strains to determine how the relative patterns of abundance relating to corresponding mRNA and protein products of specific genes compare across the different mouse groups studied. 37

Hypotheses

This investigation tested the following two hypotheses:

1. There are differences between the ECM proteomes of mice of different sexes

and strains.

2. There are similarities between the patterns of relative abundance of

mRNAs and their encoded proteins associated with the glomerular ECM. 38

Methods

This investigation involved the use of complex techniques and delicate equipment that require specialised skills to effectively operate. The following lists clarify which of the following methods I carried out myself and which were implemented by others:

Techniques I carried out:

 Isolation of mouse glomeruli by sieving

 Isolation of mouse glomeruli by magnetic Dynabeads

 Quantification of glomerular isolation method effectiveness

 Enrichment of glomerular ECM proteins

 Western blotting

 Bioinformatic Analysis

 Statistical analysis of LC-MS/MS data

 Comparison and Statistical Analysis of MS and RNA Array Data Sets

Techniques carried out by Mr Michael Randles:

 In-gel proteolytic digestion

 LC-MS/MS analysis

Techniques carried out by Dr. David Long:

 Global microarray of isolated glomeruli

 Statistical analysis of RNA Microarray

Antibodies

Monoclonal antibodies used were against collagen IV (ab6588; Abcam), heat shock protein 70 (HSP70) (MA3-028, Thermo scientific), CD2-associated protein (CD2AP)

(sc-9137, Santa Cruz), laminin (ECM domain specific) (ab11575, Abcam) and alpha- 39

tubulin (T9026 Sigma). Secondary antibodies conjugated to Alexa Fluor 680 (Life

Technologies, Paisley, UK) or IRDye 800 (Rockland Immunochemicals, Gilbertsville,

PA, USA) were used in Western blotting.

Isolation of mouse glomeruli by sieving

Isolation of murine glomeruli was carried out utilising the sieving-based method,

(Lennon et al., in press), and both kidneys from each mouse/replicate were used. The kidneys were dissected at 4oC to remove the ureters and diced into approximately

1mm3 pieces. These pieces were pressed and washed through a 100 μm sieve

(Scientific Laboratory Supplies Ltd, Nottingham, UK) using the rubberised plunger from a 10 ml syringe and 30 ml of cold PBS. This suspension was passed through two 70

μm sieves, each washed with 10 ml volumes of cold PBS. The glomeruli were collected by pouring the suspension through a 40 μm sieve, which was gently washed with 10 ml of cold PBS. The glomeruli caught in the final sieve were retrieved by inverting it over a new Falcon tube and washing with 30 ml of cold PBS. These isolated glomeruli in suspension were washed three times by centrifuging at 3893 xg for 10 min, removing the tubule fragment-containing supernatant and resuspending in 5 ml of cold PBS.

Isolation of mouse glomeruli by magnetic dynabeads

The Dynabead-based glomerular isolation method used in this investigation was based on that originally presented by Takemoto et al. (2002) and subsequently employed by

Long et al. (2013). Takemoto et al. showed that both larger numbers of glomeruli could be successfully isolated and the extract would contain fewer non-glomerular components using the Dynabead method rather than the sieve-based isolation method.

Mice under terminal anaesthesia had the skin over the ventral surface of their thorax and abdomens removed and their sternums carefully dissected away. Each mouse was then perfused via injection with approximately 1x108 Dynabeads in a suspension of

PBS (pH 7.4) through the left ventricle of the heart following incision of the right cardiac 40

atrium. These surgical steps were carried out by an experienced technician in the

University College London (UCL) animal facilities with full Home Office approval (PPL no. 70/7478). Following this step mouse kidneys were removed and, at 4oC, dissected to remove the ureter and finely diced. They were then pushed through a 100 μm sieve

(Scientific Laboratory Supplies Ltd, Nottingham, UK) using a 10 ml syringe plunger and washed through a second 100 μm sieve with 5 ml of PBS (pH 7.4). The suspension was passed through a 40 μm sieve to catch the glomeruli and remove free Dynabeads.

The sieves were then inverted and washed into clean petri-dishes using 40 ml of PBS

(pH 7.4). This suspension was centrifuged at 1730 xg for 10 minutes, the supernatant discarded and the pellet resuspended in 1ml of PBS. The entire suspension was transferred to an Eppendorph tube mounted into a magnet stand to pull out the

Dynabead-containing glomeruli, which were washed three times and finally resuspended in 1ml of PBS.

Quantification of glomerular isolation method effectiveness

Total numbers of glomeruli harvested were calculated and the purity of the product was expressed as a percentage of all fragments i.e. including glomeruli and fragments of unwanted renal tubule. These numbers were calculated by extrapolation of average numbers of glomeruli and tubular fragments counted in three 5 µl volumes taken from the final 1 ml isolate volumes. These average numbers were divided by 5 x 10 -3 to give values representative of the total glomerular and tubular numbers present. This calculation can be shown as:

(Glomeruli x 100) / (Glomeruli + tubule fragments) = % purity

Enrichment of glomerular ECM proteins

This method was developed by my primary supervisor Dr Rachel Lennon to process protein samples destined for MS analysis. The removal of unwanted cellular components and consequent enrichment of ECM proteins reduces the impact of 41

contaminants and enhances the potential for the detection of a higher proportion of target proteins including those present in small quantities. All steps were carried out at

4oC to minimize proteolysis.

Suspensions of isolated glomeruli, produced from either the sieving or Dynabead based glomerular isolation protocols, were incubated for 30 minutes in triton-based buffer 1 TB (200µl 5x Tris/NaCl stock, 200µl 5% Triton, 10µl 1M EDTA and 2.5µl

Leupeptin, 2.5µl Aprotinin and 1µl AEBSF + 584 µl dH2O) to solubilise cellular proteins and thereby allow the glomerular cellular components to diffuse away from the ECM structures. Samples were then centrifuged for 10 min at 14000 xg and the cellular- component containing supernatant was removed (fraction 1).

The remaining pellet was incubated for 30 minutes in extraction buffer 2 (20 mM

NH4OH, 0.5% Triton-X in PBS) (EB) to solubilise any remaining cellular proteins attached to the glomerular ECM. Samples were again centrifuged for 10 minutes at

14000 xg and the supernatants (fraction 2) removed and stored. The remaining pellet was incubated with buffer 3 (DNase) (1:1000 dilution) for 30 minutes to remove DNA and any associated nuclear proteins. The sample was centrifuged for the last time for

10 minutes at 14000 xg (fraction 3).

Lastly, the final pellet was resuspended in reducing SDS sample buffer 4 (50 mM Tris-

HCl, pH 6.8, 10% (w/v) glycerol, 4% (w/v) sodium dodecylsulfate (SDS), 0.004% (w/v) bromophenol blue, 8% (v/v) β-mercaptoethanol) to yield an enriched-ECM fraction that was heat-denatured at 70oC for 30 minutes.

Western blotting

Enriched glomerular ECM samples suspended in reducing SDS sample buffer underwent SDS-gel electrophoresis through 4-12% SDS-gel (Life Technologies,

Carlsbad, CA, USA) in NuPAGE MES running buffer (Invitrogen) diluted to a 1x solution in dH2O at 200 Volts for 45 minutes. 42

The protein was transferred from the gel, using immunoblotting, to a nitrocellulose membrane (Scientific Laboratory Supplies) in transfer buffer (187 mM glycine, 25 mM

Tris, 0.347 mM SDS (Fisher Scientific, Waltham, MA, USA) in 1L dH2O) at 30 Volts for

90 minutes. Following transfer the membranes were blocked in a 1:10 solution of

Blocking Buffer and TBST (TBS + 3% Tween-20) for 90 min before incubation overnight at 4°C with the appropriate primary antibodies.

The following day the membranes were washed for an hour in TBST changing the liquid five times before incubation with the appropriate fluorescent-labelled secondary antibodies (either Alexa Fluor 680 (Life Technologies, Paisley, UK) or IRDye 800

(Rockland Immunochemicals, Glibertsville, PA, USA) in lightless, room temperature conditions for one hour. The membranes were again repeatedly washed for an hour in

TBST changing the liquid five times before being visualised using the Odyssey imaging system

(http://www.licor.com/bio/products/imaging_systems/odyssey/odyssey_imager.jsp).

In-gel proteolytic digestion

In-gel digestion with trypsin was carried out as described by Shevchenko et al. (2006) and adapted to enable high-throughput processing in 96-well plates. These experiments were performed by Mr. Michael Randles (PhD student, University of

Manchester).

The gels first underwent Coomassie staining (1 hour staining in 3% Coomassie blue in dH2O followed by 2 hour wash in acetic acid and an overnight wash in dH2O) to allow visualisation of the protein bands they contained, after which they were washed with distilled H2O. Each of the lanes was divided into 30 slices, which were subsequently chopped into ~1-mm3 pieces. The cubes from each slice were moved into the wells of a perforated V-bottomed 96-well plate (Proxeon, Odense, Denmark). The cubes were dehydrated in acetonitrile (ACN) for 5 min after which centrifugation was used to 43

remove the ACN. The gel samples were dried in a vacuum centrifuge and rehydrated for 1 hour at 56°C in reduction buffer (10 mM dithiothreitol, 25 mM NH4HCO3) after which the supernatant was removed by centrifugation. The gel cubes were then incubated in the dark at room temperature for 45 min in alkylation buffer (55 mM iodoacetamide, 25 mM NH4HCO3).

The gel cubes were then washed for 10 min in 25 mM NH4HCO3, dehydrated for 5 min in ACN and then washed and dehydrated for 5 min in each case. The cubes were dried in the presence of trypsin-containing digestion buffer (25 mM NH4HCO3 with 12.5 ng/μl sequencing-grade modified trypsin (Promega, Southampton, UK) in a vacuum centrifuge min at 4°C for 45 min. After this the samples were incubated at 37°C for 16 hours.

Peptides were collected into a V-bottomed 96-well storage plate (Thermo Fisher

Scientific, Waltham, MA, USA) via centrifugation. Those peptides still in the digestion buffer were removed through three isolation steps using 20 mM NH4HCO3 (used once) or 5% (v/v) formic acid in 50% (v/v) ACN (all incubated for 20 min) followed by centrifugation. The peptide extracts were pooled, concentrated into a volume of 20 μl and the ACN removed via vacuum centrifugation. These peptide samples were stored at −20°C until analysis by liquid chromatography–tandem mass spectrometry as carried out (LC-MS/MS).

Liquid chromatography–tandem mass spectrometry analysis

LC-MS/MS analysis was performed by Mr. Michael Randles in accordance with the methodology described by Lennon et al. (manuscript in press). A nanoACQUITY

UltraPerformance LC system (Waters, Elstree, UK) coupled online to a LTQ Orbitrap analyzer (Applied Biosystems, Framingham, MA, USA) was used.

Peptide sample volumes of 5 μl , prepared using the in-gel proteolytic digestion protocol described above, were desalted and concentrated using a Symmetry 44

analysed by C18 preparative column (20 mm length, 180 μm inner diameter, 5 μm particle size, 100 Å pore size; Waters). An ACQUITY UltraPerformance LC bridged ethyl hybrid C18 analytical column (100 mm length, 75 μm inner diameter, 1.7 μm particle size, 130 Å pore size; Waters) using a 40-min linear gradient from 1% to 30%

(v/v) ACN in 0.1% (v/v) formic acid at a flow rate of 300 nl/min at 50°C was then used to separate the peptides.

The mass spectrometer was set to acquire enhanced-resolution and product ion scans for peptides that had ion counts of >250,000 counts per second and were within a precursor ion mass-to-charge ratio (m/z) selection window of m/z 400–1600.

Information-dependent acquisition (Analyst, version 1.4.1; Applied Biosystems) was used to acquire tandem mass spectra over the range m/z 140–1400 for the two most intense peaks, which were excluded for 12 seconds after two occurrences.

Bioinformatic Analysis

The Mascot Search script (mascot.dll, version 1.6b9; Matrix Science, London, UK) plug-in for Analyst software was first used to process detected spectra through a sequence of extraction, charge-state deconvoluting and deisotoping steps. The files produced in this manner were searched against a modified version of the IPI mouse database. Searches, taking only tryptic peptides with a maximum of one missed cleavage into account, were submitted to the in-house Mascot server. Mass tolerances were set at 1.5 D for precursor ions and 0.5 D for fragment ions. These were validated using Scaffold (version Scaffold_2_00_05; Proteome Software, Portland, OR, USA).

Mascot-generated databases were imported into Scaffold for further analysis using the

X! Tandem search engine (version 2007.01.01.1). Peptide identifications were accepted if: 1) they could be established with at least 90% probability as determined by the PeptideProphet algorithm (Nesvizhskii et al., 2003); and 2) if these identified peptides were individually assigned a >99% probability as determined by the

ProteinProphet algorithm and at two or more unique, validated peptides. 45

The data were sorted to give a subset of proteins that were detected in a minimum of two out of three biological replicates originally analysed. This step was used to confirm if the given peptides were present in the sample or not (those that did not occur at the required frequency were considered artefacts. The percentage spectral counts of these proteins were calculated taking the values relating to the identified spectra for each protein and dividing by the total identified spectral counts for the replicate they belonged to and then multiplied by 100 to give a percentage. The % spectral counts were further normalised by the molecular weights (kDa) of their respective peptides by dividing the % spectral count values by the molecular weights of the proteins they represent. The mean spectral count values for each protein were calculated from the values of each biological replicate in a given group (e.g. C57 female mean value =

(A+B+C)/3).

The conversion tools available on the DAVID (http://david.abcc.ncifcrf.gov/) or

UNIPROT (www.uniprot.org/) websites were used to determine the Uniprot Accession numbers, gene ID's, Ensemble gene ID's and gene names of each of the proteins identified from the IPI numbers provided by Scaffold. Openoffice software was used to ensure these new lists of gene names and identifiers were correctly matched up with the original MS data.

The DAVID and Matrisome databases were then used to identify ECM proteins. In

DAVID the ontological classification extracellular region was used and in the Matrisome database proteins were matched via Uniprot accession numbers using Open Office.

The list of ECM proteins were categorised into: Structural proteins (subcategorised into collagens, glycoproteins and proteoglycans) and ECM-associated proteins.

Duplications in the data set, such as those entries from the same gene that shared

Uniprot accession numbers, were removed. 46

RNA Microarray analysis of isolated glomeruli

Microarray profiling of the glomerular transcriptome of male and female C57 and FVB mice was carried out by Long et al. and shared with this investigation. The glomeruli of eighteen-week-old male and female C57 and FVB/N mice (n=3 in each group) were isolated through the Dynabead-based isolation method described above. RNA was prepared using an RNeasy kit (Qiagen, Crawley, UK). RNA quality was assessed on the Bioanalyser 2100 (Agilent Technologies, Palo Alto, CA); subsequent cDNA and cRNA synthesis was performed and hybridized to mouse MOE430 2.0 GeneChips.

Functional categories for genes were assigned using the Database for Annotation,

Visualization and Integrated Discovery software (http://david.abcc.ncifcrf.gov) (Huang et al., 2008, 2009). For transcripts up- or down-regulated with albuminuria the ClustalW tool (http://www.ebi.ac.uk/Tools/clustalw/index.html) was used to determine whether the 1000 bp upstream of the transcription start site contained sequences with at least

75% homology to the consensus sequence of the androgen response element

(AGAACAnnnTGTTCT) or estrogen response element (AGGTCAnnnTGACCT), in addition to either the complete 5′-untranslated region or the first exon of the 5′- untranslated region when this is split over several exons. The data was compared with prior studies that have identified QTLs related to albumin excretion in the mouse and candidate genes implicated in albuminuria using a genome-wide association approach

(Long et al., 2013).

Analysis of RNA Microarray results

The following procedure was performed in order to validate the RNA microarray data obtained from Long et al. and so allow an accurate comparison between the transcriptomes and proteomes of male and female FVB and C57 mice to be carried out. The analysis was performed by Dr Leo Zeef, Faculty of Life Sciences, University of

Manchester. Signal values for transcripts on the Affymetrix array were calculated using the MAS 5.0 algorithm to generate .chp files that were exported to GeneSpring 9.0 47

(Agilent Technologies, Wokingham, Berkshire, UK) for further analysis. The MAS 5.0–

generated values were log2 transformed, normalized to the median within each array

(to control for array loading), and these values were then baseline transformed to the median value of each transcript. Transcripts were filtered to exclude genes whose expression did not reach a threshold value for reliable detection (based on the relaxed

Affymetrix MAS 5.0 probability of detection; P 0.1) in at least 1 of the 12 chips assessed.

Statistical analysis of RNA Microarray data

The following procedure was performed in order to enable pre-existing RNA array values for particular genes to be compared with the corresponding MS protein data.

This approach was implemented because it enabled two otherwise incompatible data sets to be visually compared. To determine genes modified by strain and sex, a two- way analysis of variance analysis was performed. Transcripts that were shown to be significantly altered by both strain and sex (P<0.05 after applying the Benjamini and

Hochberg false discovery multiple testing correction (Pounds, 2006) were used in subsequent analysis. Expression levels in female FVB/N versus female B6 mice, male

FVB/N versus male B6 mice, male versus female B6 mice, and male versus female

FVB/N mice were then examined. Genes were excluded if mean normalized expression levels did not vary by >1.5-fold in all of these individual comparisons.

The Openoffice software package was used to create a single list of genes the products of which demonstrated the necessary levels of abundance (as described in the methods above) in both the list of ECM proteins produced from analysis of the MS analysis of glomerular ECM and RNA array data set.

The remaining transcripts were clustered depending on whether the values they represented were of indicative of either male/female (i.e. sex) or FVB/C57 (i.e. strain) comparisons. 48

An alternative approach could have been to determine corresponding gene products in the RNA and MS data sets by using Openoffice to match gene names (instead of

Uniprot accession numbers) in the two data sets. However, while this method would have produced a much longer list of matched gene products it would not have taken into account that multiple transcripts variants can be produced from one gene, thereby provided a less accurate comparison.

Comparison of Mass Spectrometric and RNA Microarray data sets

An analysis was carried out to compare the closeness of the pattern of protein abundance in the MS data with the pattern of expression revealed by the RNA array data on a gene by gene basis. So that the relationship between the relative levels of

RNA and protein could be investigated on a gene-by-gene basis a series of paired bar charts showing corresponding RNA array and MS NSC values were plotted using the standard Windows Excel package.

To provide an objective evaluation of the closeness of the association between array and MS values corresponding to matched gene products a correlation coefficient was employed: The Pearson Product Moment Correlation Co-efficient (R) was used to determine the degree of match, i.e. correlation, between the corresponding pairs of MS and RNA array values for each of the four mouse groups for each of the individual genes. The value of R is a statistical measure of how well the trend line approximates the real data points. An R (or R2) value of 1 indicates that the trend line fits the data perfectly in that there is a directly proportional relationship between the two variables. A value of -1 reflects a perfect indirect proportional relationship, and a value of zero indicates that there is no simple, proportionality between the two variables at all. The two variables are independent of each other. However, a statistically significant outcome for the Pearson test does not necessarily indicate that there is a functionally close relationship between the two variables. Consequently, the Coefficient of

Determination (R2) is preferable (Dytham, 2011). It gives an indication of the proportion 49

of the variability in the data determined by the factor of interest, e.g. an R2 value of 0.88 indicates that 88% of the variability in one factor that is associated with the second factor (the remaining 12% of the variability in values being associated with other factors including random variation). The default measure of correlation produced by Microsoft

Excel function is the Coefficient of Determination R2, rather than the raw R value.

Some degree of caution should be employed when interpreting the results of Pearson product-moment correlation analysis because it is based on the assumption that the data are normally distributed and so could give an inaccurate indication of the strength of an association if the data analyzed are not so distributed. However, in this case the test was only used to provide an objective means of allocating each set of RNA array and MS values corresponding to matched gene products pairs to one of three classes depending on the nature of the relationship: 1) directly proportional, 2) inversely proportional 3) random relationship. This correlation analysis was only carried out in those cases where both RNA and the corresponding proteins were detected in all four mouse groups. A fourth category (Protein absent) included all of those genes in which there was a mismatch in that while RNA was present in all four mouse groups, the protein products were not detected in one or more of the mouse groups. Correlation analysis is not valid in this case. 50

Results Part 1: Comparison of glomerular isolation methodologies

The first stage of this project was to determine the most effective method of producing the purest possible murine glomerular isolate. The aim of this stage was to produce a sample in which whole glomeruli are isolated from their surrounding Bowman’s capsules (decapsulated) and not accompanied by fragments of tubules or other impurities that could affect the compositional readout given by MS analysis, (Figure 4).

Two methods were available for comparison in this investigation: the sieving based method documented by Kirkwood et al. (1970) and the Dynabead-perfusion based approach used by Takemoto et al. (2002) and more recently by Long et al. (2013).

These two isolation techniques were used to prepare samples of glomeruli from kidneys from mice of the C57 strain. The samples were compared in terms of both number of glomeruli isolated and the percentage purity of the product.

51

Glomeruli

70 µm

Figure 4: Light microscope image of a relatively pure glomerular isolate: An isolation of glomeruli showing four glomeruli obtained through the sieve-based glomerular isolation method. Note the lack of cell fragment impurities.

52

The average numbers of glomeruli, tubules and the percentage purities (percentages of glomerular number/combined glomerular and tubular counts) were then calculated.

These values were then used to determine the numbers of glomeruli present in the final isolates, thereby quantifying the effectiveness of the two methodologies. These data are summarised in Figure 5. The percentage purity of the glomerular isolations

(produced through either the sieving or dynabead-based isolation methods) was calculated using the following formula glomerular number x 100/glomerular + tubular fragment numbers.

53

12000

10000 8955.6

8000

6000

4000

1813.3

Glomerular/Tubularfragment no. 2000 1061.1 586.7 0 Sieve-based isolation method Dynabead-based isolation method

Glomerular no./ml Tubule fragment no./ml

Figure 5 Mean numbers of whole glomeruli from sieving and Dynabead based isolation methods: Bar chart showing mean numbers of whole (i.e. unfragmented) glomeruli (to 1 decimal place), calculated from three replicates in each case, and tubule fragments present in isolates produced using the sieving and Dynabead perfusion- based isolation methods using two mouse kidneys per replicate in both cases. Bars show standard error of the mean values for glomerular numbers/tubular fragments.

54

It was found that the sieving-based method isolates an average of 1813 glomeruli and

587 'fragments' (to the nearest whole number) for every 2 kidneys processed while the

Dynabead-based method produces an average of 8956 glomeruli and 1061 fragments, i.e. the Dynabead-based method produces isolates that contain, on average, 4.94 times as many glomeruli as the sieving method.

The data show that the Dynabead method produces samples with an average purity of

89.4% compared to the sieving-based method that produces isolates with only 75.6 % purity. Therefore, the average percentage purity of the isolates produced via the

Dynabead-based method is 13.8% higher than that of the sieve-based method.

Discussion

The dynabead-based glomerular isolation methodology generates isolates of greater purity that contain far greater numbers of glomeruli than the sieve-based method. The much higher numbers of glomeruli isolated (nearly five times that of the sieve-based method) and the greater purity of the samples makes the Dynabead-based method preferable because higher yields of less contaminated glomeruli will generate more reliable, robust, representative data through MS analysis of the glomerular ECM.

However, when the workflows summarising the methodologies (Figures 6 A and B) are compared, key differences between the two approaches are apparent, which have implications for their application in practice. The Dynabead-based method is more complex and thereby time consuming than the sieving-based method. Additionally it requires surgical skills to effectively perfuse live mice through the left ventricle while the animal is under terminal anaesthesia. This procedure also has more demanding Home

Office licencing requirements.

Consequently, the optimal technique to use depends on the availability of resources including access to appropriately experienced, technical expertise in addition to the efficacy of the isolation methods. 55

Figure 6A Workflow of sieving-based glomerular isolation method: Workflow showing key stages of the sieving-based glomerular isolation procedure. 56

Figure 6B Workflow of Dynabead-based glomerular isolation method: Workflow showing key stages of the dynabead based glomerular isolation protocol.

57

Results Part 2: Enrichment of Isolated Glomerular Extracellular

Matrix Proteins

The second aim of the project was to determine whether the buffer-based enrichment method used in the research laboratory of my primary supervisor, Dr. Rachel Lennon, for human samples is suitable for use on murine renal tissue. After assessing the most effective method for isolating murine glomeruli, the triple-buffer based ECM enrichment procedure (as previously employed by Lennon et al. (in press) was employed to remove the unwanted glomerular cellular components and produce an enriched ECM protein suspension. The effectiveness of this procedure was assessed using

Coomassie staining and Western blotting to ensure that each buffer was removing the desired glomerular cellular components. Successful enrichment would be demonstrated by the presence of cellular proteins in buffer fractions 1-3; (fractions 1 and 2 would be expected to contain cytoplasmic proteins while fraction 3 would contain nuclear proteins) while ECM proteins would remain in the fourth ECM fraction. 20 µl samples of fractions 1-4 in addition to a protein marker) and the final enriched protein suspension were run side by side (Figure 7).

58

Figure 7 Coomassie stained gel and Western blots showing enrichment of glomerular ECM proteins: Coomassie stained gels (top) and Western blots (bottom five panels) demonstrating successful enrichment of glomerular ECM proteins. Fractions 1, 2 and 3 were derived from successive buffer washes designed to remove cellular protein components; leaving enriched ECM proteins within the final extract (Enriched ECM). Western blots: Collagen IV and Laminin are components of the ECM; HSP70, tubulin and CD2AP are cellular components.

59

A stepwise decline in the intensity of the Coomassie staining on the protein bands for

Fractions 1, 2 and 3 in the gel shows that protein was successfully removed by each of the buffer washes. Densely stained bands on the final glomerular extract indicate that proteins were retained. The western blots confirm that enrichment of the glomerular

ECM has been achieved as the signals corresponding to the ECM components

(collagen IV and laminin) were detected most strongly in the ECM fraction, which indicates a strong presence of these proteins in the final extract. Fainter bands within the buffer washes, Fractions 1, 2 and 3, testify that there was only a small loss of laminin during the enrichment process and negligible loss of collagen IV. Intracellular proteins, represented by HSP70, CD2AP and tubulin, are potential impurities, which it is desirable to remove from the final ECM extract. These proteins are present at increasingly low concentrations in fractions 1-3, as indicated by increasingly faint staining of the bands relating to cellular components (tubulin and HSP70). In contrast, they are absent or at comparatively low concentrations in the final enriched fraction.

However, removal of CD2AP appears to be less successful. This protein is also a cellular component; it shows a fainter band in fraction 1 than in the ECM fraction, indicating negligible presence in the buffer washes, and some level of retention within the final ECM extract. However, it is not possible to remove all cellular components and the purpose was to create an enriched ECM fraction.

Discussion

These results confirm that enrichment of the glomerular ECM was achieved. While not fully removing all cellular proteins from the final isolate e.g. CD2AP, the relative abundance of unwanted cellular, protein components was visibly reduced. This purification and enrichment of the samples should ensure that analysis of the samples by MS is more effective than it would be without it, because there is a lower probability of not only the ECM proteins, the focus of further investigation, being masked by an abundance of unwanted isolates, but also of less abundant proteins falling below the 60

limits of detection. It was therefore concluded that the buffer-based enrichment method developed by my primary supervisor, Dr Rachel Lennon was suitable for use with isolated mouse glomeruli.

61

Results Part 3: Analysis of Mass Spectrometric Data

The third aim of the project is to analyse MS data gathered from enriched glomerular

ECM from male and female mice from the C57 and FVB strains in order to determine whether differences exist between the glomerular ECM proteomes of different sexes and strains of mice by, and to characterise any differences that are identified. Enriched glomerular ECM samples were analysed using a LTQ Orbitrap analyser. These proteins were identified with >99% confidence, their relative abundance was measured through their spectral count. Those proteins known to be associated with the ECM compartment were identified using the online database DAVID. For the purposes of this investigation, proteins were considered 'present' in the sample if >2 of their constituent peptides were detected in >2/3 biological replicates of each mouse. The four lists of glomerular ECM proteins that correspond to each mouse group are appended

(Appendix 1). A Venn diagram was constructed from these data to indicate the patterns of occurrence of the proteins amongst the mice strain and sex groups (Figure 8). The proteins within each category (set) of the Venn Diagram are listed in a separate table together with the gene that codes for them and the functional category of ECM protein to which the protein belongs (Table 1).

62

Figure 8 Venn diagram of MS analysis of enriched glomerular isolates from male and female mice from the C57 and FVB strains: Venn Diagram of MS analysis of enriched glomerular ECM isolates from male and female mice from the C57 and FVB strains; A total of 120 ECM proteins were detected. Bracketed numerals on the diagram indicate the number of proteins falling within each category (set), which are identified by upper case letters. Lists of the proteins within each category are given in the associated table (Table 1).

63

A (59) Protein name Gene Name Category Collagen alpha-1(XII) chain Col12a1 Collagens Procollagen type XV Col15a1 Collagens Collagen alpha-1(XVIII) chain Col18a1 Collagens Collagen alpha-1(I) chain Col1a1 Collagens Collagen alpha-1(IV) chain Col4a1 Collagens Collagen, type IV, alpha 2 Col4a2 Collagens Collagen alpha-3(IV) chain Col4a3 Collagens Collagen alpha-4(IV) chain Col4a4 Collagens Col4a5 protein Col4a5 Collagens Procollagen type IV alpha 6 Col4a6 Collagens Collagen alpha-1(VI) chain Col6a1 Collagens Collagen alpha-2(VI) chain Col6a2 Collagens Protein Col6a3 Col6a3 Collagens EMILIN-1 Emilin1 ECM Glycoproteins Fibrillin 1 Fbn1 ECM Glycoproteins Fibrinogen beta chain Fgb ECM Glycoproteins Putative uncharacterized protein Fn1 ECM Glycoproteins Laminin subunit alpha-1 Lama1 ECM Glycoproteins Laminin subunit alpha-3 Lama3 ECM Glycoproteins Laminin subunit alpha-4 Lama4 ECM Glycoproteins Laminin, alpha 5 Lama5 ECM Glycoproteins Laminin subunit beta-1 Lamb1 ECM Glycoproteins Laminin, beta 2 Lamb2 ECM Glycoproteins Laminin subunit gamma-1 Lamc1 ECM Glycoproteins Lactadherin Mfge8 ECM Glycoproteins Multimerin-2 Mmrn2 ECM Glycoproteins Nidogen-1 Nid1 ECM Glycoproteins Putative uncharacterized protein Nid2 ECM Glycoproteins Netrin-4 Ntn4 ECM Glycoproteins Papilin, -like sulfated glycoprotein Papln ECM Glycoproteins Peroxidasin homolog (Drosophila) Pxdn ECM Glycoproteins Tubulointerstitial nephritis antigen Tinag ECM Glycoproteins Tubulointerstitial nephritis antigen-like 1 Tinagl1 ECM Glycoproteins Tenascin Tnc ECM Glycoproteins Vitronectin Vtn ECM Glycoproteins von Willebrand factor A domain-containing protein 8 Vwa8 ECM Glycoproteins von Willebrand factor Vwf ECM Glycoproteins Basement membrane-specific proteoglycan core protein Hspg2 Proteoglycans Serum albumin Alb ECM-associated Alkaline phosphatase Alpl ECM-associated Annexin A2 Anxa2 ECM-associated Complement C3 C3 ECM-associated Complement C4-B C4b ECM-associated Clusterin Clu ECM-associated Ceruloplasmin Cp ECM-associated Dipeptidylpeptidase 4 Dpp4 ECM-associated Glutathione peroxidase 3 Gpx3 ECM-associated Gelsolin, isoform CRA_c Gsn ECM-associated Heat shock 70kD protein 5 (Glucose-regulated protein) Hspa5 ECM-associated Hyaluronidase-2 Hyal2 ECM-associated Ig mu chain C region secreted form Igh-6 ECM-associated Inter-alpha trypsin inhibitor, heavy chain 1, isoform CRA_a Itih1 ECM-associated Meprin A subunit alpha Mep1a ECM-associated Meprin A subunit beta Mep1b ECM-associated Nephronectin NPNT ECM-associated MCG23455, isoform CRA_e Rpl13a ECM-associated 64

Protein name Gene Name Category Putative uncharacterized protein Serpinh1 ECM-associated Protein-glutamine gamma-glutamyltransferase K Tgm1 ECM-associated Protein-glutamine gamma-glutamyltransferase 2 Tgm2 ECM-associated

B (9) Protein name Gene Name Category Collagen, type XIV, alpha 1 Col14a1 Collagens Collagen alpha-2(I) chain Col1a2 Collagens Putative uncharacterized protein Col3a1 Collagens Protein Col6a3 Col6a3 Collagens Transforming growth factor, beta induced Tgfbi ECM Glycoproteins Prolargin Prelp Proteoglycans Apolipoprotein E Apoe ECM-associated Inter-alpha-trypsin inhibitor heavy chain H5 Itih5 ECM-associated Ladinin-1 Lad1 ECM-associated

C (1) Protein name Gene Name Category Inter alpha-trypsin inhibitor, heavy chain 4 Itih4 ECM-associated

D (1) Protein name Gene Name Category Deoxyribonuclease-1 Dnase1 ECM-associated

E (4) protein name Gene Name Category Fibrinogen gamma chain Fgg ECM Glycoproteins Apolipoprotein O-like Apool ECM-associated Murinoglobulin-1 Mug1 ECM-associated Alpha-1-antitrypsin 1-1 Serpina1a ECM-associated

F (3) protein name Gene Name Category von Willebrand factor A domain-containing protein 1 Vwa1 ECM Glycoproteins Serine protease HTRA1 Htra1 ECM-associated Superoxide dismutase Sod3 ECM-associated

G (10) Protein name Gene Name Category Collagen alpha-5(VI) chain Col6a5 Collagens Hyaluronan and proteoglycan link protein 1 Hapln1 Proteoglycans Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 Proteoglycans Agrin Agrn ECM Glycoproteins C4b-binding protein C4bpa ECM-associated Heparin-binding growth factor 2 Fgf2 ECM-associated Protein Gm20425 Gm20425 ECM-associated Ig gamma-1 chain C region, membrane-bound form Ighg1 ECM-associated Inter alpha-trypsin inhibitor, heavy chain 4 Itih4 ECM-associated Kielin/chordin-like protein Kcp ECM Glycoproteins Serine protease inhibitor A3K Serpina3k ECM-associated

65

H (5) Protein name Gene Name Category Extracellular matrix protein FRAS1 Fras1 ECM Glycoproteins Biglycan Bgn Proteoglycans GLI pathogenesis-related 2 Glipr2 ECM-associated Inter-alpha-trypsin inhibitor heavy chain H2 Itih2 ECM-associated Uromodulin Umod ECM-associated

I (3) Protein name Gene Name Category Probable carboxypeptidase PM20D1 Pm20d1 ECM Glycoproteins Dehydrogenase/reductase (SDR family) member 8, isoform CRA_b HSD17B11 ECM-associated MCG140784 Try10 ECM-associated

J (18) Protein name Gene Name Category Collagen alpha-2(V) chain Col5a2 Collagens Alpha-1B-glycoprotein A1bg ECM Glycoproteins EMI domain-containing protein 1 Emid1 ECM Glycoproteins Fibulin-5 Fbln5 ECM Glycoproteins Insulin-like growth factor-binding protein complex acid labile subunit Igfals ECM Glycoproteins Thrombospondin type-1 domain-containing protein 4 Thsd4 ECM Glycoproteins Tenascin-X Tnxb ECM Glycoproteins Alpha-2-macroglobulin A2m ECM-associated A disintegrin and metalloproteinase with thrombospondin motifs 13 Adamts13 ECM-associated Aminoacyl tRNA synthase complex-interacting multifunctional protein 1 Aimp1 ECM-associated Angiopoietin-related protein 2 Angptl2 ECM-associated Apolipoprotein A-I Apoa1 ECM-associated Complement C1q subcomponent subunit C C1qc ECM-associated Calreticulin Calr ECM-associated Carboxypeptidase N subunit 2 Cpn2 ECM-associated Ecm1 protein Ecm1 ECM-associated Mannose-binding protein C Mbl2 ECM-associated Brain cDNA, clone MNCb-5810, similar to Mus musculus tissue inhibitor of metalloproteinase 3 (Timp3), mRNA Timp3 ECM-associated

K (3) Protein name Gene Name Category Coatomer subunit alpha Copa ECM-associated Glypican-1 Gpc1 ECM-associated 14-3-3 protein sigma Sfn ECM-associated

L (4) Protein name Gene Name Category Arginase-1 Arg1 ECM-associated Complement component C9 C9 ECM-associated Igh protein Gm16844 ECM-associated Igh protein Gm16844 IGH ECM-associated

Table 1 Lists of proteins within each category of the Venn Diagram: Lists of proteins relating to protein categories shown in the Venn diagram (Figure 9). Categories are indicated by letters, and numbers of proteins are shown in brackets e.g. proteins in category H(5) of the Venn Diagram (i.e. proteins present in both FVB 66

females and C57 females) are shown in the table under H (5). The table lists the names of the proteins present in each of the lists (column 1), the gene that codes for them (column 2) and the functional category of ECM protein the protein belongs to (column 3). 67

There are 59 proteins, named the common matrisome for the purposes of this investigation, which are present in all four sex/strain groups i.e. occur in both sexes of both strains. In terms of sex, there are twenty six proteins associated exclusively with the female sex; of these five occur in females of both strains, three only occur in C57 females, but by far the largest set is the 18 proteins that occur exclusively in females from the FVB strain. Seven proteins are restricted to males, but none of them occur in males of both strains; four are found only in C57 males and three are restricted to FVB males.

In terms of strain, only seven proteins are restricted to the C57 strain, but none of them occur in both sexes; four are found only in C57 males and three are restricted to C57 females. In contrast to the C57 strain, 31 proteins are associated solely with the FVB strain; ten occur in both sexes, three occur only in FVB males, and 18 in FVB females.

Conversely, there are five proteins that are present in female mice from both the C57 and FVB strains while there are no proteins that are expressed exclusively by male mice of both strains.

Fifteen proteins occur in three of the groups i.e. they are absent from just one of the groups. Only one of the 15 is missing from FVB females, and similarly only one is missing from C57 males. Four proteins are absent from C57 females. The most deficient group in this respect is FVB males apparently lacking nine proteins found in all other groups.

The 10 FVB-only and 5 female only proteins are of key importance to this investigation as they are the only two qualitative protein groupings that are clearly strain or sex specific. Five well characterised ECM proteins, which could potentially be involved in influencing the rates of albuminuria in the mouse groups investigated, are shown below

(Figure 9).

68

0.00003 0.000008 0.000025 0.000007

0.00002 0.000006

0.000005 0.000015

NSC 0.000004

0.00001 NSC 0.000003 0.000005 0.000002 0 0.000001 FVBM FVBF C57M C57F 0 Mouse strain/sex group FVBM FVBF C57M C57F Mouse strain/sex group

(A) Umod (B) Fras1

0.00025

0.0002

0.00015

NSC 0.0001

0.00005

0 FVBM FVBF C57M C57F Mouse strain/sex group

(C) Fgf2

0.00014 0.00001 0.00012 0.000008

0.0001

0.00008 0.000006 NSC NSC 0.00006 0.000004 0.00004 0.000002 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

(D) Serpina3k (E) Agrn

Figure 9 Bar charts summarizing MS data values for protein products of selected genes characteristic of one sex or one strain: Bar charts summarizing MS data values for protein products of genes restricted to females of both strains:(A) Umod (Uromodulin), (B) Fras1 (Extracellular matrix protein FRAS1), (C) Fgf2 (fibroblast growth factor 2), and protein products of selected genes restricted to males and females of strain FVB only: (D) Serpina3k and (E) Agrn (Agrin) genes.Bar charts show mean NSC values for each strain/sex mouse group.

69

Uromodulin (Umod) and Extracellular matrix protein FRAS1 (Fras1) are examples of proteins that are only found in female mice (Figure 9 A and B). Uromodulin is present in greater quantities in C57 females than FVB females. In contrast, Fras 1 is present in greater quantities in FVB than C57 females. They illustrate the point that the relative amount of protein in the two strains is not consistent for all genes in that category.

Fibroblast growth factor 2 (Fgf2), Serpina3k and Agrin (Agrn) (Figure 9 C, D and E) are examples of genes for which the protein products were only found in both males and females of the FVB strain. In these cases, there is always more of the protein in the females than in the males, the majority of the proteins only found in the FVB strain follow this trend, the exceptions being Col6a5, Ighg1, Kcp. The total numbers of proteins present in each mouse group as well as the numbers of proteins unique to each group are shown below (Figure 10A). In addition, these proteins were characterised, based on their functions, as belonging either to one of the three subgroups of structural proteins: collagens, glycoproteins or proteoglycans or as ECM- associated proteins (which includes regulators, immunogenic and other, typically globular, proteins secreted into the ECM). To identify if there were any broad consistencies between the roles of the unique-to-group proteins identified their relative numbers and what categories they belong to are shown below (Figure 10B).

70

120 110

100 79 80 79 80 60 40 18 No.of proteins 20 3 3 3 0 FVBM FVBF C57M C57F Mouse groups

Total ECM Protein No. Unique ECM Protein No. A

12 10 Collagens 8 (structural) 6 Glycoproteins (structural) 4

No.of proteins Proteoglycans 2 (structural) 0 ECM Associated FVBM FVBF C57M C57F Mouse groups B

Figure 10 Total numbers of glomerular ECM proteins that are unique to each group and classes of glomerular ECM protein that are unique to each mouse group: (A) Bar chart showing total numbers of glomerular ECM protein counts and numbers of these proteins that are unique to each group. (B) Bar chart showing the classes of glomerular ECM protein that are unique to each mouse group as well as showing the number of proteins that belong to each class.

71

FVB males, C57 males and females show very similar total ECM protein numbers (79,

80 and 79 respectively). The FVB female group is different from the other three groups as it has the largest total glomerular matrisome of any group, consisting of 110 proteins. The FVBM, C57F and C57M groups are also very similar in that there are three proteins that are unique to each of them, while the FVBF group is once again exceptional in that it expresses 18 unique ECM proteins.

There are around 30 more proteins found in the ECMs of FBV female mice than in the other groups in terms of total protein count (Figure 10A). In FVB males and C57 mice of both sexes 3.8% of the associated proteins are unique to each group. In contrast,

16.4% (4.3 times as many) of the glomerular ECM proteins in the FVB female mice are not expressed by other groups.

Differences between the mouse groups can also be observed in terms of the classes of unique proteins (Figure 10B). The unique-to-group proteins found in both sexes of FVB mice include those from both the ECM-associated and structural classes. However, the only class unique to males and females of the C57 strain is ECM-associated proteins, i.e. neither C57 males nor females have any structural proteins that are not found in any other group. While FVB females have considerably more unique proteins in their glomerular ECMs than males of the same strain both FVB groups seem to have roughly half as many glycoproteins as those from the larger ECM-associated class.

The FVB female group is the only one to have any collagens that are unique to it. No proteoglycans were restricted to a single mouse group.

The largest class of unique-to-group proteins are ECM-associated proteins. This class includes proteins responsible for the regulation of ECM formation and organisation.

While males and females of the C57 strain uniquely have only ECM-associated proteins the unique proteins in male and female FVB mice include both ECM- associated and structural proteins. The largest class of unique-to-group proteins in both 72

male and female FVB mice is ECM-associated, while glycoproteins are the most abundant structural subgroup.

The proteins that make up the common matrisome were also categorised and plotted

(Figure 11).

73

30

25

20

15

No.of proteins 10

5

0 Collagens Glycoproteins Proteoglycans ECM Associated (structural) (structural) (structural)

Figure 11 Bar Relative sizes of the four protein classes in the common matrisome: Bar chart summarizing the relative sizes of the four classes that the 59 proteins occurring in the common matrisome: each column indicates the numbers of proteins that fall into each class.

74

The majority of the proteins that make up the common matrisome (38 out of 59) are structural (the largest subgroup being the glycoproteins). The largest class within the core matrisome, containing 24 proteins, is the glycoproteins (structural proteins that include the laminins). The second largest class, containing 21 proteins, is the ECM- associated proteins that include both regulatory proteins and other proteins such as immunoglobulins. Collagens with 13 proteins and especially proteoglycans with just one protein representative are the least frequent types of protein in the group of 59 proteins common to all mouse groups.

75

100 90 80 38.5

41.8 44.5 42.8

70

60 3.8 3.8 4.5 4.8 50 ECM-Associated 40 33.3 Proteoglycans (structural) 36.7 33.6 34.3 ECM Glycoproteins (structural)

30 %total countprotein Collagens (structural) 20 24.4 10 17.7 17.3 18.1 0 FVBM FVBF C57M C57F Proteins Proteins Proteins Proteins Mouse groups

Figure 12 Percentages that each of the protein classes contribute to the glomerular matrisomes: Bar chart showing the percentages that each of the protein classes contribute to the glomerular matrisomes of each of the mouse groups studied as part of this investigation (male and female mice from the FVB and C57 strains).

76

While there are different total numbers of proteins that make up the glomerular matrisomes in each of the mouse groups studied (Figure 10A), the percentage size of the categories present in each group show little variation, differing by no more than 6%

(Figure 11). Around 42% are ECM associated, 4% are proteoglycans, 34% are glycoproteins and 18% are collagens.

This finding would suggest that, irrespective of the complexity (i.e. numbers of different proteins present) of the glomerular ECM of a given mouse group, ECMs may require consistent ratios of regulatory, affiliated and secreted proteins to structural proteins in order to function effectively.

The analysis of the MS data described above involved an essentially qualitative examination of the patterns of presence and absence of the various proteins detected in the four mouse strain/sex groups.

The Venn diagram and the other, associated, figures above show that there are qualitative differences between the patterns of proteins expressed by male and female mice from the C57 and FVB strains.

MS spectral count values relating to different proteins cannot be accurately compared to each other, preventing quantitative analysis of different proteins. However, the relative amounts of the same protein present in different mouse groups can be compared with each other. The data revealed that there is considerable variability between mouse groups in terms of the relative amounts of the protein they contained.

In addition, the patterns of relative protein abundance between the mouse groups also appear to vary greatly between genes. The data for four genes Nid2, Vtn, Alb and

Hspg2 (presented in graph form) are included to demonstrate some of this variability

(Figure 13).

77

0.175 0.16 0.14

0.17

0.12 0.165 0.1 0.08

0.16 0.06 NSC x 1000 x NSC NSC NSC 1000 x 0.04 0.155 0.02 0.15 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group A B

0.18 0.185 0.16

0.18

0.14 0.12 0.175 0.1 0.17 0.08

0.06 0.165 NSC x 1000 x NSC NSC NSC 1000 x 0.04 0.16 0.02 0 0.155 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group C D

Figure 13 Mean Normalised Spectral count values: Bar charts showing Mean Normalised Spectral count values (NSC) from MS data for proteins coded for the genes Nid2 (A), Vtn (B), Alb (C) and Hspg2 (D). These proteins are shown here to illustrate the existence of quantitative differences in relative levels of proteins between male and female mice from the C57 and FVB strains.

78

There are quantitative differences between the relative abundance of proteins between the mouse groups studied as part of this investigation (Figure 13). Nid2 is more abundant in females than males of both strains (Figure 13A); additionally, while FVBF has more of the protein than C57F, C57M has more than FVBM. There are greater amounts of Vtn in mice from the FVB than C57 strain, while females of both strains express slightly more of the protein than males (Figure 13B). Alb shows similar patterns of abundance to that of Vtn in that there is more of its protein in the FVB strain but differs from it in that FVBf shows higher levels of the protein than FVBm and C57m has more than C57f mice (Figure 13C). Hspg2 shows no such patterns of expression relating to sex or strain; the protein is most abundant in C57f mice, followed by FVBm,

FVBf and then is least abundant in the C57f group (Figure 13D).

In order to see whether there were any consistent quantitative differences relating to sex or strain that affected the patterns of protein associated with each mouse group, a ranking comparison was carried out. The 59 proteins chosen for this analysis were the ones present in the four mouse groups. The MS values for each group were ranked on a protein by protein basis, from one to four, with one being the lowest value and four the highest. The mean rank was then calculated for each group across all proteins included in the comparison (Table 2).

Mean Ranks for Samples

A B C D

(FVB males) (FVB (C57 males) (C57 females) females)

2.4 2.8 2.4 2.5

Table 2: Summary of ranking comparison demonstrating a quantitative analysis of the relative protein content in the common matrisome of the four sex/strain mouse groups

79

A summary of the analysis (Table 2) reveals that the mean ranks range in value from

2.4 to 2.8. The ranks run from 1 the lowest to 4 the highest amount of protein detected in each set of four values. All of these values are close to the expected value of 2.5 that would result if equal numbers of each rank occurred in the dataset for each mouse group. It appears that amongst the four mouse groups there is very little bias in terms of the relative amounts of protein produced. However, the average median values show that there are slightly higher amounts of relative protein associated with the core matrisomes of female mice from the C57 and FVB strains (the FVB female mice producing greater amounts of protein than their counterparts from the C57 strain).

Therefore, for those proteins that occur in all four groups, it appears, from this very simple analysis, that no one sex, strain or individual group consistently produces larger or smaller quantities of the protein than the other groups, because there is considerable variation in the rank order of amounts of protein across the four groups.

80

Discussion

As shown by the Venn diagram (Figure 8) there are qualitative differences between the glomerular ECM proteomes of male and female mice from the C57 and FVB strains.

The proteins that are expressed in all four groups are restricted to a single group, or are absent from a single group, are the least interesting in terms of understanding the processes involved in causing albuminuria. It is the proteins that are shared by one sex of both strains or both sexes of a single strain that are most likely to include proteins that are the strongest candidates for involvement in the mechanisms of albuminuria.

The fact that there are no proteins that are only shared by either the male sex or the

C57 strain, but there are proteins that are restricted to the female sex and the FVB strain, is significant as according to Long et al.'s (2013) findings regarding sex and strain-based patterns of albuminuria, they represent proteins that are expressed in either the less-albuminuric females or the higher albuminuric FVB strain. Following a review of the literature associated with these genes, a list of five proteins that may be involved in influencing rates of albuminuria was collated. The genes of interest in the female sex datasets include: Umon (uromodulin) and Fras1 (Extracellular Matrix

Protein FRAS1) and the FVB strain datasets were associated with Fgf2 (fibroblast growth factor 2), Serpina3k (Serpin Peptidase Inhibitor, Clade A (Alpha-1

Antiproteinase, Antitrypsin) Member 3) and Agrn (Agrin) (Figure 9). These genes of interest are discussed below.

FVB males and C57 females, C57 males and FVB females express glomerular ECM proteomes of 82, 82, 83 and 110 proteins respectively. These proteins fall into a series of groupings including a commonly expressed common matrisome of 59 proteins. The majority of the core matrisome are structural proteins (Figure 11), which account for

65% of the group. These account for 46% of all structural proteins that were detected in one or more of the mouse groups investigated via MS analysis. 81

FVB female mice express greater total and unique numbers (4.3 times as many in this second case) of glomerular ECM proteins than any other group. Additionally, both FVB strains express unique-to-group structural as well as ECM associated proteins. From these observations it could be concluded that glomerular ECM from male and female

FVB mice is compositionally more complex compared to that from mice from the C57 strain. The presence of so many more uniquely expressed protein components, which belong to both the structural and ECM-associated protein categories, in the glomerular

ECM of FVB females may explain the resistance of this mouse group to testosterone- induced glomerular ECM functional decline as shown by Long et al. (2013).

The relative percentage sizes of the protein classes in the glomerular ECMs of the four mouse groups investigated remain fairly consistent, irrespective of the numbers of proteins that make up the specific matrisomes. This finding could indicate relationships between the numbers of regulators 'required' to synthesise and maintain glomerular

ECM tissue.

At the level of the individual protein, there is quantitative variation in the amounts of protein in each mouse group. Proteins are not detected in identical quantities in all four mouse groups. However, while there are clear differences in the amounts of proteins found in one or more mouse groups at the level of the individual proteins, there is no simple pattern to this variation: no one sex or strain consistently produces more or less protein than the other groups. A more sophisticated technique, such as principal component analysis, could be used to analyse of the patterns of protein distribution amongst the mouse groups. In summary, there are clearly qualitative and quantitative differences between the glomerular matrisomes of male and female mice from the C57 and FVB strains, based on the components that form these ECMs. Hypothesis one can therefore be accepted. It can be concluded that there are differences between the ECM proteomes of mice of different sexes and strains. 82

Results Part 4: Comparison of Mass Spectrometric and RNA

Array Data Sets

The fourth aim of the study is to compare MS and RNA array datasets focusing on the glomerular ECM proteomes of male and female mice of the C57 and FVB strains to determine how the patterns of corresponding mRNA and protein products (which I will refer to collectively as gene products) relating to specific genes compared across the different mouse groups studied.

Hypothesis 2 states that: there are similarities between the patterns of relative abundance of mRNAs and their encoded proteins associated with the glomerular ECM.

This implies that the pattern of relative amounts of RNA recorded in the array data in the four strain/sex mouse groups will closely match the pattern of relative amounts of proteins recorded in the MS data.

The relationship between amounts of corresponding gene products has not been previously investigated in murine glomerular tissue using a comparison of two global investigatory approaches. Therefore, as part of this investigation, matched data sets from RNA array and MS, representing the transcriptome and proteome respectively, were compared to determine if a correlation exists between the relative amounts of corresponding gene products between male and female mice from the C57 and FVB strains. The Openoffice software package was used to produce a list of genes with corresponding RNA and protein products. Uniprot Accession (http://www.uniprot.org/) numbers, rather than gene names, were used as search terms for this to account for splice variants. All 37 genes for which matching RNA array and MS data existed were included in the comparison (Appendix 2).

83

180 0.00004 160 0.000035 140 0.00003 120

0.000025 100

0.00002 NSC RFU 80 0.000015 60 40 0.00001 20 0.000005 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col4a6: RNA array data Col4a6: MS data

5000 0.00025 4500 4000 0.0002

3500

3000 0.00015

2500 NSC RFU 2000 0.0001 1500 1000 0.00005 500 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Tinag: RNA array data Tinag: MS data

27 0.00016 26 0.00014

25 0.00012

0.0001 24

0.00008 NSC RFU 23 0.00006 22 0.00004 21 0.00002 20 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Vtn: RNA array data Vtn: MS data 84

15 0.00008 14.5 0.00007

14 0.00006

13.5 0.00005

13 0.00004

RFU NSC 12.5 0.00003 12 0.00002 11.5 0.00001 11 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col3a1: RNA array data Col3a1: MS data

8000 0.00003 7000 0.000025 6000

0.00002

5000

4000 0.000015

NSC RFU 3000 0.00001 2000 0.000005 1000 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Umod: RNA array data Umod: MS data

120 0.000008 0.000007 100 0.000006

80

0.000005

60 0.000004

RFU NSC 0.000003 40 0.000002 20 0.000001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Fras1: RNA array data Fras1: MS data 85

11.5 0.00014 11.4 0.00012

11.3 0.0001

11.2 0.00008

11.1 NSC RFU 0.00006 11 10.9 0.00004 10.8 0.00002 10.7 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Serpina3k: RNA array data Serpina3k: MS data

Figure 14 Correlations between RNA array and MS data sets: Data relating to the RNA and protein products of seven sample genes used to illustrate the degree of match between RNA array data and MS data across the four strain/sex mouse groups. Each row represents a different protein: Row 1 - Col4a6, Row 2 - Tinag, Row 3 - Vtn, Row 4 –Col3a1, Row 5 - Umod, Row 6 - Fras1 and Row 7 - Serpina3k. Column 1- bar charts summarizing RNA array data with relative fluorescence units (RFU) scores for each strain/ sex mouse group; Column 2 – bar charts summarizing MS data with mean NSC scores for each strain/sex mouse group.

86

There is a marked lack of consistency in the relationships between the gene products

(Figure 14). A visual comparison of a sample of the results for seven pairs of gene products illustrates a lack of consistent relationship between RNA array values and MS values; the bar charts for both RNA array and MS data reveals that considerable variation existed in both the relative amounts of RNA and protein measured in between the strain/sex mouse groups and between genes. The array data for Col4a6 shows greater amounts of RNA in the females than males and more in the C57 strain than

FVB. In contrast the MS data shows that the females have less of the protein than the males, and FVB males have much smaller amounts of protein than the other three groups that are fairly similar.

The array data for Tinag shows that the males express more RNA than females and

FVB strain expressing more RNA than C57. In contrast, the MS data shows the order for the FVB groups is reversed, with FVB females having the most protein of all groups and the FVB males having the least. In addition, the C57 females that had the overall second highest amount of RNA had the lowest amount of protein of the four groups.

Vtn shows differences between the relative abundances of its gene products. RNA for the gene is much less abundant in the FVB than the C57. However, the inverse pattern is present in relation to the protein product.

A different situation is apparent in Col3a1: RNA appears to be present in all four mouse groups, with FVB females having by far the greatest amounts, while the MS reveals that the protein is apparently absent from FVB males and it is the C57 males that have the greatest amount.

Uromodulin and Extracellular Matrix Protein FRAS1 (coded for by Umod and Fras1 respectively) are specific to the female sex and Serpin Peptidase Inhibitor, Clade A

(Alpha-1 Antiproteinase, Antitrypsin), Member 3 (coded for by Serpina3k) is specific to the FVB strain. 87

Data relating to the gene products of Umod, Fras1 and Serpina3k were included as in all three cases they illustrate even greater levels of disparity than that seen in the other four comparisons. RNA for all three genes is present in all four of the mouse groups while the proteins coded for by Umod and Fras1 are only found in female mice while

Serpina3k is restricted to the FVB strain (Note: the other two 'genes of interest', identified in the analysis of the MS data (presented above) did not have direct matches in the RNA array data set and so could not be shown in this comparison).

These seven examples illustrate a range of different types of relationship. In order to give a general impression of the variability in the types of relationship that occurred within the full dataset of 37 matched gene product values, each gene was allocated to one of four categories based on the type of relationship between the four pairs of RNA and MS values (Table 3). The first category included all those genes for which RNA was present in all four mouse groups but the corresponding proteins were absent from at least one mouse group. The remaining genes were allocated to one of three classes on the basis of their Coefficient of Determination (R2) values. The bar charts can be compared by eye, but the closeness of the match between the RNA array and MS values is subjective. Therefore, the use of a correlation coefficient to determine a threshold for allocating the genes to classes was considered to be more objective. The use of a correlation coefficient should not be taken to imply that a simple linear relationship is the definitive measure of the closeness of the relationship between the

RNA array and MS data but it is one straightforward way of evaluating the relationships. Where there are four pairs of values, the Pearson Product Moment

Correlation Coefficient (R) is statistically significant (P < 0.05) when R > 0.8114.

Therefore, an R2 value of 0.64 was taken as the threshold for allocating genes to a category. The relationships were categorised as either directly proportional due to a significant positive correlation (R2 > +0.64), inversely proportional due to significant 88

negative correlation (R2 < -0.64), or no proportional relationship due to non-significant correlation coefficient (-0.64 < R2 < +0.64).

Type of relationship between RNA and Number of Names of genes in each category MS values genes

Protein absent in >1 mouse groups 21 Apoa1, Apoe, Arg1, Bgn, Col3a1, Col5a2, Col6a5, C9, Cpn2, Ecm1, Fras1, Glipr2, Hspg2, Itih2, Itih4, Lad1, Mbl2, Pm20d1, Sfn, Thsd4, Umod

Significant positive correlation 1 Col1a1

Significant negative correlation 3 Alpl, Col15a1, Vtn,

Non-significant relationship 12 Alb, Col4a6, Emilin1, Hspg2, Itih1, Lama5, Mmrn2, Nid2, Pxdn, Serpinh1, Serpina3k, Tinag

Table 3: Summary of frequencies of different types of gene product relationships: numbers of RNA/protein relationships within the set of in the 37 pairs of matched gene products within each of four categories: 1) RNA present in all groups but at least one protein absent, 2) directly proportional relationship due to significant positive correlation 3) inversely proportional relationship due to significant negative correlation, 4) no proportional relationship due to non-significant correlation coefficient.

Visual analysis of the relative abundances of RNA and protein relating to the 37 genes with matched products identified in this investigation shows that the majority of the gene products (21 of the 37) are obviously unmatched because while their RNA is expressed in all four of the mouse groups their protein products are not. The second largest grouping of paired gene products shows that twelve of the RNA/protein pairings demonstrate non-significant correlation coefficients that indicate an absence of any simple linear relationship between the RNA array and MS data. Three of the paired gene products show statistically significant negative correlations while only one gene

(Col1a1) shows a statistically significant (R2 = >0.64) positive relationship between its corresponding RNA and protein data sets. This low frequency of significant correlations is roughly consistent with what might be expected to occur by chance; at the 5% level 89

of significance a spurious significant correlation would be expected to occur with a frequency of one in twenty tests. Therefore, there is negligible evidence for the occurrence of any kind of simple linear relationship between RNA array values and the amounts of protein detected by MS.

90

Discussion

The reduction from 120 to 37 in the list of gene products encountered as a result of cross-referencing the lists of ECM proteins derived from MS data and transcripts obtained via RNA array requires explanation. This apparent lack of proteins is most likely to be a consequence of the selection methods. There are a number of possible contributory sources of error: There could be loss of material during the practical implementation of the methods. Another possibility is that while corresponding peptides and transcripts may have been detected by the two methods the procedures and software used to analyse the raw data may have attributed different identification codes to products of the same gene, which has the effect of artificially inflating the total number of genes. Additionally, the initial analysis of the RNA array data set may have resulted in the allotting of older (and therefore redundant) Uniprot accession identification numbers to that data set. This is a known weakness of the Uniprot identification system, which, in many cases, features multiple identification numbers for a single protein.

With the exception of the statistically significant positive correlation shown by Col1a1 and the statistically significant negative correlations shown by Alpl, Col15a1 and Vtn the matched RNA and protein products of the 37 glomerular ECM protein genes identified show neither consistent nor statistically significant correlations, as is illustrated by Col3a1, Umod, Fras1 and Serpina3k. Consequently, there is negligible evidence for the occurrence of any kind of simple linear relationship between RNA array values and the amounts of protein detected by MS.

The key observation that can be drawn from this analysis is probably that, while the

RNAs for all the genes included in this part of the investigation are present in all four of the mouse groups studied, the corresponding proteins are frequently undetectable in one or more of the mouse groups. These observations not only demonstrate that the relationship between relative RNA and protein levels is inconsistent between different 91

genes, but also indicate that the patterns of expression observed in the transcriptome cannot be reliably used to determine the abundance of corresponding proteins.

Analysis of the overall relationship between the gene product data sets shows that there is no consistent relationship between the relative amounts of RNA and protein that form the glomerular ECMs of mice groups studied in this investigation.

Therefore, Hypothesis Two can be rejected, and it is concluded that in general there is very little similarity between the patterns of relative abundance of mRNAs and their encoded proteins associated with the glomerular ECM. There appears to be considerable variation in the ratios of RNA array values and MS values both between genes and between mouse groups. 92

Overall Discussion

In the first stage of this investigation the two glomerular isolation methods, based on either sieving or magnetic Dynabead perfusion, were compared in terms of number of whole glomeruli successfully isolated and the percentage purity of the isolates produced. In both cases the Dynabead based method proved superior: producing isolates that, on average, contained nearly five times as many glomeruli/ml and showed 13.9% greater purity. Despite the increased technical difficulties in carrying out this isolation method the superior yield and purity of the isolate makes this method preferable providing the resources and experienced technical assistance are available.

Following identification of a method that enabled the effective and consistent isolation of whole murine glomeruli to be achieved, a method by which glomerular ECM could be enriched was investigated. The buffer-based enrichment method was successfully used to enrich and purify murine glomerular ECM tissue.

Analysis of MS data derived from enriched glomerular ECMs shows that, for the purposes of this investigation, there are a total 120 proteins that make up the total glomerular ECM in male and female mice from the C57 and FVB strains. Each of the mouse groups studied had a number of unique proteins (mainly belonging to the ECM- associated category) while sharing a (predominantly structural) common matrisome.

The MS data obtained as part of this investigation show that these shared proteins are present in different mouse groups at different relative abundances.

The consistent presence of 59 glomerular ECM proteins in the common matrisome, in all four mouse groups used in this investigation shows that there are qualitative consistencies between the glomerular matrisomes irrespective of strain or sex. The majority of these core proteins (38 out of 59) were structural proteins: the most abundant structural sub-category being the glycoproteins containing 24 proteins, followed by the collagens containing 13 proteins; the proteoglycans containing one 93

protein (Hspg2) were the least abundant sub-category. Further analysis of the 21 ECM- associated proteins within the common matrisome, through cross referencing with the

Matrisome data base (Hynes and Naba, 2012), has shown that 9 of these 21 proteins were found to have known functions in ECM regulation.

In contrast, the majority of proteins uniquely expressed by each of the mouse groups are from the ECM-associated class and so are involved in either regulatory or supporting roles. It is possible that the differences between albuminuria levels of the mouse groups are partly due to quantitative regulation (i.e. varying the amount of protein) of the common matrisome by these unique to group proteins. Lastly, it is also possible that one or more of the comparatively greater number of uniquely expressed proteins observed in FVB female mice are possible candidates for this group's resistance to the damaging effects of testosterone on the glomerular filtration barrier.

However, this assertion is only a hypothesis at present.

Genes of interest

Two groups of proteins that exist outside of the core matrisome are those that are commonly expressed by the female sex (i.e. FVB females and C57 females) and the

FVB strain (both males and females of this strain). There are no such groups that are associated with the male sex or C57 strain. These groups of proteins are both associated with factors that are known to correlate to differing levels of albuminuria: i.e. gender and strain (Long et al., 2013) The proteins that are contained within these two protein groups are potential candidates for causing the higher levels of albuminuria observed in the FVB compared with the C57 strain and the lower rates of albuminuria associated with the female compared with the male sex.

94

Genes coding for proteins expressed in the female gender only

Umod

Umod was is a gene of interest because its protein product, known to be involved in renal ECMs, is only expressed in the female mouse models investigated, which makes it a candidate for playing role in stabilising the glomerular ECM, thereby explaining the lower levels of albuminuria in the female sex.

Also known as Tamm–Horsfall glycoprotein, Uromodulin (Umod) is the most abundant protein in urine under normal conditions and is known to be produced and excreted from the thick ascending limb of Henle’s loop (Bachmann et al., 2005), possibly as part of immunological defence mechanism that protects against bacterial infection of the urological tract. Defects in Umod result in in Uromodulin-associated kidney disease

(UAKD) that can manifest as one of three conditions: medullary cystic disease type 2, familial juvenile hyperuricemic nephropathy and glomerulocystic kidney disease

(Lorember and Vehaskari, 2013). UAKD is an autosomal-dominant, hereditary disease that can be caused by over 70 characterised mutations (Kemter et al., 2013). UAKD is typically a slowly progressive disorder; characterised by hypouricosuric hyperuricemia, gout and mild defects in urine concentrating ability that lead to renal failure.

Although Umod has been previously identified as being localised in the ascending limb of the loop of Henle, MS data obtained through this investigation has shown Umod to be part of the glomerular ECM of female mice from both the FVB and C57 strains. This finding is consistent with the recent identification of glomerulocystic kidney disease as one of the presentations of UAKD. In contrast to the general trend Umod is more abundant in the glomerular ECM of female C57 instead of FVB mice. As higher glomerular levels of Umod correlate with lower levels of albuminuria, the role of Umod in the glomerular ECM deserves to be further investigation as it could have roles in reducing albuminuria through stabilisation of the glomerular ECM. 95

Fras1

Fras1 was chosen as a gene of interest as it codes for a protein with known roles in renal development (Pitera et al., 2012). Additionally, as it is only expressed in the glomerular ECM of female mice from the C57 and FVB strains, it may play a role in reducing the levels of albuminuria in these mouse groups.

This gene encodes an extracellular matrix protein that appears to function in the regulation of epidermal-basement membrane adhesion and organogenesis during development. Mutations in this gene cause Fraser syndrome, an autosomal recessive disease, which is characterised by multisystem craniofacial, urogenital and respiratory system malformations (http://www.ncbi.nlm.nih.gov/gene/80144).

Targeted deletion of Fras1 in adult kidney podocytes has provided information on the post-developmental roles of the gene as knockout mouse models suffer from skin blistering, renal agenesis, and early death. Fras1 expression was downregulated in maturing glomeruli, which then became sclerotic. The data are consistent with the hypothesis that locally produced Fras1 has roles in glomerular maturation and integrity

(Pitera et al., 2012).

Fras1 signalling is therefore essential for maturation of the kidneys and other organs and is associated with maintaining glomerular ECM integrity in adult mice; the loss of this gene results in glomerulosclerosis. Fras1 is associated with females and not males in the mouse models studied. An explanation for the higher levels of urinary albumin in males could be that the lower Fras1 signalling results in a reduced glomerular integrity, leading to protein leakage.

Additionally, the higher levels of Fras1 in the glomerular ECM of FVB female mice may be part of a coping mechanism, which results in the more complex and protein rich glomerular ECM observed in this model. Eisner et al. (2010) showed that up to half of the creatinine in urine results from leakage of this protein from walls of the nephron 96

tubule. If FVB mice have high tubular creatinine excretion, the expression of Fras1 (as well as the unique to-FVB female proteins detected in this study) may be as part of an attempt to reduce the loss of blood proteins as part of glomerular filtration, thereby compensating for the loss of protein from the tubular protein of the nephron. However, this hypothesis is purely conjecture and presented here to represent one possible avenue of future research.

Genes coding for proteins expressed in the FVB strain only

Fgf2

Fgf2 is a member of the fibroblast growth factor family that has been exclusively detected in the glomerular matrisomes of FVB male and female mice, indicating that its expression may be associated with increased rates of albuminuria. These proteins bind to heparin and demonstrate a broad range of mitogenic and angiogenic activities. Fgf2 has been linked to numerous biological processes including limb and nervous system development, wound healing, and tumour growth. The mRNA strand coded for by the

Fgf2 gene can be translated into five different protein isoforms due to its multiple polyadenylation sites as well as its ability to be alternatively translated from non-AUG

(CUG) and AUG initiation codons. Those isoforms initiated by the CUG codon are typically nuclear proteins and are involved in intracrine effects while those preceded by an AUG codon are predomently cytosolic proteins that play roles in autocrine and paracrine signalling (Touriol et al., 2000). Studies in mouse models have shown that signalling involving FGFs, including Fgf2, is required for podocytes in vitro to undergo epithelial to mesenchymal transition as part of terminal differentiation (Davidson et al.,

2001). However, renal failure is very rare in Fgf2 mutant mice, suggesting functional compensation from other FGF's such as Fgf7 and 10 that are both expressed in adult and developing renal tissue. 97

Epithelial-mesenchymal transition of tubular cells is a widely recognized mechanism that sustains interstitial fibrosis in diabetic nephropathy (Masola et al., 2012). Fgf2 expression was detected in the glomerular ECM of male and female mice from the FVB that present higher levels of albuminurIa than mice from the C57 strain. As albuminuria can be caused by fibrotic changes of the glomerular ECM, the higher expression of

Fgf2 in these mice could be an indication of a trend towards fibrotic changes in FVB mice, involving a similar but less severe mechanism to that encountered in diabetic nephropathy.

Serpina3k and Agrin

SERPINA3K is an ECM protein that acts as a serine proteinase inhibitor as well as playing roles as an antiangiogenic (Gao et al., 2003), anti-inflammatory (Zhang et al.,

2009) and an antifiborgenic (Zhang et al., 2010) factor that is known to be expressed in renal, hepatic, pancreatic and retinal tissues. While less is known about its renal functions Serpina3k's retinal antifibrogenic activity is known to act through blocking of the Wnt signalling pathway. Decreased levels of Serpina3k expression are associated with fibrosis in diabetic retinopathy.

Agrin (Agrn) is a key protein component of the GBM and a member of the HSPG family. Agrn's known physiological roles in the glomerular ECMs have been discussed previously. Like Serpina3k loss of Agrn from the GFB is associated with higher levels of albuminurea.

The pattern of expression shown by both Agrn and Serpina3k is at odds with the findings of Long et al.'s investigation in that glomerular ECM expression of both proteins in 18 week old mice is associated with the comparatively higher, rather than lower, levels of albuminuria observed in male and female mice from the FVB strain. A possible explanation for this observation is that these two proteins (and other anti- albuminuric proteins with unique or relatively higher expression in mice of the FVB 98

compared to the C57 strain) show increased expression as part of a biological adaptation to high levels of tubular protein loss. The hypothetical mechanism behind this adaptation would up regulate available glomerular ECM-stabilising proteins (e.g.

Agrn and Serpina3k) in an attempt to compensate for the lack of other key glomerular

ECM components, such as Fras1 in the case of FVB males, or in the presence of high levels of proteins that lead to greater levels of albuminuria, such as Fgf2.

Comparison of Mass Spectrometric and RNA data

The comparison of MS and RNA array data has shown that there is no consistent relationship between of the relative proportions of coding RNAs and their corresponding ECM proteins for all genes. The lack of any consistent relationship between relative abundances of corresponding RNA and protein values is of key importance as it demonstrates that the data relating to relative amounts of RNA detected using RNA array protocols cannot be used as a reliable indicator of the relative abundance of corresponding protein products. In other words, the pattern of gene expression amongst the four mouse strain/sex groups for any particular gene as indicated by RNA array data is a poor predictor of gene expression amongst the groups as indicated by the protein products detected by MS. Indeed, the presence of RNA for a particular gene cannot be taken as an indication that the protein it codes for is even present at all.

These differing patterns of corresponding RNA and protein amongst the mouse groups could be caused by either translational control mechanisms such as RNA interference by micro RNAs or the inhibition of effective chaperoning of mRNAs to translational sites

(Bartel et al., 2004); or via differing rates of degradation of synthesised proteins between the sexes and strains of mice used in this investigation. However, further investigation will be required to confirm the underlying cause of this lack of correlation.

Consequently, it is probably advisable to conclude that RNA arrays should not be seen as effective tools for the analysis of protein expression in the glomerular ECM of the 99

mouse groups used in this study. Indeed these findings have implications for the wider application of RNA arrays as a predictive tool at least until the mechanisms underlying the relationship between RNA and protein concentrations are better understood.

More information relating to the control mechanisms that influence glomerular ECM composition as well as the impact of individual components on the function of the glomerular filtration barrier will need to be obtained through targeted follow up studies focusing on individual protein function.

It should be noted that artefacts, stemming from any one of the multiple steps in the various methods used in this investigation could cause significant changes to the final data. These could include, but are not limited to: loss of protein in the initial isolation and enrichment stages, insufficient enrichment of the ECM fraction leading to masking of certain less abundant components and gaps in the software databases used to identify proteins from the raw MS or RNA array data are all possible. Lastly the analysis of the final data sets made in this report should not be seen as unequivocal but as an interpretation of large data sets with the aim of directing future investigations.

Main Conclusions

So that the glomerular ECM of male and female mice from the C57 and FVB strains could be analysed by MS, two initial investigations were undertaken to allow the production of samples suitable for this form of analysis to be successfully implemented.

The first of these initial investigations was to identify a method by which murine glomeruli could be successfully isolated from surrounding tissues. The second stage was to confirm that the method to produce enrich samples of human glomerular ECM was also applicable for use in mouse tissue. The results show that the Dynabead method is preferable to the sieving method for isolating glomeruli from mouse kidneys as it allows for comparatively purer and more glomeruli-rich samples to be produced. 100

Western blots and Coomassie staining has demonstrated that buffer based enrichment method can be used on murine tissue to produce enriched ECM samples.

Qualitative differences in proteins have been identified between mouse groups for 120 genes confirmed as being present within the glomerular ECM. Therefore, Hypothesis

One is supported and it is confirmed that there are differences between the ECM proteomes of mice of different sexes and strains. There is also considerable variability in the quantities of protein present in four mouse groups at the level of an individual gene. The results of the qualitative analysis have been used to identify a number of genes that are the strongest candidates for underlying the mechanisms of albuminuria, and it is recommended that these genes, Umod, Fras 1, Fgf2, Agrn and Serpina3k, should be the subject of further investigation.

There is no consistent similarity, in terms of a simple linear proportionality, between the patterns of relative abundance of RNA and protein products of murine genes associated with the glomerular ECM when four strain/sex mouse groups are compared to each other. Furthermore, there is also considerable variation in the nature of the relationship between the RNA array and protein values across mouse groups at the level of the individual gene, and there appears to be no consistent relationship between the two variables based on strain or sex. Therefore, Hypothesis Two can be rejected, and it is concluded that there are no consistent similarities between the patterns of relative abundance of mRNAs and their encoded proteins associated with the glomerular ECM. This finding has implications for the use of RNA array data for predicting protein content or activity in individual subjects.

Future Directions of Investigation

This investigation has identified a number of proteins that could be responsible for influencing the rates of albuminuria observed in male and female FVB and C57 mice.

The functions these proteins play within the glomerular ECM could be investigated 101

through several means such as knock-down/out in vitro cell cultures or in vivo mouse models. The latter of these choices (in vivo) would be doubly appropriate as it would factor in the effect played by sex to the role of the protein.

Another line of investigation relevant to the findings reported here would be to investigate the differing rates of degradation of glomerular ECM proteins in the four mouse groups studied here. This approach would provide greater understanding of the pre and/or post-translational mechanisms governing the lack of correlation between

RNA and MS data.

In addition, more complex statistical analyses, such as principal component analysis, could be performed to reveal underlying patterns and trends in the data. It is possible that manual re-analysis of the RNA array data set could result in the identification of matching RNA data for all 120 ECM proteins identified. The use of complex statistical approaches such as hierarchical clustering could identify relationships between glomerular ECM components that less sophisticated methods would not.

The successful application of MS analytical techniques in this and Lennon et al.'s investigations (in press) indicates that this method is suitable for compositional analysis of glomerular ECMs. MS analysis could therefore be expected to yield reliable results in similar studies, examples of which could be in the determination of the compositional changes that occur as part of human pathologies such as Alport syndrome or diabetic nephropathy.

102

References

Abboud H. E., Poptic E. and DiCorleto P. (1987) Production of platelet-derived growth factor like protein by rat mesangial cells in culture, Journal of Clinical Investigation, 80:

675–683.

About Cancer (2012), Kidney (Renal Cell) Carcinoma: [Online], Available at http://www.aboutcancer.com/kidney_basic.htm (Accessed 17 July 2013).

Abrahamson D. R. (1985) Origin of the glomerular basement membrane visualized after in vivo labeling of laminin in newborn rat kidneys, Journal of Cell Biology, 100:

1988-2000.

Aebersold R. and Mann M. (2003) Mass spectrometry-based proteomics, Nature, 422:

198-207.

Andersson C. O. (1958) Mass Spectrometric Studies on Amino Acid and Peptide

Derivatives, Acta Chemica Scandinavia, 12: 6.

Aron D. C., Rosenzweig J. L. and Abboud H. E. (1989) Synthesis and binding of insulin-like growth factor I by human glomerular mesangial cells, Journal of Clinical

Endocrinology and Metabolism, 68: 585–591.

Attwood T. K., Gisel A., Eriksson N-E. and Bongcam-Rudloff E. (2011) ‘Concepts,

Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A

European Perspective’ in Mahdavi M. A. (Ed.) Bioinformatics - Trends and

Methodologies, InTech, [Online], Available at http://www.intechopen.com/books/bioinformatics-trends-and-methodologies/concepts- historical-milestones- and-the-central-place-of-bioinformatics-in-modern-biology-a- european- (Accessed 3 August 2013). 103

Ausiello D. A., Kreisberg, J. I., Roy C. and Karnovsky M. J. (1980) Contraction of cultured rat glomerular cells of apparent mesangial origin after stimulation with angiotensin II and arginine vasopressin, Journal of Clinical Investigation, 65: 754-60.

Bachmann S., Mutig K., Bates J., Welker P., Geist B., Gross V., Luft F. C., Alenina N.,

Bader M., Thiele B. J., Prasadan K., Raffi H. S. and Kumar S. (2005) Renal effects of

Tamm-Horsfall protein (uromodulin ) deficiency in mice, American Journal of Physiology - Renal Physiology, 288: 559–567.

Bader B. L., Smyth N., Nedbal S., Miosge N., Baranowsky A., Mokkapati S., Murshed

M. and Nischt R. (2005) Compound Genetic Ablation of Nidogen 1 and 2 Causes

Basement Membrane Defects and Perinatal Lethality in Mice, Society, 25: 6846-6856.

Bartel D.P., Lee R. and Feinbaum R. (2004) MicroRNAs: Genomics, Biogenesis,

Mechanism, and Function Genomics : The miRNA Genes, Cell, 116: 281–297.

Boots WebMD: [Online], Available at http://www.webmd.boots.com/urinary- incontinence/guide/kidneys-picture; last updated 13/1/2012 (Accessed 20th February

2013)

Boute N., Gribouval O., Roselli S., Benessy F., Lee H., Fuchshuber A., Dahan K.,

Gubler M. C., Niaudet P. and Antignac C. (2000) NPHS2, encoding the glomerular protein podocin, is mutated in autosomal recessive steroid-resistant nephrotic syndrome. Nature genetics, 24: 349–54.

Bulger R. E., Eknoyan G., Purcell D. J. and Dobyan D. C. (1983) Endothelial characteristics of glomerular capillaries in normal, mercuric chloride-induced, and gentamicin-induced acute renal failure in the rat, Journal of Clinical Investigation, 7:

128-41. 104

Carter D. C., He X. M., Munson S. H., Twigg P. D., Gernert K. M., Broom M. B. and

Miller T. Y. (1989) Three-dimensional structure of human serum albumin, Science, 244:

1195-8.

Chen Y. M., Kikkawa Y. and Miner J. H. (2011) A missense LAMB2 mutation causes congenital nephrotic syndrome by impairing laminin secretion, Journal of the American

Society of Nephrology: 22: 849-58.

Davidson, G., Dono, R. and Zeller, R. (2001) FGF signalling is required for differentiation-induced cytoskeletal reorganisation and formation of actin-based processes by podocytes, Journal of cell science, 114: 3359–66.

Haraldsson B., Nystrome J. and Deen W. M. (2008) Properties of the Glomerular

Barrier and Mechanisms of Proteinuria, Physiological Reviews, 88: 451– 487.

Deen W. M., Lazzara M. J. and Myers B. D. (2001) Structural determinants of glomerular permeability, American Journal of Physiology. Renal Physiology, 281: F579-

96.

Dytham C. (2011) Choosing and Using Statistics, A Biologist’s Guide, 3rd edn,

Chichester, Wiley-Blackwell.

Economou C. G., Kitsiou P. V., Tzinia A. K., Panagopoulou E., Marinos E., Kershaw, D.

B., Kerjaschki D. and Tsilibary E. C. (2004) Enhanced podocalyxin expression alters the structure of podocyte basal surface, Journal of cell science, 117: 3281–94.

EMBL-EBI Bioinformatic Services (2013): [Online]. http://www.ebi.ac.uk/services

(Accessed 14th April 2013).

Eisner C., Faulhaber-Walter R., Wang Y., Leelahavanichkul A., Yuen P. S. T., Mizel D.,

Star R. A, Briggs J. P., Levine M. and Schnermann J. (2010) Major contribution of tubular secretion to creatinine clearance in mice, Kidney International, 77: 519–26. 105

Feiglin A., Hacohen A., Sarusi A., Fisher J., Unger R. and Ofran Y. (2012) Static network structure can be used to model the phenotypic effects of perturbations in regulatory networks. Bioinformatics (Oxford, England), 28: 2811–8.

Fujita T., Tokunaga J. and Edanaga M. (1976) Scanning electron microscopy of the glomerular filtration membrane in the rat kidney, Cell and Tissue Research, 166: 299-

314.

Gao G., Shao C., Zhang S. X., Dudley A., Fant J. and Ma J. X. (2003) Kallikrein- binding protein inhibits retinal neovascularization and decreases vascular leakage,

Diabetologia, 46: 689-98.

Gene Cards (2013) Fgf2: [Online], Available at http://www.genecards.org/cgi- bin/carddisp.pl?gene=FGF2 (Accessed 12 July 2013).

Gould M. M., Mohamed-Ali V., Goubet S. A, Yudkin J. S. and Haines A. P. (1993)

Microalbuminuria: associations with height and sex in non-diabetic subjects, BMJ

(Clinical research ed.), 306: 240–2.

Gygi S. P., Corthals G. L., Zhang Y., Rochon Y., and Aebersold R. (2000) Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology, Proceedings of the National Academy of Sciences of the United States of America, 97: 9390-9395.

Haraldsson B. and Nyström J. (2012) The glomerular endothelium: new insights on function and structure. Current opinion in nephrology and hypertension, 21: 1062-4821.

Haraldsson B., Nystrome J. and Deen W. M. (2008) Properties of the Glomerular

Barrier and Mechanisms of Proteinuria, Physiological Reviews, 88: 451– 487.

Hassell J. R., Robey P. G., Barrach H. J., Wilczek J., Rennard S. I. and Martin G. R.

(1980) Isolation of a heparan sulfate-containing proteoglycan from basement membrane, Proceedings of the National Academy of Sciences of the United States of

America, 77: 4494-4498. 106

Hanevold, C. D., Pollock, J. S., and Harshfield, G. A. (2008) Racial differences in microalbumin excretion in healthy adolescents, Hypertension, 51: 334–8.

Hogeweg P (2011) The Roots of Bioinformatics in Theoretical Biology. PLoS Comput

Biol 7(3): e1002021, [Online], Available at http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002021,

(Accessed 12 February 2013).

Huang D. W., Sherman B. T. and Lempicki R. A. (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nature Protocols, 4:

44-57.

Huang D. W., Sherman B. T. and Lempicki R. A. (2009) Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic acids research, 37: 1–13.

Hunt D. F., Shabanowitz J., Sakaguchi K., Michel H., Sevilir N., Cox A. L., Appella E.,

Engelhard V. H. and Henderson R. A. (1992) Characterization of Peptides Bound to the

Class I MHC Molecule HLA-A2.1 by Mass Spectrometry, Science, 255: 1-3.

Hynes R. O. and Naba A. (2012) Overview of the matrisome--an inventory of extracellular matrix constituents and functions, Cold Spring Harbor Perspectives in

Biology, 4: a004903, [Online], Available at http://cshperspectives.cshlp.org/content/4/1/a004903.full (Accessed 16 March 2013).

Iorember F. M. and Vehaskari V. M. (2013) Uromodulin: old friend with new roles in health and disease, Pediatric Nephrology, [Online] Available at http://link.springer.com/article/10.1007%2Fs00467-013-2563-z (Accessed 16 March

2013).

Life Technologies (2013) The Basics: What is a Gene Array?, [Online], available at http://www.invitrogen.com/site/us/en/home/References/Ambion-Tech-Support/rna- 107

gene-expression/general-articles/the-basics-what-is-a-gene-array.html (Accessed 23

August 2013).

Isles C. G., Hole D. J., Hawthorne V. M. and Lever A. F. (1992) Relation between coronary risk and coronary mortality in women of the Renfrew and Paisley survey: comparison with men, Lancet, 339: 702–6.

The Jackson Laboratory (2013) Strain name: C57BL/6J: [Online], Available at http://jaxmice.jax.org/strain/000664.html (Accessed 23 August 2013).

The Jackson Laboratory (2013) Strain name: FVB/NJ: [Online], Available at http://jaxmice.jax.org/strain/001800.html (Accessed 23 August 2013).

Jones C. A., Francis M. E. and Eberhardt M. S. (2002) Microalbuminuria in the US population: third National Health and Nutrition Examination Survey, American Journal of Kidney Disease; 39: 445-459.

Julian B. A. and Novak J. (2004) IgA nephropathy : an update. Current Opinion in

Nephrology and Hypertension: Clinical Nephrology, Vol. 13: 171-179.

Kanwar Y. S., Liu Z. Z., Kashihara N. and Wallner E. I. (1991) Current status of the structural and functional basis of glomerular filtration and proteinuria, Seminal

Nephrology, 11: 390–413.

Kanwar Y. S., Linker A. and Farquhar M. G. (1980) Increased permeability of the glomerular basement membrane to ferritin after removal of glycosaminoglycans

(heparan sulfate) by enzyme digestion, Journal of Cell Biology, 86: 688-93.

Karp G. (2007) Cell and Molecular Biology: Concepts and Experiments, Chichester,

Wiley-Blackwell.

Keller G., Zimmer G., Mall G., Ritz E. and Amann K. (2003) Nephron Number in

Patients with Primary Hypertension, New England Journal of Medicine, 348: 101–108. 108

Kemter E., Prueckl P., Sklenak S., Rathkolb B., Habermann F.A., Hans W., Gailus-

Durner V., Fuchs H., Hrabe de Angelis M., Wolf E., Aigner B. and Wanke R. (2013)

Type of uromodulin mutation and allelic status influence onset and severity of uromodulin-associated kidney disease in mice, Human molecular genetics, [Online],

Available at http://hmg.oxfordjournals.org/content/early/2013/06/06/hmg.ddt263.full.pdf+html

(Accessed 5th August 2013).

Kestilä M., Lenkkeri U., Männikkö M., Lamerdin J., McCready P., Putaala H.,

Ruotsalainen V., Morita T., Nissinen M., Herva R., Kashtan C. E., Peltonen L.,

Holmberg C., Olsen A. and Tryggvason K. (1998) Positionally cloned gene for a novel glomerular protein--nephrin--is mutated in congenital nephrotic syndrome, Molecular cell, 1: 575–82.

Khoshnoodi J., Cartailler J.-P., Alvares K., Veis A. and Hudson B. G. (2006) Molecular recognition in the assembly of collagens: terminal noncollagenous domains are key recognition modules in the formation of triple helical protomers, Journal of Biological

Chemistry, 281: 38117-38121.

Kirkwood W. and Nagi A. H. (1970) A quick method for the isolation of glomeruli from human kidney, Technical Methods, 27: 361-362.

Lennon R., Byron A., Humphries J. D., Randles M. R., Carisey A., Murphy S., Knight

D., Brenchley P. E., Zent R. and Humphries M. J. (2013 in press.) Global analysis reveals the complexity of the human glomerular extracellular matrix, Journal of the

American Society of Nephrology.

Life Technologies, The Basics: What is a Gene Array? [Online], Available at http://www.invitrogen.com/site/us/en/home/References/Ambion-Tech-Support/rna- gene-expression/general-articles/the-basics-what-is-a-gene-array.html (Accessed 24

July 2013). 109

Long D. A., Price K. L., Kolatsi-Joannou M., Dessapt-Baradez C., Huang J. L.,

Papakrivopoulou J., Korstanje R., Gnudi L. and Woolf A. S. (2013) Identification of novel glomerular molecules implicated in albuminuria, Kidney international, 83: 1118-

1129.

Masola, V., Onisto, M., Zaza, G., Lupo, A. and Gambaro, G. (2012) A new mechanism of action of sulodexide in diabetic nephropathy: inhibits heparanase-1 and prevents

FGF-2-induced renal epithelial-mesenchymal transition, Journal of translational medicine, 10: 213.

Miner, J. H. (2012) The glomerular basement membrane, Experimental cell research,

318: 973–978.

Miner J. H., Go G., Cunningham J., Patton B. L. and Jarad G. (2006) Transgenic isolation of skeletal muscle and kidney defects in laminin beta2 mutant mice: implications for Pierson syndrome, Development, 133: 967–75.

Miner J. H., Patton B. L., Lentz S. I., Gilbert D. J., Snider W. D., Jenkins N. A.,

Copeland N. G. and Sanes J. R. (1997) The Laminin Alpha Chains: Expression,

Developmental Transitions, and Chromosomal Locations of Alpha 1-5, Identification of

Heterotrimeric Laminins 8–11, and Cloning of a Novel Alpha 3 Isoform, Cell, 137: 685-

701.

Miner J. H. and Sanes J. R. (1994) Collagen IV tx3, c 4, and ct5 Chains in Rodent

Basal Laminae: Sequence, Distribution, Association with Laminins, and Developmental

Switches, Cell, 127: 879-891.

Molitch E. M., DeFronzo A. R., Franz J. M., Keane F. W., Mogensen E. C. and Parving

H. H. (2004) Nephropathy in Diabetes, Diabetes Care 27, [Online], Available at http://care.diabetesjournals.org/content/27/suppl_1/s79.full (Accessed 25 July 2013). 110

Naba A., Clauser K. R., Hoersch S., Liu H., Carr S. A. and Hynes R. (2012) The matrisome: in silico definition and in vivo characterization by proteomics, Molecular and

Cellular Proteomics, 11(4): M111.014647, [Online], Available at http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3322572/ (Accessed 18 June 2013).

Nature Reviews Cancer; Type IV collagen network formation (2012); [Online], Available at http://www.nature.com/nrc/journal/v3/n6/box/nrc1094_BX1.html (Accessed 19 July

2013).

NCBI (2013) CCL28 chemokine (C-C motif) ligand 28 [ Homo sapiens (human) ]

[Online], Available at http://www.ncbi.nlm.nih.gov/gene/56477 (Accessed 18 June

2013).

Nesvizhskii A. I., Keller A., Kolker E. and Aebersold R. (2003) A Statistical Model for

Identifying Proteins by Tandem Mass Spectrometry abilities that proteins are present in a sample on the basis, 75(17), Analytical Chemistry, 75: 4646–4658.

Noakes P. G., Miner J. H., Gautam M., Cunningham J. M., Sanes J. R. and Merlie J. P.

(1995) The renal glomerulus of mice lacking s−laminin/laminin β2: nephrosis despite molecular compensation by laminin β1, Nature Genetics 10: 400 – 406.

Osawa T., Onodera M., Feng X.-Y. and Nozaka Y. (2003) Comparison of the thickness of basement membranes in various tissues of the rat, Journal of Electron Microscopy,

52: 435-40.

Pounds S. B. (2006) Estimation and control of multiple testing error rates for microarray studies, Briefings in Bioinformatics, 7: 25–36.

Pitera J. E., Turmaine M., Woolf A. S. and Scambler P. J. (2012) Generation of mice with a conditional null Fraser syndrome 1 (Fras1) allele, Genesis, 50: 892–8.

Qi Z., Whitt I., Mehta A., Jin J., Zhao M., Harris R. C., Fogo A. B. and Breyer M. D,

(2004) Serial determination of glomerular filtration rate in conscious mice using FITC- 111

inulin clearance, American journal of physiology. Renal physiology, 3: 286 [Online],

Available at http://ajprenal.physiology.org/content/286/3/F590 (Accessed 21 June

2013).

Rostgaard J. and Qvortrup K. (2002) Sieve plugs in fenestrae of glomerular capillaries-- site of the filtration barrier?, Cells, Tissues, Organs, 170: 132-8.

Rupp F., Ozçelik T., Linial M., Peterson K., Francke U. and Scheller R. (1992)

Structure and chromosomal localization of the mammalian agrin gene. The Journal of

Neuroscience: the Official Journal of the Society for Neuroscience, 12: 3535-44.

Rupp F., Payan D. G., Magill-Solc C., Cowan D. M. and Scheller R. H. (1991) Structure and expression of a rat agrin, Neuron, 6: 811-23.

Salmivirta K. (2002) Binding of Mouse Nidogen-2 to Basement Membrane Components and Cells and Its Expression in Embryonic and Adult Tissues Suggest Complementary

Functions of the Two Nidogens, Experimental Cell Research, 279: 188–201.

Salmon A. H. J.,Ferguson J. K., Burford J. L., Gevorgyan H., Nakano D., Harper S. J. and Bates D. O., Peti-peterdi J. (2012) Loss of the Endothelial Glycocalyx Links

Albuminuria and Vascular Dysfunction, Journal of the American Society of Nephrology

23: 1–12.

Sanes J. R., Engvall E., Butkowski R. and Hunter D. D. (1990) Molecular heterogeneity of basal laminae: isoforms of laminin and collagen IV at the neuromuscular junction and elsewhere, The Journal of cell biology, 111: 1685–1699.

Schena M., Shalon D., Davis R. W. and Brown P. O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray, Science, 270: 467–

470.

Schena F. P. and Gesualdo L. (2005) Pathogenetic mechanisms of diabetic nephropathy, Journal of the American Society of Nephrology, 16: 30–33. 112

Schlondorff D. (1987) The glomerular mesangial cell: an expanding role for a specialized pericyte, Journal of the Federation of American Societies for Experimental

Biology, 1: 272–81.

Shevchenko A., Tomas H., Havlis J., Olsen J. V. and Mann M. (2006) In-gel digestion for mass spectrometric characterization of proteins and proteomes, Nature protocols, 1:

2856–60.

Smoyer W. E. and Mundel P. (1998) Regulation of podocyte structure during the development of nephrotic syndrome, Journal of molecular medicine, 76: 172–83, p. 174 figure, [Online], Available at http://www.ncbi.nlm.nih.gov/pubmed/9535550 (Accessed

5th May 2012).

Stanis J. (2006) Glomerulus [Photoshop], Medical Illustrations and Fine Art, [Online],

Available at http://www.jimstanis.com/gloum.htm (Accessed 15th May 2012).

Sugio S., Kashima A., Mochizuki S., Noda M. and Kobayashi K. (1999) Crystal structure of human serum albumin at 2.5 A resolution, Protein Engineering, 12: 439-46.

Takemoto M., Asker N., Gerhardt H., Lundkvist A., Johansson B. R., Saito Y. and

Betsholtz C. (2002) A new method for large scale isolation of kidney glomeruli from mice, The American Journal of Pathology, 161: 799-805.

Tsaih S.-W., Pezzolesi M. G., Yuan R., Warram J. H., Krolewski A. S. and Korstanje R.

(2010) Genetic analysis of albuminuria in aging mice and concordance with loci for human diabetic nephropathy found in a genome-wide association scan, Kidney

International, 77: 201-10.

Touriol C., Roussigne M., Gensac M. C., Prats H. and Prats A. C. (2000) Alternative translation initiation of human fibroblast growth factor 2 mRNA controlled by its 3’- untranslated region involves a Poly(A) switch and a translational enhancer, The Journal of biological chemistry, 275: 19361–19367. 113

Vander A. J., Sherman J. H. and Luciano D. S. (2001) Human Physiology: the

Mechanisms of Body Function, 8th edn; Singapore, McGraw-Hill.

Verhave, J. C. (2003) Cardiovascular Risk Factors Are Differently Associated with

Urinary Albumin Excretion in Men and Women, Journal of the American Society of

Nephrology, 14: 1330–1335.

Vernier R. L. and Birch-Andersen A. (1962) Studies of the human fetal kidney, The

Journal of Pediatrics, 60: 754–776.

Viberti G. C., Jarrett R. J., Hill R. D., Mahmud U. and Argyropoulos A. K. H., (1982)

Microalbuminuria as a predictor of clinical nephropathy in insulin-dependent diabetes mellitus, Lancet, 1430–1432.

Wang, Z., Gerstein, M., and Snyder, M. (2010) RNA-Seq: a revolutionary tool for transcriptomics, Nature Reviews Genetics, 10: 57–63.

Wartiovaara J., Öfverstedt L.-göran, Khoshnoodi J., Zhang J., Mäkelä E., Sandin S.,

Ruotsalainen V., Cheng R.H., Jalanko H., Skoglund U. And Tryggvason K. (2004)

Nephrin strands contribute to a porous slit diaphragm scaffold as revealed by electron tomography, Filtration, 114: 1475-1483.

Weir M. R. (2007) Microalbuminuria and cardiovascular disease, Clinical journal of the

American Society of Nephrology, 2: 581–90.

Zhang, B., Hu, Y. and Ma, J. (2009) Anti-inflammatory and antioxidant effects of

SERPINA3K in the retina, Investigative ophthalmology and visual science, 50: 3943–

3952.

Zhang B. Zhou K. K. and Ma J. (2010) Inhibition of Connective Tissue Growth Factor

Overexpression in Diabetic Retinopathy by SERPINA3K via Blocking the WNT/β-

Catenin Pathway, Diabetes, 59: 1809–1816. 114

Zoja C., Wang J.M., Bettoni S., Sironi M., Renzi D., Chiaffarino F., Abboud H.E., Van

Damme J., Mantovani A. and Remuzzi G. (1991) Interleu- kin-1 beta and tumor necrosis factor-alpha induce gene expression and production of leukocyte chemotactic factors, colony-stimulating factors, and interleukin-6 in human mesangial cells,

American Journal of Pathology, 138: 991–1003. 115

Appendices

Appendix 1: Tables of glomerular extracellular matrix proteins detected by mass spectrometry in male FVB, female FVB, male C57 and female C57 mice

Appendix 1.1: FVB Male Glomerular ECM: 79 ECM proteins were identified in the ECM collected from FVB male mice. Protein name, gene name, Uniprot identifier, Entrez gene ID and protein abundance measured by normaised spectral counts (NSC) are indicated.

Gene Gene Protein name and category Name Uniprot ID MW NSC

Collagens Collagen alpha-1(XII) chain Col12a1 Q60847 12816 340 1.193E-05 Procollagen type XV Col15a1 A2AJY2 12819 138 1.057E-04 Collagen alpha-1(XVIII) chain Col18a1 P39061 12822 182 7.000E-05 Collagen alpha-1(I) chain Col1a1 P11087 12842 138 1.900E-05 Collagen alpha-1(IV) chain Col4a1 P02463 12826 161 6.100E-05 Collagen, type IV, alpha 2 Col4a2 B2RQQ8 12827 167 1.357E-04 Collagen alpha-3(IV) chain Col4a3 Q9QZS0 12828 162 7.600E-05 Collagen alpha-4(IV) chain Col4a4 Q9QZR9 12829 164 1.600E-04 Collagen alpha-5(IV) chain Col4a5 Q63ZW6 12830 162 5.382E-05 Procollagen type IV alpha 6 Col4a6 B1AVK5 94216 164 1.053E-05 Collagen alpha-1(VI) chain Col6a1 Q04857 12833 108 1.800E-04 Collagen alpha-2(VI) chain Col6a2 Q02788 12834 110 2.100E-04 Collagen alpha-3(VI) chain Col6a3 E9PWQ3 1293 287 2.867E-04 Collagen alpha-5(VI) chain Col6a5 A6H584 665033 290 1.358E-05

ECM Glycoproteins Agrin Agrn A2ASQ1 11603 208 7.259E-06 EMILIN-1 Emilin1 Q99K41 100952 108 9.001E-05 Fibrillin 1 Fbn1 A2AQ53 14118 312 2.779E-05 Fibrinogen beta chain Fgb Q8K0E8 110135 55 1.999E-04 Fibrinogen gamma chain Fgg Q8VCM7 99571 49 1.157E-04 Fibronectin 1 Fn1 Q3UH17 14268 250 2.100E-04 Kielin/chordin-like protein Kcp Q3U492 375616 167 9.633E-06 Laminin subunit alpha-1 Lama1 P19137 16772 338 3.500E-05 Laminin subunit alpha-3 Lama3 E9PUR4 3909 366 6.330E-05 Laminin subunit alpha-4 Lama4 P97927 16775 202 5.900E-05 Laminin subunit alpha-5 Lama5 Q61001 16776 404 2.000E-04 Laminin subunit beta-1 Lamb1 P02469 3912 343 5.606E-05 Laminin subunit beta-1 Lamb1 P02469 3912 197 1.689E-04 Laminin, beta 2 Lamb2 Q61292 16779 197 3.100E-04 Laminin subunit gamma-1 Lamc1 P02468 226519 177 2.500E-04 Lactadherin Mfge8 P21956 17304 51 1.763E-04 Multimerin-2 Mmrn2 A6H6E2 105450 105 6.200E-05 Nidogen-1 Nid1 P10493 18073 137 2.209E-04 Nidogen 2 Nid2 Q3TPN0 18074 154 1.581E-04 Netrin-4 Ntn4 Q9JI33 59277 70 6.626E-05 Papilin Papln Q9EPX2 170721 139 4.619E-05 Probable carboxypeptidase PM20D1 Pm20d1 Q8C165 212933 56 4.900E-05 Peroxidasin homolog (Drosophila) Pxdn B2RX13 69675 165 5.727E-05 Tubulointerstitial nephritis antigen Tinag Q91XG7 26944 54 1.684E-04 Tubulointerstitial nephritis antigen-like 1 Tinagl1 Q99JR5 94242 53 1.270E-04 Tenascin Tnc Q80YX1 21923 232 3.100E-05 116

Vitronectin Vtn P29788 22370 55 1.300E-04 von Willebrand factor A domain-containing protein 8 Vwa8 Q8CC88 219189 213 4.000E-05 von Willebrand factor Vwf E9QPU1 7450 309 1.867E-05

Proteoglycans Hyaluronan and proteoglycan link protein 1 Hapln1 Q9QUP5 12950 40 5.400E-05 Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 Q05793 15530 398 6.500E-06 Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 E9PZ16 3339 470 1.772E-04

ECM-associated Serum albumin Alb P07724 117586 69 1.179E-04 Alkaline phosphatase Alpl B7XGA6 249 57 1.466E-04 Annexin A2 Anxa2 P07356 12306 39 1.821E-04 Apolipoprotein O-like Apool Q78IK4 68117 29 1.948E-04 Complement C3 C3 P01027 12266 186 2.143E-04 Complement C4-B C4b P01029 721 193 1.783E-04 C4b-binding protein C4bpa P08607 12269 52 3.530E-05 Clusterin Clu Q06890 12759 52 2.164E-04 Ceruloplasmin Cp Q61147 12870 121 1.091E-04 Deoxyribonuclease-1 Dnase1 P49183 13419 32 7.062E-05 Dipeptidylpeptidase 4 Dpp4 A2AS84 13482 87 1.364E-04 Heparin-binding growth factor 2 Fgf2 P15655 14173 17 1.141E-04 Protein Gm20425 Gm20425 E9Q035 108 1.946E-05 Glutathione peroxidase 3 Gpx3 P46412 14778 25 3.300E-04 Gelsolin Gsn Q6PAC1 227753 81 8.400E-05 Dehydrogenase/reductase (SDR family) member 8 HSD17B11 Q9EQ06 114664 33 1.986E-04 Heat shock 70kD protein 5 (Glucose-regulated protein) Hspa5 P20029 14828 72 1.630E-04 Hyaluronidase-2 Hyal2 O35632 15587 54 6.689E-05 Ig mu chain C region secreted form Igh-6 P01872 16019 50 6.129E-05 Ig gamma-1 chain C region Ighg1 P01869 16017 43 5.256E-05 Inter-alpha trypsin inhibitor, heavy chain 1, isoform CRA_a Itih1 Q61702 16424 101 5.100E-05 Meprin A subunit alpha Mep1a P28825 17287 84 7.742E-05 Meprin A subunit beta Mep1b Q61847 17288 80 1.007E-04 Murinoglobulin-1 Mug1 P28665 17836 165 8.700E-05 Nephronectin NPNT D3YTX1 255743 50 8.734E-05 Ribosomal Protein L13a Rpl13a Q5M9M0 14256 23 1.056E-04 Alpha-1-antitrypsin 1-1 Serpina1a P07758 20700 46 5.170E-05 Serine protease inhibitor A3K Serpina3k P07759 20714 47 1.100E-04 Serpin Peptidase Inhibitor, Clade H (Heat Shock Protein 47), Member 1 Serpinh1 P19324 12406 47 2.590E-04 Protein-glutamine gamma-glutamyltransferase K Tgm1 Q9JLF6 21816 90 4.775E-05 Protein-glutamine gamma-glutamyltransferase 2 Tgm2 P21981 21817 77 1.600E-04 Trypsin 10 Try10 Q792Z1 436522 26 5.807E-05

117

Appendix 1.2: FVB female Glomerular ECM: 110 ECM proteins were identified in the ECM collected from FVB female mice. Protein name, gene name, Uniprot identifier, Entrez gene ID and protein abundance measured by normaised spectral counts (NSC) are indicated.

Gene Gene M Abundance Protein name and category Name Uniprot ID W (NSC)

Collagens Collagen alpha-1(XII) chain Col12a1 Q60847 12816 340 5.08476E-05 Collagen alpha-1(XIV) chain Col14a1 B7ZNH7 12818 193 4.29284E-05 Collagen alpha-1(XV) chain Col15a1 A2AJY2 12819 138 8.3084E-05 Collagen alpha-1(XVIII) chain Col18a1 P39061 12822 182 0.00011923 Collagen alpha-1(I) chain Col1a1 P11087 12842 138 0.000095 Collagen alpha-2(I) chain Col1a2 Q01149 12843 130 6.93434E-05 Collagen alpha-1(III) chain Col3a1 Q8BLW4 12825 139 0.000027 Collagen alpha-1(IV) chain Col4a1 P02463 12826 161 0.000062 Collagen alpha-2(IV) chain Col4a2 B2RQQ8 12827 167 0.000124404 Collagen alpha-3(IV) chain Col4a3 Q9QZS0 12828 162 0.000067 Collagen alpha-4(IV) chain Col4a4 Q9QZR9 12829 164 0.00014 Collagen alpha-5(IV) chain Col4a5 Q63ZW6 12830 162 5.35847E-05 Collagen alpha-6(IV) chain Col4a6 B1AVK5 94216 164 3.41028E-05 Collagen alpha-2(V) chain Col5a2 Q3U962 12832 145 0.000021 Collagen alpha-1(VI) chain Col6a1 Q04857 12833 108 0.00022 Collagen alpha-2(VI) chain Col6a2 Q02788 12834 110 0.00022 Collagen alpha-3(VI) chain Col6a3 D3YWD1 1293 186 1.20256E-05 Collagen alpha-3(VI) chain Col6a3 E9PWQ3 1293 287 0.00031118 Collagen alpha-5(VI) chain Col6a5 A6H584 665033 290 1.307E-05

ECM Glycoproteins Alpha-1B-glycoprotein A1bg Q19LI2 13722 57 0.000162383 Agrin Agrn A2ASQ1 11603 208 8.92584E-06 EMI domain-containing protein 1 Emid1 Q91VF5 140703 46 3.3942E-05 Elastin Microfibril Interfacer 1 Emilin1 Q99K41 100952 108 0.000158271 Fibulin-5 Fbln5 Q9WVH9 23876 50 0.000038 Fibrillin 1 Fbn1 A2AQ53 14118 312 2.59054E-05 Fibrinogen beta chain Fgb Q8K0E8 110135 55 0.000231764 Fibrinogen gamma chain Fgg Q8VCM7 99571 49 0.000196892 Fibronectin 1 Fn1 Q3UH17 14268 250 0.00019 Extracellular matrix protein FRAS1 Fras1 Q80T14 80144 442 7.09081E-06 Insulin-like growth factor-binding protein complex acid labile subunit Igfals P70389 16005 67 6.88719E-05 37561 Kielin/chordin-like protein Kcp Q3U492 6 167 7.53484E-06 Laminin subunit alpha-1 Lama1 P19137 16772 338 0.000057 Laminin subunit alpha-3 Lama3 E9PUR4 3909 366 7.1992E-05 Laminin subunit alpha-4 Lama4 P97927 16775 202 0.000092 Laminin subunit alpha-5 Lama5 Q61001 16776 404 0.00018 Laminin subunit beta-1 Lamb1 P02469 3912 343 5.12579E-05 Laminin subunit beta-1 Lamb1 P02469 3912 197 0.000176455 Laminin subunit beta-2 Lamb2 Q61292 16779 197 0.00027 Laminin subunit gamma-1 Lamc1 P02468 226519 177 0.00023 Milk Fat Globule-EGF Factor 8 Protein Mfge8 P21956 17304 51 0.000133883 Multimerin-2 Mmrn2 A6H6E2 105450 105 0.000077 Nidogen-1 Nid1 P10493 18073 137 0.000217164 Nidogen-2 Nid2 Q3TPN0 18074 154 0.000171299 Netrin-4 Ntn4 Q9JI33 59277 70 8.89969E-05 Papilin, proteoglycan-like sulfated glycoprotein Papln Q9EPX2 170721 139 6.73384E-05 Peroxidasin homolog (Drosophila) Pxdn B2RX13 69675 165 7.93256E-05 Transforming growth factor, beta induced Tgfbi A1L353 21810 75 8.3012E-05 Thrombospondin type-1 domain- containing protein 4 Thsd4 Q3UTY6 207596 113 0.000011 118

Tubulointerstitial nephritis antigen Tinag Q91XG7 26944 54 0.000229516 Tubulointerstitial nephritis antigen-like 1 Tinagl1 Q99JR5 94242 53 0.000175805 Tenascin Tnc Q80YX1 21923 232 0.000046 Tenascin-X Tnxb O54796 81877 447 0.000011 Vitronectin Vtn P29788 22370 55 0.00014 von Willebrand factor A domain-containing protein 1 Vwa1 Q8R2Z5 246228 45 0.000061 von Willebrand factor A domain-containing protein 8 Vwa8 Q8CC88 219189 213 0.0000099 von Willebrand factor Vwf E9QPU1 7450 309 2.28281E-05

Proteoglycans Biglycan Bgn P28653 12111 42 5.01322E-05 Hyaluronan and proteoglycan link protein 1 Hapln1 Q9QUP5 12950 40 0.000078 Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 Q05793 15530 398 0.000007 Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 E9PZ16 3339 470 0.000171729 Prolargin Prelp Q9JK53 116847 43 9.29515E-05

ECM -associated Alpha-2-macroglobulin A2m Q61838 11287 166 9.40561E-06 ADAM Metallopeptidase With Adamts13 Thrombospondin Type 1 Motif, 13 Q769J6 279028 155 0.0000081 Aminoacyl tRNA synthase complex- interacting multifunctional protein 1 Aimp1 P31230 11657 34 3.70093E-05 Serum albumin Alb P07724 117586 69 0.000156993 Alkaline phosphatase Alpl B7XGA6 249 57 0.000194754 Angiopoietin-related protein 2 Angptl2 Q9R045 26360 57 7.5775E-05 Annexin A2 Anxa2 P07356 12306 39 0.000150089 Apolipoprotein A-I Apoa1 Q00623 11806 31 0.00015 Apolipoprotein E Apoe P08226 11816 36 0.000105179 Apolipoprotein O-like Apool Q78IK4 68117 29 7.27393E-05 Complement C1q subcomponent subunit C C1qc Q02105 12262 26 0.000083 Complement C3 C3 P01027 12266 186 0.000279629 Complement C4-B C4b P01029 721 193 0.000155443 C4b-binding protein C4bpa P08607 12269 52 3.57033E-05 B2MWM9 Calreticulin Calr 12317 48 2.50892E-05 Clusterin Clu Q06890 12759 52 0.000245135 Ceruloplasmin Cp Q61147 12870 121 0.00011286 Carboxypeptidase N subunit 2 Cpn2 Q9DBB9 71756 60 0.000046 Dipeptidylpeptidase 4 Dpp4 A2AS84 13482 87 0.000160494 Ecm1 protein Ecm1 B7ZNR0 1893 63 2.86117E-05 Heparin-binding growth factor 2 Fgf2 P15655 14173 17 0.000202414 GLI pathogenesis-related 2 Glipr2 Q9CYL5 384009 17 0.000128852 Protein Gm20425 Gm20425 E9Q035 108 2.86994E-05 Glutathione peroxidase 3 Gpx3 P46412 14778 25 0.00046 Gelsolin, isoform CRA_c Gsn Q6PAC1 227753 81 0.00012 Heat shock 70kD protein 5 (Glucose- regulated protein) Hspa5 P20029 14828 72 0.00012019 Serine protease HTRA1 Htra1 Q9R118 56213 44 7.06164E-05 Hyaluronidase-2 Hyal2 O35632 15587 54 5.76832E-05 Ig mu chain C region secreted form Igh-6 P01872 16019 50 8.08636E-05 Ig gamma-1 chain C region, membrane- bound form Ighg1 P01869 16017 43 4.31761E-05 Inter-alpha trypsin inhibitor, heavy chain 1, isoform CRA_a Itih1 Q61702 16424 101 0.000065 Inter-alpha-trypsin inhibitor heavy chain H2 Itih2 Q61703 16425 106 1.47295E-05 Inter alpha-trypsin inhibitor, heavy chain 4 Itih4 A6X935 16427 100 2.45872E-05 119

Inter-alpha-trypsin inhibitor heavy chain H5 Itih5 Q8BJD1 209378 107 0.000052 Ladinin-1 Lad1 P57016 16763 59 8.97241E-05 Mannose-binding protein C Mbl2 P41317 17195 26 0.000119654 Meprin A subunit alpha Mep1a P28825 17287 84 1.7944E-05 Meprin A subunit beta Mep1b Q61847 17288 80 2.6368E-05 Murinoglobulin-1 Mug1 P28665 17836 165 0.000019 25574 Nephronectin NPNT D3YTX1 3 50 0.000118535 Ribosomal Protein L13a Rpl13a Q5M9M0 14256 23 0.000257348 Serine (Or Cysteine) Proteinase Inhibitor, Serpina1a Clade A, Member 1 P07758 20700 46 5.39532E-05 Serine (Or Cysteine) Proteinase Inhibitor, Clade A, Member 3 Serpina3k P07759 20714 47 0.00012 Serpin Peptidase Inhibitor, Clade H (Heat Shock Protein 47), Member 1 Serpinh1 P19324 12406 47 0.000277743 Superoxide dismutase Sod3 O88592 20657 27 7.99847E-05 Protein-glutamine gamma- glutamyltransferase K Tgm1 Q9JLF6 21816 90 2.3395E-05 Protein-glutamine gamma- glutamyltransferase 2 Tgm2 P21981 21817 77 0.0002 TIMP Metallopeptidase Inhibitor 3 Timp3 Q54AE5 21859 24 0.000116198 Uromodulin Umod Q91X17 22242 71 0.000017

120

Appendix 1.3: C57 Male Glomerular ECM. 80 ECM proteins were identified in the ECM collected from C57 male mice. Protein name, gene name, Uniprot identifier, Entrez gene ID and protein abundance measured by normaised spectral counts (NSC) are indicated.

Gene Abundance Protein name and category Name Uniprot Gene ID MW (NSC)

Collagens Collagen alpha-1(XII) chain Col12a1 Q60847 12816 340 2.18551E-05 Collagen alpha-1(XIV) chain Col14a1 B7ZNH7 12818 193 1.92244E-05 Collagen alpha-1(XV) chain Col15a1 A2AJY2 12819 138 9.05127E-05 Collagen alpha-1(XVIII) chain Col18a1 P39061 12822 182 0.000108351 Collagen alpha-1(I) chain Col1a1 P11087 12842 138 0.00013 Collagen alpha-2(I) chain Col1a2 Q01149 12843 130 0.000100053 Collagen alpha-1(III) chain Col3a1 Q8BLW4 12825 139 0.000067 Collagen alpha-1(IV) chain Col4a1 P02463 12826 161 0.000068 Collagen alpha-2(IV) chain Col4a2 B2RQQ8 12827 167 0.00013766 Collagen alpha-3(IV) chain Col4a3 Q9QZS0 12828 162 0.000081 Collagen alpha-4(IV) chain Col4a4 Q9QZR9 12829 164 0.00014 Collagen alpha-5(IV) chain Col4a5 Q63ZW6 12830 162 4.52005E-05 Collagen alpha-6(IV) chain Col4a6 B1AVK5 94216 164 3.78908E-05 Collagen alpha-1(VI) chain Col6a1 Q04857 12833 108 0.0002 Collagen alpha-2(VI) chain Col6a2 Q02788 12834 110 0.0002 Collagen alpha-3(VI) chain Col6a3 D3YWD1 1293 186 1.08955E-05 Collagen alpha-3(VI) chain Col6a3 E9PWQ3 1293 287 0.00027419

ECM Glycoproteins Elastin Microfibril Interfacer 1 Emilin1 Q99K41 100952 108 0.00010693 Fibrillin 1 Fbn1 A2AQ53 14118 312 2.23815E-05 Fibrinogen beta chain Fgb Q8K0E8 110135 55 0.000134151 Fibrinogen gamma chain Fgg Q8VCM7 99571 49 8.2717E-05 Fibronectin 1 Fn1 Q3UH17 14268 250 0.00016 Laminin subunit alpha-1 Lama1 P19137 16772 338 0.000052 Laminin subunit alpha-3 Lama3 E9PUR4 3909 366 3.4271E-05 Laminin subunit alpha-4 Lama4 P97927 16775 202 0.000044 Laminin subunit alpha-5 Lama5 Q61001 16776 404 0.00018 Laminin subunit beta-1 Lamb1 P02469 3912 343 4.21013E-05 Laminin subunit beta-1 Lamb1 P02469 3912 197 0.000169606 Laminin subunit beta-2 Lamb2 Q61292 16779 197 0.00028 Laminin subunit gamma-1 Lamc1 P02468 226519 177 0.00025 Milk Fat Globule-EGF Factor 8 Protein Mfge8 P21956 17304 51 0.000146504 Multimerin-2 Mmrn2 A6H6E2 105450 105 0.000074 Nidogen-1 Nid1 P10493 18073 137 0.000212962 Nidogen-2 Nid2 Q3TPN0 18074 154 0.000165028 Netrin-4 Ntn4 Q9JI33 59277 70 2.89991E-05 Papilin, proteoglycan-like sulfated glycoprotein Papln Q9EPX2 170721 139 2.67657E-05 Peroxidasin homolog (Drosophila) Pxdn B2RX13 69675 165 6.55869E-05 Transforming growth factor, beta induced Tgfbi A1L353 21810 75 4.05988E-05 Tubulointerstitial nephritis antigen Tinag Q91XG7 26944 54 0.000179542 Tubulointerstitial nephritis antigen-like 1 Tinagl1 Q99JR5 94242 53 0.000136461 Tenascin Tnc Q80YX1 21923 232 0.000052 Vitronectin Vtn P29788 22370 55 0.000061 von Willebrand factor A domain-containing protein 1 Vwa1 Q8R2Z5 246228 45 0.000045 von Willebrand factor A domain-containing protein 8 Vwa8 Q8CC88 219189 213 0.000053 von Willebrand factor Vwf E9QPU1 7450 309 2.0291E-05

Proteoglycans Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 E9PZ16 3339 470 0.000165948 121

Prolargin Prelp Q9JK53 116847 43 4.70511E-05

ECM -associated Gelsolin, isoform CRA_c Gsn Q6PAC1 227753 81 0.000087 Serum albumin Alb P07724 117586 69 0.000103464 Alkaline phosphatase Alpl B7XGA6 249 57 0.000184561 Annexin A2 Anxa2 P07356 12306 39 0.000161818 Apolipoprotein E Apoe P08226 11816 36 0.000139605 Apolipoprotein O-like Apool Q78IK4 68117 29 0.000137839 Arginase-1 Arg1 Q61176 11846 35 0.000054 Complement C3 C3 P01027 12266 186 0.000195313 Complement C4-B C4b P01029 721 193 0.000124054 Complement component C9 C9 P06683 12279 62 0.000097 Clusterin Clu Q06890 12759 52 0.000181845 Ceruloplasmin Cp Q61147 12870 121 0.000134371 Deoxyribonuclease-1 Dnase1 P49183 13419 32 6.92743E-05 Dipeptidylpeptidase 4 Dpp4 A2AS84 13482 87 0.000163158 Igh protein Gm16844 Q58E61 634338 53 5.73239E-05 Glutathione peroxidase 3 Gpx3 P46412 14778 25 0.00043 Heat shock 70kD protein 5 (Glucose- regulated protein) Hspa5 P20029 14828 72 0.000176834 Serine protease HTRA1 Htra1 Q9R118 56213 44 3.07056E-05 Hyaluronidase-2 Hyal2 O35632 15587 54 3.1243E-05 Ig mu chain C region secreted form Igh-6 P01872 16019 50 9.71936E-05 Inter-alpha trypsin inhibitor, heavy chain 1, isoform CRA_a Itih1 Q61702 16424 101 0.000086 Inter-alpha-trypsin inhibitor heavy chain H5 Itih5 Q8BJD1 209378 107 0.000022 Ladinin-1 Lad1 P57016 16763 59 0.000123758 Meprin A subunit alpha Mep1a P28825 17287 84 9.64068E-05 Meprin A subunit beta Mep1b Q61847 17288 80 0.000125904 17836, Murinoglobulin-1 Mug1 P28665 640530 165 0.000022 Nephronectin NPNT D3YTX1 255743 50 8.01486E-05 Ribosomal Protein L13a Rpl13a Q5M9M0 14256 23 0.000184347 Serine (Or Cysteine) Proteinase Inhibitor, Clade A, Member 1 Serpina1a P07758 20700 46 3.67498E-05 Serpin Peptidase Inhibitor, Clade H (Heat Shock Protein 47), Member 1 Serpinh1 P19324 12406 47 0.000278146 Superoxide dismutase Sod3 O88592 20657 27 7.50581E-05 Protein-glutamine gamma- glutamyltransferase K Tgm1 Q9JLF6 21816 90 4.92617E-05 Protein-glutamine gamma- glutamyltransferase 2 Tgm2 P21981 21817 77 0.00015

122

Table 1.4: C57 female Glomerular ECM. 79 ECM proteins were identified in the ECM collected from C57 female mice. Protein name, gene name, Uniprot identifier, Entrez gene ID and protein abundance measured by normaised spectral counts (NSC) are indicated.

Gene Gene M Abundance Protein name and category Name Uniprot ID W (NSC)

Collagens Collagen alpha-1(XII) chain Col12a1 Q60847 12816 340 5.08646E-05 Collagen alpha-1(XIV) chain Col14a1 B7ZNH7 12818 193 1.81453E-05 Collagen alpha-1(XV) chain Col15a1 A2AJY2 12819 138 9.9942E-05 Collagen alpha-1(XVIII) chain Col18a1 P39061 12822 182 0.000111428 Collagen alpha-1(I) chain Col1a1 P11087 12842 138 0.00011 Collagen alpha-2(I) chain Col1a2 Q01149 12843 130 7.26974E-05 Collagen alpha-1(III) chain Col3a1 Q8BLW4 12825 139 0.00002 Collagen alpha-1(IV) chain Col4a1 P02463 12826 161 0.000072 Collagen alpha-2(IV) chain Col4a2 B2RQQ8 12827 167 0.000135013 Collagen alpha-3(IV) chain Col4a3 Q9QZS0 12828 162 0.000068 Collagen alpha-4(IV) chain Col4a4 Q9QZR9 12829 164 0.00014 Collagen alpha-5(IV) chain Col4a5 Q63ZW6 12830 162 4.99715E-05 Collagen alpha-6(IV) chain Col4a6 B1AVK5 94216 164 3.65644E-05 Collagen alpha-1(VI) chain Col6a1 Q04857 12833 108 0.00018 Collagen alpha-2(VI) chain Col6a2 Q02788 12834 110 0.00023 D3YWD1 Collagen alpha-3(VI) chain Col6a3 1293 186 1.69584E-05 E9PWQ3 Collagen alpha-3(VI) chain Col6a3 1293 287 0.000305845

ECM Glycoproteins EMILIN-1 Emilin1 Q99K41 100952 108 0.000119261 Fibrillin 1 Fbn1 A2AQ53 14118 312 1.45688E-05 Fibrinogen beta chain Fgb Q8K0E8 110135 55 0.000109028 Fibronectin 1 Fn1 Q3UH17 14268 250 0.00019 Extracellular matrix protein FRAS1 Fras1 Q80T14 80144 442 3.2017E-06 Laminin subunit alpha-1 Lama1 P19137 16772 338 0.000076 Laminin subunit alpha-3 Lama3 E9PUR4 3909 366 5.40672E-05 Laminin subunit alpha-4 Lama4 P97927 16775 202 0.000065 Laminin subunit alpha-5 Lama5 Q61001 16776 404 0.00018 Laminin subunit beta-1 Lamb1 P02469 3912 343 7.21329E-05 Laminin subunit beta-1 Lamb1 P02469 3912 197 0.000198283 Laminin subunit beta-2 Lamb2 Q61292 16779 197 0.00031 Laminin subunit gamma-1 Lamc1 P02468 226519 177 0.00025 Milk Fat Globule-EGF Factor 8 Protein Mfge8 P21956 17304 51 0.000143451 Multimerin-2 Mmrn2 A6H6E2 105450 105 0.000054 Nidogen-1 Nid1 P10493 18073 137 0.000219551 Nidogen-2 Nid2 Q3TPN0 18074 154 0.00016728 Netrin-4 Ntn4 Q9JI33 59277 70 3.04967E-05 Papilin, proteoglycan-like sulfated glycoprotein Papln Q9EPX2 170721 139 3.31318E-05 Peroxidasin homolog (Drosophila) Pxdn B2RX13 69675 165 4.77528E-05 Transforming growth factor, beta induced Tgfbi A1L353 21810 75 7.34034E-05 Tubulointerstitial nephritis antigen Tinag Q91XG7 26944 54 0.000182341 Tubulointerstitial nephritis antigen-like 1 Tinagl1 Q99JR5 94242 53 0.000126494 Tenascin Tnc Q80YX1 21923 232 0.000071 Vitronectin Vtn P29788 22370 55 0.000071 von Willebrand factor A domain-containing protein 8 Vwa8 Q8CC88 219189 213 0.000034 von Willebrand factor Vwf E9QPU1 7450 309 3.06234E-05

Proteoglycans Biglycan Bgn P28653 12111 42 5.05412E-05 123

Basement membrane-specific heparan sulfate proteoglycan core protein Hspg2 E9PZ16 3339 470 0.000179386 Prolargin Prelp Q9JK53 116847 43 4.15569E-05

ECM -associated Serum albumin Alb P07724 117586 69 9.14281E-05 Alkaline phosphatase Alpl B7XGA6 249 57 0.000171275 Annexin A2 Anxa2 P07356 12306 39 0.000189432 Apolipoprotein E Apoe P08226 11816 36 5.99634E-05 Complement C3 C3 P01027 12266 186 0.00020672 Complement C4-B C4b P01029 721 193 9.37691E-05 Clusterin Clu Q06890 12759 52 0.000190275 Coatomer subunit alpha Copa Q8CIE6 12847 138 1.83364E-05 Ceruloplasmin Cp Q61147 12870 121 0.00012082 Deoxyribonuclease-1 Dnase1 P49183 13419 32 6.59625E-05 Dipeptidylpeptidase 4 Dpp4 A2AS84 13482 87 0.000179756 GLI pathogenesis-related 2 Glipr2 Q9CYL5 384009 17 8.32443E-05 Glypican-1 Gpc1 Q9QZF2 14733 61 2.92942E-05 Glutathione peroxidase 3 Gpx3 P46412 14778 25 0.00032 Gelsolin, isoform CRA_c Gsn Q6PAC1 227753 81 0.0001 Heat shock 70kD protein 5 (Glucose- regulated protein) Hspa5 P20029 14828 72 0.000152246 Hyaluronidase-2 Hyal2 O35632 15587 54 5.81905E-05 Ig mu chain C region secreted form Igh-6 P01872 16019 50 7.02805E-05 Inter-alpha trypsin inhibitor, heavy chain 1, isoform CRA_a Itih1 Q61702 16424 101 0.000065 Inter-alpha-trypsin inhibitor heavy chain H2 Itih2 Q61703 16425 106 1.35766E-05 Inter alpha-trypsin inhibitor, heavy chain 4 Itih4 A6X935 16427 100 1.43912E-05 Inter-alpha-trypsin inhibitor heavy chain H5 Itih5 Q8BJD1 209378 107 0.000026 Ladinin-1 Lad1 P57016 16763 59 6.52528E-05 Meprin A subunit alpha Mep1a P28825 17287 84 4.25457E-05 Meprin A subunit beta Mep1b Q61847 17288 80 6.64112E-05 25574 Nephronectin NPNT D3YTX1 3 50 7.02805E-05 Ribosomal Protein L13a Rpl13a Q5M9M0 14256 23 0.000138182 Serpin Peptidase Inhibitor, Clade H (Heat Shock Protein 47), Member 1 Serpinh1 P19324 12406 47 0.000269465 Stratifin Sfn O70456 55948 28 0.000139207 Protein-glutamine gamma- glutamyltransferase K Tgm1 Q9JLF6 21816 90 2.37197E-05 Protein-glutamine gamma- glutamyltransferase 2 Tgm2 P21981 21817 77 0.00016 Uromodulin Umod Q91X17 22242 71 0.000025

124

Appendix 2: Visual comparison of data (MS NSC/RNA RFU)

RNA array data relative fluorescence MS data Normalised spectral count units (RFU): (NSC): Structural Proteins:

Collagens

500 0.00012

400 0.0001

0.00008

300

0.00006 NSC RFU 200 0.00004 100 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col15a1

62 0.00014 60 0.00012

58 0.0001

56 0.00008

54 NSC RFU 0.00006 52 50 0.00004 48 0.00002 46 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col1a1

15 0.00008 14.5 0.00007

14 0.00006

13.5 0.00005

13 0.00004 NSC RFU 12.5 0.00003 12 0.00002 11.5 0.00001 11 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col3a1

125

180 0.00004 160 0.000035 140 0.00003

120

0.000025 100

0.00002 NSC RFU 80 60 0.000015 40 0.00001 20 0.000005 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col4a6

140 0.000025 120 0.00002

100

80 0.000015 NSC RFU 60 0.00001 40 0.000005 20 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col5a2

26 0.000016 25 0.000014

24 0.000012

0.00001 23

0.000008 NSC RFU 22 0.000006 21 0.000004 20 0.000002 19 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Col6a5

126

ECM Glycoproteins

350 0.00018 300 0.00016 0.00014 250

0.00012

200 0.0001 NSC RFU 150 0.00008 0.00006 100 0.00004 50 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Emilin1

120 0.000008 0.000007 100 0.000006 80

0.000005

60 0.000004 NSC RFU 0.000003 40 0.000002 20 0.000001

0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Fras1: RNA array data Fras1: MS data

35 0.000205 30 0.0002

25 0.000195

20 0.00019 NSC RFU 15 0.000185 10 0.00018 5 0.000175 0 0.00017 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Lama5 127

3000 0.00009 0.00008 2500 0.00007

2000 0.00006

0.00005

1500 NSC RFU 0.00004 1000 0.00003 0.00002 500 0.00001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Mmrn2

800 0.000175 700 0.00017

600

500 0.000165

400 NSC RFU 300 0.00016 200 0.000155 100 0 0.00015 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Nid2

0.00006 0.00006 0.00005 0.00005

0.00004 0.00004

0.00003 0.00003

NSC NSC 0.00002 0.00002 0.00001 0.00001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Pm20d1 128

3000 0.00009 0.00008 2500 0.00007

2000 0.00006

0.00005

1500 NSC RFU 0.00004 1000 0.00003 0.00002 500 0.00001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Pxdn

250 0.000012

200 0.00001

0.000008

150

0.000006 NSC RFU 100 0.000004 50 0.000002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Thsd4

5000 0.00025

4000 0.0002

3000 0.00015 NSC RFU 2000 0.0001

1000 0.00005

0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Tinag 129

27 0.00016 26 0.00014

25 0.00012

0.0001 24

0.00008 NSC RFU 23 0.00006 22 0.00004 21 0.00002 20 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Vtn

130

Proteoglycans

1000 0.000008 0.000007 800

0.000006

600 0.000005

0.000004 NSC RFU 400 0.000003 0.000002 200 0.000001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Hspg2

6000 0.00006 5000 0.00005

4000 0.00004

3000 0.00003

NSC RFU 2000 0.00002 1000 0.00001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Bgn

1600 0.000185 1400 0.00018 1200

0.000175

1000

800 0.00017

NSC RFU 600 0.000165 400 200 0.00016 0 0.000155 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Hspg2

131

ECM-associated Proteins 300 0.00018 0.00016 250 0.00014

200 0.00012

0.0001

150 NSC RFU 0.00008 100 0.00006 0.00004 50 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Alb

2500 0.00025

2000 0.0002

1500 0.00015 NSC RFU 1000 0.0001

500 0.00005

0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Alpl

80 0.00016 70 0.00014

60 0.00012

50 0.0001

40 0.00008 NSC RFU 30 0.00006 20 0.00004 10 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Apoa1 132

3500 0.00016 3000 0.00014

2500 0.00012

0.0001 2000

0.00008 NSC RFU 1500 0.00006 1000 0.00004 500 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Apoe

30 0.00006 25 0.00005

20 0.00004

15 0.00003

NSC RFU 10 0.00002 5 0.00001 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Arg1

80 0.00012 70 0.0001 60

0.00008

50

40 0.00006

NSC RFU 30 0.00004 20 10 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

C9 133

25 0.00005

24 0.00004

23

0.00003

22 NSC RFU 0.00002 21 20 0.00001 19 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Cpn2

1400 0.000035 1200 0.00003

1000 0.000025

800 0.00002 NSC RFU 600 0.000015 400 0.00001 200 0.000005 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Ecm1

120 0.00014 100 0.00012 0.0001

80

0.00008

60 NSC RFU 0.00006 40 0.00004 20 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Glipr2 134

180 0.0001 160 140 0.00008

120

0.00006

100 NSC RFU 80 0.00004 60 40 0.00002 20 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Itih1

100 0.000016 0.000014 80

0.000012

60 0.00001

0.000008 NSC RFU 40 0.000006 0.000004 20 0.000002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Itih2

36 0.00003 35 0.000025 34

0.00002

33

32 0.000015

NSC RFU 31 0.00001 30 29 0.000005 28 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Itih4 135

1800 0.00014 1600 0.00012 1400 0.0001

1200

1000 0.00008 NSC RFU 800 0.00006 600 0.00004 400 200 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Lad1

50 0.00014 0.00012 40

0.0001

30 0.00008 NSC RFU 20 0.00006 0.00004 10 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Mbl2

5000 0.00028 0.000275 4000

0.00027

3000 0.000265 NSC RFU 2000 0.00026 0.000255 1000 0.00025 0 0.000245 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Serpinh1 136

11.5 0.00014 11.4 0.00012

11.3 0.0001

11.2 0.00008

11.1 NSC RFU 0.00006 11 10.9 0.00004 10.8 0.00002 10.7 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Serpina3k

50 0.00016 0.00014 40

0.00012

30 0.0001

0.00008 NSC RFU 20 0.00006 0.00004 10 0.00002 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Sfn

8000 0.00003 7000 0.000025 6000

0.00002

5000

4000 0.000015

NSC RFU 3000 0.00001 2000 1000 0.000005 0 0 FVBM FVBF C57M C57F FVBM FVBF C57M C57F Mouse strain/sex group Mouse strain/sex group

Umod: RNA array data Umod: MS data