<<

Structural changes in phosphorylated triple helical -like peptides and requirements

for in vitro enzymatic phosphorylation

Erik Poppleton

Committee: Stephen Fuchs and Barbara Brodsky

Tufts University Department of Biology

Abstract

Recent surveys of post-translational modifications of proteins observed that 19 instances of phosphorylation occur within the human type-1 α1 collagen chain. However, the purpose, localization and timing of these modifications are unknown. In this study, short collagen-like peptides based on known phosphorylation sites were used to explore the structural effects of phosphoserine on collagen stability, and identify requirements for in vitro phosphorylation of serine residues within the collagen chain. Circular dichroism spectroscopy of the peptides revealed that phosphoserine slightly decreases the thermal stability of the triple through electrostatic interactions. Kinase incubations showed that the structure resists kinase activity as the peptides were only phosphorylated when denatured prior to incubation. This research has implications in understanding the role of phosphorylation in collagen secretion and activity, and directs the search for the physiological enzyme responsible for collagen phosphorylation.

Introduction

Collagen Structure, Stability and Secretion

Collagen is a key structural molecule that is the most abundant protein in the human body. The 28 members of the collagen family are involved in a wide variety of processes including cell mobility and adhesion, (ECM) stability and immune system function. The defining feature of all is that they contain at least one triple helical domain, in which three linear collagen monomers each in a left-handed polyproline II (PPII) coil wind in a right-handed helix with a one residue stagger between chains (Shoulders and Raines,

2010; Ricard-Blum, 2011). This structure arises from the repetitive Gly-Xaa-Yaa motif, wherein the Xaa and Yaa positions are commonly the imino acids and 4-, respectively, which aid in the development of the dihedral bond angles required for the triple helix to form (Boudko et al, 2002). Hydrogen bonding between the amino group on the and the carbonyl oxygen on the X-position along the helix creates a rigid structure with a 20.0-

28.6 Å axial repeat (Shoulders and Raines, 2010). The glycine residues are required for stable helix as their hydrogen atoms pack tightly within the internal portion of the molecule, while the

R groups of the X and Y-position residues are presented to the outside, providing a readable sequence that allows for specific interactions with other proteins.

Members of the collagen family are involved in many key extracellular biological processes, and as such are secreted by cells or presented on the surface of cells. Some examples of collagens and their functions include collagen I, the most common fibrillar collagen which is the major structural component of skin, , and the cornea (Shoulders and

Raines, 2010), collagen IV, a network collagen which is the most abundant collagen in the and is required for cellular adhesion (LeBleu et al, 2007), and collagen

XXVIII, which is a collagen with frequent disruptions in its triple helical domain which is important in nervous system development (Veit et al, 2006). Once secreted, fibrillar and membrane-forming collagens self-assemble into superstructures. In , fibrils are further strengthened by covalent crosslinking between lysine residues in the telopeptide sequences flanking the triple helical domain and hydroxylysine residues within the helix (Orgel et al, 2005). These fibrils can reach up to 1 cm in length, despite being only 500 nm in diameter

(Shoulders and Raines, 2010).

To function as a structural component, collagens must both be structurally competent against physical forces, as well as resistant to chemical degradation and proteases. The molecular structure of the triple helix plays a key role in these factors. As a relatively unique motif among proteins in general, the triple helix is not recognized as a substrate and therefore prevents interaction with proteases and other binding partners that are not specific to collagens. However, there are a variety of binding partners which are specific to collagen and responsible for cell adhesion, collagen structure and collagen remodeling, such as fibronectin and matrix metalloproteinase. In many cases, the recognition sequence within the collagen sequence contains large and charged amino acids, which contribute to a partial localized destabilization and opening of the triple helix, which are separated fully during interactions.

Point mutations in the collagen chain sequence often result in changes in chain strength leading to abnormal openings or rigidity within the triple helix that are implicated in many pathologies

(Arnold and Fertala, 2013).

Understanding the structural implications of changes in residue identity is key for the understanding of protein interactions, disease phenotypes and considerations for the development of engineered tissues. The structural influences on helix strength of the 20 common amino acids as well as hydroxyproline are well-characterized in collagen (Persikov et al, 2005), though the effects on protein binding are still being investigated; however, relatively little is known about the structural ramifications of post-translational modifications (PTMs) of amino acids besides hydroxylation of proline residues. The most abundant and well-characterized PTM of collagen is proline and lysine hydroxylation. Proline hydroxylation is required for folding of the triple helix and is carried out by the enzyme prolyl 4-hydroxylase (P4H). P4H is localized within the cisternae of the cell’s rough endoplasmic reticulum at the site of collagen translation. Here, procollagen has not yet nucleated into trimers and is in the unfolded monomer conformation (Olsen et al, 1973). the rate-limiting step of helix self-assembly is the nucleation of strands from single strands to an unstable dimer and the addition of a third strand (Boudko et al, 2002). Hydroxylation of proline molecules results in a rigid upward pucker of the proline ring as opposed to the unconstrained conformation of unmodified proline residues. This enforced ring pucker creates a φ angle of -

60° in the Y-position, which is also the angle preferred by the triple helix and therefore promotes nucleation (Vitagliano et al, 2000). Once hydroxylation has occurred, the kinetic barrier to self-assembly is overcome, resulting in the rapid formation of trimers and the triple helix propagates from the trimerization domains. It should also be noted that once trimerized, procollagen is no longer recognized as a substrate for P4H. When prolyl hydroxylase is inhibited in cells, the collagen is never assembled into the triple helix and is not secreted from the ER (Uitto and Prockop, 1974).

Phosphoproteomics and In Vivo Phosphorylation

Another, less well-characterized PTM that has been observed in collagen is phosphorylation, in which a phosphate group is added to serine, threonine, or tyrosine residues, with serine being the most prevalent in collagen. In general, phosphorylation is one of the most common PTMs, involved in regulation of protein activity through direct or allosteric modification of enzyme active sites. In the case of collagen, phosphate, as a large and highly charged group, has the potential to modify the folding and rigidity of the triple helix through electrostatic repulsion and induce loose regions in sequences that would normally be stable. It also modifies the profile of the R-groups of the Xaa and Yaa positions presented to the outside of the cells involved in protein recognition. In summary, collagen phosphorylation may modify the physical and biological profile of the collagen helix in a non-sequence-dependent manner and would provide a mechanism of regulation for collagen function.

Recent advances in mass spectrometry and computing technologies has led to the burgeoning field of phosphoproteomics, which take a shotgun approach to categorizing phosphorylated proteins in a cell culture or tissue. Recent phosphoproteome surveys have revealed many sites within collagens; however, these sites have not been studied in detail, as the wide scope of these surveys results in data sets of many tens of thousands of phosphorylated proteins (Lundby et al, 2012; Mertins et al, 2014; Sharma et al, 2014; Yalak and

Olsen, 2015; Mertins et al, 2016).

In most cases, the localization of the phosphorylation event and the kinases responsible are currently unknown. In addition, the specific biological purpose of collagen phosphorylation is unclear, as it has been found in secreted collagen fibers during phosphoproteome studies, as well as implicated as a degradation signal for cellular proteases (Olsen et al; 2005; Lundby et al,

2012). Understanding the structural consequences of phosphorylation is a key first step in elucidating the underlying intent of collagen phosphorylation. As previously mentioned, native collagen sequences require several chaperone proteins for proper folding as the helix is not thermodynamically favored at physiological temperatures, though kinetics prevent unfolding once formed (Ricard-Blum, 2005). With the potential for phosphate groups to affect the structure of the collagen helix, there is some question of the efficiency of collagen folding.

Slower rates of folding may contribute improper folding or protease interaction with the collagen prior to secretion and therefore degradation. Collagen folding rates have also been implicated as a potential cause for disease phenotypes, particularly in the inherited bone mineralization disorder, , which is caused by missense mutation, most often at Gly residues (Arnold and Fertala, 2013). Phosphate may also play a role in binding of other physiological proteins and molecules, though no direct evidence of changes in interactions as a result of phosphorylation have been observed at this point.

This research focuses on type I collagen due to its abundance in vertebrates as well as its simple and well characterized structure and binding domains (Marini et al, 2007; Di Lullo et al, 2001). Within the alpha-1 chain, 19 phosphorylation sites have been identified, four within the telopeptides, three within the propeptides, and 12 within the triple helix. Serine is the most common residue identity at these sites, accounting for 16 out of the 19 reported sites

(See Table 1 for complete list). The most commonly phosphorylated triplet is Gly-Ser-Pro, representing 42% of sites. In most cases, the phosphorylation sites are clustered, with two sites within a three-triplet span. Furthermore, these clustered sites are often found in the same phosphoproteomics surveys, implying that they are phosphorylated together. It is also interesting to note that all sites found in rat and chicken tissues are in sequences with a high degree of homology to human sequences, and only one of the four known rat phosphoserine sites is not also found in humans. This may imply evolutionary pressure to maintain these phosphorylation sites, however phosphoproteomics data from a more diverse group of species would be required to verify this. Table 1 Residue Flanking Sequence Structure Found in: 109 (Chicken)- KGHRGFSGLDGAK Triple Helix Glass and May, homologous to 1984 human residue 271 161 (Rat)- YGYDEKSAGVSVP N-Telopeptide Lundby et al, 2014 homologous to human residue 171 171 YGYDEKSTGGISV N-Telopeptide Mertins et al, 2014; Olsen et al, 2010; Mertins et al, 2016 172 GYDEKSTGGISVP N-Telopeptide Olsen et al, 2010 173 (Rat)- PGPMGPSGPRGLP Triple Helix Lundby et al, 2014 homologous to human residue 184 176 KSTGGISVPGPMG N-Telopeptide Mertins et al, 2014; Olsen et al, 2010; Mertins et al, 2016 260 (Rat)- KGHRGFSGLDGAK Triple Helix Lundby et al, 2014 homologous to human residue 271 271 KGRHGFSGLDGAK Triple Helix Mertins et al, 2016 see also: Glass and May, 1984 291 PKGEPGSPGENGA Triple Helix Mertins et al, 2016

445 PGSKGDTGAKGEP Triple Helix Mertins et al, 2014 513 PAGERGSPGPAGP Triple Helix Sharma et al, 2014 522 PAGPKGSPGEAGR Triple Helix Sharma et al, 2014 543 AKGLTGSPGSPGP Triple Helix Sharma et al, 2014; Mertins et al, 2014; Olsen et al, 2005 546 LTGSPGSPGPDGK Triple Helix Sharma et al, 2014, Mertins et al, 2014; Olsen et al, 2005 766 DGVRGLTGPIGPP Triple Helix Olsen et al, 2010 776 (Rat)- KGETGPSGPAGPT Triple Helix Lundby et al, 2014 homologous to human residue 787 787 KGESGPSGPAGPT Triple Helix Predicted by homology from Lundby et al, 2014 1023 APGAEGSPGRDGS Triple Helix Sharma et al, 2014 1029 SPGRDGSPGAKGD Triple Helix Sharma et al, 2014 1125 PPGPPGSPGEQGP Triple Helix Sharma et al, 2014; Mertins et al, 2014; Mertins et al, 2016 1199 SAGFDFSFLPQPP C-Telopeptide Mertins et al, 2016

1247 QIENIRSPEGSR C-Terminal NC Sharma et al, 2014, Mertins et al, 2014; 1271 CHSDWKSGEYWI C-Terminal NC Mertins et al, 2016

1393 ALLLQGSNEIEI C-Terminal NC Mertins et al, 2016

Table 1: The distribution of experimentally observed phosphorylation sites within the alpha-1 chain of type 1 collagen. All sites are from the human sequence unless otherwise noted. Data comes from a variety of tumors, healthy tissues and recombinant expressions. Kinases have highly specific sequences within substrates which are recognized.

Prediction of kinases targeting a motif is done using sequence homologies to characterized kinase-substrate interaction sites listed in databases. Based on the phosphorylation information contained in the Human Protein Reference Database, GSK-3, ERK1, ERK2, and CDK5 are known to phosphorylate the motif Xaa-Ser-Pro (Amanchy et al, 2007). Unfortunately, none of these kinases are believed to localize to the endoplasmic reticulum, Golgi apparatus, or the extracellular space, implying that the kinase interaction responsible for collagen phosphorylation has not yet been characterized (Bechard and Dalton, 2009; Marchi et al, 2010;

Catania et al, 2001).

Recent studies on kinases localized to the Golgi apparatus have recently suggested that

Fam20C may play a role in collagen phosphorylation (J. Dixon, 2016, personal communication) due to the number of Ser-Xaa-Glu consensus sequences present within the collagen chain.

Despite the presence of such motifs, three within the triple helix and one in the N-telopeptide, none have been observed to be phosphorylated in vivo. It has been noted, however, that phosphorylated Ser-Xaa-Glu motifs play a role in bone formation through the binding of calcium ions in other extracellular matrix proteins (Liu et al, 2014; Tagliabracci et al, 2012; Yalak and

Olsen, 2015). The effect of Fam20C on bone is particularly apparent when it is inactivated in type I collagen-producing cells, resulting in poor development of teeth and bone, resulting from porous mineral deposition over disorganized collagen fibrils (Liu et al 2014). Despite the compelling evidence, it is yet to be demonstrated that Fam20C associates with collagen and these defects do not arise from phosphorylation deficiencies in other collagen associated proteins that Fam20C is known to phosphorylate, specifically SIBLINGs and caseins (Tagliabracci et al, 2012).

Experimental Rationale

This experiment seeks to determine the structural consequences of phosphate addition to collagen sequences, as well as to determine requirements, both sequential and structural, for the phosphorylation of collagen. The specific sequence chosen for the experiments corresponds to residue Ser546 of the α1 chain of human type-1 collagen. Ser546 was chosen as a representation of the Gly-Ser-Pro motif that is commonly phosphorylated. Preliminary results are also discussed involving residue S271, which explores a phosphorylation site in the Yaa position of the triple helix motif. The different bond angles of the Xaa and Yaa positions may result in different spacing of the phosphate groups, which may affect the potential for electrostatic repulsion between the chains. These two sites are also the only sites within the type 1 collagen sequence that have been verified both by phosphoproteomics and site-specific studies (Glass and May, 1984; Olsen et al, 2005; Sharma et al, 2014; Mertins et al, 2014;

Mertins et al, 2016;).

Short peptides were chosen as the phosphorylation substrate due to the ease of detecting folding disruptions through circular dichroism spectroscopy. While longer collagens and collagen-like molecules, such as native type I collagen, or the bacterial collagen-like protein

Scl2, are more physiologically relevant as a model system, phosphates are likely to cause only local folding disruptions which would be difficult to visualize due to the stability of the rest of the chain. Short synthetic peptides, around 30 residues in length, on the other hand, are highly sensitive to changes in even single residues (Persikov et al 2005), and this sensitivity can be quantified through changes in melting temperature and mean residue ellipticity at the triple helix peak around 225 nm. Additionally, the low molecular weight of the peptides allows for detecting phosphorylated peptides through mass spectrometry, which has advantages in terms of simplicity over other methods of detecting kinase activity.

Understanding of the effect of phosphorylation will improve prediction of protein interactions that may be affected by phosphorylation. In addition, understanding the process of phosphorylation in vitro will illuminate conformational requirements, specifically whether kinases can act on an assembled triple helix, which is important in understanding whether phosphorylation can act as a post-secretion mechanism of modulation regulated by cellular secretion of collagen kinases. Furthermore, structural studies of phosphorylated peptides will provide data on the mechanism of chain modification by phosphate, particularly whether it is electrostatic or steric in nature.

Results

Peptide Design and Synthesis

The peptides used in this study were required meet three criteria,

1. Contain a phosphorylation site that has been confirmed in multiple studies

and matches the consensus sequence for a commercially available enzyme;

2. Have a thermal stability between 30° and 37° C to allow the enzyme assay to

be carried out on both the native triple helix and the denatured random coil state. This

was predicted using the Collagen Stability Calculator (Periskov et al 2005);

3. Must be short enough to be synthetically produced with high yield and purity.

A peptide was successfully synthesized based on sequences surrounding residue Ser546 from the human type I collagen α1 chain, denoted Peptide Ser546. The sequence was first observed during the expression of a recombinant gelatin in yeast by Olsen et al (2005) and later confirmed in phosphoproteomics surveys by Sharma et al (2014) and Mertins et al (2014).

Olsen observed that both Ser546 and Ser543, both of which are contained within Gly-Ser-Pro triplets, were phosphorylated by an unknown yeast kinase. Through studies on modified sequences, it was shown that Ser546 phosphorylation was required for Ser543 phosphorylation. Due to the dependent nature of Ser543 phosphorylation and to maintain simple stoichiometry of the reaction, the peptides used in this study were synthesized with an alanine at the residue corresponding to Ser543 in the human collagen sequence (S543A). In order to test the effects of collagen structure on kinase activity, sufficient Gly-Pro-Hyp repeats were added to the sequence spanning Gly542 to Glu550 to promote triple-helix formation. The peptide had a predicted triple-helix stability of 35.6° C. Finally, a Tyrosine residue was added to the C terminal to allow for concentration determination through UV absorption.

To produce a fully phosphorylated control peptide for structural characterization and as a standard for enzymatic phosphorylation, Peptide Ser546 was also synthesized using pSer as the residue within the GSP triplet as a model of phosphorylated collagen, identified as Peptide pSer543. Table 2 shows the sequences of the homologous pair of peptides which were synthesized based on residues 542-550.

Table 2 Peptide Sequence Ser546 PO(GPO)3GAPGSPGPD(GPO)4GY pSer546 PO(GPO)3GAPGJPGPD(GPO)4GY

Table 2: Sequences of peptides produced to investigate the effects of serine phosphorylation on collagen structure and stability. O is the one-letter abbreviation of 4-hydroxyproline (Hyp) and J is the one-letter abbreviation of phosphoserine (pSer). Molecular Dynamics

Molecular dynamics simulations of Peptides Ser546 and pSer546 were performed in collaboration with Hongtao Yu and Yu-Shan Lin of the Tufts University Chemistry Department.

The simulations suggested that phosphoserine R-groups were exposed to the outside of the triple helix in a roughly symmetrical format (Figure 1-B). They also suggested that, contrary to initial hypotheses, the phosphate groups have very little effect on the hydrogen bonding profile of the triple helix (Figure 1-C).

Figure 1

Figure 1: Molecular dynamics simulations of peptides Ser546 and pSer546 showing orientation of phosphoserine residues (B). Blue arrows in (C) represent stable interchain H-bonds, implying minimal disruption of the triple helix by phosphoserine. Simulations and figure courtesy of Hongtao Yu, Tufts University Chemistry Department. Peptide Stability

Both Peptide Ser546 and pSer546 formed normal triple helices as determined by

Circular Dichroism (CD) spectroscopy at 0° and 25° C. Their spectra fit the normal triple helix profile (Figure 2), with a maximum at 225 nm and a minimum near 198nm. Peptide pSer546 showed a slightly lower Tm peptide compared with Ser546, lowering from 37° to 34° C as determined by measurement of CD absorbance at 225 nm as the temperature of the solution was raised from 0° to 70° C (Figure 3).

Figure 2

Peptide Ser546 Peptide pSer546

8000

6000

4000

))

1 -

dmole 2000 2

0

(deg cm (deg 3 - 215 220 225 230 235 240 245 250 255 260

-2000

MRE ([Θ] MRE([Θ] x 10 -4000

-6000

-8000 Wavelength (nm)

Figure 2: CD Spectra from 260-190 nm of Peptide Ser546 and Peptide pSer546 carried out at 0° C. The spectra show the characteristic peptide triple helix with a maximum around 225 nm. Spectra recorded at lower concentrations also showed a minimum near 198nm (data not shown).

Figure 3

Peptide Ser546 Peptide pSer546

6000

5000

))

1 -

4000 dmole

2 3000

2000

(deg cm (deg

3 - 1000

0

0 10 20 30 40 50 60 70 MRE ([Θ] MRE([Θ] x 10 -1000

-2000 Temperature (°C)

Figure 3: CD melting of Peptide Ser546 and Peptide pSer546 obtained by measuring absorbance at the triple helix peak (225 nm) while the temperature was raised from 0° to 70° C. Triple helices undergo a cooperative unfolding process, resulting in a rapid decrease of the triple-helix CD maximum. The Tm is taken to be the point halfway between the native and the denatured CD signals. Molecular dynamics simulations of the peptides with a phosphate charge of -2 compared with -1 suggested that any destabilizing effect of phosphate groups was due to electrostatic interactions and not steric hindrance. To verify these claims, the melting curves at neutral pH, where phosphate has a -2 charge, were compared with the melting curves of peptides dissolved in 0.1 M acetic acid, where phosphate has a -1 charge. The acidic solution stabilized the phosphorylated peptide and both peptides had a Tm of 37° C (Figure 4).

Figure 4

Peptide Ser546 (Acid) Peptide pSer546 (Acid) Peptide S546 (Neutral) Peptide pSer546 (Neutral) 7000

6000

5000

4000

3000

2000

1000

0 0 10 20 30 40 50 60 70 -1000

-2000

Figure 4: Comparison of the thermal stabilities of both the phosphorylated and non- phosphorylated peptides at neutral and acidic pH. The shift in the melting curve of the phosphorylated peptide to match the non-phosphorylated peptide indicates that the charge on the phosphate group was the primary driver of instability.

Refolding of Collagen peptides

Refolding experiments determine the rate at which the trimers are able to re-nucleate and propagate after being denatured at high temperatures. Rates of triple helix refolding are a function of substrate concentration and temperature. Refolding experiments were performed by denaturing the peptide for 30 minutes at 70° C followed by immediate introduction to the

CD chamber, which was pre-equilibrated to the experimental temperature. The CD signal was then observed at 225 nm, revealing the recovery of the triple helix structure. Refolding at 0°C results were not reproducible, showing varying rates of refolding between repeated experiments. The lack of reproducibility may be due to difficulties in determining the precise concentration of the solution, and since it is 2nd or 3rd order effect, small errors in concentration can result in large changes in refolding rate. However, the phosphorylated peptides were consistently slower in refolding than the unphosphorylated peptides. This suggests that the pSer may slow down triple-helix folding. Different concentration measurement techniques or iterative dilutions will need to be employed to obtain more accurate refolding results.

Figure 5

7000 Peptide S546 Peptide pSer546

6000

))

1 -

5000

dmole 2

4000

(deg cm (deg 3 - 3000

2000 MRE ([Θ] MRE([Θ] x 10 1000

0 0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 Time (s)

Figure 5: Refolding of the peptides at 0° C obtained by denaturing the peptide for 30 minutes, at 70° C then introducing to 0° C CD chamber and measuring CD absorption at 225 nm. Both peptides refold into a triple helix, with peptide pSer546 refolding slower than peptide Ser546. High temperature refolding at 30°, 34° and 37° C showed that peptide Ser546 regained some structure, though it was much slower than at 0°C and stabilizing at much lower MRE.

Post-refolding high temperature CD spectra suggested the peptides were not trimerizing, but regaining some polyproline II character, characterized by a similar curve, with a maximum around 230 (Data not shown). These results imply that the peptides remained in monomer conformation over the course of the kinase assays described below.

Figure 6

7000 37° C 34° C 0° C

6000

)) 1

- 5000 dmole

2 4000

3000

(deg cm (deg

3 -

2000

1000 MRE ([Θ] MRE([Θ] x 10

0 0 2000 4000 6000 8000 10000 12000

-1000 Time (s) Figure 6: Refolding of peptide Ser5 at varying temperatures using the same method as refolding at 0° C and a spectrum taken at the end of the refolding process at 37° C. These results show that higher temperature incubation prevents significant triple helix formation but not polyproline II helix formation, which appears to fold in a temperature-dependent manner. Kinase Activity

Since there are currently no kinases identified that are colocalized with collagen and are theorized to phosphorylate the GSP consensus sequences, enzymes that are unlikely physiologically relevant, but with well-characterized properties and commercial availability were chosen for in vitro phosphorylation studies. For Peptide Ser546, Erk1 was the chosen enzyme. Erk1 is a cellular signaling enzyme involved in the Ras-MAP kinase cascade and is responsible for translocating signal from the cytoplasm to the nucleus. The consensus sequences for Erk1 is PXSP, which matches the sequence PGSP found in residues 544-547

(Amanchy et al 2007).

When the peptide was denatured at 70° C for 30 minutes prior to incubation, Peptide

Ser546 was successfully phosphorylated by Erk1. The ATP concentration, time, and other conditions were optimized and efficient phosphorylation was observed at 27 ng/µl Erk1, 200

µM ATP, 24 µM Peptide Ser546 in Promega 1X Kinase Buffer D (40mM Tris, 20mM MgCl2, 0.1 mg/ml BSA, 50 µM DTT, 0.25% DMSO) at 34° C for three hours. Matrix-assisted laser desorption/ionization-time of flight (MALDI-ToF) mass spectrometry results showed that over half of the substrate was phosphorylated at maximum yield (figure 7). All parameters in the incubation mixture had an impact on phosphorylation amount. In all conditions tested, maximum phosphorylation was achieved after three hours and allowing incubation to continue overnight did not significantly change the amount of phosphorylated peptide.

Experiments were carried out to investigate whether the triple-helix state of the peptide could also be phosphorylated. No phosphorylation was observed when the peptide was not denatured prior to the incubation, unless the incubation was carried out at 37° C, implying that the triple helix is resistant to Erk1 interaction.

Figure 7

Figure 7: MALDI readouts for (A) Peptide Ser546, (B) Peptide pSer546, (C) native triple-helical Peptide Ser546 incubated with ERK1, and (D) denatured Peptide Ser546 incubated with Erk1. These results show that the denatured peptide can be phosphorylated by Erk1 while the native peptide is resistant to kinase interaction. Phosphorylation of Ser in the Yaa position of Gly-Xaa-Yaa

In an early study of collagen peptide phosphorylation, Glass and May (1984) found that the serine at residue 109 in a peptide spanning residues 98-110 in the type I α1 chain of chick collagen, corresponding to Ser271 in the human α1 chain (Sequence homology of residues chick

98-110 to human 260-272: 92.3%), could be phosphorylated by PKA1. Furthermore, PKA1 has been observed excreted extracellularly, leading to the potential that this kinase could interact with collagen molecules in the extracellular space (Wang et al, 2007). This site was verified by a phosphoproteomics survey by Mertins et al (2016). To compare the effects of phosphorylation on the Xaa and Yaa positions, a peptide was designed using the same criteria used for Peptide

Ser546 based on residues 266-273 denoted Peptide Ser271 (See Table 3 for sequence). Peptide

Ser271 had a predicted Tm of 34.1° C, and required a two additional Gly-Pro-Hyp repeats surrounding the insert, implying that this likely represents an unstable region in the collagen chain. Because of the numerous differences from the Glass and May sequence, particularly the shorter N-terminal region, and the longer sequence which may have had a negative impact on synthesis it was planned not to have synthesized a control phosphorylated peptide until in vitro phosphorylation of Peptide Ser271 was verified.

Table 3: Peptide Sequence Ser271 PO(GPO)4GHRGFSGLO(GPO)5GY Table 3: The sequence of Peptide Ser271, based on residues 266-273 of the human type I α1 chain. Prediction of thermal stability by the Collagen Stability Calculator suggest that this is an unstable region of the chain, as two additional GPO repeats had to be added to obtain a sequence with a predicted Tm in the desired range.

Peptide S271 also had a normal triple helical structure with a peak at 225 nm. The Tm was found to be 40° C, substantially higher than the Collagen Stability Calculator prediction and outside the ideal range, though it still may not refold at 37° C.

Peptide S271 was not phosphorylated to the same extent as Peptide Ser546. PKA1 incubation with denatured peptide produced only small amounts of phosphorylated peptide under the same conditions as successful ERK1 phosphorylation of Peptide Ser546. No phosphorylation of Peptide S271 was observed in the triple helical state by PKA1. It is possible that the Lysine at position 265 is required for PKA1 recognition. Lys268 was excluded from the peptides as the Stability Calculator predicted that it would have had a significant impact on stability and may not have allowed a stable triple helix at the incubation temperature. Due to the overestimation of Stability Calculator predictions compared with experimental values for

Tm, future research using these model peptides should include Lys265 to verify sequence compatibility and reduce the thermal stability to the desired range.

Testing Sequence Compatibility with Non-Helical Peptides

In Glass and May’s 1984 paper on peptide phosphorylation, they observed stoichiometric phosphorylation of their synthetic peptide. The sequence used, due to the lack of stabilizing Gly-Pro-Hyp triplets, would never form a triple helix (Collagen Stability Calculator predicted Tm: -41.2° C). In an effort to demonstrate enzyme activity and verify their results of stoichiometric phosphorylation, peptides were produced containing the insert sequence plus an additional N-terminal triplet with no Gly-Pro-Hyp sequences called Short Peptide Ser546 and

Short Peptide Ser271. These peptides did not demonstrate more than minimal levels of phosphorylation at all assay conditions. These peptides were unable to form triple helices, even at 0°C, implying that the structure was not responsible for the lack of significant phosphorylation. It is possible that these peptides are too short to interact with the kinases, being only 10 residues each, which is smaller than those used in many other peptide studies.

Table 4 Peptide Sequence Short Peptide Ser546 GLTGSPGSPGPDY Short Peptide Ser271 GMKGHRGFSGLDY Table 4: The sequences of the short, non-helical peptides produced to test sequence compatibility with the chosen enzymes. The sequences were designed based on the wild-type sequence of the α1 chain of human type I collagen, containing one more N-terminal triplet than the triple helical peptides, which were omitted due to stability predictions.

Discussion

Implications of structural characteristics of pSer incorporated within collagen chain

The phosphorylated peptide pSer546 studies show that phosphorylated collagen is capable of folding into a stable triple helix. This implies that it is possible that phosphorylated collagen is capable of being secreted from the ER and incorporated into extracellular structures.

This lends weight to the hypothesis that collagen phosphorylation may regulate protein interaction or be involved in processes such as bone mineralization.

Though phosphorylated collagen peptides were capable of folding, it should be noted that at physiological pH, phosphorylation of the peptide does lead to destabilization of the helix of about 3° C. This could translate into a small destabilization of type I collagen, especially if multiple clustered sites are phosphorylated. As collagen is kinetically, not thermodynamically, stabilized at physiological temperatures (Shoulders and Raines, 2009), this seemingly small degree of change in thermal stability may have an impact in natural systems. It has been noted that mutations which decrease the thermostability of collagen trimers often cause delays in folding, which lead to a buildup of collagen monomers inside the ER (Arnold and Fertala 2013).

This buildup is a significant stressor on the ER, and may lead to increased degradation of collagen, resulting in less secreted to the ECM. Olsen et al (2005), in their identification of

Ser546 as a phosphorylated residue, noted that when they replaced the serine residues at 543 and 546 with alanine, there was increased secretion of their product. It should also be recognized, however, that unlike these peptides, type I collagen is a long molecule and that single point modifications are more likely to affect local stability and are unlikely to affect the thermostability of the molecule as a whole unless many phosphates are incorporated.

It has also been observed in disease phenotypes from Osteogenesis Imperfecta (OI) patients that delayed folding can lead to over-modification of collagen molecules, which further changes the structural and interactive profiles of the molecules. Particularly, additional lysine resides can be hydroxylated, leading to increased glycosylation and linkage of chains

(Raghunath et al 1994). There has not yet been experimental evidence, but it is also possible that phosphorylation could beget additional phosphorylation. It took 3 hours for the enzyme assay in this study to reach maximum phosphorylation and native collagen folding occurs within

30 minutes (Raghunath et al 1994). This, however, assumes that the kinases responsible for collagen phosphorylation in vivo behave similarly to Erk1 and PKA1, which is a major assumption given that the enzymes have not yet been identified.

Collagen refolding and the potential windows for incorporation

The most intriguing result of the kinase assay experiments was the resistance of the native triple helix to the incorporation of phosphate groups (Figure 8). This implies that unless there exists a yet-unknown kinase specific for triple helical molecules, that collagens must become phosphorylated in the ER immediately upon translation. The presence of a triple-helix- specific kinase is not an impossible suggestion as triple helix-specific proteases and binding proteins are known to exist in matrix metalloproteinase and fibronectin. However, if phosphorylation occurs prior to folding, the localization of P4H within the cisternae of the ER and the rapid trimerization once a threshold of hydroxylation is attained suggests that collagen phosphorylation must be synchronous with P4H activity.

Figure 8

Figure 8: The results of this study suggest that while collagen can be phosphorylated while denatured, it cannot act as a substrate for kinases in the triple helical state. Therefore, we propose that collagen phosphorylation occurs in the ER prior to triple helix formation. Figure courtesy of Hongtao Yu and Yu-Shan Lin, Tufts University Chemistry Department. This also implies that Fam20C is not the kinase responsible for the phosphorylation of secreted collagen. Fam20C is localized to the golgi and may be secreted to the extracellular space (Tagliabracci et al, 2010; Olsen et al, 2012) and therefore never has the opportunity to interact with non-helical collagen. This fits with the dearth of observations of phosphorylated Ser-Xaa-Glu motifs in phosphoproteomics surveys. With so many other sites found in multiple surveys, it is relatively implausible that none of the Ser-Xaa-Glu motifs would be observed if they were indeed being phosphorylated in vivo.

Unlike other examples of phosphorylation, which is usually involved in protein regulation, collagen phosphorylation does not appear to be a transient state. Since the triple helix is able to fold with a phosphate group incorporated, and the triple helix resists interactions with non-specific proteins, phosphatases would likely be unable to remove the phosphate groups. Even if phosphatase action is possible, kinase activity appears to be impossible to add phosphates back after being removed. This paradigm of phosphorylation represents a departure from the usual manner considered in a cellular context; instead of as a transient modification, collagen phosphorylation is an intrinsic characteristic of the sequence.

Further Directions of Study

In the understanding of the effects of phosphorylation on collagen structure, the obvious next steps involve translating the small-scale results of this study to large collagen molecules. Studies on the structure and folding rates of longer collagen molecules will determine whether phosphorylated serine indeed delays the propagation of the triple helix around phosphorylated residues. This may also play a role in disease phenotypes, as Gly to Ser mutations are one of the most common missense mutations implicated in the bone disease

Osteogenesis Imperfecta (OI). The potential for the introduction of rouge phosphorylation sites may further exacerbate the destabilizing effects of serine residues in the Gly position. Other research in this lab has demonstrated sequence homologies between OI mutations and known phosphorylation sites within the triple helical domain of type I collagen (P. Chhum, 2017, unpublished data).

Identification of the responsible kinase is a key unanswered question in the physiological role of collagen phosphorylation. We hypothesize that collagen phosphorylation most likely takes place in the cisternae of the ER prior to propagation of the triple helix.

Identification of the kinase would lead to a better understanding of the regulation and therefore the physiological purpose of collagen phosphorylation.

The extracellular effects of collagen phosphorylation also remain to be explored. The size and charge of phosphate groups can affect protein binding and phosphorylation of other proteins has been shown to play a role in bone mineralization (Tagliabracci et al, 2010; Liu,

2014). Collagen phosphorylation has been observed both in tumors (Mertins et al, 2014;

Mertins et al, 2016; Sharma et al 2014) and in normal tissues (Lundby et al, 2012; Olsen et al

2010). Many sites, including Ser546, are observed in both normal and diseased tissues, though others are found only in cancer tissues. This may imply that the rapid secretion of collagen observed in cancers may allow for less specific phosphorylation of the collagen strands or that cancer phenotypes may upregulate kinases responsible.

In conclusion, collagen phosphorylation remains a nascent field in collagen research with many unknowns regarding both the development and function of phosphorylated molecules. The research presented here represents an initial step in understanding the structural properties of phosphorylated collagen and the requirements for kinases to target collagen as a substrate.

Materials and Methods

1: Peptide Design and Synthesis

Design of collagen-like peptides was performed as previously discussed. The peptides were synthesized using FastMoc Chemistry at the 0.1mm scale by the Tufts University Core

Facility. Phosphoserine was incorporated during the synthesis process in Peptide pSer546 in place of serine. The lyophilized peptides were dissolved in phosphate buffered saline (Sigma,

2- 0.01 M PO4 , 0.154 M NaCl, PH 7.4) or 0.1 M acetic acid (pH 2.87) to a concentration of approximately 1 mg/ml. The solution was further diluted to 0.5 mg/ml for CD spectrometry.

Concentration was determined by UV absorption at 180 nm on an Aviv UV/Vis Model 14

Spectrophotometer.

2: Structural Characterization

Circular Dichroism spectroscopy was carried out on an Aviv Model 430 CD Spectrometer then performed to characterize the quality of the triple helix in each molecule. Scans were performed from 260-190 nm at 0° C. Melting curves were obtained by measuring the CD signal at 225 nm while the samples were heated from 0-70° C. Refolding times were determined by denaturing the peptides at 70° C for 30 minutes and then immediately exposing them to 0°C while measuring CD signal at 225 nm. Additionally, refolding was measured at 30, 34 and 37° C to determine whether the peptides would refold under kinase reaction conditions.

3: Kinase Assays

In vitro phosphorylation of the unphosphorylated peptides was attempted using a variety of conditions. The Erk1 assay was modified from the Kinase Enzyme System Protocol

(Hsiao, 2004; Promega, 2011). Modifications were made as the original protocol does not prescribe sufficient ATP to induce complete phosphorylation in a small substrate. The most successful phosphorylation incubation was carried out with the following solutions: substrate solution which was prepared using 25% kinase buffer D (4x Kinase Buffer A (Promega) + 200 µm

DTT + 4% DMSO), 50% 100 µm ATP, 25% substrate in PBS; kinase solution was prepared using

8.3% water, 25% 4x Kinase Buffer D (as described above), and 66.7% 100 ng/µl Erk1 (Promega).

The enzyme and substrate solutions were mixed in a 2:3 ratio and incubated for 1 hr, 2 hr, 3 hr and overnight at room temperature, 30°, 34°, and 37° C. To determine the optimal reaction conditions, ATP concentrations were varied from 0 to 200 µM, and Erk1 mass in reaction volume was varied from 0 to 150 ng. To determine the effect of collagen conformation on kinase activity, phosphorylation of the collagen-like peptides was performed on samples taken directly from 4° C storage as well as samples denatured at 70° C for 30 minutes. Peptide S271 was incubated with activated PKA1 (Promega) under the same parameters as the Erk1 assay above.

Presence of phosphorylated peptide was then determined using MALDI-TOF mass spectrometry, using the Bruker Microflex Mass Spectrometer in the Tufts University Department of Chemistry. The reaction mixture was mixed with MALDI matrix (40% water, 10% trifluoracetic acid, 50% acetone nitrile, and saturated with synaptic acid) in a 3:10 ratio and 2 µl of the peptide-containing matrix was pipetted onto a MALDI-TOF target slide and allowed to dry. Analysis was performed in positive ion mode at 70% amplitude over 200 shots at a rate of

60 shots per second. It was noted that each peptide produced multiple peaks, at the expected mass of the peptide as well as at weights that corresponded to the mass of the peptide plus an associated cation with a molecular weight that matched that of either sodium or potassium.

This is an expected behavior of the MALDI setup when used in positive ion mode (Ashcroft).

Negative ion mode was not used as it could result in phosphate ions becoming involved from the PBS present in the mixture giving false positives.

Acknowledgments

Special thanks to Yimin Qiu and Panharith Chhum for guidance and help in the laboratory,

Hongtao Yu and Yu-Shan Lin from the Tufts University Chemistry Department for molecular dynamics simulation, and Michael Berne at Tufts University Core Facility for peptide synthesis.

References

1. Amanchy, R., Periaswamy, B., Mathivanan, S., Reddy, R., Tattikota, S. G., & Pandey, A.

(2007). A curated compendium of phosphorylation motifs. Nature Biotechnology, 25(3),

285–286. https://doi.org/10.1038/nbt0307-285

2. Arnold, W. V., & Fertala, A. (2013). Skeletal diseases caused by mutations that affect

collagen structure and function. The International Journal of Biochemistry & Cell Biology,

45(8), 1556–1567. https://doi.org/10.1016/j.biocel.2013.05.017

3. Ashcroft, A. (n.d.). An Introduction to Mass Spectrometry, 1–18. Retrieved from

http://www.astbury.leeds.ac.uk/facil/MStut/mstutorial.htm

4. Bechard, M., & Dalton, S. (2009). Subcellular Localization of Glycogen Synthase Kinase

3β Controls Embryonic Stem Cell Self-Renewal. Molecular and Cellular Biology, 29(8), 2092–

2104. https://doi.org/10.1128/MCB.01405-08

5. Boudko, S., Frank, S., Kammerer, R. A., Stetefeld, J., Schulthess, T., Landwehr, R., …

Engel, J. (2002). Nucleation and propagation of the collagen triple helix in single-chain and

trimerized peptides: transition from third to first order kinetics. Journal of Molecular

Biology, 317(3), 459–70. https://doi.org/10.1006/jmbi.2002.5439

6. Catania, a, Urban, S., Yan, E., Hao, C., Barron, G., & Allalunis-Turner, J. (2001). Expression

and localization of cyclin-dependent kinase 5 in apoptotic human glioma cells. Neuro-

Oncology, 3(2), 89–98. Retrieved from

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1920606&tool=pmcentrez&re

ndertype=abstract 7. Di Lullo, G. A., Sweeney, S. M., Körkkö, J., Ala-Kokko, L., & San Antonio, J. D. (2002).

Mapping the ligand-binding sites and disease-associated mutations on the most abundant protein in the human, type I collagen. Journal of Biological Chemistry, 277(6), 4223–4231. https://doi.org/10.1074/jbc.M110709200

8. Glass, D. B., & May, J. M. (1984). In vitro phosphorylation of a synthetic collagen peptide by cyclic AMP-dependent protein kinase. Collagen and Related Research, 4(1), 63–74. https://doi.org/10.1016/S0174-173X(84)80029-1

9. Hsiao, K., Zegzouti, H., Vidugiriene, J., & Goueli, S. A. (2004). Kinase Assay, 1–2.

10. Joan C. Marini, Antonella Forlino Wayne A. Cabral, Aileen M. Barnes, James D. San

Antonio, Sarah Milgrom, James C. Hyland, Jarmo Körkkö, Darwin J. Prockop, Anne De Paepe,

Paul Coucke, Sofie Symoens, Francis H. Glorieux, Peter J. Roughley, Alan M. Lund, Kaija

Kuurila-Svahn, Heini Hartikka, Daniel H. Cohn, Deborah Krakow, Monica Mottes, Ulrike

Schwarze, Diana, & Chen, Kathleen Yang, Christine Kuslich, James Troendle, Raymond

Dalgleish, and Peter H. Byers. (2007). Consortium of Osteogenesis Imperfect Mutations in the Helical Domain of Type I Collagen: Regions Rich in Lethal Mutations Align with Collagen

Binding Sites for Integrins and Proteoglycans. Human Mutation, 28(3), 209–221. https://doi.org/10.1002/humu

11. Liu, P., Zhang, H., Liu, C., Wang, X., Chen, L., & Qin, C. (2014). Inactivation of FAM20C in cells expressing type I collagen causes periodontal disease in mice. PLoS ONE, 9(12), 1–18. https://doi.org/10.1371/journal.pone.0114396 12. Lundby, A., Secher, A., Lage, K., Nordsborg, N. B., Dmytriyev, A., Lundby, C., & Olsen, J.

V. (2012). Quantitative maps of protein phosphorylation sites across 14 different rat organs and tissues. Nature Communications, 3, 876. https://doi.org/10.1038/ncomms1871

13. Matilde Marchi, Riccardo Parra, Mario Costa, and G. M. R. (2010). Localization and

Trafficking of Fluorescently Tagged ERK1 and ERK2. In MAP Kinase Signaling Protocols (Vol.

661, pp. 287–301). https://doi.org/10.1007/978-1-60761-795-2

14. Mertins, P., Yang, F., Liu, T., Mani, D. R., Petyuk, V. A., Gillette, M. A., … Carr, S. A.

(2014). Ischemia in tumors induces early and sustained phosphorylation changes in stress kinase pathways but does not affect global protein levels. Mol Cell Proteomics, 13(7), 1690–

1704. https://doi.org/10.1074/mcp.M113.036392

15. Mertins, P., Mani, D. R., Ruggles, K. V., Gillette, M. A., Clauser, K. R., Wang, P., … Nci, C.

(2016). Proteogenomics connects somatic mutations to signalling in breast cancer. Nature,

534(7605), 55–62. https://doi.org/10.1038/nature18003\rhttp://www.nature.com/nature/journal/v534/n760

5/abs/nature18003.html#supplementary-information

16. Olsen, B. R., Berg, R. a, Kishida, Y., & Prockop, D. J. (1973). Collagen synthesis: localization of prolyl hydroxylase in tendon cells detected with ferritin-labeled antibodies.

Science (New York, N.Y.), 182(114), 825–827. https://doi.org/10.1126/science.182.4114.825

17. Olsen, D., Jiang, J., Chang, R., Duffy, R., Sakaguchi, M., Leigh, S., … Polarek, J. W. (2005).

Expression and characterization of a low molecular weight recombinant human gelatin:

Development of a substitute for animal-derived gelatin with superior features. Protein

Expression and Purification, 40(2), 346–357. https://doi.org/10.1016/j.pep.2004.11.016 18. Olsen, J. V, Vermeulen, M., Santamaria, A., Kumar, C., Miller, M. L., Jensen, L. J. Gnad, F.

Cox, J., Jensen, T. S, Nigg, E. A, Brunak, S, Mann, M. (2010). Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis.

Science Signaling, 3(104), ra3. https://doi.org/10.1126/scisignal.2000475

19. Orgel, J. P., Wess, T. J., & Miller, A. (2000). The in situ conformation and axial location of the intermolecular cross- linked non-helical telopeptides of type I collagen. Structure, 8(2),

137–142. https://doi.org/10.1016/S0969-2126(00)00089-7

20. Persikov, A. V., Ramshaw, J. A. M., & Brodsky, B. (2005). Prediction of collagen stability from amino acid sequence. Journal of Biological Chemistry, 280(19), 19343–19349. https://doi.org/10.1074/jbc.M501657200

21. Utto, J, Prockop, D. J. (1974). Intracellular Hydroxylation of Non-Helical Protocollagen to

Form Triple-Helical Procollagen and Subsequent Secretion of the Molecule, 230, 221–230.

22. Promega. (2011). Kinase Enzyme System Protocol PROTOCOLS & APPLICATIONS GUIDE

Materials Required : Kinase Enzyme System Protocol PROTOCOLS & APPLICATIONS GUIDE,

1–7.

23. Promega. (2009). ADP-Glo TM Kinase Assay. Luminescence, 1–21.

24. Raghunath, M., Bruckner, P., & Steinmann, B. (1994). Delayed Triple Helix Formation of

Mutant Collagen from Patient with Osteogenesis Imperfecta. Journal of Molecular Biology. https://doi.org/10.1006/jmbi.1994.1199

25. Ricard-Blum, S. (2011). The Collagen Family. Cold Spring Harbor Perspectives in Biology,

3(1), 1–19. https://doi.org/10.1101/cshperspect.a004978 26. Sharma, K., D’Souza, R. C. J., Tyanova, S., Schaab, C., Wiśniewski, J., Cox, J., & Mann, M.

(2014). Ultradeep Human Phosphoproteome Reveals a Distinct Regulatory Nature of Tyr and Ser/Thr-Based Signaling. Cell Reports, 8(5), 1583–1594. https://doi.org/10.1016/j.celrep.2014.07.036

27. Shoulders, M. D., & Raines, R. T. (2010). Collagen Structure and Stability. Annu Rev

Biochem, 78, 929–958. https://doi.org/10.1146/annurev.biochem.77.032207.120833.COLLAGEN

28. Tagliabracci, V. S., Engel, J. L., Wen, J., Wiley, S. E., Carolyn, A., Kinch, L. N., … Dixon, J. E.

(2012). Secreted Kinase Phosphorylated Extracellular Proteins That Regulate

Biomineralization. Science, 336(6085), 1150–1153. https://doi.org/10.1126/science.1217817.Secreted

29. Veit, G., Kobbe, B., Keene, D. R., Paulsson, M., Koch, M., & Wagener, R. (2006). Collagen

XXVIII, a novel von Willebrand factor A domain-containing protein with many imperfections in the collagenous domain. Journal of Biological Chemistry, 281(6), 3494–3504. https://doi.org/10.1074/jbc.M509333200

30. Vitagliano, L., Berisio, R., Mazzarella, L., & Zagari, A. (2001). Structural bases of collagen stabilization induced by proline hydroxylation. Biopolymers, 58(5), 459–464. https://doi.org/10.1002/1097-0282(20010415)58:5<459::AID-BIP1021>3.0.CO;2-V

31. Wang, H., Li, M., Lin, W., Wang, W., Zhang, Z., Rayburn, E. R., … Zhang, R. (2007).

Extracellular activity of cyclic AMP-dependent protein kinase as a biomarker for human cancer detection: distribution characteristics in a normal population and cancer patients.

Cancer Epidemiology, Biomarkers & Prevention : A Publication of the American Association for Cancer Research, Cosponsored by the American Society of Preventive Oncology, 16(4),

789–795. https://doi.org/10.1158/1055-9965.EPI-06-0367

32. Yalak, G., & Olsen, B. R. (2015). Proteomic database mining opens up avenues utilizing extracellular protein phosphorylation for novel therapeutic applications. Journal of

Translational Medicine, 13(1), 125. https://doi.org/10.1186/s12967-015-0482-4