Multi-protease analysis of Pleistocene bone proteomes

Lanigan, Liam T; Mackie, Meaghan; Feine, Susanne; Hublin, Jean-Jacques; Schmitz, Ralf W; Wilcke, Arndt; Collins, Matthew J; Cappellini, Enrico; Olsen, Jesper V.; Taurozzi, Alberto J; Welker, Frido

Published in: Journal of Proteomics

DOI: 10.1016/j.jprot.2020.103889

Publication date: 2020

Document version Publisher's PDF, also known as Version of record

Document license: CC BY-NC-ND

Citation for published version (APA): Lanigan, L. T., Mackie, M., Feine, S., Hublin, J-J., Schmitz, R. W., Wilcke, A., Collins, M. J., Cappellini, E., Olsen, J. V., Taurozzi, A. J., & Welker, F. (2020). Multi-protease analysis of Pleistocene bone proteomes. Journal of Proteomics, 228, [103889]. https://doi.org/10.1016/j.jprot.2020.103889

Download date: 26. sep.. 2021 Journal of Proteomics 228 (2020) 103889

Contents lists available at ScienceDirect

Journal of Proteomics

journal homepage: www.elsevier.com/locate/jprot

Multi-protease analysis of Pleistocene bone proteomes T Liam T. Lanigana, Meaghan Mackiea,b, Susanne Feinec, Jean-Jacques Hublind,e, Ralf W. Schmitzc, ⁎ Arndt Wilckef, Matthew J. Collinsa,g, Enrico Cappellinia, Jesper V. Olsenb, Alberto J. Taurozzia, , ⁎ Frido Welkera,d, a Evolutionary Genomics Section, Globe Institute, University of Copenhagen, Copenhagen, Denmark b Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark c LVR-LandesMuseum, Bonn, d Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany e Collège de , Paris, France f Fraunhofer Institute for Cell Therapy and Immunology, Leipzig, Germany g Department of Archaeology, University of Cambridge, Cambridge,

ABSTRACT

Ancient protein analysis is providing new insights into the evolutionary relationships between hominin fossils across the Pleistocene. Protein identification commonly relies on the proteolysis of a protein extract using a single protease, trypsin. As with modern proteome studies, alternative or additional proteases have the potential to increase both proteome size and protein sequence recovery. This could enhance the recovery of phylogenetic information from ancient proteomes. Here we identify 18 novel hominin bone specimens from the Kleine Feldhofer Grotte using MALDI-TOF MS peptide mass fingerprinting of collagen type I. Next, we use one of these hominin bone specimens and three Late Pleistocene Equidae specimens identified in a similar manner and present a comparison of the bone proteome size and protein sequence recovery obtained after using nanoLC-MS/MS and parallel proteolysis using six different proteases, including trypsin. We observe that the majority of the preserved bone proteome is inaccessible to trypsin. We also observe that for proteins recovered consistently across several proteases, protein sequence coverage can be increased significantly by combining peptide identifications from two or more proteases. Our results thereby demonstrate that the proteolysis of Pleistocene proteomes by several proteases has clear advantages when addressing evolutionary questions in palaeoproteomics. Significance: Maximizing proteome and protein sequence recovery of ancient skeletal proteomes is important when analyzing unique hominin fossils. As with modern proteome studies, palaeoproteomic analysis of Pleistocene bone and dentine samples has almost exclusively used trypsin as its only protease, despite the demon- strated advantages of alternative proteases to increase proteome recovery in modern proteome studies. We demonstrate that Pleistocene bone proteomes can be significantly expanded by using additional proteases beside trypsin, and that this also improves sequence coverage of individual proteins. The use of several alternative proteases beside trypsin therefore has major benefits to maximize the phylogenetic information retrieved from ancient skeletal proteomes.

1. Introduction and Y are frequently proline and hydroxyproline, respectively [12]. The maintenance of these G-X-Y repeats is required for proper collagen fibril The study of ancient proteins, or palaeoproteomics, has recently formation [12,13]. Despite these structural and functional constraints seen its first applications to human origin studies. These range from the on the COL1 triple helical regions, COL1 is a phylogenetically in- identification of faunal complexes associated with hominin occupancy formative biomolecule that has been used in a number of studies on through MALDI-TOF MS peptide mass fingerprinting of collagen type I, extinct mammals [14–16]. However, its utility to disentangle the phy- to the study of hominin phylogeny based on entire ancient skeletal logenetic relationships between closely related species or populations is proteomes [1–9]. Such palaeoproteomic studies primarily analyze the limited due to its slow rate of evolution, a consequence of the restric- proteins preserved in the skeletal tissues (bone, dentine, enamel) from tions on sequence variability within G-X-Y repeats. hominin fossils directly. In addition to collagens, the bone and dentine proteomes also Bone and dentine proteomes are dominated by collagen type I contain a large number of non-collagenous proteins (NCPs; [10,17,18]). (COL1), which comprises roughly 90% of the total organic content of As an assemblage the NCPs have a heterogenous origin, including bone. This dominance of COL1 is maintained in Pleistocene bone pro- proteins deriving from the bone extracellular matrix, cellular proteins teomes [10,11]. The COL1A1 and COL1A2 chains that compose the and plasma proteins. NCPs are commonly identified by fewer peptides mature COL1 protein are largely composed of G-X-Y repeats, in which X when compared to COL1, and hence have lower sequence coverage.

⁎ Corresponding authors at: Evolutionary Genomics Section, Globe Institute, University of Copenhagen, Copenhagen, Denmark. E-mail addresses: [email protected] (A.J. Taurozzi), [email protected] (F. Welker). https://doi.org/10.1016/j.jprot.2020.103889 Received 28 April 2020; Received in revised form 8 June 2020; Accepted 25 June 2020 Available online 09 July 2020 1874-3919/ © 2020 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/BY-NC-ND/4.0/). L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

However, they are more numerous in absolute protein numbers, num- 1990 (excavations led by F. Lévêque). No recent radiocarbon ages are bering several hundreds to a few thousand different proteins in fresh available for the site or its Châtelperronian layers, but the specimen can tissues [19]. Some of these NCPs are also less evolutionarily conserved be expected to date between 40 and 50 ka BP [28]. Equid2 represents a when compared to COL1 [20]. Therefore, theoretically, NCPs provide non-diagnostic bone specimen from the Châtelperronian layer 6 at La access to a larger, more diverse set of phylogenetic information. Ferrassie, France, recovered during recent excavations (directed by As with most modern proteome studies, palaeoproteomic studies of Alain Turq). It is identified as Equidae as part of ongoing COL1 peptide bone proteomes almost exclusively use trypsin as the single digestive mass fingerprinting of the bone assemblage deriving from layer 6. This protease [21]. It has recently been demonstrated that the use of addi- layer is dated to approximately 42 ± 3 ka BP [29, 30]. Finally, Equid3 tional proteases, or different combinations of proteases, has the po- and the Homo sp. specimens derive from sediment attributed to the tential to access different “extractomes” of the same proteome, thereby Kleine Feldhofer Grotte, Germany. They were identified as part of a significantly increasing both the total number of proteins identified and larger COL1 peptide mass fingerprinting study of 76 bone and dentine the combined protein sequence coverage obtained [22–24]. Therefore, specimens that could represent potential hominin bone specimens (re- the use of multiple proteases could have major benefits to the phylo- ported below). The deposits from the Kleine Feldhofer Grotte were in- genetic study of ancient proteomes. Alternative proteolysis strategies itially removed during August 1856, resulting in the discovery and have so far been explored to a limited extent in palaeoproteomics. description of Homo neanderthalensis [31–33]. Archival research and Examples include, among others, the cleavage of collagen with col- archaeological excavations led by R.W. Schmitz and J. Thissen resulted lagenese prior to trypsin proteolysis [11], parallel proteolysis with in the rediscovery of these sediments in 1997, and subsequent ex- elastase and trypsin [14,25], and consecutive proteolysis using Lys-C cavations in 1997 and 2000 also provided additional faunal bone and trypsin [26]. Despite their potential for improving protein identi- fragments, bone fragments, and Palaeolithic archaeology fications and/or sequence coverage in palaeoproteomic studies, a sys- [34,35]. These specimens are dated to around 40 ka BP [35]. tematic study of the effects of alternative or additional proteases has so far not been formally investigated. 2. Methods Here, our primary aim is to explore the use of five alternative pro- teases in addition to trypsin applied to four Late Pleistocene bone 2.1. Protein extraction and proteolysis for MALDI-TOF MS samples. We selected these four bone specimens (three horses and one hominin) randomly after MALDI-TOF MS peptide mass fingerprinting A total of 76 specimens (75 bone, 1 dentine) recovered from the conducted at three different Late Pleistocene archaeological sites lo- backdirt excavation from the Kleine Feldhofer Grotte sediments [35] cated in western Europe (Quinçay, , the Kleine Feldhofer were analyzed using collagen fingerprinting via MALDI-TOF MS ana- Grotte; Fig. 1). Equid1 was identified as part of previous MALDI-TOF lysis (“ZooMS”)[36]. All bone specimens could either not be identified MS COL1 peptide mass fingerprinting of morphologically unidentifiable morphologically or were classified as potential hominins. Bone spe- bone specimens from Quinçay, France [27]. It derives from the Châ- cimen maximum dimensions were in the range of 15 to 61 mm. Protein telperronian layer Ej at the site, and was recovered between 1968 and extraction proceeded in a two-step approach. First, soluble collagen was extracted using an ammonium-bicarbonate (Ambic; 50 mM) protocol without demineralizing the bone matrix [37]. Bone samples that failed to provide a conclusive taxonomic assignment (n = 15) or that were identified as potential hominins (n = 17) using the ammonium-bi- carbonate protocol were subsequently demineralized in hydrochloric acid (0.6 M; [37]). In both protocols, solubilized proteins were digested overnight using trypsin (Thermo Scientific), acidified using an aqueous 5% v/v trifluoroacetic acid (TFA) solution, and purified using C18 ZipTips (Thermo Scientific) according to the manufacturer's instruction. Samples were eluted using an aqueous solution comprising 0.5% v/v TFA and 50% v/v acetonitrile. One negative extraction control was processed alongside each extraction protocol to monitor the presence of protein contamination in the laboratory environment. COL1 peptides were not observed in these negative extraction controls. We note that peptide mass fingerprinting using MALDI-TOF MS is entirely based around trypsin digestion. In our study we maintained this approach and decided to not explore the use of alternative proteases for peptide mass fingerprinting purposes.

2.2. MALDI-TOF MS and spectra identification

1 μl of the eluted peptide mixture was spotted on a 384 stainless steel MALDI Target Plate together with 1 μl of CHCA matrix. Each sample was spotted in triplicate. The plate was air-dried and analyzed on a Autoflex LRF MALDI-TOF MS (Bruker) in reflector mode, positive polarity, matrix suppression of 590Da, and collected in the mass-to- – Fig. 1. Location of archaeological sites and specimens included in this study. charge range 800 4000m/z. After spectra acquisition, the triplicates of Horse specimens from Quinçay (Equid1) and La Ferrassie (Equid2) derive from each sample were merged and analyzed manually in mMass [38]. the Châtelperronian layers (layer Ej and layer 6, respectively). Equid3 and the Spectral taxonomic identification proceeded via comparison against Homo sp. bone specimen derive from the re-excavated archaeological deposits reference peptide markers that cover most medium- and larger mammal associated with the Neanderthal type specimen from the Kleine Feldhofer species in existence of the Late Pleistocene of western Europe [36,37]. Grotte (Germany). All specimens likely date to between 40 and 50 ka BP. The Glutamine deamidation was calculated following Wilson et al. [39] for base map was generated using public domain data from http://www. the peptides P1105 and P1706. Both peptides contain a single gluta- naturalearthdata.com/. mine (Q) and no asparagines (N).

2 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

An equid (Equid3, ZooMS specimen number F-35) and a hominin other samples and proteases were suspended in Ambic. A summary of specimen (ZooMS specimen number F-15) were selected randomly from extraction and proteolysis conditions can be found in Table S1. All the identified specimens. Here, the only criterion for selection was the samples were then immobilised on in-house made C18 StageTips in presence of a MALDI-TOF MS spectrum allowing taxonomic identifi- triplicate according to the parameters from [41], and sent for LC-MS/ cation. We added two Equidae specimens identified through peptide MS analysis at the Novo Nordisk Center for Protein Research. mass fingerprinting screening ongoing at La Ferrassie (Equid2, ZooMS specimen number LF-59) and Quinçay (Equid1, ZooMS specimen 2.4. LC-MS/MS number Q-12497, [27]). The protein extraction, MALDI-TOF MS pro- cedure, and spectral identifications were identical for these two speci- Samples were eluted from StageTips using 30 μl 40% ACN 0.1% mens and the two Kleine Feldhofer Grotte specimens. formic acid (FA) into a 96 well MS plate. Samples were adjusted to contain approximately 2 μg protein, based on concentrations achieved 2.3. Protein extraction and proteolysis for LC-MS/MS by Nanodrop (Thermo, Wilmington, DE, USA) measurement at A205 nm, and all samples were brought to the same volume using 40% ACN Protein extraction was based on existing protocols for palaeopro- 0.1% FA. Samples were then placed in a vacuum centrifuge at 40 °C teomics [40], modified to ensure compatibility between the six dif- until approximately 3 μl of solution was left and then rehydrated with ferent proteases. For each of the four selected bone specimens, around 10 μl of 0.1% TFA and 5% ACN solution. 60 mg of bone was weighed out and placed in a 1.5 ml microcentrifuge 5 μl of each sample was then separated on an in-house made 15 cm tube. To reduce contamination, 300 μl of 50 mM Ambic (pH 8) was column (75 μm inner diameter) packed with 1.9 μm C18 beads (Dr. added and incubated at 25 °C for 2.5 h, with 20 s of mixing every Maisch, Germany) on an EASY-nLC 1200 (Proxeon, Odense, Denmark). 20 min. A negative extraction control was generated for each proteo- The column was connected to a Q-Exactive HF-X orbitrap mass spec- lysis condition. Following this, the supernatant was removed, and re- trometer (Thermo Scientific, Bremen, Germany) on a 77 min gradient. placed with 1800 μl of 0.5 M EDTA. The samples and negative controls Buffer A was milliQ water, and the peptides were separated with in- were then rotated end-over-end until demineralisation was complete creasing buffer B (80% ACN and 0.1% formic acid), going from 5% to (1–3 days). They were then incubated for 30 min at 60 °C to ensure that 30% in 50 min, 30% to 45% in 10 min, 45% to 80% in 2 min, held at the proteins were solubilized. 80% for 5 min before dropping back down to 5% in 5 min and held for The samples were centrifuged at 10,000 x g for 10 min, and the 5 min. Flow rate was 250 nl/min. The column temperature was kept at EDTA removed and split between four Amicon Ultra 3 kDa cut-offfilters 40 °C with an integrated oven. (Sigma-Aldrich, Denmark). The filters were prepared with a pre-rinse of The Q-Exactive HF-X was operated in data dependent top 10 mode. 0.1 M NaOH suspended in molecular biology grade water (Accugene), Spray voltage was 3.7 kV, S-lens RF level was 40, and capillary was with a subsequent rinse of molecular biology grade water. The sample heated at 320 °C. Full scan mass spectra were recorded at a resolution of was passed through the filters by centrifugation at 12,000 x g for 120,000 at m/z 200 over the m/z range 350–1400 with a target value of 20–30 min. A buffer exchange was performed by washing each filter 3e6 and a maximum injection time of 25 ms. MS/MS HCD-generated with 400 μl of 50 mM Ambic (pH 8) three times. After buffer exchange product ions were recorded with a maximum ion injection time set to the upper fraction was eluted from each of the four filters for each 108 ms and a target value set to 2e5 and recorded at a resolution of sample and was then combined into a single collection tube for each 60,000. Normalized collision energy was set at 28% and the isolation sample. This was performed by sequentially removing and inverting window was 1.2 m/z with the dynamic exclusion set to 20 s. each of the four Amicon filters and placing them in the collection tube A wash-blank method using 0.1% TFA and 5% ACN solution was followed by centrifugation at 1000 x g for 3 min. The same procedure run in between each sample to reduce cross-contamination. was followed for the negative controls. The protein concentration was determined by BCA assay (Pierce Biotechnology, Rockford, IL, USA) 2.5. MaxQuant database searching according to the manufacturer's instructions. Pellets left over from demineralization were resuspended in 100 μl Raw mass spectrometry data were searched in MaxQuant (version of 50 mM Tris and incubated at 60 °C for 30 min, with 10 s mixing using 1.6.3.4) against the horse (UP000002281) and human (UP000005640) a vortex every 10 min. Once the incubation was completed, the protein proteomes (downloaded from UniProt on 07/07/2018 and 11/09/ concentration of the supernatant was determined by BCA assay ac- 2018, respectively). Sequences of common protein contaminants were cording to the manufacturer's instructions. Results from this analysis added to the database search in MaxQuant. For the Equidae, searches showed comparable protein concentrations; therefore the filtered EDTA were performed separately for each protease. For the hominin, a single and Tris fractions were combined. search was performed including all data files, grouped per protease. A 0.5 M stock solution of TCEP (Sigma Aldrich) was added to each Injection replicates were kept separate during MQ data analysis. Each sample to reach a final concentration of 10 mM and incubated at 56 °C search was performed twice, once using semi-specific proteolysis mode for 30 min. Subsequently, a 0.5 M stock solution of CAA (Sigma and once using specific proteolysis mode. For each search, default set- Aldrich) was added to reach a final concentration of 20 mM, and in- tings of 20 ppm for the first search and 4.5 ppm for the final search cubated in the dark at room temperature for another 30 min. The su- were used, and a fragment mass tolerance of 20 ppm. Protease-specific pernatant of each sample and the extraction controls were then ali- cleavage settings can be found in Table S1. As ancient proteomes are quoted into six equal fractions (ranging from 23 μlto83μl) to commonly highly deamidated, we modified Glu-C specificity to also correspond with the six protease conditions. The divided samples were include C-terminal cleavage after glutamine (Q), alongside C-terminal then heated to 60 °C for 30 min to promote solubilisation of the pro- cleavage of glutamic acid (E) and aspartic acid (D). This strategy teins. worked, as approximately one quarter of the Glu-C-specific peptides All proteases were added in a protease:protein ratio of 1:50, and derived from C-terminal cleavage of glutamines (Fig. S1; [42]). As proteolysis was performed at 37 °C overnight, except for chymotrypsin variable post-translational modifications (PTMs), we selected oxidation which was incubated at 25 °C. Zinc sulphate (ZnSO4) was added to Lys- (M), deamidation (N and Q), proline hydroxylation (P), and carbami- N samples to a final concentration of 10 mM. Elastase, chymotrypsin, domethylation (C). No fixed modifications were selected. Peptide and pepsin were vacuum concentrated (Eppendorf, Denmark) to dry- matches were accepted with a minimum score of 40. Peptide and pro- ness and resuspended in 50 mM Tris-HCl at a pH of 8.5–9.5; 100 mM tein FDR were both set to 1.0%.

Tris-HCl and 10 mM calcium-chloride (CaCl2) at pH 8.0; and 50 mM of Subsequently, during data analysis post-MQ protein groups were Ambic (NH4HCO3) adjusted to a pH of one using HCl, respectively. All accepted as confidently identified for each sample only when at least

3 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 2. New hominin bone specimens from the Kleine Feldhofer Grotte identified through MALDI-TOF MS. a, F-2. b, F-3. c, F-4. d, F-6. e + f, F-25. g, F-15. This bone was selected for subsequent LC-MS/MS analysis. h, F-26. i, F-34, j, F-38. k, F-40. l, F-42. m, F-51. n, F-53, o, F-54. p, F-55. q, F-63. r, F-72. s, F-76. Photos by J. Vogel (LVR-LandesMuseum Bonn). two replicates contained one unique+razor peptide, or when at least protein groups together using 1000 bootstraps [43]. Presented Venn one replicate contained two or more unique+razor peptides. The diagrams were created using http://bioinformatics.psb.ugent.be/ UniProt accession numbers of protein groups considered possible con- webtools/Venn/. The raw proteomic data files as well as MaxQuant taminants and those present with at least two unique+razor peptides search results are available in ProteomeXchange under accession across all negative controls can be found in Table S2. These, alongside number PXD018264 [44]. reverse protein entries, were removed from subsequent analysis. Sample-level deamidation rates were calculated for all remaining

4 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 3. Comparison of the number of proteins identified for six different proteases. a, proteome sizes after semi-specific searches. b, proteome sizes after specific searches. Note differences in y-axis scale. NCPs = Non-collagenous proteins. Chymo = Chymotrypsin.

3. Results and discussion 3.2. Proteolysis with alternative proteases allows access to larger bone- proteomes 3.1. Collagen fingerprinting identifies novel hominin bone specimens A homogenized bone sample from each specimen was digested Collagen fingerprinting of 75 bone specimens and 1 dental specimen using six proteases separately. Each digest was injected in triplicate from the Kleine Feldhofer Grotte sediments identified 18 novel hominin (technical replicates). Peptide and proteome identification was then bone specimens (Fig. 2; Table S3). The newly identified hominin bone performed twice on the resulting datasets, first via semi-specific pro- specimens do not directly refit to any of the two Neanderthal in- tease settings and a second time using specific protease settings. Overall dividuals known from the Kleine Feldhofer Grotte hominin assemblage proteome deamidation indicates greater deamidation for asparagine (N) [35], but might represent parts of the missing skeletal elements from compared to glutamine (Q; Fig. S3, S4), and above that observed for our the Neanderthal type specimen or of the additional female Neanderthal. negative extraction controls (Fig. S5). Multiple t-testing for a difference In addition, specimens belonging to 13 additional taxa were also in deamidation for each extraction control-sample pair supports this identified. These include extinct megafauna such as Elephantidae conclusion (Table S4). Deamidation is therefore, given differences in (likely Mammuthus sp.) and Rhinocerotidae, and a wider variety of protein and peptide numbers and proteome composition, relatively medium- and large-sized herbivore specimens. A few specimens could comparable across proteases for each bone specimen and suggests the be identified as carnivores. Extinct megafauna and the identified ho- recovery of endogenous, ancient bone proteomes from all four bone minin specimens have slightly more advanced deamidation signatures specimens. compared to some, but not all, herbivores (Fig. S2) suggesting that the We observe a significant difference (p < .000001) in extractome analyzed assemblage is potentially composed of more recent and Late sizes between specific protease settings and semi-specific protease set- Pleistocene bone specimens. tings (means of 111.3 ± 66.5 proteins and 19.3 ± 7.4 proteins, re- Extraction controls produced MALDI-TOF MS spectra devoid of spectively; Fig. 3; Table S5). For semi-specific searches, trypsin pro- collagen peptides, indicating that contamination by collagen did not teolysis resulted in larger extractomes than any other protease tested occur in the laboratory environment. In addition, ammonium-bicarbo- (mean of 33 proteins for trypsin and 17 proteins for the other five nate and acid extractions from the hominin bone specimens produced proteases). In contrast, chymotrypsin (mean = 178 proteins) and Glu-C identical taxonomic identifications. These results add to a growing (mean = 177 proteins) yielded extractomes larger than with trypsin number of studies that demonstrate the utility of proteomic screening of proteolysis (mean = 131 proteins) in specific searches. Regardless of potential hominin and morphologically unidentifiable specimens in identification approach, proteolysis with elastase (mean = 51 pro- Pleistocene contexts [6,37,45]. teins), pepsin (mean = 114 proteins), and particularly Lys-N

5 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 4. Proteome overlap between protease digests. Venn diagrams of the number of protein groups shared and uniquely identified by each protease digest per sample. a, Equid1. b, Equid2. c, Equid3. d, Homo sp. The total number of protein groups included in each Venn diagram is given in the top-left corner of each panel. Lys-N was excluded due to small proteome sizes (see Figs. 3 and 5). A significant portion of the bone proteome (65–79%) is not identified via trypsin. Chymo = Chymotrypsin. Data from specific search only.

(mean = 16 proteins) resulted in smaller extractomes. The majority of identified by a single protease. This observation is true for all four bone the proteins identified in the specific search are NCPs, while in the specimens that we analyzed. When comparing the protein composition semi-specific searches approximately 50% of the identified proteins are of our Equidae specimens for each of the utilized proteases, we also collagens. This was observed in all conditions and samples. observe that the majority of the proteins identified in each sample are Despite the smaller number of identified proteins, semi-specific shared between the three specimens. Therefore, protein composition searches resulted in the identification of some proteins not observed in identified using each protease is generally reproducible across bone specific searches. For the Equidae samples, these include decorin specimens (Fig. 5). Although the majority of proteins are represented by (DCN), which is involved in collagen fibril formation [46], and lac- low peptide counts, they are consistently identified by specific pro- toylglutathione lyase (GLO1), which is necessary for normal osteo- teases across Pleistocene bone specimens. clastogenesis [58]. For the hominin specimen, a similar example is re- Strikingly, the trypsin extractome provided access to only a small presented by integrin alpha-9 (ITGA9), which together with integrin portion of the entire proteome per specimen (22.1%, 21.2%, 21.3%, beta-1 (ITGB1) interacts with osteopontin (SPP1; [47]). In turn, os- and 34.6%, respectively; Fig. 4). Examples of proteins not identified in teopontin is a common bone NCP that, like other small integrin-binding the Homo sp. trypsin digests include fibroblast growth factor-binding ligand, N-linked glycoproteins (SIBLINGs), binds to hydroxyapatite protein 1 (FGFBP1, identified via Lys-N only) and von Willebrand factor [48]. Therefore, despite their smaller size, semi-specific searches might A domain-containing protein 1 (VWA1, identified via Glu-C and Lys-N). provide protein identifications of interest that could be missed during FGFBP1 is an extracellular protein that controls the release of fibroblast protease-specific search settings. Additionally, protein and peptide- binding factors (FGFs), which play a role in skeletal growth and identifications only recovered via semi-specific searches have the po- maintenance [49,50]. VWA1 is expressed in chondrocytes and poten- tential to enlarge the phylogenetic information recovered from ancient tially interacts with collagens, perlecan, as well as FGFs [51,52]. proteomes. Our results imply that the majority of the ancient bone proteome is Alternative proteases also provide access to distinct extractomes potentially inaccessible to trypsin proteolysis. This observation is in (Fig. 4). We observe that the majority of proteins are uniquely general agreement with previous modern proteomic studies of bone

6 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 5. Proteome overlap between Equidae specimens. a, Chymotrypsin. b, Elastase. c, Glu-C. d, Lys-N. e, Pepsin. f, Trypsin. The majority of proteins identified by each protease are identified in two or three of the Equidae specimens in all cases, with the exception of Lys-N. Data from specific search only.

[53,54] and non-bone tissues [21–24], that also demonstrate the re- osteocalcin (OCN). OCN is thought to be one of the more abundant covery of a significant number of additional proteins through the use of NCPs in bone and dentine proteomes but is variably detected in non-trypsin proteases. Future work could potentially provide further (trypsin-digested) ancient bone proteomes [56]. We identify no OCN clarification as to what determines the accessibility of the ancient bone peptides in our trypsin-digest either, but do retrieve peptide sequences proteome to trypsin proteolysis. spanning the C-terminal region of the mature protein in chymotrypsin, pepsin, and Glu-C digests (Fig. 7c). Again, this shows that (several) alternative proteases can demonstrate the presence of proteins not 3.3. Alternative proteases yield increased protein sequence coverage identified by trypsin proteolysis of the same protein extract. Previous proteomic mass spectrometry studies have suggested that In addition to proteome size and composition, the second determi- highly dominant proteins might mask the presence of low-abundance nant of the phylogenetic utility of ancient bone proteomes concerns proteins [57]. Similarly, in skeletal proteomes, the dominance of COL1 protein sequence coverage. We therefore investigated protein sequence might mask the presence of low-abundance proteins [54]. Indeed, we coverage for a subset of proteins consistently identified across bone observe that COL1 has the highest sequence coverage of all identified specimens and proteases. proteins across all search conditions and bone specimens (Fig. 6a, b). As We observe that protein sequence coverage of non-collagenous expected, sequence coverage is restricted to the mature, triple-helical proteins (NCPs) identified across specimens and proteases is variable. In regions of COL1A1 and COL1A2 (Fig. 7a). Previous attempts to deplete some cases the coverage retrieved from the alternative proteases is ancient collagen by collagenase proteolysis into peptide fragments too greater than the sequence coverage observed after trypsin proteolysis of small for common nanoLC-MS/MS analysis proved unsuccessful in re- the same bone specimen (Fig. 6c–f). More importantly, we observe moving this dominant protein [11]. An alternative approach would be examples where protease digests provide access to alternative sequence to avoid proteotypic cleavage of COL1. We observe that proteolysis regions of the same protein. For example, alternative proteases almost with alternative proteases accesses the same mature, triple-helical re- double the sequence coverage of collagen type 5, alpha-2 (COL5A2, gion (Fig. 7a) but with lower sequence coverage (Fig. 6a, b). This is from 16 to 29%), biglycan (BGN, from 7 to 13%), and pigment epi- particularly true for chymotrypsin, which on average reduced COL1 thelium-derived factor (SERPINF1, from 6 to 12%) in our Homo sp. sequence coverage by ≈50% (54.4% mature COL1 sequence coverage, sample (Fig. 7b, e, f). A similar observation can be made for proteins compared to 94.2% mature COL1 sequence coverage for trypsin). This not identified by trypsin but present in several other digests of this is unsurprising, as chymotrypsin primarily cleaves after large and aro- sample. For example, ficolin-1 (FCN1), an extracellular lectin expressed matic amino acid residues (Y, W, F, M, L) which are relatively un- in bone marrow [55], is absent in trypsin digests of our Homo sp. common in COL1. Chymotrypsin proteomes are among the largest sample but has up to 20% sequence coverage across chymotrypsin, generated in our comparative set-up (Fig. 3). Enzymatic strategies that pepsin, and Glu-C extractomes (Fig. 7d). Another example concerns

7 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 6. Sequence coverage for selected protein groups. a, Collagen type 1, alpha-1 (COL1A1, UniProt accession F6SSG3). b, Collagen type 1, alpha-2 (COL1A2, F6RUA6). Horizontal line indicates average trypsin-derived sequence coverage for COL1A1 and COL1A2, respectively. Note that sequence coverage for COL1 is calculated here for the entire protein, including signal and pro-peptide regions. c, Biglycan (BGN, O46403). d, Osteomodulin (OMD, F6ZRQ3). e, Alpha-2-HS- glycoprotein (AHSG, F7C450). f, Chondroadherin (CHAD, F6WD70). Only data deriving from the three Equidae samples is included, with the three replicates kept separate. Note differences in y-axis scale between panels. Open symbols: specific search. Closed symbols: semi-specific search. Chymo = Chymotrypsin. avoid or minimize the proteolytic cleavage of COL1 might therefore be effectively were Glu-C and chymotrypsin. Additionally, combining se- a favourable route towards enhancing the detection of non-collagenous quence coverage generated via alternative proteases improved overall proteins in skeletal proteomics. protein sequence coverage. Glu-C and chymotrypsin were found to be the most useful additional proteases for this application. For specific 4. Conclusion applications or targeted studies other alternative proteases might be better suited. These observations are in agreement with previously re- We identified 18 novel hominin bone specimens from the Kleine ported modern proteome studies on bone and non-bone tissues, and Feldhofer Grotte, potentially belonging to the Neanderthal type spe- pave the way towards more complex ancient proteomes with improved cimen, via MALDI-TOF MS peptide mass fingerprinting. We utilized one phylogenetic value. of these hominin bone specimens and three additional Pleistocene Equidae bone specimens and explored extractome size and composition Funding sources as well as protein sequence coverage generated through the use of six proteases and nanoLC-MS/MS. We observed differences in size and This research was supported by generous funding from the Leakey composition for each of these extractomes. Importantly, we determined Foundation. FW was supported by the European Commission through a that the proteolysis of skeletal proteomes using only trypsin resulted in Marie Skłodowska-Curie (MSCA) Individual Fellowship (795569). E.C. a systematic underestimation of bone proteome size. In our study, the is supported by the VILLUM FONDEN (grant number 17649). M.M., proteases that complimented protein identification by trypsin most L.T.L., and A.J.T. are supported by the Danish National Research

8 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Fig. 7. Sequence coverage obtained for six different proteins using six different proteases. a, Collagen type 1, alpha-1 (COL1A1, UniProt accession P02452). The mature region of COL1A1 is located between amino acid positions 162 and 1218. b, Collagen type 5, alpha-2 (COL5A2, P05997). c, Osteocalcin (BGLAP, P02818). d, Ficolin-1 (FCN1, O00602). e, Biglycan (BGN, P21810). f, Pigment epithelium-derived factor (SERPINF1, P36955). Data included from the Homo sp. specimen only, with specific searches, semi-specific searches, and technical replicates combined.

Foundation award PROTEIOS (DNRF128). F.W. and J.-J.H. thank the L.T.L., A.J.T. and M.M. wrote the manuscript. All authors discussed the Max Planck Society for support. Work at the Novo Nordisk Foundation results and commented on the manuscript. Center for Protein Research is funded in part by a donation from the Novo Nordisk Foundation (NNF14CC0001). Funders had no role in Declaration of Competing Interest study design or any other aspect of the presented research. The authors declare there is no conflict of interest. Author contributions Acknowledgements F.W. and A.J.T designed the study. S.F., J.-J.H. and R.W.S. provided samples. A.W., M.J.C., E.C. and J.V.O. provided technical infra- M. Soressi, M. Roussel (Leiden University, Leiden, NL), S.P. structure. L.T.L. and A.J.T. performed the experiments. M.M. operated McPherron (MPI-EVA, Leipzig, DE), T.E. Steele (UC Davis, USA), A. the nanoLC-MS/MS. F.W., L.T.L. and A.J.T. analyzed the data. F.W., Turq (Musée national de Préhistoire, Les Eyzies, France), and S.

9 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

Madelaine (Musée national de Préhistoire, Les Eyzies, France and jprot.2018.11.014. UMR5199 PACEA, Pessac, France) are thanked for providing access to [17] S.M. Lyon, A. Mayampurath, M.R. Rogers, D.J. Wolfgeher, S.M. Fisher, S.L. Volchenboum, T.-C. He, R.R. Reid, A method for whole protein isolation from the Quinçay and La Ferrassie bone specimens. human cranial bone, Anal. Biochem. 515 (2016) 33–39. [18] C.R. Salmon, D.M. Tomazela, K.G.S. Ruiz, B.L. Foster, A.F. Paes Leme, E.A. Sallum, Appendix A. Supplementary data M.J. Somerman, F.H. Nociti Jr., Proteomic analysis of human dental cementum and alveolar bone, J. Proteome 91 (2013) 544–555. [19] X. Jiang, M. Ye, X. Jiang, G. Liu, S. Feng, L. Cui, H. Zou, Method development of Supplementary data to this article can be found online at https:// efficient protein extraction in bone tissue for proteome analysis, J. Proteome Res. 6 doi.org/10.1016/j.jprot.2020.103889. (2007) 2287–2294. [20] F. Welker, Palaeoproteomics for human evolution studies, Quat. Sci. Rev. 190 (2018) 137–147. References [21] L. Tsiatsiani, A.J.R. Heck, Proteomics beyond trypsin, FEBS J. 282 (2015) 2612–2626. [22] X. Guo, D.C. Trudgian, A. Lemoff, S. Yadavalli, H. Mirzaei, Confetti: a multiprotease [1] F. Welker, M. Soressi, W. Rendu, J.-J. Hublin, M.J. Collins, Using ZooMS to identify map of the HeLa proteome for comprehensive proteomics, Mol. Cell. Proteomics 13 fragmentary bone from the late middle/early upper Palaeolithic sequence of les (2014) 1573–1584. Cottés, France, J. Archaeol. Sci. 54 (2015) 279–286. [23] D.L. Swaney, C.D. Wenger, J.J. Coon, Value of using multiple proteases for large- [2] V. Sinet-Mathiot, G.M. Smith, M. Romandini, A. Wilcke, M. Peresani, J.-J. Hublin, scale mass spectrometry-based proteomics, J. Proteome Res. 9 (2010) 1323–1329. F. Welker, Combining ZooMS and zooarchaeology to study late Pleistocene hominin [24] J.R. Wiśniewski, M. Mann, Consecutive proteolytic digestion in an enzyme reactor behaviour at Fumane (), Sci. Rep. 9 (2019) 12350. increases depth of proteomic and phosphoproteomic analysis, Anal. Chem. 84 [3] G.P. Bouchard, J. Riel-Salvatore, F. Negrino, M. Buckley, Archaeozoological, ta- (2012) 2631–2637. phonomic and ZooMS insights into the protoaurignacian faunal record from riparo [25] B. Demarchi, S. Hall, T. Roncal-Herrero, C.L. Freeman, J. Woolley, M.K. Crisp, bombrini, Quat. Int. (2020), https://doi.org/10.1016/j.quaint.2020.01.007. J. Wilson, A. Fotakis, R. Fischer, B.M. Kessler, R. Rakownikow Jersie-Christensen, [4] F. Chen, F. Welker, C.-C. Shen, S.E. Bailey, I. Bergmann, S. Davis, H. Xia, H. Wang, J.V. Olsen, J. Haile, J. Thomas, C.W. Marean, J. Parkington, S. Presslee, J. Lee- R. Fischer, S.E. Freidline, T.-L. Yu, M.M. Skinner, S. Stelzer, G. Dong, Q. Fu, Thorp, P. Ditchfi eld, J.F. Hamilton, M.W. Ward, C.M. Wang, M.D. Shaw, G. Dong, J. Wang, D. Zhang, J.-J. Hublin, A late middle pleistocene denisovan T. Harrison, M. Domínguez-Rodrigo, R.D.E. MacPhee, A. Kwekason, M. Ecker, L. mandible from the Tibetan plateau, Nature. 569 (2019) 409–412. Kolska Horwitz, M. Chazan, R. Kröger, J. Thomas-Oates, J.H. Harding, [5] F. Welker, J. Ramos-Madrigal, P. Gutenbrunner, M. Mackie, S. Tiwary, E. Cappellini, K. Penkman, M.J. Collins, Protein sequences bound to mineral sur- R. Rakownikow Jersie-Christensen, C. Chiva, M.R. Dickinson, M. Kuhlwilm, M. de faces persist into deep time, Elife 5 (2016) e17092, , https://doi.org/10.7554/eLife. Manuel, P. Gelabert, M. Martinón-Torres, A. Margvelashvili, J.L. Arsuaga, 17092. E. Carbonell, T. Marques-Bonet, K. Penkman, E. Sabidó, J. Cox, J.V. Olsen, [26] R. Sawafuji, E. Cappellini, T. Nagaoka, A.K. Fotakis, R.R. Jersie-Christensen, D. Lordkipanidze, F. Racimo, C. Lalueza-Fox, J.M. Bermúdez de Castro, J.V. Olsen, K. Hirata, S. Ueda, Proteomic profiling of archaeological human bone, R. E. Willerslev, E. Cappellini, The dental proteome of Homo antecessor, Nature. 580 Soc. Open Sci. 4 (2017) 161004. (2020) 235–238. [27] F. Welker, M.A. Soressi, M. Roussel, I. van Riemsdijk, J.-J. Hublin, M.J. Collins, [6] S. Brown, T. Higham, V. Slon, S. Pääbo, M. Meyer, K. Douka, F. Brock, D. Comeskey, Variations in glutamine deamidation for a Châtelperronian bone assemblage as N. Procopio, M. Shunkov, A. Derevianko, M. Buckley, Identification of a new ho- measured by peptide mass fingerprinting of collagen, STAR. 3 (2017) 15–27. minin bone from Denisova , Siberia using collagen fingerprinting and mi- [28] M. Roussel, M. Soressi, J.-J. Hublin, The Châtelperronian conundrum: blade and tochondrial DNA analysis, Sci. Rep. 6 (2016) 23559. bladelet lithic technologies from Quinçay, France, J. Hum. Evol. 95 (2016) 13–32. [7] E. Cappellini, F. Welker, L. Pandolfi, J. Ramos-Madrigal, D. Samodova, P.L. Rüther, [29] G. Guérin, M. Frouin, S. Talamo, V. Aldeias, L. Bruxelles, L. Chiotti, H.L. Dibble, A.K. Fotakis, D. Lyon, J.V. Moreno-Mayar, M. Bukhsianidze, R. Rakownikow Jersie- P. Goldberg, J.-J. Hublin, M. Jain, C. Lahaye, S. Madelaine, B. Maureille, Christensen, M. Mackie, A. Ginolhac, R. Ferring, M. Tappen, E. Palkopoulou, S.J.P. McPherron, N. Mercier, A.S. Murray, D. Sandgathe, T.E. Steele, K.J. Thomsen, M.R. Dickinson, T.W. Stafford Jr., Y.L. Chan, A. Götherström, S.K.S.S. Nathan, A. Turq, A multi-method luminescence dating of the Palaeolithic sequence of La P.D. Heintzman, J.D. Kapp, I. Kirillova, Y. Moodley, J. Agusti, R.-D. Kahlke, Ferrassie based on new excavations adjacent to the La Ferrassie 1 and 2 skeletons, J. G. Kiladze, B. Martínez-Navarro, S. Liu, M. Sandoval Velasco, M.-H.S. Sinding, Archaeol. Sci. 58 (2015/6) 147–166. C.D. Kelstrup, M.E. Allentoft, L. Orlando, K. Penkman, B. Shapiro, L. Rook, L. Dalén, [30] M. Frouin, G. Guérin, C. Lahaye, N. Mercier, S. Huot, V. Aldeias, L. Bruxelles, M.T.P. Gilbert, J.V. Olsen, D. Lordkipanidze, E. Willerslev, Early Pleistocene enamel L. Chiotti, H.L. Dibble, P. Goldberg, S. Madelaine, S.J.P. McPherron, D. Sandgathe, proteome from Dmanisi resolves Stephanorhinus phylogeny, Nature 574 (2019) T.E. Steele, A. Turq, New luminescence dating results based on polymineral fine 103–107. grains from the middle and Upper Palaeolithic site of La Ferrassie (Dordogne, SW [8] F. Welker, J. Ramos-Madrigal, M. Kuhlwilm, W. Liao, P. Gutenbrunner, M. de France), Quat. Geochronol. 39 (2017) 131–141, https://doi.org/10.1016/j.quageo. Manuel, D. Samodova, M. Mackie, M.E. Allentoft, A.-M. Bacon, M.J. Collins, J. Cox, 2017.02.009. C. Lalueza-Fox, J.V. Olsen, F. Demeter, W. Wang, T. Marques-Bonet, E. Cappellini, [31] H. Schaaffhausen, Vortrag vom 2. Juni zu den Menschenknochen aus dem Enamel proteome shows that Gigantopithecus was an early diverging pongine, , Verh. Naturhist. Ver. Preuss. Rheinl. 14 (1857) 50–52. Nature. 576 (2019) 262–265. [32] J.C. Fuhlrott, Theile des menschlichen Skelettes im Neanderthal bei Hochdahl, [9] E. Cappellini, A. Prohaska, F. Racimo, F. Welker, M.W. Pedersen, M.E. Allentoft, Verh. Naturhist. Ver. Preuss. Rheinl. 14 (1857) 50. P. de Barros Damgaard, P. Gutenbrunner, J. Dunne, S. Hammann, M. Roffet-Salque, [33] W. King, On the Neanderthal skull, or reasons for believing it to belong to the M. Ilardo, J.V. Moreno-Mayar, Y. Wang, M. Sikora, L. Vinner, J. Cox, R.P. Evershed, Clydian period and to a species different from that represented by man, Notices E. Willerslev, Ancient biomolecules and evolutionary inference, Annu. Rev. Abstr. Br. Assoc. Adv. Sci. 1863 (1864) 81–82. Biochem. 87 (2018) 1029–1060. [34] F.H. Smith, M.O. Smith, R.W. Schmitz, Human skeletal remains from the 1997 and [10] E. Cappellini, L.J. Jensen, D. Szklarczyk, A. Ginolhac, R.A.R. da Fonseca, 2000 excavations of cave deposits derived from Kleine Feldhofer Grotte in the T.W. Stafford, S.R. Holen, M.J. Collins, L. Orlando, E. Willerslev, M.T.P. Gilbert, Neander Valley, Germany, in: R.W. Schmitz (Ed.), Neanderthal 1856–2006., Mainz, J.V. Olsen, Proteomic analysis of a pleistocene mammoth femur reveals more than 2006, pp. 187–246. one hundred ancient bone proteins, J. Proteome Res. 11 (2012) 917–926. [35] R.W. Schmitz, D. Serre, G. Bonani, S. Feine, F. Hillgruber, H. Krainitzki, S. Pääbo, [11] C. Wadsworth, M. Buckley, Proteome degradation in fossils: investigating the F.H. Smith, The Neandertal type site revisited: interdisciplinary investigations of longevity of protein survival in ancient bone, Rapid Commun. Mass Spectrom. 28 skeletal remains from the Neander Valley, Germany, Proc. Natl. Acad. Sci. U. S. A. (2014) 605–615. 99 (2002) 13342–13347. [12] J.K. Mouw, G. Ou, V.M. Weaver, Extracellular matrix assembly: a multiscale de- [36] M. Buckley, M.J. Collins, J. Thomas-Oates, J.C. Wilson, Species identification by construction, Nat. Rev. Mol. Cell Biol. 15 (2014) 771–785. analysis of bone collagen using matrix-assisted laser desorption/ionisation time-of- [13] B. Brodsky, A.V. Persikov, Molecular structure of the collagen triple helix, Adv. flight mass spectrometry, Rapid Commun. Mass Spectrom. 23 (2009) 3843–3854. Protein Chem. 70 (2005) 301–339. [37] F. Welker, M. Hajdinjak, S. Talamo, K. Jaouen, M. Dannemann, F. David, M. Julien, [14] F. Welker, M.J. Collins, J.A. Thomas, M. Wadsley, S. Brace, E. Cappellini, M. Meyer, J. Kelso, I. Barnes, S. Brace, P. Kamminga, R. Fischer, B.M. Kessler, S.T. Turvey, M. Reguero, J.N. Gelfo, A. Kramarz, J. Burger, J. Thomas-Oates, J.R. Stewart, S. Pääbo, M.J. Collins, J.-J. Hublin, Palaeoproteomic evidence iden- D.A. Ashford, P.D. Ashton, K. Rowsell, D.M. Porter, B. Kessler, R. Fischer, tifies archaic hominins associated with the Châtelperronian at the , C. Baessmann, S. Kaspar, J.V. Olsen, P. Kiley, J.A. Elliott, C.D. Kelstrup, V. Mullin, Proc. Natl. Acad. Sci. U. S. A. 113 (2016) 11162–11167. M. Hofreiter, E. Willerslev, J.-J. Hublin, L. Orlando, I. Barnes, R.D.E. MacPhee, [38] M. Strohalm, D. Kavan, P. Novák, M. Volný, V. Havlícek, mMass 3: a cross-platform Ancient proteins resolve the evolutionary history of Darwin/’s South American software environment for precise analysis of mass spectrometric data, Anal. Chem. ungulates, Nature 522 (2015) 81–84. 82 (2010) 4648–4651. [15] S. Presslee, G.J. Slater, F. Pujos, A.M. Forasiepi, R. Fischer, K. Molloy, M. Mackie, [39] J. Wilson, N.L. van Doorn, M.J. Collins, Assessing the extent of bone degradation J.V. Olsen, A. Kramarz, M. Taglioretti, F. Scaglia, M. Lezcano, J.L. Lanata, using glutamine deamidation in collagen, Anal. Chem. 84 (2012) 9041–9048. J. Southon, R. Feranec, J. Bloch, A. Hajduk, F.M. Martin, R. Salas Gismondi, [40] C. Warinner, J.F.M. Rodrigues, R. Vyas, C. Trachsel, N. Shved, J. Grossmann, M. Reguero, C. de Muizon, A. Greenwood, B.T. Chait, K. Penkman, M. Collins, A. Radini, Y. Hancock, R.Y. Tito, S. Fiddyment, C. Speller, J. Hendy, S. Charlton, R.D.E. MacPhee, Palaeoproteomics resolves sloth relationships, Nat. Ecol. Evol. 3 H.U. Luder, D.C. Salazar-García, E. Eppler, R. Seiler, L.H. Hansen, J.A.S. Castruita, (2019) 1121–1130, https://doi.org/10.1038/s41559-019-0909-z. S. Barkow-Oesterreicher, K.Y. Teoh, C.D. Kelstrup, J.V. Olsen, P. Nanni, T. Kawai, [16] M. Buckley, C. Lawless, N. Rybczynski, Collagen sequence analysis of fossil camels, E. Willerslev, C. von Mering, C.M. Lewis Jr., M.J. Collins, M.T.P. Gilbert, F. Rühli, Camelops and c.f. Paracamelus, from the Arctic and sub-Arctic of Plio-Pleistocene E. Cappellini, Pathogens and host immunity in the ancient human oral cavity, Nat. North America, J. of Proteomics 194 (2019) 218–225, https://doi.org/10.1016/j. Genet. 46 (2014) 336–344.

10 L.T. Lanigan, et al. Journal of Proteomics 228 (2020) 103889

[41] R.R. Jersie-Christensen, L.T. Lanigan, D. Lyon, M. Mackie, D. Belstrøm, [50] E. Tassi, A. Al-Attar, A. Aigner, M.R. Swift, K. McDonnell, A. Karavanov, C.D. Kelstrup, A.K. Fotakis, E. Willerslev, N. Lynnerup, L.J. Jensen, E. Cappellini, A. Wellstein, Enhancement of fibroblast growth factor (FGF) activity by an FGF- J.V. Olsen, Quantitative metaproteomics of medieval dental calculus reveals in- binding protein, J. Biol. Chem. 276 (2001) 40247–40253. dividual oral health status, Nat. Commun. 9 (2018) 4744. [51] J.M. Allen, J.F. Bateman, U. Hansen, R. Wilson, P. Bruckner, R.T. Owens, T. Sasaki, [42] O. Wagih, ggseqlogo: a versatile R package for drawing sequence logos, R. Timpl, J. Fitzgerald, WARP is a novel multimeric component of the chondrocyte Bioinformatics 33 (2017) 3645–3647. pericellular matrix that interacts with perlecan, J. Biol. Chem. 281 (2006) [43] M. Mackie, P. Rüther, D. Samodova, F. Di Gianvincenzo, C. Granzotto, D. Lyon, 7341–7349. D.A. Peggie, H. Howard, L. Harrison, L.J. Jensen, J.V. Olsen, E. Cappellini, [52] U. Hansen, J.M. Allen, R. White, C. Moscibrocki, P. Bruckner, J.F. Bateman, Palaeoproteomic profiling of conservation layers on a 14th century Italian wall J. Fitzgerald, WARP interacts with collagen VI-containing microfibrils in the peri- painting, Angew. Chem. Int. Ed. Eng. 57 (2018) 7369–7374, https://doi.org/10. cellular matrix of human chondrocytes, PLoS One 7 (2012) e52793. 1002/anie.201713020. [53] P.A. Bell, N. Solis, J. Kizhakkedathu, I. Matthew, C.M. Overall, Proteomic and N- [44] E.W. Deutsch, N. Bandeira, V. Sharma, Y. Perez-Riverol, J.J. Carver, D.J. Kundu, terminomic TAILS analyses of human alveolar bone proteins: improved protein D. García-Seisdedos, A.F. Jarnuczak, S. Hewapathirana, B.S. Pullman, J. Wertz, extraction methodology and LysargiNase digestion strategies increase proteome Z. Sun, S. Kawano, S. Okuda, Y. Watanabe, H. Hermjakob, B. Maclean, coverage and missing protein identification, J. Proteome Res. 18 (12) (2019) M.J. MacCoss, Y. Zhu, Y. Ishihama, J.A. Vizcaíno, The ProteomeXchange con- 4167–4179, https://doi.org/10.1021/acs.jproteome.9b00445. sortium in 2020: enabling “big data”approaches in proteomics, Nucleic Acids Res. [54] M. Kuljanin, C.F.C. Brown, M.J. Raleigh, G.A. Lajoie, L.E. Flynn, Collagenase 48 (D1) (2020) D1145–D1152. treatment enhances proteomic coverage of low-abundance proteins in decellular- [45] T. Devièse, I. Karavanić, D. Comeskey, C. Kubiak, P. Korlević, M. Hajdinjak, ized matrix bioscaffolds, Biomaterials 144 (2017) 130–143, https://doi.org/10. S. Radović, N. Procopio, M. Buckley, S. Pääbo, T. Higham, Direct dating of 1016/j.biomaterials.2017.08.012. Neanderthal remains from the site of and implications for the middle [55] Y. Liu, Y. Endo, D. Iwaki, M. Nakata, M. Matsushita, I. Wada, K. Inoue, to upper Paleolithic transition, Proc. Natl. Acad. Sci. U. S. A. 114 (2017) M. Munakata, T. Fujita, Human M-ficolin is a secretory protein that activates the 10606–10611. lectin complement pathway, J. Immunol. 175 (2005) 3150–3156. [46] Y. Mochida, D. Parisuthiman, S. Pornprasertsuk-Damrongsri, P. Atsawasuwan, [56] T.P. Cleland, C.J. Thomas, C.M. Gundberg, D. Vashishth, Influence of carboxylation M. Sricholpech, A.L. Boskey, M. Yamauchi, Decorin modulates collagen matrix on osteocalcin detection by mass spectrometry, Rapid Commun. Mass Spectrom. 30 assembly and mineralization, Matrix Biol. 28 (2009) 44–52. (2016) 2109–2115. [47] Y. Takada, X. Ye, S. Simon, The integrins, Genome Biol. 8 (2007) 215. [57] C. Wu, J. Duan, T. Liu, R.D. Smith, W.-J. Qian, Contributions of immunoaffinity [48] L.W. Fisher, D.A. Torchia, B. Fohr, M.F. Young, N.S. Fedarko, Flexible structures of chromatography to deep proteome profiling of human biofluids, J. Chromatogr. B SIBLING proteins, bone sialoprotein, and osteopontin, Biochem. Biophys. Res. Anal. Technol. Biomed. Life Sci. 1021 (2016) 57–68. Commun. 280 (2001) 460–465. [58] M. Kawatani, H. Okumura, K. Honda, N. Kanoh, M. Muroi, N. Dohmae, M. Takami, [49] J.R. Chapman, O. Katsara, R. Ruoff, D. Morgenstern, S. Nayak, C. Basilico, M. Kitagawa, Y. Futamura, M. Imoto, H. Osada, The identification of an osteo- B. Ueberheide, V. Kolupaeva, Phosphoproteomics of fibroblast growth factor 1 clastogenesis inhibitor through the inhibition of glyoxalase I, Proc. Natl. Acad. Sci. (FGF1) signaling in chondrocytes: identifying the signature of inhibitory response, 105 (33) (2008) 11691–11696. Mol. Cell. Proteomics 16 (2017) 1126–1137.

11