www.nature.com/scientificreports

OPEN Possible proteomic biomarkers for the detection of pancreatic cancer in oral fuids O. Deutsch1,7, Y. Haviv2,7, G. Krief1, N. Keshet2, R. Westreich3,4, S. M. Stemmer5, B. Zaks1, S. P. Navat1, R. Yanko2, O. Lahav1, D. J. Aframian2,7 & A. Palmon1,6,7*

The 80% mortality rate of pancreatic-cancer (PC) makes early diagnosis a challenge. Oral fuids (OF) may be considered the ultimate body fuid for non-invasive examinations. We have developed techniques to improve visualization of minor OF thereby overcoming major barriers to using OF as a diagnostic fuid. The aim of this study was to establish a short discriminative panel of OF biomarkers for the detection of PC. Unstimulated OF were collected from PC patients and controls (n = 30). High-abundance-proteins were depleted and the remaining proteins were analyzed by two- dimensional-gel-electrophoresis and quantitative dimethylation-liquid-chromatography-tandem mass-spectrometry. Label-free quantitative-mass-spectrometry analysis (qMS) was performed on 20 individual samples (n = 20). More than 100 biomarker candidates were identifed in OF samples, and 21 had a highly diferential expression profle. qMS analysis yielded a ROC-plot AUC value of 0.91 with 90.0% sensitivity and specifcity for a combination of fve biomarker candidates. We found a combination of fve biomarkers for PC. Most of these proteins are known to be related to PC or other gastric cancers, but have never been detected in OF. This study demonstrates the importance of novel OF depletion methodologies for increased visibility and highlights the clinical applicability of OF as a diagnostic fuid.

Abbreviations OF Oral fuids PC Pancreatic cancer

Pancreatic cancer (PC) ofen remains undetected until the late stages of the disease. Each year approximately 37,000 Americans are diagnosed with PC, furthermore, 33,000 Americans and more than 42,000 Europeans die from pancreatic cancer ­annually1. PC was the 4th leading caner type for estimated deaths in the USA in 2012 and ­20132–4. Te median survival time for PC is nine to 12 months with an overall 5-year survival rate of 3%. Te high mortality rate is due in part to the fact more than 50% of patients with PC have metastatic disease at the time of diagnosis. Te 50% recurrence of PC following surgical removal, suggests that PC is relatively refractory to current treatments. No specifc tumor marker for the diagnosis of PC has been identifed, complicating early diagnosis. Terefore, extensive genomic, transcriptomic, and proteomic studies are being performed to identify candidate markers by employing high-throughput systems capable of large cohort screening. Currently, early detection of pancreatic cancer in high-risk patients is done using highly invasive means (Endoscopic ultrasound combined with fne- needle-aspiration). Tese methods cause discomfort, require an expert team and are very expensive, making them useless as screening tools. Te lack of a single diagnostic marker suggests that only a combination of biomarkers

1Institute of Dental Sciences, Faculty of Dental Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel. 2Salivary Gland Clinic and Saliva Diagnostic Laboratory, Department of Oral Medicine, Sedation and Maxillofacial Radiology, Sjogren’s Syndrome Center, Hadassah Medical Center, The Hebrew University of Jerusalem, Jerusalem, Israel. 3Department of Internal Medicine B, Soroka Medical Center, Beer‑Sheva, Israel. 4Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer‑Sheva, Israel. 5Rabin Medical Center, Davidof Center, Petach Tiqwa, Afliated to the Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel. 6Department of Oral Medicine, Sedation and Maxillofacial Imaging, Faculty of Dental Medicine, Hebrew University – Hadassah, Jerusalem, Israel. 7These authors contributed equally: O. Deutsch, Y. Haviv, D. J. Aframian and A. Palmon *email: [email protected]

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 1 Vol.:(0123456789) www.nature.com/scientificreports/

will be able to provide the appropriate combination of high sensitivity and specifcity. Biomarker discovery using novel technologies can improve prognostic upgrading and pinpoint new molecular targets for innovative therapy. Over the last decade OF have been recognized as a "diagnostic window to the body"5. Tis is due to the fact that despite the apparent low degree of overlap between OF and plasma, the distribution found across Ontological categories, such as molecular function, biological processes, and cellular components, is very ­similar6. Many centers, including our department, have taken advantage of the non-invasive access to this readily available body fuid. Furthermore the composition of OF is known and therefore fuctuations can be used to monitor diseases and physiological changes­ 7. Te positive aspects of OF compared to serum as a diagnostic fuid for practitioners include simple collection (of adequate volumes), storage and shipment. Procurement is also safer than venipuncture, limiting exposure to infectious agents. Te non-invasive, painless collection reduces fear and enhances compliance when repeated samples are needed over time. Te non-clotting nature of the fuid makes it ideal for diagnostic purposes­ 8. Analysis of OF using proteomics has been hindered by the presence of high abundant proteins such as salivary alpha amylases (sAA)9–12 albumins (alb)13 and immunoglobulins (Ig)12,13 which conceal or reduce the separation sensitivity of other proteins. Tere are two main advantages to high abundant protein depletion followed by 2DE: (i) gel resolution is increased because the levels of the low abundant proteins in the proteomic map are relatively higher and (ii) important low abundant proteins are revealed when the overlapping high abundant protein spots are removed. Low abundant proteins can also be exposed by using qMS. We have developed and successfully used techniques to remove the high abundant proteins in OF, thereby improving protein visualization­ 14–16. We hypothesize that OF composition will be altered by pancreatic cancer. Te similarity of the structure of the salivary and pancreatic glands may cause the salivary glands to function as a biological amplifer and to produce proteins in response to PC which will be detectable in OF. Tis phe- nomenon has been reported in breast cancer patients, (the structure of the mammary glands is also similar to the salivary glands). C-erb-b2 a breast cancer marker was produced by the salivary glands and detected in the saliva of breast cancer patients­ 17. Te aim of this study was to identify and develop an early detection assay for PC based on OF, to characterize OF proteins following removal of the high abundance proteins, and to identify candidate biomarkers for PC. Materials and methods Ethical approval. Te OF accumulation protocol was approved by the Ethical Committee, Rabin Medi- cal Center, Beilinson Hospital, Request No. 0053-09-RMC. Informed consent was obtained according to the instructions of the Ethical Committee. All procedures were carried out in accordance with relevant guidelines and regulations.

OF collection, patients and healthy volunteers. Unstimulated OF fow was collected for 5 min using the spitting method­ 18 into pre-calibrated tubes. All participants refrained from eating, drinking and brushing their teeth 1 h prior to saliva collection. Patients did not take their medications, including sialagogues, before saliva collection. Volunteers rested for 10 min before saliva collection, sitting in an upright position and in a quiet room and were asked not to speak or leave the room until afer the saliva was collected. Saliva samples were immediately placed on ice and then centrifuged at 14,000 g for 20 min at 4 °C to remove insoluble materials, cell debris and food remnants. Te supernatant of each sample was collected and protein concentration was determined using the Bio-Rad Bradford protein assay (Bio-Rad, Hercules, CA, USA) as previously described­ 19. OF were collected from 31 males; 15 PC patients and 16 healthy, age matched controls. Controls did not take any medications known to cause xerostomia (supplementary data A), had no complaints of oral dryness and no evidence oral mucosal diseases was detected following examination. 2 patients in the PC group were undergoing chemotherapy at the time of collection and were therefore excluded from the OF pool. Salivary fow rate was calculated. OF samples were divided into two groups: (1) for to 2DE and Demethylation MS analysis (described below), samples were pooled according to the amount of total protein in each individual sample. 2) For label-free qMS, individual samples were used.

sAA afnity removal. Amylase was removed from the pooled OF using an amylase removing device. 600 µL of water was hand pressed (20 s) through the device to moisturize the substrate. Tereafer, 1 mL of pooled OF (in two aliquots of 500 µL) was hand pressed and fltered (120 s) through the amylase removing device. Te resultant 1 mL of fltrated OF was amylase-free, as previously ­described14.

Alb and IgGs removal, capturing and elution. In order to remove alb and IgGs the ProteoPrep Immu- noafnity alb and IgG Depletion Kit (Sigma-Aldrich, St Louis, MO, USA) were used as previously ­described15 Protein concentration was measured again as before, using the Bio-Rad Bradford protein assay (Bio-Rad, Her- cules, CA, USA)19. Te triple depleted OF were divided to 2 tubes for 2DE and quantitative MS analysis and frozen at − 80 °C and lyophilized overnight. Sediments (products (deposits) of lyophilization processes) for 2DE were dissolved in 7M urea, 2M thiourea and 4% 3-[(3-cholamidopropyl) dimethylammonio]-1-propane-sulfonate (CHAPS) and stored at − 20 °C until analysis.

Two‑dimensional sodium dodecyl sulfate polyacrylamide gel electrophoresis (2DE). For ana- lytical gels, 100 µg of protein were rehydrated then subjected to isoelectrofocusing in 18 cm long second dimen-

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 2 Vol:.(1234567890) www.nature.com/scientificreports/

sion gels, pH 3–10 NL as previously ­described20. To prepare the gel strips for separation in the second dimension they were soaked twice for 15 min in an SDS-PAGE equilibration bufer as previously ­described14. For the second dimension, strips were embedded in 0.5% w/v agarose containing a trace of bromophenol blue and loaded onto hinged spacer plates (20 cm × 20.5 cm; Bio-Rad, Hercules, CA, USA) using 9.5–16.5% SDS polyacrylamide gradient gel electrophoresis. Te same running and staining apparatus at a constant current of 30 mA per gel at 10 °C was used for all samples. Gels were silver stained with SilverQuest kit (Invitrogen, Carlsbad, CA, USA).

Imaging and statistical analysis. Gels were scanned using a computer GS-800 calibrated densitometer (Bio-Rad, Hercules, CA, USA) and spots were detected and quantifed using PDQuest sofware V 6.2.0 (Bio- Rad, Hercules, CA, USA). In order to overcome several of the known limitations of 2D gel analysis that occur as a result of gel to gel variation, and also variability in ­staining14, all samples were run simultaneously for the frst and second dimensions. Normalization with PDQuest was performed using the total density in image method to semi-quantify spot intensities and to minimize staining variation between gels­ 14.

2DE Mass‑spectrometry (MS) identifcation. For MS identifcation, a 2DE containing 100 µg of pro- tein was prepared and fxed in 50% (v/v) ethanol, 12% (v/v) acetic acid for 2 h. Proteins were visualized by staining with a SilverQuest staining kit for MS compatible silver staining (SilverQuest, Invitrogen, Carlsbad, CA, USA). Electrophoretically separated spots were excised from the gels, and in-gel reduced (10 mM Dith-9 iothreitol, incubated at 6 °C for 30 min), alkylated (10 mM iodoacetamide, at room temperature for 30 min) and proteolyzed with trypsin (overnight at 37 °C using modifed trypsin, Promega at a 1:100 enzyme-to-substrate ratio). Te resulting tryptic peptides were resolved by reversed-phase chromatography on 0.1·200-mm fused silica capillaries (J&W, 100 µm ID) packed with Everest reversed phase material (Grace Vydac, CA, USA). Te peptides were eluted with a 45 min gradient of 5 to 95% (v/v) of acetonitrile with 0.1% (v/v) formic acid in water at fow rates of 0.4 ll min. Mass spectrometry was performed by an ion-trap MS (Orbitrap; Termo) in a posi- tive mode using a repetitively full MS scan followed by collision induced dissociation (CID) of the fve most dominant ions selected from the frst MS scan. Te MS data were clustered and analyzed using Sequest sofware (version 3.31; J. Eng and J. Yates, University of Washington and Finnegan, San Jose, USA) and Pep-Miner21 searching against the human part of the Uniprot database (2014_03, https​://www.unipr​ot.org/). Te results were fltered according to the Xcorr value (1.5 for singly charged peptides, 2.2 for doubly charged peptides and 3 for triply charged peptides).

Quantitative mass‑spectrometry (MS). Protein extraction and proteolysis. Te proteins in 8M Urea were reduced with 2.8 mM DTT (60 °C for 30 min), modifed with 8.8 mM iodoacetamide in 100 mM ammo- nium bicarbonate (room temperature for 30 min) and digested in 2M Urea, 25 mM ammonium bicarbonate with modifed trypsin (Promega) at a 1:50 enzyme-to-substrate ratio, overnight at 37 °C. In order to achieve full cleavage, a second 4 h digestion was performed at 37 °C.

Demethylation MS analysis. As described previously by Krief et al.7, the resulting peptides were desalted using C18 Stage tips, dried and re-suspended in 50 mM Hepes (pH 6.4). Labeling by Dimethylation was done in the presence of 100 mM NaCBH­ 3 (Sterogene cat#9704 1M), by adding Light Formaldehyde (35% Frutarom cat#5551810, 12.3M ) to the pooled control sample, and Heavy Formaldehyde (20% w/w, Cambridge Isotope laboratories cat#CDLM-4599-16.5M) to the pooled PC sample to a fnal concentration of 200 mM. Follow- ing 1 h of incubation at room temperature the pH was raised to 8 and the reaction was incubated for another hour at room temperature. Neutralization was done with 25 mM ammonium bicarbonate for 30 min, and equal amounts of the light and heavy peptides were mixed, cleaned on a C18 stage tip, dried and re-suspended in 0.1% formic acid. Peptides were resolved by reverse-phase chromatography on 0.075 × 200-mm fused silica capillaries (J&W) packed with Reprosil reverse phase material (Dr. Maisch GmbH, Germany). Te peptides were eluted with linear 215 min gradients of 7 to 40% and then for 8 min at 95% acetonitrile with 0.1% formic acid in water at fow rates of 0.25 μl/min. Mass spectrometry was performed using an ion-trap mass spectrometer (Orbitrap, Termo) in a positive mode using a repetitively full MS scan followed by collision induced dissociation (CID) of the 7 most dominant ions selected from the frst MS scan. Te MS data was analyzed using Sequest 3.31 sofware (J. Eng and J. Yates, University of Washington and Finnegan, San Jose) searching the human part of the NCBI-NR database. Quantitation was performed using the PepQuant algorithm of Bioworks and "in house" sofware.

Label free MS analysis. 20 individual samples (from 10 PC patients and 10 healthy volunteers) were analyzed using Label free analysis following the depletion of high abundance proteins. Te tryptic peptides were desalted using C18 tips, dried and re-suspended in 0.1% formic acid. Te peptides were resolved by reverse-phase chro- matography on 0.075 × 200-mm fused silica capillaries (J&W) packed with Reprosil reversed phase material (Dr Maisch GmbH, Germany). Te peptides were eluted as described above. A wash run and one blank injection were performed between the samples to make sure there was no cross contamination­ 7. Te MS data was analyzed using MaxQuant 1.2.2.5 sofware (Mathias Mann’s group) searching against the human section of the Uniprot database and quantifed by label free analysis using the same sofware. Statistical analysis was done using Perseus sofware (Mathias Mann’s group).

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 3 Vol.:(0123456789) www.nature.com/scientificreports/

Bio‑statistical analysis. Dr. Yoav Smith (Head of the Genomic Data Analysis Unit, Te Hebrew University, Jerusalem) was our consultant for the analysis. Briefy, label-free qMS results were initially analyzed utilizing Matlab sofware R2013a (Te MathWorks, Inc. USA). Data was then presented in a Volcano plot using the verti- cal axis for the p-values and the horizontal axis for the log 2 ratio values. By using a threshold of less than 0.05 for the p-values, and a fold change of + or − 2 for the absolute log 2 ratios, proteins with the largest statistically signifcant expression change were chosen. Furthermore, for the combined protein group the predicted prob- ability for each subject was obtained and was used to construct receiver operating characteristic (ROC) curves. Te standard error of the area under the curve (AUC) value and the 95% confdence interval (CI) for the ROC curve were computed as previously described­ 22. Te sensitivity and specifcity for the combined biomarkers were estimated by identifying the cutof-point of the predicted probability that yielded the highest sum of sensitivity and specifcity. Results Te mean age of the 15 PC patients was 65.7 ± 13.24 years, and the mean age of the 16 healthy age-matched controls was 56.5 ± 3.3 years. Te average time from PC diagnosis to OF collection was ~ 7 months. 72% of the patients were diagnosed with stage IV and the rest with stage III. All the PC patients took medi- cations regularly, and their tendency to cause xerostomia was checked (supplementary data A), only 2 patients used medicines known to cause dry mouth in more than 10% of individuals. Te study was divided into sections: 1. Proteomic analysis on pooled samples using 2DE and dimethylation- qMS. 2. Analysis of individual samples using label-free qMS.

Dimethylation MS analysis of pooled PC and control samples. Dimethylation followed by LC– MS/MS of PC and control OF samples exposed 182 proteins (supplementary data B). 21 proteins showed an extended diferential profle with a 3 to 50-fold change in expression. 37 proteins had a 2 to threefold expression change (see Table 1 for details). Table 1A refers to publications implicating 19 of our 21 identifed proteins as bio- marker candidates for PC or other cancers. None of these proteins has ever been detected in OF of PC patients.

2DE and MS analysis of pooled PC and control samples. 2DE of pooled triple-depleted OF samples from healthy controls (Fig. 1A) and PC patients (Fig. 1B) was performed. PDQuest analysis revealed 360 protein spots, and 72 had an expression change of more than threefold. 15 spots with expression changes greater than fvefold were chosen for MS analysis. Only spots identifed in both maps were further analyzed by MS (sup- plementary data C). Of the twenty proteins identifed, 12 were newly identifed; Ig kappa chain V-I region AG (P01593), Ig kappa chain V–I region DEE (P01597), Polymeric immunoglobulin receptor P01833, Ig alpha-1 chain C region (P01876), Cystatin-B (P04080), Protein disulfde-isomerase (P07237), Leukocyte elastase inhibi- tor (P30740), Beta-2-microglobulin (P61769), Fatty acid-binding protein, epidermal (Q01469), Serpin(Q9UIV8), Tumor necrosis factor ligand superfamily member 13B (Q9Y275), IgGFc-binding protein (Q9Y6R7). Of the 8 proteins also found in the qMS results, 5 had a similar trend; Ig kappa chain C region (P01834), Ig mu chain C region (P01871), Serum albumin (P02768), Leukotriene A-4 hydrolase (P09960), Hemoglobin subunit beta (P68871). Te remaining 3 showed an opposite trend; Ig kappa chain V-III region SIE (P01620), Zinc-alpha- 2-glycoprotein (P25311), Hemoglobin subunit beta (P68871) and Lipocalin-1 (P31025).

Label free qMS on individual samples. Tis extensive examination led to the identifcation of 480 pro- teins. MS results show the relative expression profle of the proteins in each sample. An average expression ratio was calculated for each protein. 71 proteins were down regulated by more than twofold in PC samples, among them 34 by more than threefold. 92 proteins were up regulated by more than twofold, out of them 46 by more than threefold. Te subsequent statistical analysis (t test, p value < 0.05), showed 39 proteins with an average change in expression profle of more than twofold. Te proteins were grouped according to the number of subjects in which they were found; less than 6 subjects and more than 6 subjects. For example, S100-A9 was found in OF samples of all subjects, and decreased signifcantly (p < 0.05) by more than threefold in PC patients [Table 2, Fig. 2A]. From the 39 statistically signifcant highly diferentiated proteins, 8 had similar trends to those noted in the pooled sample results, including; Glyceraldehyde-3-phosphate dehydrogenase (P04406), S100-A8 (P05109), S100-A9 (P06702), Disulfde-isomerase (P07237), Zinc-alpha-2-glycoprotein (P25311), Cornulin (Q9UBG3), Apolipoprotein A-I (P02647), L-lactate dehydrogenase B chain (P07195). Interestingly, Zinc-alpha-2-glycoprotein (P25311), showed an increased expression profle in the individual qMS whereas the in the results of the qMS of pooled samples it showed an opposite trend. Another controversial protein was Lipocalin-1 (P31025) in which the individual MS supported the results of the 2DE showing an aver- age increase of more than 3.5-fold in PC patients, but the changes in the MS were not statistically signifcant.

Bio‑statistical analysis. In order to determine a short panel of discriminative biomarkers, label free qMS results were bio-statistical analyzed utilizing Matlab sofware R2013a (Te MathWorks, Inc, USA). Data was presented in a Volcano plot using the vertical axis (Fig. 2B). Te Biostatistical analysis revealed fve highly discriminative proteins; -14 (P02533), Lactoper- oxidase (P22079), Cytokeratin-16 (P08730), Cytokeratin-17 (Q04695) and Peptidyl-prolyl cis–trans isomerase B (P23284). To further examine the clinical utility of this combination of biomarkers for PC detection, an ROC curve was built. Tis model yielded a ROC-plot AUC value of 0.910 (95% CI, 0.714 to 1.000; p < 0.000001) with

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 4 Vol:.(1234567890) www.nature.com/scientificreports/

Previously identifed Protein Matched Average Ratio PC Other cancer Serial no identifcation Accession no MW (Da) peptides PC/C Sample origin Not Identifed Not Identifed biomarkers biomarkers A HNSCC*, 1 Histone H4 P62805 11,360 3 0.02 [43, 44] Saliva Histone H2B Pancreatic 2 P33778 13,942 2 0.03 [45] type 1-B tumor tissue 6-phospho- gluconate 3 dehydrogenase, P52209 53,106 2 0.04 FNA of PTC** [46] decarboxylat- ing Basic salivary proline-rich 4 P02812 40,775 2 0.05 Saliva [44] protein 2 precursor Histone H2B 5 Q96A08 14,159 2 0.06 HNSCC* [43] type 1-A Azurocidin 6 P20160 26,869 3 0.07 x x precursor Apolipoprotein Biological 7 P02647 30,759 7 0.08 [27] A-I precursor sample Alpha-amylase 8 P04745 57,731 31 0.16 Saliva [44] 1 precursor Myeloperoxi- 9 P05164 83,815 9 0.16 Blood [28] dase precursor Human Protein 10 P05109 10,828 6 0.19 Pancreatic [30] S100-A8 Cell-line Transthyretin 11 P02766 15,877 6 0.22 Serum [29] precursor Human Lipocalin-1 12 P31025 19,238 12 0.23 Pancreatic [30] precursor Cell-line Protein Biological 13 P06702 13,234 7 0.24 [47] [44] S100-A9 sample, Saliva Short palate, lung and nasal epithelium 14 carcinoma- Q96DR5 26,995 6 0.24 Saliva [48] associated protein 2 precursor Pancre- Hemoglobin 15 P69905 15,248 9 0.25 atic tumor [45] [44] subunit alpha tissue,Saliva Small proline- 16 P35326 7960 3 0.25 x x rich protein 2A Hemoglobin Tissue and 17 P02042 16,045 2 0.26 [49] subunit delta Serum pancreatic 18 Transketolase P29401 67,835 11 3.18 [31] ductal tissue , type I Pancreatic 19 P13645 59,475 2 4.57 [50] cytoskeletal 10 cancer tissue Hemopexin 20 P02790 51,643 13 4.99 Plasma, Saliva [32] [44] precursor Alpha-2-mac- 21 roglobulin P01023 163,174 41 8.06 Plasma, Saliva [44,51] precursor avg Ratio Protein (Patient/ Serial No Identifcation Accession No MW (Da) Peptides No Healthy) B Fibrinogen 1 alpha chain P02671 94,914 5 0.31 precursor Serum albumin 2 P02768 69,322 14 0.31 precursor Hemoglobin 3 P68871 15,988 20 0.33 subunit beta Continued

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 5 Vol.:(0123456789) www.nature.com/scientificreports/

avg Ratio Protein (Patient/ Serial No Identifcation Accession No MW (Da) Peptides No Healthy) Vitamin 4 D-binding pro- P02774 52,929 5 0.34 tein precursor Complement 5 P01024 187,029 27 0.35 C3 precursor Alpha-1B-gly- 6 coprotein P04217 54,239 5 0.35 precursor Alpha-1-acid 7 glycoprotein 1 P02763 23,497 8 0.37 precursor , cytoplas- 8 P60709 41,710 12 0.38 mic 1 L-lactate 9 dehydrogenase P07195 36,615 2 0.39 B chain Leukotriene 10 P09960 69,241 2 0.40 A-4 hydrolase Fibrinogen 11 gamma chain P02679 51,479 6 0.40 precursor 12 Involucrin P07476 68,427 2 0.44 Metalloprotein- 13 ase inhibitor 1 P01033 23,156 2 0.46 precursor Glyceral- dehyde- 14 P04406 36,030 8 0.48 3-phosphate dehydrogenase Fibrinogen beta 15 P02675 55,892 5 0.49 chain precursor Protein- glutamine 16 gamma-gluta- Q08188 76,584 6 0.49 myltransferase E precursor Beta-2-gly- 17 coprotein 1 P02749 38,273 3 0.49 precursor Keratin, type I 18 P13646 49,555 3 0.49 cytoskeletal 13 Ig alpha-1 19 P01876 37,631 27 0.54 chain C region Serotransferrin 20 P02787 77,000 56 0.54 precursor 21 P08670 53,619 5 0.54 Alpha-1-acid 22 glycoprotein 2 P19652 23,588 2 0.54 precursor Desmoglein-1 23 Q02413 113,644 3 0.57 precursor Ig kappa chain 24 P01834 11,602 14 0.58 C region Zinc-alpha- 25 2-glycoprotein P25311 33,851 32 0.58 precursor 26 Cornulin Q9UBG3 53,502 4 0.58 Phosphoglycer- 27 P18669 28,786 2 0.58 ate mutase 1 Ig gamma-1 28 P01857 36,083 11 0.58 chain C region Ig heavy chain V-III region 29 P01764 12,574 6 0.58 VH26 precur- sor Complement 30 factor B pre- P00751 85,479 2 0.59 cursor Continued

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 6 Vol:.(1234567890) www.nature.com/scientificreports/

avg Ratio Protein (Patient/ Serial No Identifcation Accession No MW (Da) Peptides No Healthy) Aldehyde dehydrogenase, 31 P30838 50,347 3 0.60 dimeric NADP- preferring Cystatin-S 32 P01036 16,204 10 1.91 precursor Ig kappa chain 33 V-III region P01620 11,768 6 1.94 SIE Keratin, type II 34 P04264 65,978 4 2.15 cytoskeletal 1 Keratin, type II 35 cytoskeletal 2 P35908 65,825 4 2.29 epidermal Alpha- 36 P12814 102,993 4 2.49 -1 Prolactin- 37 inducible pro- P12273 16,562 10 2.59 tein precursor

Table 1. Proteins identifed by Dimethylation MS analysis of pooled PC and control samples. A. Highly diferentiated expression profle (above threefold). B 2 to threefold expression profle diferences. *NSCC— Human Head-and-Neck Squamous Cell Carcinomas tissue; * FNA of PTC—Fine Needle Aspiration of Papillary Tyroid Cancer.

Figure 1. Silver-stained 2DE gels of pooled oral fuid samples afer triple depletion (100 µg). (A) Control group and (B) PC group. Numbered spots were found with an OD change above fvefold (PDQuest sofware, Bio-Rad, USA) and identifed by MS.

90.0% sensitivity and 90.0% specifcity in diferentiating PC patients from healthy subjects (Fig. 2C). In other words, 18 out of 20 OF samples showed true positive or true negative results, based on the combined biomarker examination. Discussion Pancreatic cancer (PC) is an aggressive cancer and ranks third in cancer mortality in Israel and 8th worldwide­ 2,23,24. Most PC are diagnosed at a late stage demonstrating the need to establish a simpler, non-invasive, cost efective screening tool for PC such as oral fuids (OF).

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 7 Vol.:(0123456789) www.nature.com/scientificreports/

Serial no Protein description Accesion no MW (kDa) PC/H P value No. of PC samples analysed No. of Healthy samples analysed A 1 P19013 63.91 0.10 0.04 6 8 2 Keratin, type I cytoskeletal 17 Q04695 48.11 0.25 0.03 8 10 3 Protein S100-A8 P05109 10.83 0.25 0.04 10 10 4 S100-A9 P06702 13.24 0.28 0.05 10 10 5 Glyceraldehyde-3-phosphate dehydrogenase P04406 36.05 0.35 0.05 9 10 6 Cornulin O Q9UBG3 53.53 0.38 0.02 9 9 7 Keratin, type I cytoskeletal 16 P08779 51.27 0.42 0.01 10 10 8 Ubiquitin thioesterase Q9UGI0 80.97 0.45 0.02 7 7 9 Keratin, type I cytoskeletal 14 P02533 51.62 0.50 0.00 10 10 10 keratin complex 1, acidic A2A5Y0 47.12 0.50 0.05 5 7 11 Keratin, type II cytoskeletal 5 P13647 62.38 0.60 0.04 10 10 12 Zinc-alpha-2-glycoprotein P25311 34.26 1.63 0.03 10 10 13 Ig mu heavy chain disease protein P04220 43.06 2.63 0.03 6 7 14 Leucine-rich alpha-2-glycoprotein P02750 38.18 3.63 0.05 10 9 15 Protein disulfde-isomerase P07237 57.12 4.63 0.04 8 9 16 Cystatin-C P01034 15.8 5.63 0.05 10 9 17 Kallikrein-6 Q92876 26.86 6.63 0.03 8 8 18 Tioredoxin domain-containing protein Q9BRA2 13.94 7.63 0.02 8 8 19 Lactoperoxidase P22079 80.29 8.63 0.01 10 9 20 Zymogen granule protein 16 homolog B Q96DA0 22.74 9.63 0.03 10 10 B 1 Beta-actin-like protein 2 Q562R1 42 0.10 0.04 2 4 2 Apolipoprotein A-I P02647 30.78 0.15 0.05 4 6 3 Purine nucleoside phosphorylase P00491 32.12 0.20 0.05 5 4 4 Annexin A1 P04083 38.71 0.30 0.03 5 6 5 L-lactate dehydrogenase B chain P07195 36.64 0.35 0.04 4 6 6 Keratin, type II cytoskeletal 75 O95678 59.5 0.40 0.04 5 6 7 Ig lambda chain V-I region NEW P01701 11.45 0.45 0.04 4 5 8 Keratin, type II cytoskeletal 1b Q6IFZ6 61.36 0.57 0.05 2 3 Neuroblast diferentiation-associated protein 9 Q09666 629.1 1.94 0.00 4 6 AHNAK 10 Cathepsin S P25774 37.5 2.14 0.04 4 3 11 Cation channel sperm-associated protein 3 Q86XQ3 46.42 2.31 0.01 5 5 12 -specifc chaperone A O75347 12.86 2.41 0.05 3 4 13 Ribonuclease T2 O00584 29.48 2.42 0.03 5 5 14 Costars family protein Q9P1F3 9.056 2.92 0.05 5 4 15 Dipeptidyl peptidase 1 P53634 51.85 3.12 0.03 5 4 16 Ig kappa chain V-III region HAH P18135 14.07 3.34 0.03 5 5 17 heavy chain 10, axonemal Q8IVF4 514.8 4.67 0.04 4 5 18 Calcium-activated chloride channel regulator 4 Q14CN2 101.3 5.59 0.04 5 4 19 Proline-rich protein 4 Q16378 15.1 9.60 0.04 4 3

Table 2. Individual sample analysis (n = 20) by label free qMS. A. Proteins identifed in at least 6 control and PC subjects with an average diferential expression (P < 0.05). B. Proteins with an average diferential expression (P < 0.05), with no minimum number of subjects.

Proteomic analysis of pooled OF samples. Tis is the frst study (to our knowledge) characterizing the OF proteome of PC patients. Te biomarker candidates identifed in our pooled OF samples were compared to previous proteomic studies from other tissues or body fuids. Table 1A summarizes 19 proteins out of 21 with more than threefold changes in expression that were considered as potential biomarkers, details of seven of these proteins are presented below:

i. Histones (P62805, P33778, Q96A08) are strongly alkaline proteins which package and organize the DNA into structural units called nucleosomes. Autoantibodies to this protein found in the serum of PC patients have been suggested as potential ­biomarkers25,26.

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 8 Vol:.(1234567890) www.nature.com/scientificreports/

Figure 2. (A) Graphical illustrations of 20 proteins with signifcantly increased expression (p < 0.05) afer normalization, found in at least 6 subjects per group. (B). Volcano plot. Red asterisks represent fve proteins with the largest statistically signifcant changes in expression. (C). ROC curve utilizing fve biomarkers (P02533, P22079, P08730, Q04695 and P23284) yielded an AUC value of 0.910, with 90.0% sensitivity and 90.0% specifcity.

ii. Apolipoprotein A-I precursor has a specifc role in lipid metabolism. It is the major component of high- density lipoprotein in plasma and has recently been patented for early diagnosis, screening, therapeutic follow-up and prognosis, as well as diagnosis of relapse of colorectal cancer­ 27. iii. Myeloperoxidase is an important factor infuencing oxygen dependent mechanisms of pathogen destruc- tion. A signifcant decrease in the activity of myeloperoxidase has been found in the neutrophils of PC ­patients28. iv. Transthyretin precursor is a serum and cerebrospinal fuid carrier of the thyroid hormone thyroxine (T4) and retinol. Its expression was signifcantly lower (7.9-fold) in the serum of PC patients­ 29. v. Lipocalin-1 and Protein S100-A8 were down regulated in PC versus non-neoplastic ductal cells by stable isotope labeling with amino acids in cell ­culture30.

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 9 Vol.:(0123456789) www.nature.com/scientificreports/

vi. Transketolase is up regulated in PC cells compared to healthy pancreatic ducts (3.66-fold increase com- pared to the 3.18-fold increase we found in OF)31. vii. Hemopexin is the highest afnity heme binding protein, protecting the body from the oxidative damage that free heme can cause. Tis protein has been consistently associated with tumors­ 30.

Partial overlap between the two-proteomic screening approaches; 2DE and dimethylation qMS demonstrated the importance of employing diferent proteomic strategies to maximize identifcation abilities. Te disadvan- tages of 2DE as a proteomic method including: spots containing more than one protein; limited dynamic range imposed by the gel method; difculty with hydrophobic proteins; inability to detect proteins with extreme molecular weights and pI values, have been previously ­described30. In order to overcome these limitations, multiple detection methods were used. Furthermore, when a discrepancy was noted between the methods, the label-free qMS on individual samples supported the results of the 2DE upon dimethylation qMS. Nevertheless, the need for extensive individual proteomic analyses and validation is clear.

Bioinformatic analysis. Up and down regulated biomarker candidates were analyzed and clustered accord- ing to their molecular and biological functions using David-Kegg Bioinformatics ­Resources32. Te expression of 32 proteins increased and 65 had lower levels (> twofold change). Te main functional and molecular groups included; signal peptides, glycosylation processes and protease activity (Fig. 3A). Tese fnding are in accordance with extensive bioinformatic analysis of PC biomarker candidates from tumor tissue or patient serum ­samples33. Further analysis utilizing "String" bioinformatics website (http://strin​g-db.org/) to explore protein–protein inter- action strength revealed four clustered functional groups, including; tissue homeostasis, regulation of biological quality, peptidase regulation activity and extra cellular exosome (Fig. 3B). In this study 25 out of 32 candidate biomarkers were exosomal proteins. Tis, most interestingly, is in full agreement with a study by Lau et al. discussing the role of tumor-derived exosomes in OF biomarker development­ 34. Te authors, however, focused on the infuence of pancreatic exosomes on OF biomarker devel- opment, while the role of the exosomes in the targeted organs remained ambiguous. A partial explanation may be that exosomes not only transport messenger molecules from the pancreas to the salivary glands, but also deliver biomarkers to OF. Whether these are the original pancreatic exosomes or newly secreted vesicles from the salivary glands, should be examined further. Similarly, an in vitro examination showed that breast cancer derived exosomes interact with the salivary glands and alter the composition of salivary gland cell-derived exosome-like macrovesicles in the transcriptome and ­proteome35. Because a solitary biomarker is unlikely to detect a particular cancer with high specifcity and sensitivity, we evaluated combinations of the identifed biomarkers using an ROC analysis. We calculated high ROC AUC values indicating that the predictive utility increased substantially, enabling the identifcation of a group of fve biomarker candidates. Tree Cytokeratin types (14, 16 and 17), involved in the regulation of cellular properties and functions, including apico-basal polarization, motility, cell size, protein synthesis and membrane trafc and signaling were selected. In many cases, their presence or absence has prognostic signifcance for cancer ­patients36. Te role of in pancreatic cancer and the ability to utilize them as biomarkers is widely discussed in the ­literature37,38. For example was proven to be a novel negative prognostic biomarker for pancreatic ­cancer39. Te remaining two proteins with elevated levels in OF of PC patients and included in our biomarker com- bination were Lactoperoxidase and Peptidyl-prolyl cis–trans isomerase B. Te latter is also called Cyclophilin B (CypB) and is a 21-kDa protein belonging to the cyclophilin family of peptidyl-prolyl cis–trans isomerase. It promotes alterations in protein conformation and infuences cell growth, proliferation, and motility­ 40. Enhanced expression of CypB in malignant breast epithelium may contribute to the pathogenesis of the disease­ 41. Moreover, elevated levels of CypB have been found in sera of PC patients and this protein has been suggested as a serum biomarker for ­PC42. Te comparison of pooled sample results to individual qMS analysis showed partial overlap. Approximately 33% of the proteins with the highest expression fold change and lowest p-value identifed in the individual sam- ples presented similar expression trends in pooled samples. Furthermore, CypB, one of the fve discriminative biomarkers found in the individual qMS analysis, was related to the down regulation of two S100 proteins. Both the pooled and individual qMS analysis showed decreased expression levels in these proteins. It was previously claimed that pooling serum samples may cause a ~ 50% loss of potential ­biomarkers43. Te results of the current study support this argument; yet also show the advantages of the pooling strategy as an initial step before performing extensive examinations on individual sam- ples. Pooled sample analysis enabled a relatively low-cost and rapid "proof of concept" examination. Clearly, vali- dation using individual samples is required to understand the diagnostic potential of the biomarker combination. Concluding remarks Enhanced proteomic characterization of the oral fuids of PC patients revealed a profle of diferentially expressed proteins. Bioinformatic analysis of OF was in accordance with previous studies of proteins expressed in PC in tissues, pancreatic juice or serum. Moreover, an extensive label free qMS analysis revealed a group of proteins, which may be used as a highly specifc, and sensitive OF based test for PC test. A larger study is required for A. Exploring the accuracy of the combined 5 biomarkers that were found in this study, utilizing diferent proteomic technology (e.g. Elisa, Western blot, lateral fow immunoassay etc.). B. validation and identifying high-risk groups in order to enable an early diagnosis, screening, therapeutic follow-up and prognosis and diagnosis of relapse in relation to PC using OF.

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 10 Vol:.(1234567890) www.nature.com/scientificreports/

Figure 3. (A) David-Kegg Bioinformatics ­Resources32. Classifcation of proteins with increased expression according to their biological functions. Proteins with more than one biological function were counted multiple times. (B) "String" online database (http://string-db.org/​ ). Association network of overexpressed proteins in OF of PC patients.

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 11 Vol.:(0123456789) www.nature.com/scientificreports/

Received: 1 March 2020; Accepted: 24 November 2020

References 1. Malvezzi, M. et al. European cancer mortality predictions for the year 2016 with focus on leukaemias. Ann. Oncol. 27, 725–731. https​://doi.org/10.1093/annon​c/mdw02​2mdw0​22 (2016). 2. Parkin, D. M., Bray, F., Ferlay, J. & Pisani, P. Global cancer statistics, 2002. CA Cancer J. Clin. 55, 74–108. https​://doi.org/10.3322/ canjc​lin55​/2/74 (2005). 3. Siegel, R., Naishadham, D. & Jemal, A. Cancer statistics, 2012. CA Cancer J. Clin. 62, 10–29. https​://doi.org/10.3322/caac.20138​ (2012). 4. Siegel, R., Naishadham, D. & Jemal, A. Cancer statistics, 2013. CA Cancer J. Clin. 63, 11–30. https​://doi.org/10.3322/caac.21166​ (2013). 5. Greabu, M. et al. Saliva–a diagnostic window to the body, both in health and in disease. J. Med. Life 2, 124–132 (2009). 6. Loo, J. A., Yan, W., Ramachandran, P. & Wong, D. T. Comparative human salivary and plasma proteomes. J. Dent Res. 89, 1016– 1023. https​://doi.org/10.1177/00220​34510​38041​4 (2010). 7. Krief, G. et al. Proteomic profling of whole-saliva reveals correlation between Burning Mouth Syndrome and the neurotrophin signaling pathway. Sci. Rep. 9, 4794. https​://doi.org/10.1038/s4159​8-019-41297​-9 (2019). 8. Segal, A. & Wong, D. T. Salivary diagnostics: enhancing disease detection and making medicine better. Eur. J. Dent. Educ. 12(Suppl 1), 22–29. https​://doi.org/10.1111/j.1600-0579.2007.00477​.x (2008). 9. Vitorino, R. et al. Identifcation of human whole saliva protein components using proteomics. Proteomics 4, 1109–1115. https​:// doi.org/10.1002/pmic.20030​0638 (2004). 10. Hu, S., Loo, J. A. & Wong, D. T. Human saliva proteome analysis. Ann. N. Y. Acad. Sci. 1098, 323–329. https​://doi.org/10.1196/ annal​s.1384.015 (2007). 11. Oppenheim, F. G., Salih, E., Siqueira, W. L., Zhang, W. & Helmerhorst, E. J. Salivary proteome and its genetic polymorphisms. Ann. N. Y. Acad. Sci. 1098, 22–50. https​://doi.org/10.1196/annal​s.1384.030 (2007). 12. Hu, S. et al. Salivary proteomics for oral cancer biomarker discovery. Clin. Cancer Res. 14, 6246–6252. https://doi.org/10.1158/1078-​ 0432.CCR-07-5037 (2008). 13. Hu, S. et al. Large-scale identifcation of proteins in human salivary proteome by liquid chromatography/mass spectrometry and two-dimensional gel electrophoresis-mass spectrometry. Proteomics 5, 1714–1728. https://doi.org/10.1002/pmic.20040​ 1037​ (2005). 14. Deutsch, O. et al. An approach to remove alpha amylase for proteomic analysis of low abundance biomarkers in human saliva. Electrophoresis 29, 4150–4157. https​://doi.org/10.1002/elps.20080​0207 (2008). 15. Krief, G. et al. Improved visualization of low abundance oral fuid proteins afer triple depletion of alpha amylase, albumin and IgG. Oral. Dis. 17, 45–52. https​://doi.org/10.1111/j.1601-0825.2010.01700​.x (2011). 16. Krief, G. et al. Comparison of diverse afnity based high-abundance protein depletion strategies for improved bio-marker discovery in oral fuids. J. Proteom. 75, 4165–4175. https​://doi.org/10.1016/j.jprot​.2012.05.012 (2012). 17. Streckfus, C. & Bigler, L. Te use of soluble, salivary c-erbB-2 for the detection and post-operative follow-up of breast cancer in women: the results of a fve-year translational research study. Adv. Dent. Res. 18, 17–24 (2005). 18. Aframian, D. J., Davidowitz, T. & Benoliel, R. Te distribution of oral mucosal pH values in healthy saliva secretors. Oral Dis. 12, 420–423. https​://doi.org/10.1111/j.1601-0825.2005.01217​.x (2006). 19. Bradford, M. M. A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem. 72, 248–254. https​://doi.org/10.1016/0003-2697(76)90527​-3 (1976). 20. Bjellqvist, B., Pasquali, C., Ravier, F., Sanchez, J. C. & Hochstrasser, D. A nonlinear wide-range immobilized pH gradient for two- dimensional electrophoresis and its defnition in a relevant pH scale. Electrophoresis 14, 1357–1365 (1993). 21. Beer, I., Barnea, E., Ziv, T. & Admon, A. Improving large-scale proteomics by clustering of mass spectrometry data. Proteomics 4, 950–960. https​://doi.org/10.1002/pmic.20030​0652 (2004). 22. Zhang, L. et al. Salivary transcriptomic biomarkers for detection of resectable pancreatic cancer. Gastroenterology 138, 949–957. https​://doi.org/10.1053/j.gastr​o.2009.11.010 (2010). 23. Jemal, A. et al. Cancer statistics, 2009. CA Cancer J. Clin. 59, 225–249. https​://doi.org/10.3322/caac.20006​ (2009). 24. Jemal, A., Siegel, R., Xu, J. & Ward, E. Cancer statistics, 2010. CA Cancer J. Clin. 60, 277–300. https​://doi.org/10.3322/caac.20073​ (2010). 25. Patwa, T. H. et al. Te identifcation of phosphoglycerate kinase-1 and histone H4 autoantibodies in pancreatic cancer patient serum using a natural protein microarray. Electrophoresis 30, 2215–2226. https​://doi.org/10.1002/elps.20080​0857 (2009). 26. Kamei, M. et al. Serodiagnosis of cancers by ELISA of anti-histone H2B antibody. Biotherapy 4, 17–22 (1992). 27. 27Ataman-Onal, Y., Charrier, J. P., Choquet-Kastylevsky, G. & Poirier, F. (Google Patents, 2014). 28. Geetha, A., Jeyachristy, S. A. & Surendran, R. Assessment of immunity status in patients with pancreatic cancer. J. Clin. Biochem. Nutr. 39, 18–26 (2006). 29. Yu, K. H., Rustgi, A. K. & Blair, I. A. Characterization of proteins in human pancreatic cancer serum using diferential gel electro- phoresis and tandem mass spectrometry. J. Proteome Res. 4, 1742–1751. https​://doi.org/10.1021/pr050​174l (2005). 30. Gronborg, M. et al. Biomarker discovery from pancreatic cancer secretome using a diferential proteomic approach. Mol. Cell Proteom. 5, 157–171. https​://doi.org/10.1074/mcp.M5001​78-MCP20​0 (2006). 31. Cui, Y. et al. Proteomic and tissue array profling identifes elevated hypoxia-regulated proteins in pancreatic ductal adenocarci- noma. Cancer Invest. 27, 747–755. https​://doi.org/10.1080/07357​90080​26727​46 (2009). 32. Huang, D. W., Brad, S. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID Bioinformatics Resources. Nat. Protoc. 4, 44–57 (2009). 33. Grutzmann, R. et al. Meta-analysis of microarray data on pancreatic cancer defnes a set of commonly dysregulated . Oncogene 24, 5079–5088. https​://doi.org/10.1038/sj.onc.12086​96 (2005). 34. Lau, C. et al. Role of pancreatic cancer-derived exosomes in salivary biomarker development. J. Biol. Chem. https://doi.org/10.1074/​ jbc.M113.45245​8 (2013). 35. Lau, C. S. & Wong, D. T. Breast cancer exosome-like microvesicles and salivary gland cells interplay alters salivary gland cell-derived exosome-like microvesicles in vitro. PLoS ONE 7, e33037. https​://doi.org/10.1371/journ​al.pone.00330​37 (2012). 36. Karantza, V. in health and cancer: more than mere epithelial cell markers. Oncogene 30, 127–138 (2011). 37. Goldstein, N. S. & Bassi, D. Cytokeratins 7, 17, and 20 reactivity in pancreatic and ampulla of vater adenocarcinomas. Percentage of positivity and distribution is afected by the cut-point threshold. Am. J. Clin. Pathol. 115, 695–702. https://doi.org/10.1309/1NCM-​ 46QX-3B5T-7XHR (2001). 38. Poruk, K. E. et al. Circulating tumor cells expressing markers of tumor-initiating cells predict poor survival and cancer recurrence in patients with pancreatic ductal adenocarcinoma. Clin. Cancer Res Ofc. J. Am. Assoc. Cancer Res. 23, 2681–2690. https​://doi. org/10.1158/1078-0432.CCR-16-1467 (2017). 39. Roa-Pena, L. et al. Keratin 17 identifes the most lethal molecular subtype of pancreatic cancer. Sci. Rep. 9, 11239. https​://doi. org/10.1038/s4159​8-019-47519​-4 (2019).

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 12 Vol:.(1234567890) www.nature.com/scientificreports/

40. Price, E. R. et al. Human cyclophilin B: a second cyclophilin gene encodes a peptidyl-prolyl isomerase with a signal sequence. Proc. Natl. Acad. Sci. USA 88, 1903–1907 (1991). 41. Fang, F., Flegler, A. J., Du, P., Lin, S. & Clevenger, C. V. Expression of cyclophilin B is associated with malignant progression and regulation of genes implicated in the pathogenesis of breast cancer. Am. J. Pathol. 174, 297–308. https​://doi.org/10.2353/ajpat​ h.2009.08075​3 (2009). 42. Ray, P., Rialon-Guevara, K. L., Veras, E., Sullenger, B. A. & White, R. R. Comparing human pancreatic cell secretomes by in vitro aptamer selection identifes cyclophilin B as a candidate pancreatic cancer biomarker. J. Clin. Invest. 122, 1734–1741. https​://doi. org/10.1172/JCI62​385 (2012). 43. Sadiq, S. T. & Agranof, D. Pooling serum samples may lead to loss of potential biomarkers in SELDI-ToF MS proteomic profling. Proteome Sci. 6, 16 (2008). Acknowledgements Tere were no funding and no publishable conficts of interest for this work. Author contributions D.O, K.G, Z.B, W.R, L.O, N.S, Y.R and P.A made substantial contributions to the study’s conception and design, acquisition of data, lab working and analysis and interpretation of data; S.S, D.O and W.R were in charge on sample collection. H.Y, A.JD, D,O and L.O drafed the submitted article and provided critical interpretation. H.Y, D.O and K.N wrote, revised and approved the manuscript. All authors discussed the results and commented on the manuscript.

Competing interests Te authors declare no competing interests. Additional information Supplementary Information Te online version contains supplementary material available at (https​://doi. org/10.1038/s4159​8-020-78922​-x). Correspondence and requests for materials should be addressed to A.P. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations. Open Access Tis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. Te images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat​iveco​mmons​.org/licen​ses/by/4.0/.

© Te Author(s) 2020

Scientifc Reports | (2020) 10:21995 | https://doi.org/10.1038/s41598-020-78922-x 13 Vol.:(0123456789)