Research Article Identification of Novel Alternative Splice Isoforms of Circulating Proteins in a Mouse Model of Human Pancreatic Cancer Rajasree Menon,1 Qing Zhang,3 Yan Zhang,1 Damian Fermin,1 Nabeel Bardeesy,4 Ronald A. DePinho,5 Chunxia Lu,2 Samir M. Hanash,3 Gilbert S. Omenn,1 and David J. States1 1Center for Computational Medicine and Biology and 2Pediatric Endocrinology, University of Michigan, Ann Arbor, Michigan; 3Fred Hutchinson Cancer Research Center, Seattle, Washington; and 4Center for Cancer Research, Massachusetts General Hospital; 5Center for Applied Cancer Science, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts Abstract database are scored as high, medium, or low confidence, reflecting the amount of cumulative evidence in support of the existence of a To assess the potential of tumor-associated, alternatively particular alternatively spliced sequence. Evidence is collected from spliced gene products as a source of biomarkers in biological clustering of ESTs, mRNA sequences, and gene model predictions. fluids, we have analyzed a large data set of mass spectra We modified the ECgene database to include three-frame trans- derived from the plasma proteome of a mouse model of lations of the cDNA sequences (5) to determine the occurrence of human pancreatic ductal adenocarcinoma. MS/MS spectra novel splice variant proteins. An important development in recent were interrogated for novel splice isoforms using a non- years is the substantial improvement in tandem mass spectrometry redundant database containing an exhaustive three-frame instrumentation for proteomics, allowing in-depth analysis and translation of Ensembl transcripts and gene models from confident identifications even for proteins coded by mRNA ECgene. This integrated analysis identified 420 distinct splice transcript sequences expressed at low levels (6–8). isoforms, of which 92 did not match any previously annotated Pancreatic ductal adenocarcinoma (PDAC) is among the most mouse protein sequence. We chose seven of those novel lethal of human cancers due to absence of methods for early variants for validation by reverse transcription–PCR. The diagnosis and chemoresistance of advanced disease. The KrasG12D/ results were concordant with the proteomic analysis. All seven Ink4a/Arf mouse model of PDAC was engineered with signature novel peptides were successfully amplified in pancreas speci- mutations that recapitulate the histopathologic progression of the mens from both wild-type and mutant mice. Isotopic labeling human disease in a highly reproducible and synchronous fashion of cysteine-containing peptides from tumor-bearing mice (9, 10). Here, we exploit this model to test the hypothesis that and wild-type controls enabled relative quantification of the cancer-specific alternative splice variants can be identified by proteins. Differential expression between tumor-bearing and in-depth mass spectrometric analysis of plasma proteins from control mice was notable for peptides from novel variants of this mouse model of pancreatic cancer. In this study, we have muscle pyruvate kinase, malate dehydrogenase 1, glyceralde- interrogated our modified ECgene database to identify both novel hyde-3-phosphate dehydrogenase, proteoglycan 4, minichro- and known splice variants among circulating proteins. In addition, mosome maintenance, complex component 9, high mobility our analysis of quantitative expression ratios reveals variant pro- group box 2, and hepatocyte growth factor activator. Our teins that are differentially expressed in pancreatic cancer. results show that, in a mouse model for human pancreatic cancer, novel and differentially expressed alternative splice Materials and Methods isoforms are detectable in plasma and may be a source of candidate biomarkers. [Cancer Res 2009;69(1):300–9] Mass Spectrometry Data Our search for alternative spliced forms used data from extensive Introduction proteomic analysis of plasma from 7-wk-old male wild-type mice and KRasG12D/Ink4a-Arf mouse model of PDAC (10). Approximately one third Alternative splicing plays an important role in protein diversity of the Kras Ink4a/Arf mice present with the most common pathology without significantly increasing genome size. Aberrations in observed in human cases—glandular. Plasma from Kras Ink4a/Arf mice alternative splice variants contribute to a number of diseases, with exclusively glandular tumor areas were used in this study. The samples including cancers (1, 2). For example, Thorsen and colleagues (2) were processed by the Intact Protein Fractionation and Analysis System (11) identified cancer-specific splicing events in colon, bladder, and protocol, which incorporates immunodepletion to eliminate the most prostate tissues, with diagnostic and prognostic implications. The abundant plasma proteins, thus removing 90% of the protein mass. several alternatively spliced sequence databases now publicly Immunodepletion was followed by isotopic labeling of protein cysteine available differ in their annotation and modeling methods and residues with acrylamide, heavy (D3) for mutant and light (D0) for wild-type samples. The mass difference between a D3-labeled and a D0-labeled contain many transcripts not present in reference resources like cysteine residue is 3.01884 Da. Samples from wild-type and PDAC-bearing Ensembl or Refseq (3). The ECgene database is one of the largest, mice were then pooled, and the intact proteins were fractionated into 12 alternatively spliced sequence databases (4). Entries in this anion exchange (AX) fractions followed by 13or more reverse phase (RP) fractions for each AX fraction, yielding a total of 163fractions. Individual fractions were digested with trypsin and analyzed using a ThermoFinnigan Note: Supplementary data for this article are available at Cancer Research Online LTQ-FT mass spectrometer. Mass spectra from the LTQ-FT experiment (http://cancerres.aacrjournals.org/). were acquired as RAW files. The mzXML files containing the spectral Requests for reprints: David J. States, University of Michigan, 100 Washtenaw 6 Avenue, Palmer Commons, Ann Arbor, MI 48109. Phone: 7346155510; Fax: 7346156553; information were extracted from RAW files using ReAdW.exe program. The E-mail: [email protected]. I2009 American Association for Cancer Research. doi:10.1158/0008-5472.CAN-08-2145 6 http://tools.proteomecenter.org Cancer Res 2009;69: (1). January 1, 2009 300 www.aacrjournals.org Downloaded from cancerres.aacrjournals.org on September 25, 2021. © 2009 American Association for Cancer Research. Novel Splice Isoforms in a Pancreatic Cancer Model Figure 1. Flow chart of multistep analysis of X!Tandem search results from intact protein analysis system (IPAS) MS/MS combiningTPP and MPPI, leadingto 92 novel alternative splice variants. mzXML files were then searched against a modified ECgene database (for transcripts were preferentially recorded over those generated from ECgene alternate splice variant analysis) using X!Tandem (12) software. records. A collection of common protein contaminant sequences was added to this set. Lastly, all sequences were reversed and appended to the Modified ECgene Database set of forward sequences as an internal control for false identifications. Alternative splicing can generate multiple transcripts from the same gene, This last step resulted in doubling the total number of entries in the which are translated to splice isoforms (also known as splice variant modified ECgene database with a final total of 10,381,156 protein proteins). Our target alternative splice variant protein database, the modified sequences. ECgene database, was constructed by combining Ensembl 40 and ECgene database (mm8, build 1). Taking alternative splicing events into specific Postsearch Analyses consideration, ECgene combines genome-based EST clustering and the Trans proteomic pipeline. For statistical purposes, the X!Tandem transcript assembly procedure to construct gene models that encompass all search results were postprocessed with PeptideProphet and ProteinProphet alternative splicing events. The reliability of each isoform is assessed from software using Trans proteomic pipeline (TPP; version 3.2).6 First, we the nature of cluster members and from the minimum number of clones analyzed search results from each fraction separately using PeptideProphet required to reconstruct all exons in the transcript (13). The ECgene database and then processed all PeptideProphet files together using ProteinProphet. contains a total of 417,643splice variants. In the Ensembl 40 database, there Relative quantification of isotopically labeled peptides was performed using is a total of 21,839 mouse genes with 28,110 transcripts, of which 10,922 the Q3Ratio (14) and XPRESS (15) applications in TPP. The relative alternative transcripts are derived from 4,651 genes. abundance of a peptide can be calculated by reconstructing the light and Transcript sequences from the ECgene and Ensembl 40 databases were heavy elution profiles of the precursor ions and by determining the elution translated in three reading frames. Within each data set, the first instance areas of each peak. From Q3Ratio and XPRESS analyses, we obtained the of each protein sequence longer than 14 amino acids was recorded. The average expression ratio (mutant/wild type) for those peptides that were resulting proteins from both database translations were then combined and unique to the splice variant protein identified and had labeled cysteine filtered for redundancy. For this
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages11 Page
-
File Size-