Supplemental Experimental Procedures and Metabolite Spectra
Total Page:16
File Type:pdf, Size:1020Kb
SUPPLEMENTAL EXPERIMENTAL PROCEDURES AND METABOLITE SPECTRA Tissue Collection and Human Breast Cancer Cell Lines. Patients were recruited in Baltimore hospitals between 1993 and 2003, as previously described (1). They were identified through surgery lists and enrolled into the study prior to surgery. Tumor and adjacent tissue was collected at time of surgery under an established protocol. Samples of macrodissected tumor tissue and adjacent non‐cancerous tissue were prepared by a pathologist immediately after surgery and were frozen within 15 min. Tissue quality of the frozen samples, as judged by RNA integrity analysis, was excellent. Specimens were further evaluated on H/E slides and classification as tumor and non‐cancerous tissue was confirmed. Clinical and pathological information (e.g., estrogen and progesterone receptor status) was obtained from medical records and pathology reports. Tumor estrogen receptor (ER) status was determined at the Department of Pathology, University of Maryland, consistent with guidelines for clinical laboratories to evaluate semi‐quantitatively receptor expression in formalin‐fixed, paraffin‐ embedded tissue (“CONFIRM Estrogen Receptor” assay by Ventana Medical Systems, Tucson, AZ). The immunohistochemical staining protocol for HER2 followed the DAKO HercepTestTM protocol. Tumors were classified as HER2‐positive when the HercepTest™ immunohistochemistry score was 3 or when the test score was 2 and the tumor contained a HER2‐enriched gene signature based on gene expression profiling according to published criteria (2;3). In the validation cohort, the tumor HER2 status was determined as previously described (4). Triple‐negative tumors were negative for estrogen, progesterone, and HER2 receptor expression. Tumors were classified as basal‐like based on their gene expression profiles using the PAM50 classifier (2) and/or immunohistochemistry (ER‐negative, HER2‐ negative, cytokeratin 5/6‐positive or EGFR‐positive) according to published criteria (5). All patients signed a consent form and completed an interviewer‐administered questionnaire that evaluated socio‐economic variables as part of a larger survey. Combined annual household income before taxes and deductions was collected as unknown, under $15,000, between $15,000 and $60,000, and above $60,000. Education information was collected as highest grade or level of schooling and how many years of school were completed. Self‐reported race/ethnicity was collected as Black (not of Hispanic origin) for African‐Americans and White (not of Hispanic origin) for European‐Americans. Disease staging was performed according to the tumor–node–metastasis (TNM) system of the American Joint Committee on Cancer/ the Union Internationale Contre le Cancer (AJCC/UICC). The Nottingham system was used to determine the tumor grade. Genetic Ancestry Estimation. Genomic DNA was isolated from either non‐cancerous tissue or a blood sample and was genotyped for ancestry informative markers (AIMs). In the discovery cohort, 105 AIMs were genotyped to determine the proportion of European, West African, and Native American genetic ancestry (6). Genotyping was performed using the Sequenom MassARRAY iPLEX platform (Sequenom, San Diego, CA). Individual ancestry estimates were obtained from the genotype data using the Markov Chain Monte Carlo (MCMC) method implemented in the program STRUCTURE 2.1 (7). Our model assumed admixture using prior population information and independent allele frequencies. The MCMC model was run using K = 3 populations (58 Europeans, 67 Native Americans and 62 West Africans) and a burn‐in length of 30,000 iterations followed by 70,000 replications. Ancestry estimates were also available for 2 39 of the African‐Americans in the validation cohort. Here, African ancestry was determined with a set of 29 AIMs, as described previously (8). Both methods yielded very similar average African ancestry estimates for the African‐American women in this study (83.2% and 81.8%). Metabolome Analysis. Metabolomic profiling of fresh‐frozen bulk human breast tissues was performed for two tissue sets. The amount of sample ranged from 33 mg to 109 mg and 47 mg to 109 mg for each set. The discovery set consisted of 67 tumors and 65 non‐cancerous tissues (65 tissue pairs). Thirty‐four of the tumors were ER‐positive and 33 were ER‐negative. Patient characteristics for this tissue set are described in Supplementary Table 1. The second tissue set consisted of 70 ER‐negative tumors and 36 non‐cancerous tissues (36 tissue pairs). Patient characteristics for this validation set are described in Supplementary Table 3. Frozen tissue from 27 tumors and 19 adjacent non‐cancerous tissues (total 19 tissue pairs) were analyzed in both the discovery and validation set. Examples of metabolite spectra from these analyses are provided following the reference list for the experimental procedures. Untargeted metabolic profiling of known and unknown metabolites in the discovery set was performed by Metabolon, Inc., as previously described (9‐12). Here, sample preparation was conducted using a proprietary series of organic and aqueous extractions for maximum recovery of small molecules. Median relative standard deviation (RSD) as a measure of instrument and total process variability was 6% and 12%, respectively, for our profiling study. Sample preparation at Metabolon: Samples were prepared using the automated MicroLab STAR® system from Hamilton Company (Reno, NV). A recovery standard was added prior to the first step in the extraction process for quality control (QC) purposes. Sample preparation was 3 conducted using aqueous methanol extraction process to remove the protein fraction while allowing maximum recovery of small molecules. The resulting extract was divided into four fractions: one for analysis by Ultrahigh Performance Liquid Chromatography/Mass Spectroscopy (UPLC/MS/MS) (positive mode), one for UPLC/MS/MS (negative mode), one for Gas Chromatography/Mass Spectroscopy (GC/MS), and one for backup. Samples were placed briefly on a TurboVap® (Zymark) to remove the organic solvent. Each sample was then frozen and dried under vacuum. Samples were then prepared for the appropriate instrument, either UPLC/MS/MS or GC/MS. UPLC/MS/MS‐based analysis: The LC/MS portion of the platform was based on a Waters ACQUITY UPLC and a Thermo‐Finnigan linear trap quadrupole mass spectrometer, which consisted of an electrospray ionization source and linear ion‐trap mass analyzer. The sample extract was dried then reconstituted in acidic or basic LC‐compatible solvents, each of which contained 8 or more injection standards at fixed concentrations to ensure injection and chromatographic consistency. One aliquot was analyzed using acidic positive ion optimized conditions and the other using basic negative ion optimized conditions in two independent injections using separate dedicated columns. Extracts reconstituted in acidic conditions were gradient eluted using water and methanol containing 0.1% formic acid, while the basic extracts, which also used water/methanol, contained 6.5 mM ammonium bicarbonate. The MS analysis alternated between MS and data‐dependent MS2 scans using dynamic exclusion. GC/MS‐based analysis: The samples destined for GC/MS analysis were re‐ dried under vacuum desiccation for a minimum of 24 hrs prior to being derivatized under dried nitrogen using bistrimethyl‐silyl‐triflouroacetamide. The GC column was 5% phenyl and the temperature ramp was from 40° to 300° C in a 16 minute period. Samples were analyzed on a 4 Thermo‐Finnigan Trace DSQ fast‐scanning single‐quadrupole mass spectrometer using electron impact ionization. The instrument was tuned and calibrated for mass resolution and mass accuracy on a daily basis. The information output from the raw data files was automatically extracted as discussed below. Quality assurance/quality control: For QA/QC purposes, additional samples were included with each day’s analysis. These samples included extracts of a pool of well‐characterized human plasma, extracts of a pool created from a small aliquot of the experimental samples, and process blanks. QC samples were spaced evenly among the injections and all experimental samples were randomly distributed throughout the run. A selection of QC compounds was added to every sample for chromatographic alignment, including those under test. These compounds were carefully chosen so as not to interfere with the measurement of the endogenous compounds. Data extraction and compound identification: Raw data was extracted, peak‐identified and QC processed using Metabolon’s hardware and software. These systems are built on a web‐service platform utilizing Microsoft’s NET technologies, which run on high‐performance application servers and fiber‐channel storage arrays in clusters to provide active failover and load‐balancing. Compounds were identified by comparison to library entries of purified standards or recurrent unknown entities. Metabolic profiling in the validation set was performed for a total of 108 metabolites at the Alkek Center for Molecular Discovery, Baylor College of Medicine and for 6 metabolites at the SAIC‐NCI Laboratory of Proteomics and Analytical Technologies. Here, tissues were homogenized in phosphate buffered saline (100 mg tissue/ml PBS) using a micro‐homogenizer from Omni International (Kennesaw, GA), and centrifuged (14,000 g/15 minutes). The supernatant was aliquoted and stored at ‐40° C until use. The pellet was collected for extraction 5 of lipid‐based metabolites. Extract aliquots