Table of Contents List of Investigators
Total Page:16
File Type:pdf, Size:1020Kb
BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Table of contents List of Investigators………………………………………………………………………………..2 Supplemental Methods………………………………………………………………………….....3 Supplemental Figures……………………………………………………………………………...4 Supplemental Tables………………………………………………………………………………7 Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis List of Investigators Steering Committee: Keith Sullivan, M.D. (Principal Investigator and NIH Contract Holder), Daniel Furst, M.D. and Peter McSweeney, M.D. (Protocol Co-chairs), Ellen Goldmuntz, M.D., Ph.D. (Medical Monitor), Lynette Keyes-Elstein, Dr. P.H. (Senior Statistical Scientist), Leslie Crofford, M.D., Richard Nash M.D., Maureen Mayes M.D. DAIT, NIAID Program: Ellen Goldmuntz, M.D., Ph.D. (Medical Officer) Statistical and Clinical Coordinating Center: Lynette Keyes-Elstein, Dr. P.H. (SACCC Principal Investigator), Ashley Pinckney, MS (Statistician) Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplemental Methods Study participants SCOT trial participants fulfilled the 1980 American College of Rheumatology classification criteria for SSc in addition to the following criteria: age 18-69 years, disease onset within 5 years (from first non-Raynaud’s symptom), diffuse cutaneous involvement, internal organ involvement of either pulmonary disease (DLco or FVC <70%) or prior scleroderma renal crisis. More detailed inclusion criteria are provided in the primary clinical manuscript (Sullivan et al 2018). Healthy controls were enrolled in the UT Houston Divisional Repository, did not have a systemic autoimmune disease, and were not first-degree relatives of an individual with SSc. Collection of Peripheral Blood Mononuclear Cells Whole blood samples were collected in Tempus tubes (Applied Biosystems, Foster City, California, USA) and stored at −80°C. RNA was extracted at Fisher according to the protocol attached at the end of the supplementary materials. RNA quality was assessed using Bioanalyzer (Agilent Genomics, Santa Clara, California, USA) and those with RNA integrity numbers >7 were examined. Gene Expression Data Preprocessing RNA was purified from baseline and longitudinal PBMC samples collected from SCOT trial participants. cRNA was hybridized to Agilent (Santa Clara, CA, USA) 8x60k SurePrint G3 Human Gene Expression Microarrays. Agilent Feature Extraction Image Analysis Software (Version 10.7.3) was used to extract data from raw microarray image files. Microarray data were log2- lowess normalized and filtered for probes with intensity ≥1.5-fold over local background in Cy3 or Cy5 channels. Expression values were multiplied by −1 to convert them to log2(Cy3/Cy5) ratios. Probes were filtered to include only those with <20% missing values across samples. For downstream statistical analyses, the missing probe values were imputed using k-Nearest- Neighbors imputation function in GenePattern. Importantly, no subjects or timepoints were imputed. Because multiple probes are used to measure the expression of a single gene, probe values were next collapsed to unique genes, selecting the maximum value, using GenePattern. The CHIP file used for collapsing probes to genes is available with the gene expression data (GSE134310). Gene expression data were median-centered across genes using Cluster 3.0, and visualized using Java TreeView. Weighted Gene Co-expression Network Analyses We used weighted gene co-expression network analysis (WGCNA) to identify modules of genes by constructing a signed network using default power of 12. WGCNA is a powerful, unsupervised clustering technique to identify groups of co-expressed genes. We identified a total of 58 modules and calculated the correlation of the eigenvalues for each module to the intrinsic subset labels generated by the machine learning classifier (Fig. S1). A module that is significantly positively Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis correlated to an intrinsic subset represents a group of genes that is upregulated in that intrinsic subset compared to the other intrinsic subsets. We selected the top positively correlated module to each intrinsic subset for further analysis. To show that the module is significantly upregulated in the intrinsic subset, we used pairwise Wilcoxon Rank Sum tests to compare the eigenvalues between the intrinsic subsets. For each module associated with an intrinsic subset, we used the list of genes for that module and performed gene ontology analyses (g:Profiler) to identify the molecular processes upregulated in that intrinsic subset. GO terms with p<0.05 corrected for multiple testing via default g:SCS method were treated as significant and reported in Figure 1. Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplemental Figures Supplemental Figure S1. PCA of all gene expression data colored by collection site. Principle components were calculated using all gene expression data. PC1 captured 26.94% of the variation in the data and PC2 captured 9.63%. Colors are intermixed, indicating that there is no strong collection site bias in the gene expression data. Each point on the graph represents a single gene expression sample and is colored according to collection site. Some patients may have received treatment at multiple sites, but we did not have access to this information. Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplementary Figure S2. Differential Gene Expression Analysis between Intrinsic Subsets. Here, we performed a differential gene expression analysis using unpaired t-test in SAM (Statistical Analysis of Microarrays). We identified genes differentially expressed at FDR 0% between healthy controls and each of the intrinsic subsets: fibroproliferative (A), inflammatory (B), and normal-like (C). There were 814 up-regulated genes and 2898 down-regulated genes in the fibroproliferative subset compared to controls. There were 884 up-regulated genes and 1278 down-regulated genes in the inflammatory subset compared to controls. There were 33 up- regulated genes and 329 down-regulated genes in the normal-like subset compared to controls. The descriptions on the right-hand side of the heatmap correspond to significant biological processes identified by g:Profiler (GO terms with p<0.05 corrected for multiple testing via default g:SCS ). The full gene lists are included as Supplementary Table S7. Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplemental Figure S3. Gene modules identified with WGCNA from SCOT baseline samples. Heatmap displays the Spearman correlation of module eigenvalues with intrinsic subsets. Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplemental Figure S4. Event-free survival (EFS) by intrinsic subset and treatment arm. Comparison between the intrinsic subsets in terms of EFS within transplantation (A) and cyclophosphamide (B) treatment arms. Franks JM, et al. Ann Rheum Dis 2020; 79:1608–1615. doi: 10.1136/annrheumdis-2020-217033 BMJ Publishing Group Limited (BMJ) disclaims all liability and responsibility arising from any reliance Supplemental material placed on this supplemental material which has been supplied by the author(s) Ann Rheum Dis Supplemental Figure S5. General gene expression trends in SCOT treatment arms. Principal Component Analysis was performed on all genes between the paired samples from patients at baseline and 48/54 months in cyclophosphamide (A) and transplantation (B) arms, respectively. Each point is a sample, each color represents a timepoint (pre/post-treatment), and ellipses represent 95% confidence intervals. There is no difference between pre- and post-treatment in the cyclophosphamide arm. However,