bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Immuno-genomic PanCancer Landscape Reveals Diverse Immune Escape Mechanisms and Immuno-Editing Histories

Shinichi Mizuno1*, Rui Yamaguchi2*, Takanori Hasegawa3*, Shuto Hayashi2*, Masashi Fujita4*, Fan Zhang5*, Youngil Koh6, Su-Yeon Lee7, Sung-Soo Yoon6, Eigo Shimizu2, Mitsuhiro Komura2, Akihiro Fujimoto4, Momoko Nagai8, Mamoru Kato8, Han Liang9, Satoru Miyano2,3, Zemin Zhang5,10, Hidewaki Nakagawa4,10, and Seiya Imoto3,10, on behalf of the PCAWG Mitochondrial Genome and Immunogenomics Working Group and The PCAWG Network

1Center for Advanced Medical Innovation, Kyushu University, Fukuoka, Japan 2Human Genome Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan 3Health Intelligence Center, The Institute of Medical Science, The University of Tokyo, Tokyo, Japan 4Laboratory for Genome Sequencing Analysis, RIKEN Center for Integrative Medical Sciences, Tokyo, Japan 5BIOPIC and College of Life Sciences, Academy for Advanced Interdisciplinary Studies, Beijing Advanced Innovation Centre for Genomics, Peking University, Beijing, China 6Department of Internal Medicine, Seoul National University Hospital, Seoul, Korea 7Samsung SDS, Seoul, Korea 8National Cancer Research Center, Tokyo, Japan 9Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA 10Correspondence to: Seiya Imoto, Hidewaki Nakagawa, and Zemin Zhang *Equally contributed

Abstract Immune reactions in the tumor micro-environment are one of the cancer hallmarks and emerging immune therapies have been proven effective in many types of cancer. To investigate cancer genome- immune interactions and the role of immuno-editing or immune escape mechanisms in cancer development, we analyzed 2,834 whole genomes and RNA-seq datasets across 31 distinct tumor types from the PanCancer Analysis of Whole Genomes (PCAWG) project with respect to key immuno- genomic aspects. We show that selective copy number changes in immune-related could contribute to immune escape. Furthermore, we developed an index of the immuno-editing history of each tumor sample based on the information of mutations in exonic regions and . Our immuno-genomic analyses of pan-cancer analyses have the potential to identify a subset of tumors with immunogenicity and diverse background or intrinsic pathways associated with their immune status and immuno-editing history.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Genomic instability and inflammation or immune responses in the tumor microenvironment are major underlying hallmarks of cancer (Hanahan and Weinberg, 2011), and the interaction between the cancer genome and immune reactions could have important implications for the early and late phases of cancer development. The immune system is a large source of genetic diversity in humans and tumors (Lefranc et al., 1999). Human leukocyte antigen (HLA), the vast number of unique T- and B- cell receptor genes, and somatic alterations in tumor cell genomes enable the differentiation between self and non-self (tumor) via neoantigen (NAG) presentation, which contributes to positive or negative immune reactions related to cancer (Linnemann et al., 2015; Tran et al., 2014; Kreiter et al., 2015; Yarchoan et al., 2017). A variety of immune cells can infiltrate tumor tissues and suppress or promote tumor growth and expansion after the initial oncogenic process (Grivennikov et al., 2010). Such cancer immuno-editing processes (Schreiber et al., 2011) sculpt the tumor genome via the detection and elimination of tumor cells in the early phase and are also related to the phenotype and biology of developed cancer. However, it is not clear how the immune microenvironment helps tumor cells with or without genetic alterations of immune molecules escape immuno-editing, and methods to observe the immuno-editing history in clinical human tumors are needed.

Emerging therapies targeting immune checkpoint or immune-escape molecules are effective against several types of advanced cancer (Sharma et al., 2011; Pardoll, 2012; Mahoney et al., 2015; Zaretsky et al., 2016; Anagnostou et al., 2017); however, most cancers are still resistant to these immunotherapies. Even after successful treatment, tumors often acquire resistance via another immune escape mechanism or by acquiring genomic mutations in instinct signaling pathways, such as the IFN gamma pathway or MHC (HLA) presentation pathway, related to NAG (Gao et al., 2016; Shin et al., 2017). Tumor aneuploidy is also correlated with immune escape and the response to immunotherapy (Dovoli et al., 2017); hence, a comprehensively understand cancer immunology and its diversity by whole genome analyses is necessary. We here analyzed the whole genome sequencing (WGS) of 2,834 donors and RNA-seq data from PanCancer Analysis of Whole Genomes (PCAWG) project (Campbell et al., PCAWG marker paper, and Yung et al., PCWG Tech paper) with respect to key immunogenomic aspects using computational approaches (Hackl et al., 2016). Our results demonstrate that diverse genomic alterations in specific tumor types, variation in immune microenvironments, and variation in oncogenic pathways are related to immune escape, and we further observed immune editing during cancer development. To illustrate the history of immuno- editing history for each cancer genome and to explore underlying molecular pathways involved, we defined immuno-editing indexes (IEIs) by comparing exonic NAGs to virtual NAGs in pseudogenes.

Somatic alterations in immune-related genes may contribute to cancer development and progression or immune escape in certain solid tumors and hematopoietic tumors. To investigate the extent of such genomic alterations, we compiled a list of 267 immune-related genes (Supplementary Table 1) that could be assigned to four categories: the immune escape pathway, antigen presentation pathways for HLA class I and HLA class II, and the signaling and apoptotic pathways, including genes involved in the IFN gamma pathway. An analysis of PCAWG consensus variant calls (Yung et al., PCWG Tech paper) demonstrated that most tumor samples have at least one somatic alteration in these immune-related genes (Figure 1a). Although copy number alterations (CNAs) were the most frequently detected type of somatic alteration, many point mutations and structural variants (SVs) were also detected in the immune-related genes. We first examined the relationship between the somatic mutations in immune-related genes and the total number of somatic mutations, especially for nonsynonymous mutations. Consistent with previous observations (Marty et al., 2017), we observed that tumors with B2M mutations carried more nonsynonymous mutations than tumors without B2M mutations (p = 2.6E-12). Furthermore, this trend was more profound for samples with either B2M

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

mutations or HLA mutations (p = 1.6E-21) (Figure 1b). We also analyzed the number of nonsynonymous mutations in tumor samples showing reduced copy numbers of genes in the antigen presentation pathway, i.e., HLA-A, HLA-B, HLA-C, B2M, TAP1, TAP2, and TAPBP. Tumors with copy number losses also had more nonsynonymous mutations (p = 0.0162), further supporting the notion that the loss of functions of genes involved in antigen presentation genes leads to an increase in nonsynonymous mutations, presumably due to compromised immune pressure.

We also investigated SVs in immune-related genes. Although SVs are relatively rare compared to CNAs, they may have a large impact on the expression and function of affected genes, as exemplified by a recent report of the 3′-untranslated region of CD274/PD-L1 (Kataoka et al., 2016). For each immune-related , we compared mRNA expression levels between SV-positive and SV-negative cases. For ten immune-related genes (CD274/PD-L1, PDCD1LG2/PD-L2, MARCH9, IL22, SEC61G, CCND1, CCT2, INHBC, AKT3, and SOCS7), we detected a statistically significant association between the occurrence of SVs and the upregulation of expression (q-value < 0.05; Figure 1c and Supplementary Figure 1). PDCD1LG2/PD-L2 can interact with PD-1 and PD-L1, resulting in inhibitory signals that modulate the magnitude of T-cell responses (Latchman et al., 2001; Rozali et al., 2012). MARCH9, an E3 ubiquitin ligase, downregulates MHC class II molecules in the plasma membrane (Janke et al., 2012), and SEC61G is involved in the translocation of HLA class I to the endoplasmic reticulum for clearance (Albring et al., 2004). These findings indicate that SVs could affect HLA complexes and their expression or activity/clearance as well as immune checkpoint molecules, which may facilitate the immune escape of tumor cells.

As shown in Figure 1a, CNAs are the most frequently observed alterations in immune-related genes. Cancers harboring many CNAs tended to show less immune involvement and worse responses to immunotherapies (Davoli et al., 2017), and this can potentially be explained by CNAs in immune- related genes. We next compared the copy numbers of immune-related genes with the ploidy levels of tumors to differentiate between selective increases in copy number or changes in ploidy or averaged changes of . We first focused on -10 (IL10), an immune suppressor gene (Itakura et al., 2011). IL10 expresses not only immune cells, but also tumors; the functions of IL10 produced from tumor cells were mainly reported in melanoma (Wiguna and Walden, 2015). We then examined the differences between copy number of IL10 and ploidy level for each donor of multiple tumor types (Figure 1d). In Liver-HCC, Breast-AdenoCA, Skin-Melanoma, and Lung-AdenoCA samples, the IL10 copy number was specifically increased, rather than the ploidy level, in almost all tumors. Since IL10 functions as a repressor of immune cells, the amplification or gain of IL10 is possibly related, in part, to the immune escape mechanism. However, in Kidney-ChRCC, no significant selective amplification was observed.

We analyzed other immune-related genes and tumor types, including MSI (microsatellite instability)-positive tumors (Fujimoto, PCAWG-7, et al., in preparation) with strong immunogenicity (Le et al., 2015) due to high numbers of NAGs. For each immune-related gene, we used t-tests to evaluate whether the copy number differences from the ploidy level are significant or not in each tumor type. The results are summarized as a landscape of selective copy number changes in Figure 1e (showing the mean copy number changes against the ploidy value) and Supplementary Figure 2 (showing the statistical significance of selective copy number changes). TGFB2 and IL10 are located on 1q and both function as suppressors of immune cells (Wiguna and Walden, 2015; Yang et al., 2015), and the selective copy number gains for these immune genes are likely to be related to tumor-immune system interactions (Figure 1e). Recently, a molecule inhibiting TGFB2 and PD-L1 simultaneously is reported its efficacy (Lan et al., 2018; Strauss et al., 2018), it might be

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

important to know an immune escape mechanism related to TGFB2. SEC61G and MARCH9, both of which exhibited significant overexpression related to SVs (Figure 1c), showed different patterns from those of TGFB2 and IL10. MARCH9 showed statistical significance in some tumor types; considering the mean value of the differences in each tumor type, selective copy number gain was detected in CNS-GBM and Bone-Leiomyo. Additionally, SEC61G was selectively amplified in CNS-GBM and Head-SCC. Interestingly, donors with SV-related overexpression and donors with selective copy number gains were highly correlated; however, selective copy number gain could only partially explain the overexpression of these genes for the donors without SVs (Supplementary Figure 3).

In Skin-Melanoma, the copy numbers of genes on chromosome 6, including HLAs, were significantly greater than the ploidy level (p = 2.26E-10 for HLA-A), which could paradoxically enforce immune pressure. However, the copy number of IL10 was also significantly (p = 8.1E-10) and selectively increased, potentially contributing to escape from immune pressure. By contrast, in Kidney-ChRCC and Panc-Endocrine samples, the copy numbers of HLAs compared with the ploidy level show the opposite tendency, and IL10 follows this. Since HLAs are not selectively increased, the copy number gain for IL10 may be unnecessary for immune escape. In Lymph-NOS and Myeloid- MDS, copy numbers of almost all immune genes were consistent with ploidy and were not selectively changed (minimum p = 0.498 and 0.184 for Lymph-NOS and Myeloid-MDS, respectively). MSI tumors showed weak selective copy number increases for genes in the cluster including IL10 (p = 0.000644); however, significant results were not obtained for other immune-related genes. In these tumor types, there may exist different immune escape systems, other than the selective copy number gain of these immune genes.

We identified two possible explanations for copy number gains, i.e., they occurred during the process of ploidy formation or they occurred by a selective process during tumorigenesis. We analyzed the differences between copy number and ploidy and, interestingly, found that genomic regions containing genes that function as suppressors of the immune system, such as TGFB2 and IL10, are selectively increased in many types of tumors. Copy number gains of these immune-related genes could arise and be selected during the establishment of immune escape. Therefore, selective copy number gains may be involved in the history of immune escape. Since TGFB2 and IL10 could play important roles in immune escape based on their function, our findings indicate that selective copy number gain is a remarkable system in the mechanism of immune escape. However, no selective copy number gains were observed in immune checkpoint genes, i.e., PD-L1 and PD-L2, which function as part of the immune escape mechanism, further supporting the diversity of immune escape mechanisms.

During tumorigenesis, mutant peptides derived from nonsynonymous somatic mutations are presented by HLA molecules to T cells (Figure 2a) (Robbins et al., 2013; Carreno et al., 2015). Although these NAGs serve to eliminate tumor cells, some cells escape this immune surveillance and eventually contribute to the formation of clinical tumors (Figure 2b) (Burnet, 1970; Dunn et al., 2002). To estimate the strength of immune surveillance or immune pressure experienced by tumor cells in each sample, we developed a novel approach to measure the strength of immune pressure using pseudogenes as an internal control (Figure 2a) (see Methods). First, we identified predicted NAGs from somatic substitutions in exonic regions of whole genome sequences and compared them to those similarly derived from pseudogenes (Supplementary Figure 4). In this process, we used the HLA types (class I and II, shown in Supplementary Figure 5) determined by our new pipeline, referred to as ALPHLARD (see Methods). The accumulation of somatic mutations in exonic regions versus virtual somatic mutations in pseudogenes during tumorigenesis is schematically represented in

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure 2b. If tumor cells grew under strong immune pressure, the difference between predicted NAGs in exonic and regions would be large. This difference is expected to be small if tumor cells immediately escape from immune pressure in the carcinogenic process (Figure 2c). We defined the immuno-editing index (IEI) according to this concept (see Methods). The virtual

neoantigen ratio RP for mutations in pseudogene regions and the neoantigen ratio RE for exonic regions can be plotted (Figure 2d) to determine the immune pressure for each tumor sample. IEI is

defined as the log-ratio of RP to RE. We used IEI to characterize the histories of different donors, including immuno-edited and immuno-editing-resistant tumors.

In subsequent analyses, we investigated the history of immune pressures for multiple tumor types, as revealed by IEI. The distributions of immune pressure for four cancers are shown in Figure 2e. The percentage of IEI-positive samples, i.e., immune-editing-resistant tumors, in each tumor is shown in Figure 2f. MSI-positive tumors show immuno-edited tumor characteristics, suggesting that MSI- positive tumors continuously fought against immune pressure and eventually escaped. Bladder-TCC, Stomach-AdenoCA, Lymph-BNHL and Head-SCC samples showed immuno-editing-resistant tendencies, indicating that mutations generating NAGs were removed by negative selection during tumorigenesis.

We compared the IEI values with the ploidies using pan-cancer data and observed a significant negative correlation (Pearson's correlation coefficient, r = -0.13, p = 0.0051) (Figure 2g). Among the tumor types, the strongest correlation was observed in Lung-AdenoCA (r = -0.66, p = 0.00028), and multiple tumor types, including ColoRect-AdenoCA, Eso-AdenoCA, and Skin-Melanoma, showed weak negative correlations, although these were not statistically significant. The negative correlation between IEI and ploidy can likely be attributed to the scenario in which a copy number gain leads to high expression of NAGs and thus high immune pressure.

We next examined the immune characteristics or signatures related to the difference in immune escape histories (as determined by IEI). Differentially expressed genes between IEI-positive and - negative tumors were analyzed to find acquired phenotypes or micro-environmental characteristics that promote tumor cell escape from immune pressure. Using four signatures related to immune characteristics, i.e., HLA class I, cytotoxic, immune checkpoint, and cell component, we divided samples into two groups, referred to as hot (inflamed) and cold (non-inflamed) tumors, and we further used gene set enrichment analyses (GSEA) to elucidate the pathways associated with IEI stratification (Supplementary Figure 6). The genes related to each of above four signatures are listed in Supplementary Table 2. Interestingly, we found distinct patterns of gene set enrichment in the high- expression and low-expression groups (Supplementary Figure 7). In the high-expression groups, multiple gene sets, e.g., gamma response and inflammatory response genes, were commonly enriched in most tumor types (Supplementary Figure 7). In contrast, in the low- expression groups, most gene sets were differentially enriched in a tumor-specific manner. Thus, immune escape pathways preventing immune-cell infiltration are diverse and specific to each tumor type.

Furthermore, we specifically examined the degree of enrichment of four specific gene sets with respect to IEI (for gene sets of response, EMT (epithelial to mesenchymal transition) (Terry et al., 2017), TGF beta signaling (Yang et al., 2015), and WNT/β-catenin signaling (Spranger et al., 2014; Pai et al., 2017)) (Figure 3a). The interferon gamma response gene set was enriched in inflamed tumors of all tumor types, as expected, as the expression levels of these genes are higher in inflamed tumors than in non-inflamed tumors. Using IEI, this trend was maintained in

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Head-SCC, Lung-AdenoCA, Lung-SCC, and Lymph-BNHL samples; in these tumor types, those genes are more highly expressed in the IEI-positive group than in the IEI-negative group. However, in Skin-Melanoma, the enrichment of this gene set was not significant, while these genes were significantly underrepresented in four tumor types (Bladder-TCC, ColoRect-AdenoCA, Stomach- AdenoCA, and Uterus-AdenoCA). This suggests that the diversity of immuno-editing histories depends on the tumor type. For the EMT gene set, the four immune signatures are also highly consistent in some types of tumors, such as Bladder-TCC, Lung-SCC, and Skin-Melanoma, and EMT may have important roles in the immune microenvironment in these tumors (Hugo et al., 2016; Chae et al., 2018). For the TGFβ signaling gene set, diverse associations with the four immune signatures were detected. WNT/β-catenin signaling was inversely related to these immune signatures and IEI in several tumor types, such as Skin-Melanoma. In Lung-AdenoCA, Lung-SCC, Lymph-BNHL, and Skin-Melanoma, the trends in IEI seemed to be consistent with the four immune signatures.

Further investigations of infiltrated immune cells are important to understand immune escape mechanisms. Based on the predicted composition of infiltrated immune cells and the expression of CD45, a pan-lymphocyte marker, we evaluated the activity of infiltrated immune cells (Supplementary Figure 8). We focused on the activity of M2 macrophages (y-axis) as immune suppressive cells and CD8+ T-cells (x-axis) as immune effector cells, and obtained a flow cytometry- like plot for each tumor type (Figure 3b). Lung-AdenoCA and Lung-SCC with selective copy number gains of TGFB2 (red circles in the upper panels of Figure 3b) showed high activity of M2 macrophages and low activity of CD8+ T-cells (statistical significance for repression of CD8+ T-cell in selective copy number gain tumors: p = 0.0291 and 0.0244 for the lung AdenoCA and SCC, respectively); in ColoRect-AdenoCA, the selective copy number gain of TGFB2 was not observed in most samples, and only a small fraction of CD8+ T-cell infiltrated tumors with a selective copy number gain of TGFB2 (p = 0.00453). By contrast, in ColoRect-AdenoCA, infiltrating CD8+ T-cells seemed to be repressed in IEI-positive tumors (lower panels of Figure 3b, p = 6.58E-4 for IEI- positive tumors’ CD8+ T-cell repression). These analyses indicated that selective copy number changes and immune escape histories (IEI) of each tumor can reflect the immune cell composition and immune microenvironment within tumors.

Finally, we performed a survival analysis of donors partitioned by IEI values and found that Lung- AdenoCA cancer donors with IEI-positive tumors (immuno-editing resistant tumor) exhibited a much worse overall survival than that of donors with IEI-negative tumors. In Lung-AdenoCA, IEI showed significant separation (p = 0.011, Figure 3c), whereas those for the other aforementioned gene set signatures were not significant. We also analyzed the relationship between selective copy number gain (IL10 and TGFB2) and overall survival and showed three examples using Liver-HCC, Lung- AdenoCA, and Cervix-SCC. For these tumor types, tumors with selective copy number gains of IL10 or TGFB2 showed worse overall survival than that of the tumors without these copy number gains (p = 0.0551 for the liver, p = 0.1 for the lung, and p = 0.0202 for the cervix).

CONCLUSION We derived immuno-genomic profiles, including somatic mutations in immune genes, HLA genotypes, NAGs, and immune micro-environmental landscapes, from pan-cancer whole genome and RNA sequence data. We observed that tumors acquired many types of immune escape mechanisms by selective copy number gains of immune-related genes, failure of the antigen presentation system, and alterations in immune checkpoint molecules in a tumor-specific manner. The history of immuno- editing, as estimated using pseudogenes as sites free of immune pressure, indicated associations between tumorigenesis and immune escape across various tumor types. Furthermore, the micro-

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

environmental landscape related to immune characteristics revealed diverse background or intrinsic pathways controlling the non-inflamed subset of each tumor type. This provides essential information for identifying therapeutic targets. These analyses revealed the impact of the immune micro- environment on the immune resistance and/or immune escape of tumors.

ACKNOWLEDGEMENTS This study was supported in party by the Japan Agency for Medical Research and Development (AMED) Project for Cancer Research and Therapeutic Evolution (P-CREATE) (to H.N.) and a Grant- in-Aid for Scientific Research (B) (to S.I.) from the JSPS. The super-computing resource ‘SHIROKANE’ was provided by Center, The University of Tokyo (http://sc.hgc.jp/shirokane.html).

COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.

REFERENCES 1. Hanahan D, Weinberg RA. (2011) Hallmarks of cancer: The next generation. Cell 144(5), 646–674. 2. Lefranc MP, Giudicelli V, Ginestoux C, Bodmer J, Müller W, Bontrop R, Lemaitre M, Malik A, Barbié V, Chaume D. (1999) IMGT, the international ImMunoGeneTics database. Nucleic Acids Res 27, 209–212. 3. Linnemann C, van Buuren MM, Bies L, Verdegaal EM, Schotte R, Calis JJ, Behjati S, Velds A, Hilkmann H, Atmioui DE, Visser M, Stratton MR, Haanen JB, Spits H, van der Burg SH, Schumacher TN. (2015) High-throughput epitope discovery reveals frequent recognition of neo-antigens by CD4+ T cells in human melanoma. Nat Med 21, 81–85. 4. Tran E, Turcotte S, Gros A, Robbins PF, Lu YC, Dudley ME, Wunderlich JR, Somerville RP, Hogan K, Hinrichs CS, Parkhurst MR, Yang JC, Rosenberg SA. (2014) Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer. Science 344, 641–645. 5. Kreiter S, Vormehr M, van de Roemer N, Diken M, Löwer M, Diekmann J, Boegel S, Schrörs B, Vascotto F, Castle JC, Tadmor AD, Schoenberger SP, Huber C, Türeci Ö, Sahin U. (2015) Mutant MHC class II epitopes drive therapeutic immune responses to cancer. Nature 520, 692–696. 6. Yarchoan M, Johnson BA 3rd, Lutz ER, Laheru DA, Jaffee EM. (2017) Targeting neoantigens to augment antitumour immunity. Nat Rev Cancer 17, 209–222. 7. Grivennikov SI, Greten FR, Karin M. (2010) Immunity, inflammation, and cancer. Cell 140(6):883–899. 8. Schreiber RD, Old LJ, Smyth MJ. (2011) Cancer immunoediting: integrating immunity's roles in cancer suppression and promotion. Science 331(6024), 1565–1570 9. Sharma P, Wagner K, Wolchok JD, Allison JP. (2011) Novel cancer immunotherapy agents with survival benefit: recent successes and next steps. Nat Rev Cancer 11, 805-812. 10. Pardoll DM. (2012) The blockade of immune checkpoints in cancer immunotherapy. Nat Rev Cancer 12, 252–264. 11. Mahoney KM, Rennert PD, Freeman GJ. (2015) Combination cancer immunotherapy and new immunomodulatory targets. Nat Rev Drug Discov, 14, 561-584. 12. Zaretsky JM, Garcia-Diaz A, Shin DS, Escuin-Ordinas H, Hugo W, Hu-Lieskovan S, Torrejon DY, Abril-Rodriguez G, Sandoval S, Barthly L, Saco J, Homet Moreno B, Mezzadra

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

R, Chmielowski B, Ruchalski K, Shintaku IP, Sanchez PJ, Puig-Saus C, Cherry G, Seja E, Kong X, Pang J, Berent-Maoz B, Comin-Anduix B, Graeber TG, Tumeh PC, Schumacher TN, Lo RS, Ribas A. (2016) Mutations associated with acquired resistance to PD-1 blockade in melanoma. N Engl J Med, 375, 819-829. 13. Anagnostou V, Smith KN, Forde PM, Niknafs N, Bhattacharya R, White J, Zhang T, Adleff V, Phallen J, Wali N, Hruban C, Guthrie VB, Rodgers K, Naidoo J, Kang H, Sharfman W, Georgiades C, Verde F, Illei P, Li QK, Gabrielson E, Brock MV, Zahnow CA, Baylin SB, Scharpf RB, Brahmer JR, Karchin R, Pardoll DM, Velculescu VE. (2017) Evolution of neoantigen landscape during immune checkpoint blockade in non-small cell lung cancer. Cancer Discov, 7, 264-276. 14. Gao J, Shi LZ, Zhao H, Chen J, Xiong L, He Q, Chen T, Roszik J, Bernatchez C, Woodman SE, Chen PL, Hwu P, Allison JP, Futreal A, Wargo JA, Sharma P. (2016) Loss of IFN-γ pathway genes in tumor cells as a mechanism of resistance to anti-CTLA-4 therapy. Cell 167, 397–404. 15. Shin DS, Zaretsky JM, Escuin-Ordinas H, Garcia-Diaz A, Hu-Lieskovan S, Kalbasi A, Grasso CS, Hugo W, Sandoval S, Torrejon DY, Palaskas N, Rodriguez GA, Parisi G, Azhdam A, Chmielowski B, Cherry G, Seja E, Berent-Maoz B, Shintaku IP, Le DT, Pardoll DM, Diaz LA Jr, Tumeh PC, Graeber TG, Lo RS, Comin-Anduix B, Ribas A. (2017) Primary resistance to PD-1 blockade mediated by JAK1/2 mutations. Cancer Discov 7, 188–201. 16. Davoli T, Uno H, Wooten EC, Elledge SJ. (2017) Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy. Science 355(6322), eaaf8399. 17. Itakura E, Huang RR, Wen DR, Paul E, Wünsch PH, Cochran AJ. (2011) IL-10 expression by primary tumor cells correlates with melanoma progression from radial to vertical growth phase and development of metastatic competence. Mod Pathol, 24, 801-809. 18. Wiguna AP and Walden P. (2015) Role of IL-10 and TGF-β in melanoma. Exp Dermatol, 24, 209-214. 19. Yang L, Pang Y, Moses HL. (2015) TGF-beta and immune cells: an important regulatory axis in the tumor microenvironment and progression. Trends Immunol, 31, 220-227. 20. Lan Y, Zhang D, Xu C, Hance KW, Marelli B, Qi J, Yu H, Qin G, Sircar A, Hernández VM, Jenkins MH, Fontana RE, Deshpande A, Locke G, Sabzevari H, Radvanyi L, Lo KM. (2018) Enhanced preclinical antitumor activity of M7824, a bifunctional fusion simultaneously targeting PD-L1 and TGF-β. Sci Transl Med, 10, pii: eaan5488. 21. Strauss J, Heery CR, Schlom J, Madan RA, Cao L, Kang Z, Lamping E, Marté JL, Donahue RN, Grenga I, Cordes L, Christensen O, Mahnke L, Helwig C, Gulley JL. (2018) Phase I trial of M7824 (MSB0011359C), a bifunctional fusion protein targeting PD-L1 and TGFβ, in advanced solid tumors. Clin Cancer Res, 24, 1287-1295. 22. Campbell P, et al., PCAWG marker paper. in submission. 23. Yung C, et al. PCAWG Tech paper. in submission. 24. Hackl H, Charoentong P, Finotello F, Trajanoski Z. (2016) Computational genomics tools for dissecting tumour-immune cell interactions. Nat Rev Genet 17(8):441–458. 25. Marty R, Kaabinejadian S, Rossell D, Slifker MJ, van de Haar J, Engin HB, de Prisco N, Ideker T, Hildebrand WH, Font-Burgada J, Carter H. (2017) MHC-I genotype restricts the oncogenic mutational landscape. Cell 171, 1272–1283. 26. Kataoka K, Shiraishi Y, Takeda Y, Sakata S, Matsumoto M, Nagano S, Maeda T, Nagata Y, Kitanaka A, Mizuno S, Tanaka H, Chiba K, Ito S, Watatani Y, Kakiuchi N, Suzuki H, Yoshizato T, Yoshida K, Sanada M, Itonaga H, Imaizumi Y, Totoki Y, Munakata W, Nakamura H, Hama N, Shide K, Kubuki Y, Hidaka T, Kameda T, Masuda K, Minato N,

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Kashiwase K, Izutsu K, Takaori-Kondo A, Miyazaki Y, Takahashi S, Shibata T, Kawamoto H, Akatsuka Y, Shimoda K, Takeuchi K, Seya T, Miyano S, and Ogawa S. (2016) Aberrant PD-L1 expression through 3′-UTR disruption in multiple cancers. Nature 534(7607), 402– 406. 27. Latchman Y, Wood CR, Chernova T, Chaudhary D, Borde M, Chernova I, Iwai Y, Long AJ, Brown JA, Nunes R, Greenfield EA, Bourque K, Boussiotis VA, Carter LL, Carreno BM, Malenkovich N, Nishimura H, Okazaki T, Honjo T, Sharpe AH, Freeman GJ. (2001) PD-L2 is a second ligand for PD-1 and inhibits T cell activation. Nat Immunol 2(3), 261–268. 28. Rozali EN, Hato SV, Robinson BW, Lake RA, Lesterhuis WJ. (2012) Programmed death ligand 2 in cancer-induced immune suppression. Clin Dev 2012, 656340. 29. Jahnke M, Trowsdale J, Kelly AP (2012). Structural requirements for recognition of major histocompatibility complex class II by membrane-associated ring-ch (march) protein e3 ligases. J Biol Chem 287(34), 28779–28789. 30. Albring J, Koopmann JO, Hämmerling GJ, Momburg F. (2004) Retrotranslocation of mhc class i heavy chain from the endoplasmic reticulum to the cytosol is dependent on ATP supply to the ER lumen. Mol Immunol 40(10), 733–741. 31. Le DT, Uram JN, Wang H, Bartlett BR, Kemberling H, Eyring AD, Skora AD, Luber BS, Azad NS, Laheru D, Biedrzycki B, Donehower RC, Zaheer A, Fisher GA, Crocenzi TS, Lee JJ, Duffy SM, Goldberg RM, de la Chapelle A, Koshiji M, Bhaijee F, Huebner T, Hruban RH, Wood LD, Cuka N, Pardoll DM, Papadopoulos N, Kinzler KW, Zhou S, Cornish TC, Taube JM, Anders RA, Eshleman JR, Vogelstein B, Diaz LA Jr. (2015) PD-1 blockade in tumors with mismatch-repair deficiency. N Engl J Med 372(26), 2509–2520. 32. Fujimoto A, et al., (2017) PCAWG marker paper. in preparation. 33. Robbins PF, Lu YC, El-Gamil M, Li YF, Gross C, Gartner J, Lin JC, Teer JK, Cliften P, Tycksen E, Samuels Y, Rosenberg SA. (2013) Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells. Nat Med 19(6), 747–752. 34. Carreno BM, Magrini V, Becker-Hapak M, Kaabinejadian S, Hundal J, Petti AA, Ly A, Lie WR, Hildebrand WH, Mardis ER, Linette GP. (2015) Cancer immunotherapy. A dendritic cell vaccine increases the breadth and diversity of melanoma neoantigen-specific T cells. Science 348(6236), 803–808. 35. Burnet FM. (1970) The concept of immunological surveillance. Prog Exp Tumor Res 13, 1– 27. 36. Dunn GP, Bruce AT, Ikeda H, Old LJ, Schreiber RD. (2002) Cancer immunoediting: from immunosurveillance to tumor escape. Nat Immunology 3(11), 991–998. 37. Terry S, Savagner P, Ortiz-Cuaran S, Mahjoubi L, Saintigny P, Thiery JP, Chouaib S. New insights into the role of EMT in tumor immune escape. Mol Oncol, 11, 824-846. 38. Pai SG, Carneiro BA, Mota JM, Costa R, Leite CA, Barroso-Sousa R, Kaplan JB, Chae YK, Giles FJ. (2017) Wnt/beta-catenin pathway: modulating anticancer immune response. J Hematol Oncol, 10, 101. 39. Hugo W, Zaretsky JM, Sun L, Song C, Moreno BH, Hu-Lieskovan S, Berent-Maoz B, Pang J, Chmielowski B, Cherry G, Seja E, Lomeli S, Kong X, Kelley MC, Sosman JA, Johnson DB, Ribas A, Lo RS. (2016) Genomic and transcriptomic features of response to anti-PD-1 therapy in metastatic melanoma. Cell 165, 35–44. 40. Chae YK, Chang S, Ko T, Anker J, Agte S, Iams W, Choi WM, Lee K, Cruz M. (2018) Epithelial-mesenchymal transition (EMT) signature is inversely associated with T-cell infiltration in non-small cell lung cancer (NSCLC). Sci Rep 8, 2918.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

41. Spranger S, Bao R, Gajewski TF. (2015) Melanoma-intrinsic β-catenin signalling prevents anti-tumour immunity. Nature 523, 231–235. 42. Newman AM, Liu CL, Green MR, Gentles AJ, Feng W, Xu Y, Hoang CD, Diehn M, Alizadeh AA. (2015) Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12(5), 453–457.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Figure Legends

Figure 1: Mutation landscape of immune-related genes. (a) Frequency and types of somatic mutations in immune genes. Single nucleotide variants (SNVs), insertions and deletions (indels), structural variants (SVs), and copy number alterations (CNAs) were examined in immune genes and donors for multiple types of tumors, where ‘truncation’ represents ‘stop gain SNV’ and ‘frameshift indel,’ and ‘in-frame’ means ‘nonsynonymous SNV’ and ‘in frame indel.’ (b) Comparison of the numbers of nonsynonymous mutations in tumors with or without somatic mutations in the antigen presentation complex (HLA and B2M). (c) Overexpression of immune-related genes and its association with SVs in each tumor type. Red and blue dots represent tumor samples with and without SVs, respectively. (d) Copy number of IL10 offset by tumor ploidy. Tumor samples are colored red and blue to indicate whether the copy number is above or below the ploidy level, respectively. (e) Selective copy number changes of immune-related genes in each tumor type. Red and blue represent an excess or deficiency in the gene copy number compared to the tumor ploidy level, respectively. The color of the element represents the mean value of the differences between copy number and ploidy.

Figure 2: Analysis of immuno-editing history. (a) Overview of the presentation of neoantigens generated from nonsynonymous mutations in exonic regions. Pseudogene regions are not translated and mutations that accumulate in pseudogenes are not presented by the HLA complex. (b) Relationship between accumulated mutations in exonic regions and pseudogenes in the immuno- editing history. Although CTLs (cytotoxic T-cells) eliminate tumor cells by recognizing these NAGs, some tumor cells escape this immune surveillance mechanism and eventually contribute to the formation of a clinical tumor. (c) In immuno-editing-resistant tumors, the tumor cells immediately escaped from immune pressure in the carcinogenic process, and the difference between NAGs in exonic and pseudogene regions was expected to be small. (d) Immuno-pressure plot of (virtual)

neoantigens in exonic regions and psueodogenes. The x-axis represents the virtual neoantigen ratio RP

for mutations in pseudogene regions and the y-axis shows the neoantigen ratio RE in exonic regions.

IEI (immuno-editing index) was defined as the log ratio of RP to RE and was used to characterize the immune-editing history of each donor, with immuno-edited and immuno-editing-resistant tumors. (e) Immuno-pressure plots of four cancer types. MSI-positive tumors show the most immuno-edited tumor characteristics; in other cancers, many tumors showed an immuno-editing-resistant tendency. (f) The proportion of immuno-editing-resistant tumors. (g, h) Tumor ploidy and IEI for a pan-cancer analysis (g; n = 433) and lung adenocarcinoma (h; n = 25). Each dot represents a tumor sample.

Figure 3: Immune signatures and their associations with genomic alterations and immuno- editing history. (a) GSEA in four gene sets (‘interferon gamma response,’ ‘EMT,’ TGF-beta signaling,’ and ‘WNT/β-catenin signaling’) was used to determine the degree of enrichment of the four immune signatures and IEI. The color of each pair of tumor type and gene set represents the GSEA score (Supplementary Figure 6) (b) Flow cytometry-like plots representing the estimated activity of infiltrated CD8+ T-cells (x-axis) and M2 macrophages (y-axis). The dotted red line and circle represent the mean value for each axis and a sample, respectively. (c) Kaplan–Meier curves for overall survival show a high IEI and selected copy number gains of IL10 and TGFB2.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Table 1: List of analyzed immune-related genes.

Supplementary Table 2: Genes involved in four immune signatures.

Supplementary Figure 1: Ten immune-related genes with structural variant-related overexpression.

Supplementary Figure 2: Statistical significance of selective copy number changes. The color of each element represents the score of the statistical test, defined by –sign(t-statistic)*log10(p-value). The function sign(a) takes +1 if a is positive, otherwise -1.

Supplementary Figure 3: Selective copy number gain and structural variation can explain RNA overexpression.

Supplementary Figure 4: Distributions of neoantigens for each tumor type. (a) Class I and (b) class II.

Supplementary Figure 5: Distributions of determined HLA types from whole genome sequence data.

Supplementary Figure 6: Overview of GSEA based on immune signatures, using the cytotoxic signature and lung cancer samples. The samples are clustered based on the expression of genes listed in Supplementary Table 2 and the expression of genes in two groups of samples were compared using two-sided t-tests. The enrichment of gene sets defined by MSigDB was evaluated by GSEA. The score is defined in the same way as selective copy number changes, using the sign of the enrichment score and its p-value.

Supplementary Figure 7: For IEI (positive and negative) and four immune signatures for tumor immuno-types (hot and cold), GSEA results for all gene sets and tumor types are summarized.

Supplementary Figure 8: Analysis of infiltrated cells and their predicted activities. Using the results of CIBERSORT and the expression of CD45 for each sample, we estimated the activity of infiltrated immune cells, including CD8+ T-cells, CD4+ T-cells, NK-cells, M2 macrophages, B-cells, etc. An example of a scatter plot (x-axis and y-axis indicate the predicted activity of CD8+ T-cells and M2 macrophages, respectively) is shown at the bottom, where a circle represents a sample.

Supplementary Figure 9: CIBERSORT deconvolution for the comparison between microarray data and RNA-Seq data using 166 TCGA LAML-US samples. Pearson’s correlation coefficients were used to measure concordance.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

METHODS

Genomic alterations in immune-related genes in pan-cancer datasets Datasets of somatic point mutations, structural variants (SVs), and copy number alterations were generated as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) project. Overall, 2834 samples with whole genome data are represented in the PCAWG datasets, spanning a range of cancer types (bladder, sarcoma, breast, liver-biliary, cervix, leukemia, colorectal, lymphoma, prostate, esophagus, stomach, central nervous system, head/neck, kidney, lung, melanoma, ovary, pancreas, thyroid, and uterus). The consensus somatic SVs, CNAs, and SNVs in PCAWG samples were determined by three different data centers using different algorithms; calls made by at least two algorithms were used in downstream analyses. To determine copy number, the calls made by the Sanger group were used (Yang et al., PCAWG Tech paper)

HLA genotyping and mutations from whole genome sequences For HLA genotyping using whole genome sequencing data, a Bayesian method known as ALPHLARD was used; this method was designed to perform accurate HLA genotyping from short- read data and to predict the HLA sequences of the sample. The latter function enables the identification of somatic mutations by comparisons of the HLA sequences of the tumor sample with those of the matched-normal sample. The statistical formulation for the posterior probability can be described as follows: �(�, �, �|�) ∝ �(�|�, �)�(�)�(�, �),

where R = (R1, R2) is the pair of HLA types (reference sequences), S = (S1,S2) is the pair of HLA

sequences of the samples, X = (x1, x2,...) is the set of sequence reads, and I = (I1, I2,...) is the set of

variables taking 1 or 2 (the jth element, Ij indicates the jth read xj is generated from). On the right- hand side of the above equation, the left term indicates the likelihood of the sequence reads when the HLA sequences and the reference sequences are fixed. The middle and the right terms are the priors. The parameters, HLA sequences, and HLA types, were determined using the MCMC procedure with parallel tempering.

Immune signatures from RNA-seq data To investigate the microenvironment related to the immune characteristics of tumors, the following immune-related signatures were prepared (Supplementary Table 2): · Cytotoxic (Rooney et al., 2015) · Immune checkpoints (Mahoney et al., 2015; Smyth et al., 2015) · HLA pathway class I (Neefjes et al., 2011) · Cell component Using a signature, two subsets of samples were defined, a subgroup of samples with immune characteristics indicating the focused signature, and a subgroup lacking these characteristics. By comparing RNA expression levels in these subgroups, enriched gene sets or pathways were identified as related microenvironments.

Immuno-signature-based GSEA For each cancer type, the samples were divided into two groups based on gene expression patterns of an immuno-signature set, e.g., cytotoxic signature set, and a GSEA was conducted for gene sets using MSigDB by comparing whole gene expression values between the two groups of samples. To obtain the two groups, hierarchical clustering was applied to the gene expression matrix for immuno- signature genes of the samples and the dendrogram for the samples was cut at the root. The group with a higher mean expression value for immuno-signature genes than that in the other group was

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

labeled “High,” while the other was labeled “Low.” In this study, as described above, we considered four immuno-signature sets. For an immuno-signature set, the above GSEA was applied for each cancer type and the enrichment results for MSigDB gene sets were compiled into heatmaps. In a heatmap, each cell corresponds to a pair of an MSigDB gene set and cancer type, and has the value of, where the nominal p-value and is an indicator variable; if the gene set is enriched in “High” group and otherwise (Supplementary Figure 6).

Immune cell components For CIBERSORT implementation, FPKM values were used after upper-quartile normalization as input gene expression values (FPKMs are in linear space, without log-transformation) and the default LM22 was used as the signature gene matrix. Twenty-two leukocyte fractions were imputed from CIBERSORT. Originally, CIBERSORT was proposed for RNA expression data obtained by microarray. However, it has been reported that CIBERSORT can be applied to bulk tumor RNA-seq (Tuong et al., 2016; Mehnert et al., 2016) and single-cell RNA-seq (Baron et al., 2016). The correlation between results obtained using microarray data and RNA-seq data from 166 LAML-US tumors was independently evaluated; the observed correlation coefficient was 0.93, which was significantly high. Therefore, CIBERSORT was applied to RNA-seq data (Supplementary Figure 9).

Neo-antigen prediction From PCAWG preliminary consensus files, 2,786 annotated .tsv files were generated using ANNOVAR and exclusion samples were removed according to release_may2016.v1.3.tsv. Next, focusing on nonsynonymous mutations in exonic regions, the corresponding mutant/wild-type peptides of length 8–11-mer including an amino-acid substitution were constructed using the UCSC RefSeq mRNA and refFlat data (http://hgdownload.soe.ucsc.edu/downloads.html). Next, binding

affinities (IC50) of all generated peptides were predicted using netMHCpan3.0 (Nielsen and Andreatta, 2016) for HLA class I and netMHCIIpan3.1 (Andreatta et al., 2015) for HLA class II. Finally,

neoantigens were counted for each patient by considering that mutant peptides with IC50 values of less than 500 as neoantigens. Here, neoantigens were counted as the number of mutations that can generate neoantigens; thus, each mutation was counted once, even if it generated more than one neoantigen for one or more HLAs. Note that mutations in which annotated information was not consistent with UCSC RefSeq mRNA and refFlat data were skipped as database mismatches. The ratio of the number of non-skipped nonsynonymous mutations to the number of all observed nonsynonymous mutations was defined as the concordance rate. Although this value was nearly 1 in all cases (greater than 0.99, on average), it was used as a tuning parameter, as described below.

Immuno-editing index To evaluate the sample-specific immuno-editing history, an immuno-editing index (IEI) describing the degree of accumulated immune suppression was established. IEI compares the ratio of the number of neoantingens to the number of nonsynonymous mutations in exonic regions and in the control regions, which are not affected by immune pressure. Pseudogene regions were used as internal controls for a tumor and only pseudogene mutations whose genomic positions were downstream of the stop codon were extracted according to PseudoPipe v.74 (http://www.pseudogene.org/pseudopipe/). In this concept, the following assumptions were made: (i) nonsynonymous mutations in exonic regions can be suppressed by immune pressure if their mutant peptides can bind to HLAs and (ii) synonymous mutations in exonic regions and nonsynonymous/synonymous mutations in pseudogene regions are not affected by immune pressure. Under these assumptions, the number of nonsynonymous mutations in exonic regions can be lower

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

than the number of ideal nonsynonymous mutations in exonic regions, indicating the hypothetical number of nonsynonymous mutations under non-immune pressure. Several quantities were defined as follows: ・ Number of nonsynonymous mutations used to evaluate neoantigens (not skipped by database mismatch) in exonic regions = #nonsynE ・ Number of synonymous mutations in exonic regions = #synE ・ Number of predicted neoantigens in exonic regions = #NagE ・ Number of nonsynonymous mutations used to evaluate neoantigens (not skipped by database mismatch) in pseudogene regions = #nonsynP ・ Number of synonymous mutations in pseudogene regions = #synP ・ Number of predicted neoantigens in pseudogene regions = #NagP

・ Concordance rate of mutation annotations in exonic regions = �

・ Concordance rate of mutation annotations in pseudogene regions = � The number of nonsynonymous mutations in exonic region was adjusted to obtain the number of ideal nonsynonymous mutations (#InonsynE) using the above quantities as follows: � #������� #�������� = × #���� × . � #���� Here, #InonsynE was set to #NagE if #InonsynE was less than #NagE. IEI was calculated as the modified log ratio in terms of the numbers of neoantingens and nonsynonymous mutations, and is equal to the sum of the numbers of neoantingens and non- neoantingens between exonic and pseudogene regions as follows: #���� + � #�������� + � ��� = log , #���� + � #������� + � where C is a regularized constant, set to 0.5 for the analysis.

Pseudogene selection PseudoPipe (build 74) (Zhang et al., 2006) was used as a pseudogene database for the following analysis, which includes the region and the parental gene of each pseudogene, among other information. First, pseudogene mutations in each sample were extracted from the VCF file based on pseudogene regions described in PseudoPipe. Next, each pseudogene in PseudoPipe was aligned to the parental gene using Clustal Omega (version 1.2.1) (Sievers et al., 2011) with default settings. Each pseudogene mutation was converted to a parental gene mutation located at the same position as that of the pseudogene mutation in the alignment. Note that pseudogene mutations were excluded in the following neo-antigen analysis if the position corresponded to an intron of the parental gene or if the bases differed at the position in the alignment of the pseudogene and the parental gene. Thus, except for the above cases, pseudogene mutations were treated as if they were exonic mutations. An immuno-editing history analysis was applied to the converted mutations and the results were used as an internal control. Mutations in pseudogene regions were used directly, without information for parental genes. However, the amino acid composition in pseudogene regions with parental genes is considered similar to that in exonic regions. Additionally, in pseudogene regions, many stop codons are present and a method was determined to handle these. Therefore, pseudogene regions with parental genes were used as a suitable internal control to evaluate the strength of immune pressure.

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

REFERENCES in METHOD SECTION 1. Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. (2015) Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell, 160, 48-61. 2. Smyth MJ, Ngiow SF, Ribas A, Teng MW. (2015) Combination cancer immunotherapies tailored to the tumour microenvironment. Nat Rev Clin Oncol, 13, 143-158. 3. Neefjes J, Jongsma ML, Paul P, Bakke O. (2011) Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat Rev Immunol, 11, 823-836. 4. Tuong ZK, Fitzsimmons R, Wang SM, Oh TG, Lau P, Steyn F, Thomas G, Muscat GE. (2016) Transgenic adipose-specific expression of the nuclear receptor RORα drives a striking shift in fat distribution and impairs glycemic control. EbioMedicine, 11, 101–117. 5. Mehnert JM, Panda A, Zhong H, Hirshfield K, Damare S, Lane K, Sokol L, Stein MN, Rodriguez-Rodriquez L, Kaufman HL, Ali S, Ross JS, Pavlick DC, Bhanot G, White EP, DiPaola RS, Lovell A, Cheng J, Ganesan S. (2016) Immune activation and response to pembrolizumab in POLE-mutant endometrial cancer. J Clin Invest, 126, 2334–23140. 6. B Baron M, Veres A, Wolock SL, Faust AL, Gaujoux R, Vetere A, Ryu JH, Wagner BK, Shen- Orr SS, Klein AM, Melton DA, Yanai I. (2016) A single-cell transcriptomic map of the human and mouse pancreas reveals inter- and intra-cell population structure, Cell Syst, 3, 346–360.e4. 7. Nielsen M, Andreatta M. (2015) NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets. Genome Med, 8, 33. 8. Andreatta M, Karosiene E, Rasmussen M, Stryhn A, Buus S, and Nielsen M. (2015) Accurate pan-specific prediction of peptide-MHC class II binding affinity with improved binding core identification. Immunogenetics, 67, 641-650. 9. Zhang Z, Carriero N, Zheng D, Karro J, Harrison PM, Gerstein M. (2006) PseudoPipe: an automated pseudogene identification pipeline. Bioinformatics, 22, 1437-1439. 10. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG. (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol, 7, 539.

Figure 1 a b 100

p = 6.8E-11 HLA class I p = 2.6E-12 B2M p = 1.6E-21 HLA or B2M non mutated non mutated non mutated 50 mutated mutated mutated genes (%) Cases with

mutated immune mutated 0 1000 1000 1000 100 sv in−frame 50 truncation Mutation

types (%) cna.loss cna.gain 0 # nonsynonymous SNVs # nonsynonymous SNVs # nonsynonymous SNVs # nonsynonymous 10 10 10 y−RCC er−HCC vix−SCC y−ChRCC eloid−AML ymph−CLL Liv eloid−MPN Lung−SCC CNS−GBM CNS−Oligo Head−SCC Bone−Epith y−AdenoCA y−AdenoCA y−AdenoCA L

bioRxiv preprint doi:Cer https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was ymph−BNHL Kidne us−AdenoCA My Bladder−TCC CNS−Medullo L My anc−AdenoCA Th Eso−AdenoCA Bone−Leio myo CNS−PiloAstro not certified by peer review) is the author/funder.anc−Endoc rine All rights reserved. No reuse allowed without permission. Kidne Lung−AdenoCA P Prost−AdenoCA Skin−Melanoma P Bone−Osteosarc Ovar Biliar Breast−AdenoCA Uter Without With Without With Without With Stomach−AdenoCA ColoRect−AdenoCA HLA mutation HLA mutation B2M mutation B2M mutation HLA mutation HLA mutation and B2M mutation or B2M mutation c CD274/PD-L1 PDCD1LG2/PD-L2 MARCH9 SEC61G 200 200 1000 20000

100 100 500 10000 FPKM FPKM FPKM FPKM

0 0 0 0 Skin Skin Skin Skin CNS CNS CNS CNS Liver Liver Liver Liver Lung Lung Lung Lung Ovary Ovary Ovary Ovary Biliary Biliary Biliary Biliary Cervix Cervix Cervix Cervix Breast Breast Breast Breast Uterus Uterus Uterus Uterus Kidney Kidney Kidney Kidney Thyroid Thyroid Thyroid Thyroid Myeloid Myeloid Myeloid Myeloid Bladder Bladder Bladder Bladder Prostate Prostate Prostate Prostate Stomach Stomach Stomach Stomach Pancreas Pancreas Pancreas Pancreas Lymphoid Lymphoid Lymphoid Lymphoid Head/Neck Head/Neck Head/Neck Head/Neck Esophagus Esophagus Esophagus Esophagus Colon/Rectum Colon/Rectum Colon/Rectum Colon/Rectum Bone/SoftTissue Bone/SoftTissue Bone/SoftTissue Bone/SoftTissue SV(+) SV( ) SV(+) SV( ) SV(+) SV( ) SV(+) SV( )

d Liver−HCC; IL10 Breast−AdenoCA; IL10 Skin−Melanoma; IL10 Lung−AdenoCA; IL10 Kidney−ChRCC; IL10 6 6 6 6 6 ploidy ploidy ploidy ploidy 4 4 4 4 ploidy 4 − − − − − 2 2 2 2 2

0 0 0 0 0 Copy number Copy Copy number Copy Copy number Copy Copy number Copy −2 −2 −2 −2 number Copy −2

Tumor samples Tumor samples Tumor samples Tumor samples Tumor samples

e −1 10

CNS−GBM Bone−Leiomyo Kidney−ChRCC Panc−Endocrine CNS−Medullo Bone−Epith Kidney−RCC CNS−Oligo Lymph−NOS MSI Lymph−BNHL Prost−AdenoCA CNS−PiloAstro Lymph−CLL Myeloid−AML Bone−Cart Thy−AdenoCA Myeloid−MPN Myeloid−MDS Cervix−SCC Lung−SCC Head−SCC Bladder−TCC Lung−AdenoCA Eso−AdenoCA Breast−DCIS Bone−Osteosarc Cervix−AdenoCA Breast−LobularCA Ovary−AdenoCA Skin−Melanoma Breast−AdenoCA Uterus−AdenoCA Liver−HCC ColoRect−AdenoCA Stomach−AdenoCA Panc−AdenoCA Biliary−AdenoCA SEC61G IFNG MARCH9 TNFRSF1B CASP9 TNFRSF25 TNFRSF9 DFFA DFFB TNFRSF14 TNFRSF4 TNFRSF18 BCL10 JAK1 NRDC CSF1 NRAS VTCN1 ULBP1 RAET1G ULBP2 RAET1E RAET1L ULBP3 IFNGR1 ARG1 NT5E RFXAP TPP2 PDCD1 MARCH4 STAT1 CASP8 CASP10 CFLAR CD28 ICOS CTLA4 IFNA8 IFNA2 IFNA1 IFNA17 IFNA7 IFNB1 PTEN FAS ENTPD1 CASP7 PRF1 VSIR MARCH8 CXCL12 SEC61A2 TNFRSF10D TNFRSF10A TNFRSF10C TNFRSF10B ALOX15B ALOX12B TNFSF12 PSMB10 TRADD MLKL CCL22 NLRC5 MARCH1 TDO2 CASP6 LAP3 CASP3 CD226 BCL2 LNPEP ERAP2 ERAP1 NLN TSLP GZMK GZMA IL4 IL13 CSF2 CD74 HAVCR2 CANX B2M PDIA3 HRAS BAD CTNNB1 B3GAT1 CRTAM CADM1 BID ADORA2A LGALS1 CD276 MFGE8 FURIN CTSL SEC61B TNFSF15 ENDOG PTGS1 TRAF2 GZMM THOP1 TMIGD2 TNFSF9 CD70 TNFSF14 TGFB3 ARG2 LGMN LGALS3 GZMB GZMH RIPK3 PSME2 PSME1 IL33 PDCD1LG2 CD274 JAK2 IDO2 IDO1 IFNAR2 IFNAR1 ICOSLG CALR RFXANK JAK3 IFI30 CXCL5 CXCL8 ALB APAF1 NOS1 TUBA1A SOCS1 CIITA CASP5 CASP1 CASP4 CASP12 SCAF11 BIRC3 BIRC2 BCL6 TNFSF10 FADD MYC CD244 CD48 PTGS2 TGFB2 IL10 FASLG TNFSF4 TNFSF18 MCL1 CD160 RFX5 CTSS BIRC6 TNFRSF21 VEGFA RIPK1 SERPINB9 HLA−DRA BTNL2 HLA−DRB1 HLA−DQA1 HLA−DQB1 LTA TNF HLA−C MICB HLA−B MICA BTN1A1 BTN2A2 BTN3A1 HLA−E HLA−A HLA−G HLA−F TAPBP HLA−D PA1 HLA−DPB1 HLA−DQA2 HLA−D OA HLA−DMA HLA−DMB TAP2 HLA−DOB PSMB8 TAP1 PSMB9 PVR NECTIN2 TGFB1 CXCL17 CEACAM1 BAX KIR3DL1 KIR3DL2 KIR2DL4 KIR2DL1 BIRC8 LILRB2 KIR2DL3 KIR3DL3 LILRB1 TNFRSF12A CSF3 MPO BIRC5 PSME3 NOS2 LGALS9 CCL3 CCL2 CCL5 SEC61A1 CD80 CD86 HHLA2 CD47 TIGIT NECTIN3 CD96 CD200R1 CD200 BTLA GAPDH LAG3 TNFRSF1A TAPBPL CD27 KLRK1 KLRC2 KLRC1 KLRF1 KLRG1 KLRD1 KLRB1 CASP2 NOS3 PVRIG CYCS IL6 KRAS SIRPA CCL28 EBAG9 BCL2L1 CD40 BIRC7 TNFRSF6B Figure 2 a Exonic region Pseudogene b MutatedbioRxiv peptides preprint are doi: processed https://doi.org/10.1101/285338 No translation; this version of mutated posted March 20, 2018. The copyright holder for this preprint (which was Immuno-editting history not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. and presented on HLAs. sequence at pseudogene. Tumor cells Neoantigen Not presented HLA by HLAs Cell

Protein Abberant Accumurated CTL CTL Mutated peptide STOP STOP Protein degree of somatic mRNA AAAA mutations in STOP Exonic region mRNA AAAA Abberant Immune surveillance STOP STOP Pseudogene region Mutated peptide is presented Mutation at exonic region STOP by HLAs and the cells are Mutation NOT presented by HLAs Pseudogene eliminated by CTL. region Mutation at Mutation presented by HLAs Coding region pseudogene d c RP CTL CTL CTL tumor Immuno-edited R IEI = log E RP #Neoantigen/#Mutations at pseudogene Immuno-editing resistant tumor

Immuno-microenvironment RE #Neoantigen/#Estimated mutations at exonic region e ColoRect-AdenoCA Lung-AdenoCA Skin-Melanoma MSI on pseudogene on pseudogene on pseudogene on pseudogene #Neoantigens / #Nonsynonymous Mutations #Neoantigens / #Nonsynonymous #Neoantigens / #Nonsynonymous Mutations #Neoantigens / #Nonsynonymous Mutations #Neoantigens / #Nonsynonymous Mutations #Neoantigens / #Nonsynonymous 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 1.2 #Neoantigens / #Estimated Nonsynonymous Mutations #Neoantigens / #Estimated Nonsynonymous Mutations #Neoantigens / #Estimated Nonsynonymous Mutations #Neoantigens / #Estimated Nonsynonymous Mutations on exonic region on exonic region on exonic region on exonic region f g h Percentages of the IEI−positive samples

Bladder−TCC Histology abbreviation ● Bladder−TCC Stomach−AdenoCA ● ColoRect−AdenoCA Histology abbreviation 1.0 ● ● 1.0 Lymph−BNHL ● ●● Eso−AdenoCA Lung−AdenoCA ●● ● ● ● ● ● ● Head−SCC ● ● ● Head−SCC ●● ●●● ● ● ● ● ● 0.5 ●●●● ● ● Lung−AdenoCA 0.5 ● ●●●● ● ●● ● ● ● ● ● ●●●● ●● ● ●●●● ● Lung−SCC ● Skin−Melanoma ●●●●●● ●● ●●● ●●●●●● ●● ● ●●●●●●●● ●●● ●●●● ● ●● ● ● ● ●●●●●●● ● ●●●● ● ●●● ● Lymph−BNHL ● ●● ●●●●●●●●●●●● ●●●●●● ● ●● ● ●●●● ● ● ● Lung−AdenoCA 0.0 ●●●●●●●● ● ●●●●●●●●●●● ●●● ● 0.0 ● ●● ● ● ● ●●●●●●●●● ●● ●● ●●● ● ●●●●●●●●● ● ● Lymph−NOS ● ● ●●●●●●●●●●● ● ●● ●●●●●●●●●●●● ● ● ● ● ●●●●●●●●●● ● ●● ●● ●●●●●●●● ● ● ● ● Lung−SCC ●●● ●●●●● ● ● ●●●● ●●●●●● ● ● ● Skin−Melanoma ● ●●●●●● ●●● ●●● ● ● ● ●●●● ● ●● ● ● ● ● ● Stomach−AdenoCA ● −0.5 ● ●●● ●●● ●●● ●●● ●● ● ● ● −0.5 ● Eso−AdenoCA ● ●● ●● ● ● ● ● ●● ● Uterus−AdenoCA ● ● ● ● ● ● ColoRect−AdenoCA ● ●●● ● ● ●●● ● ● Immuno-editing index (IEI) Immuno-editing index (IEI) −1.0 ● ● −1.0 Uterus−AdenoCA ●

MSI 2 3 4 5 2 3 4 5 0 10 20 30 40 50 Ploidy Ploidy Figure 3 a

Interferon gamma response Epithelial mesenchymal transition Score Score IEI 4 IEI 4 HLA class 1 2 HLA class 1 2 Cytotxic genes Cytotxic genes 0 0 Immune checkpoints Immune checkpoints Cell component −2 Cell component −2 −4 −4 Lung_SCC Lung_SCC Head_SCC Head_SCC ymph_BNHL ymph_BNHL Bladder_TCC Bladder_TCC L L Lung_AdenoCA Lung_AdenoCA Skin_Melanoma Skin_Melanoma Uterus_AdenoCA Uterus_AdenoCA Stomach_AdenoCA Stomach_AdenoCA ColoRect_AdenoCA ColoRect_AdenoCA

bioRxiv preprint doi:TGF https://doi.org/10.1101/285338 beta signaling ; this version posted March 20, 2018. The copyright holder for this preprint (whichWNT betawas catenin signaling not certified by peer review) is the author/funder. All rights reserved. No reuse allowedScore without permission. Score IEI 4 IEI 4 HLA class 1 2 HLA class 1 2 Cytotxic genes 0 Cytotxic genes 0 Immune checkpoints Immune checkpoints −2 −2 Cell component Cell component −4 −4 Lung_SCC Lung_SCC Head_SCC Head_SCC ymph_BNHL ymph_BNHL Bladder_TCC Bladder_TCC L L Lung_AdenoCA Lung_AdenoCA Skin_Melanoma Skin_Melanoma Uterus_AdenoCA Uterus_AdenoCA Stomach_AdenoCA Stomach_AdenoCA ColoRect_AdenoCA ColoRect_AdenoCA b TGFB2: CN − ploidy Skin−Melanoma Lung−AdenoCA Lung−SCC ColoRect−AdenoCA 4 2.5 Diff_high 3 2.0 4 3 Diff_low 1.5 2 2 2 1.0 MSI− 1 1 0.5 MSI+ log2 Macrophages:M2 log2 Macrophages:M2 log2 Macrophages:M2 log2 Macrophages:M2 0 0 0 0.0 0 2 4 0 1 2 3 0 1 2 3 4 0.0 0.5 1.0 1.5 2.0 2.5 log2 T-cells:CD8 log2 T-cells:CD8 log2 T-cells:CD8 log2 T-cells:CD8

Immuno-editing Index (IEI) Skin−Melanoma Lung−AdenoCA Lung−SCC ColoRect−AdenoCA 4 2.5

IEI_NA 3 2.0 4 3 IEI_Neg 1.5 IEI_Pos 2 2 1.0 2 1 1 0.5 MSI− log2 Macrophages:M2 log2 Macrophages:M2 log2 Macrophages:M2 log2 Macrophages:M2 MSI+ 0 0 0 0.0 0 2 4 0 1 2 3 0 1 2 3 4 0.0 0.5 1.0 1.5 2.0 2.5 log2 T-cells:CD8 log2 T-cells:CD8 log2 T-cells:CD8 log2 T-cells:CD8 c Immuno Editing Index IL10: CN − ploidy TGFB2: CN − ploidy TGFB2: CN − ploidy Lung−AdenoCA p = 0.011 Liver−HCC p = 0.0551 Lung−AdenoCA p = 0.1 Cervix−SCC p = 0.0202 1.00 1.00 1.00 1.00 IEI_pos Diff_high IEI_neg Diff_high Diff_high Diff_low Diff_low Diff_low 0.75 0.75 0.75 0.75

0.50 0.50 0.50 0.50 al Probability al Probability al Probability al Probability viv viv viv viv Su r Su r Su r 0.25 0.25 0.25 Su r 0.25

0.00 0.00 0.00 0.00 0 400 800 1200 1600 0 400 800 1200 1600 0 400 800 1200 1600 0 400 800 1200 1600 Time Time Time Time bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Figure 1 SEC61G PDCD1LG2/PD-L2 MARCH9 CD274/PD-L1 20000 200 1000 200

10000 100 500 FPKM FPKM 100 FPKM FPKM

0 0 0 Skin CNS 0 Liver Lung Ovary Biliary Cervix Breast Uterus Kidney Skin Skin CNS CNS Liver Liver Thyroid Lung Lung Myeloid Bladder Prostate Ovary Ovary Biliary Biliary Stomach Cervix Cervix Breast Breast Uterus Uterus Kidney Kidney Skin Pancreas Lymphoid CNS Liver Thyroid Thyroid Lung Myeloid Myeloid Bladder Bladder Prostate Prostate Ovary Head/Neck Biliary Esophagus Stomach Stomach Cervix Breast Uterus Kidney Pancreas Pancreas Lymphoid Lymphoid Thyroid Myeloid Bladder Prostate Head/Neck Head/Neck Esophagus Esophagus Stomach Colon/Rectum Pancreas Lymphoid Head/Neck Esophagus Bone/SoftTissue Colon/Rectum Colon/Rectum SV(+) SV( ) Bone/SoftTissue Bone/SoftTissue Colon/Rectum SV(+) SV( ) SV(+) SV( ) Bone/SoftTissue SV(+) SV( )

SOCS7 INHBC IL22 CCT2 80 200 6 4000

40 100 3 2000 FPKM FPKM FPKM FPKM

0 0 0 0 Skin Skin Skin Skin CNS CNS CNS CNS Liver Liver Liver Liver Lung Lung Lung Lung Ovary Ovary Ovary Ovary Biliary Biliary Biliary Biliary Cervix Cervix Cervix Cervix Breast Breast Breast Breast Uterus Uterus Uterus Uterus Kidney Kidney Kidney Kidney Thyroid Thyroid Thyroid Thyroid Myeloid Myeloid Myeloid Myeloid Bladder Bladder Bladder Bladder Prostate Prostate Prostate Prostate Stomach Stomach Stomach Stomach Pancreas Pancreas Pancreas Pancreas Lymphoid Lymphoid Lymphoid Lymphoid Head/Neck Head/Neck Head/Neck Head/Neck Esophagus Esophagus Esophagus Esophagus Colon/Rectum Colon/Rectum Colon/Rectum Colon/Rectum Bone/SoftTissue Bone/SoftTissue Bone/SoftTissue Bone/SoftTissue SV(+) SV( ) SV(+) SV( ) SV(+) SV( ) SV(+) SV( )

CCND1 AKT3 2000 200

1000 100 FPKM FPKM

0 0 Skin Skin CNS CNS Liver Liver Lung Lung Ovary Ovary Biliary Biliary Cervix Cervix Breast Breast Uterus Uterus Kidney Kidney Thyroid Thyroid Myeloid Myeloid Bladder Bladder Prostate Prostate Stomach Stomach Pancreas Pancreas Lymphoid Lymphoid Head/Neck Head/Neck Esophagus Esophagus Colon/Rectum Colon/Rectum Bone/SoftTissue Bone/SoftTissue SV(+) SV( ) SV(+) SV( ) TNFRSF6B BIRC7 CD40 BCL2L1 EBAG9 CCL28

100 SIRPA KRAS IL6 CYCS PVRIG NOS3 CASP2 KLRB1 KLRD1 KLRG1 KLRF1 −10 KLRC1 KLRC2 KLRK1 CD27 TAPBPL TNFRSF1A LAG3 GAPDH BTLA CD200 CD200R1 CD96 NECTIN3 TIGIT CD47 HHLA2 CD86 CD80 SEC61A1 CCL5 CCL2 CCL3 LGALS9 NOS2 PSME3 BIRC5 MPO CSF3 TNFRSF12A LILRB1 KIR3DL3 KIR2DL3 LILRB2 BIRC8 KIR2DL1 KIR2DL4 KIR3DL2 KIR3DL1 BAX CEACAM1 CXCL17 TGFB1 NECTIN2 PVR PSMB9 TAP1 PSMB8 HLA−DOB TAP2 HLA−DMB HLA−DMA HLA−DOA HLA−DQA2 HLA−DPB1 HLA−DPA1 TAPBP HLA−F HLA−G HLA−A HLA−E BTN3A1 BTN2A2 BTN1A1 MICA HLA−B MICB HLA−C TNF LTA HLA−DQB1 HLA−DQA1 HLA−DRB1 BTNL2 HLA−DRA SERPINB9 RIPK1 VEGFA TNFRSF21 BIRC6 CTSS RFX5 CD160 MCL1 TNFSF18 TNFSF4 FASLG IL10 TGFB2 PTGS2 CD48 CD244 MYC FADD TNFSF10 BCL6 BIRC2 BIRC3 SCAF11 CASP12 CASP4 CASP1 CASP5 CIITA SOCS1 TUBA1A NOS1 APAF1 ALB CXCL8 CXCL5 IFI30 JAK3 RFXANK CALR ICOSLG IFNAR1 IFNAR2 IDO1 IDO2 JAK2 CD274 PDCD1LG2 IL33 PSME1 PSME2 RIPK3 GZMH GZMB LGALS3 LGMN ARG2 TGFB3 TNFSF14 CD70 TNFSF9 TMIGD2 THOP1 GZMM TRAF2 PTGS1 ENDOG TNFSF15 SEC61B CTSL FURIN MFGE8 CD276 LGALS1 ADORA2A BID CADM1 CRTAM B3GAT1 CTNNB1 BAD HRAS PDIA3 B2M CANX HAVCR2 CD74 CSF2 IL13 IL4 GZMA GZMK TSLP NLN ERAP1 ERAP2 LNPEP BCL2 CD226 CASP3 LAP3 CASP6 TDO2 MARCH1 NLRC5 CCL22 MLKL TRADD The copyright holder for this preprint (which was The copyright holder for PSMB10 TNFSF12 ALOX12B ALOX15B TNFRSF10B TNFRSF10C TNFRSF10A TNFRSF10D SEC61A2 CXCL12 MARCH8 VSIR PRF1 CASP7 ENTPD1 FAS PTEN IFNB1 IFNA7 IFNA17 IFNA1 IFNA2 IFNA8 CTLA4 ICOS CD28 CFLAR CASP10 CASP8 STAT1 this version posted March 20, 2018. this version posted March MARCH4 ; PDCD1 TPP2 RFXAP NT5E ARG1 IFNGR1 ULBP3 RAET1L RAET1E ULBP2 RAET1G ULBP1 VTCN1 NRAS CSF1 NRDC JAK1 BCL10 TNFRSF18 TNFRSF4 TNFRSF14 DFFB DFFA TNFRSF9 TNFRSF25 https://doi.org/10.1101/285338 CASP9 TNFRSF1B MARCH9 IFNG doi: SEC61G not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. is the author/funder. All rights reserved. not certified by peer review) t ine MSI

ularCA y−RCC er−HCC

vix−SCC y−ChRCC eloid−AML Bone−Car ymph−CLL Liv eloid−MPN eloid−MDS Lung−SCC

CNS−GBM CNS−Oligo bioRxiv preprint

Head−SCC ymph−NOS Bone−Epith

y−AdenoCA y−AdenoCA

L Supplementary Figure 2 Figure Supplementary Cer

ymph−BNHL L Kidne Breast−DCIS us−AdenoCA

My Bladder−TCC vix−AdenoCA

CNS−Medullo

L My My

anc−AdenoCA Thy−AdenoCA

Eso−AdenoCA myo Bone−Leio CNS−PiloAstro r anc−Endoc

Kidne Lung−AdenoCA P

Skin−Melanoma Prost−AdenoCA

P Bone−Osteosarc

Ovar Biliar

Cer Breast−AdenoCA

Uter Breast−Lob

Stomach−AdenoCA ColoRect−AdenoCA bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Figure 3 MARCH9 SEC61G

● ● 800

600 10000

● ● sv sv ● FALSE ● FALSE fpkm 400 fpkm ● ● ● TRUE TRUE

● ● ●

● 5000

● ● ●

200 ●

● ● ●●

● ● ● ● ● ● ● ● ● ●●● ●● ●●●●●● ●● 0 ●●●● 0 ●●

0 10 20 30 40 0 200 400 600 copy_number - ploidy copy_number - ploidy Supplementary Figure 4 a

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

b Supplementary Figure 5



a 













                                                                                                                                                                                                                                           



b 













                                                                                                                                                                                        



c 









bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was

 not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.



                                                                                                                                                                                        

 d  















                                    

e  









                                                                                                            

f  











                                                                                                                             



g 











                                                                                                         

h  













                                                                                                                                                                                                                                                                                         bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Figure 6 -3G= Color Key

Color Key

Color Key Color Key -3G=M87!0I$7 Color Key h: C04_Cytotoxic_genes_20170630 !2 0 2 h: C04_Cytotoxic_genes_20170630h: C04_Cytotoxic_genes_20170630 Row Z!Score 0F56<7N

GZMA IL2_STAT5_SIGNALING IL2_STAT5_SIGNALING IL2_STAT5_SIGNALING GZMB!"#% ALLOGRAFT_REJECTION ALLOGRAFT_REJECTION ALLOGRAFT_REJECTION GZMB INFLAMMATORY_RESPONSE INFLAMMATORY_RESPONSE INFLAMMATORY_RESPONSE INTERFERON_GAMMA_RESPONSE INTERFERON_GAMMA_RESPONSE INTERFERON_GAMMA_RESPONSE GZMH Color Key !"#& IL6_JAK_STAT3_SIGNALING IL6_JAK_STAT3_SIGNALING IL6_JAK_STAT3_SIGNALING GZMH COMPLEMENT COMPLEMENT COMPLEMENT GZMK!"#' TNFA_SIGNALING_VIA_NFKB TNFA_SIGNALING_VIA_NFKB TNFA_SIGNALING_VIA_NFKB INTERFERON_ALPHA_RESPONSE INTERFERON_ALPHA_RESPONSE h: C04_Cytotoxic_genes_20170630GZMK INTERFERON_ALPHA_RESPONSE KRAS_SIGNALING_UP KRAS_SIGNALING_UP KRAS_SIGNALING_UP GZMM!"## !4 !2 0 2 4 APOPTOSIS APOPTOSIS APOPTOSIS GZMM P53_PATHWAY P53_PATHWAY P53_PATHWAY Value PRF1()*+ UV_RESPONSE_UP UV_RESPONSE_UP UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY PRF1 REACTIVE_OXIGEN_SPECIES_PATHWAY REACTIVE_OXIGEN_SPECIES_PATHWAY PI3K_AKT_MTOR_SIGNALING PI3K_AKT_MTOR_SIGNALING PI3K_AKT_MTOR_SIGNALING GNLY !,-. MTORC1_SIGNALING MTORC1_SIGNALING MTORC1_SIGNALING GNLY UNFOLDED_PROTEIN_RESPONSEIL2_STAT5_SIGNALING UNFOLDED_PROTEIN_RESPONSE UNFOLDED_PROTEIN_RESPONSE MYC_TARGETS_V2 NKG7,'!/ MYC_TARGETS_V2 MYC_TARGETS_V2 DNA_REPAIRALLOGRAFT_REJECTION DNA_REPAIR DNA_REPAIR Color Key NKG7 E2F_TARGETSINFLAMMATORY_RESPONSE E2F_TARGETS E2F_TARGETS FASLG *$0-! G2M_CHECKPOINT G2M_CHECKPOINT G2M_CHECKPOINT

FASLG MYC_TARGETS_V1INTERFERON_GAMMA_RESPONSE MYC_TARGETS_V1 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION h: C04_Cytotoxic_genes_20170630IFNG1*,! OXIDATIVE_PHOSPHORYLATIONIL6_JAK_STAT3_SIGNALING OXIDATIVE_PHOSPHORYLATION COAGULATION COAGULATION COAGULATION IFNG ANGIOGENESISCOMPLEMENT ANGIOGENESIS ANGIOGENESIS ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD !4 !2 0LUSC ! US LUSC ! US LUSC ! US LUSC ! US 2LUSC ! US LUSC ! US LUSC ! US LUSC ! US 4LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US HYPOXIATNFA_SIGNALING_VIA_NFKB HYPOXIA HYPOXIA ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ! US LUAD ValueLUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US LUSC ! US APICAL_SURFACE APICAL_SURFACE APICAL_SURFACE 234567894:;<87=653:<>7?@ APICAL_JUNCTIONINTERFERON_ALPHA_RESPONSE APICAL_JUNCTION APICAL_JUNCTION AB@C5C5DEF78E=G9C369CEN< TGF_BETA_SIGNALINGUV_RESPONSE_UP TGF_BETA_SIGNALING TGF_BETA_SIGNALING UV_RESPONSE_DNOXIDATIVE_PHOSPHORYLATION UV_RESPONSE_DN UV_RESPONSE_DN :K58:K56@;9CE5G ESTROGEN_RESPONSE_LATEREACTIVE_OXIGEN_SPECIES_PATHWAY ESTROGEN_RESPONSE_LATE ESTROGEN_RESPONSE_LATE COAGULATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_EARLYPI3K_AKT_MTOR_SIGNALING ESTROGEN_RESPONSE_EARLY ANGIOGENESIS MTORC1_SIGNALING cns cns cns skin liver skin liver lung skin liver lung lung HYPOXIA FG8 ovary ovary ovary cervix cervix 8REG breast cervix uterus ;EN<6 breast uterus kidney breast uterus kidney ;3G= thyroid kidney thyroid UNFOLDED_PROTEIN_RESPONSE thyroid bladder G

prostate APICAL_SURFACE stomach stomach stomach !0I$70F56< 5N96@ pancreas pancreas F<6NED IG6EFK<>7EG IG6EFK<>7EGpancreas ?6<98C 3C<638

MYC_TARGETS_V2 RE>G<@ CK@65E> head_neck head_neck head_neck APICAL_JUNCTION ?;9>><6 colon.rectum colon.rectum colon.rectum ! KE=K7=653: ;5J7=653: :658C9C<

S7 8E=GTI0U;5=+VT:QN9;3 Q ANDROGEN_RESPONSEE2F_TARGETS

PROTEIN_SECRETIONG2M_CHECKPOINT F5;5GO6

thyroid COAGULATION

bladder KRAS_SIGNALING_DN prostate

stomach ANGIOGENESIS pancreas PANCREAS_BETA_CELLS HYPOXIA head_neck HEDGEHOG_SIGNALING

colon.rectum APICAL_SURFACE TGF_BETA_SIGNALING APICAL_JUNCTION bone.softtissue UV_RESPONSE_DN EPITHELIAL_MESENCHYMAL_TRANSITION ESTROGEN_RESPONSE_LATE ANDROGEN_RESPONSE ESTROGEN_RESPONSE_EARLY PROTEIN_SECRETION MITOTIC_SPINDLE SPERMATOGENESIS HEME_METABOLISM cns skin liver lung NOTCH_SIGNALING ovary cervix breast uterus kidney

thyroid WNT_BETA_CATENIN_SIGNALING bladder prostate

stomach GLYCOLYSIS pancreas FATTY_ACID_METABOLISM head_neck ADIPOGENESIS colon.rectum CHOLESTEROL_HOMEOSTASIS bone.softtissue XENOBIOTIC_METABOLISM PEROXISOME BILE_ACID_METABOLISM MYOGENESIS KRAS_SIGNALING_DN PANCREAS_BETA_CELLS HEDGEHOG_SIGNALING TGF_BETA_SIGNALING UV_RESPONSE_DN ESTROGEN_RESPONSE_LATE ESTROGEN_RESPONSE_EARLY cns skin liver lung ovary cervix breast uterus kidney thyroid bladder prostate stomach pancreas head_neck colon.rectum bone.softtissue Supplementary Figure7 ColoRect_AdenoCA ColoRect_AdenoCA

Breast_AdenoCA Breast_AdenoCA bioRxiv preprint

Panc_AdenoCA Panc_AdenoCA Immuno-editing Index(IEI)

Bladder_TCC Immune checkpoints Bladder_TCC

Liver_HCC Liver_HCC

Head_SCC Head_SCC

Skin_Melanoma Skin_Melanoma not certifiedbypeerreview)istheauthor/funder.Allrightsreserved.Noreuseallowedwithoutpermission.

Uterus_AdenoCA Uterus_AdenoCA doi:

Lung_AdenoCA Lung_AdenoCA https://doi.org/10.1101/285338 Stomach_AdenoCA Stomach_AdenoCA

Ovary_AdenoCA Ovary_AdenoCA

Lung_SCC Lung_SCC

Kidney_RCC Kidney_RCC

CNS_GBM CNS_GBM

Lymph_BNHL Lymph_BNHL MITOTIC_SPINDLE SPERMATOGENESIS G2M_CHECKPOINT E2F_TARGETS MYC_TARGETS_V2 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_LATE UV_RESPONSE_DN TGF_BETA_SIGNALING HEDGEHOG_SIGNALING NOTCH_SIGNALING PANCREAS_BETA_CELLS WNT_BETA_CATENIN_SIGNALING KRAS_SIGNALING_DN ANGIOGENESIS APICAL_SURFACE MYOGENESIS APICAL_JUNCTION BILE_ACID_METABOLISM PEROXISOME CHOLESTEROL_HOMEOSTASIS XENOBIOTIC_METABOLISM ADIPOGENESIS GLYCOLYSIS FATTY_ACID_METABOLISM ANDROGEN_RESPONSE PROTEIN_SECRETION HEME_METABOLISM HYPOXIA COAGULATION UNFOLDED_PROTEIN_RESPONSE DNA_REPAIR MTORC1_SIGNALING PI3K_AKT_MTOR_SIGNALING P53_PATHWAY UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY EPITHELIAL_MESENCHYMAL_TRANSITION TNFA_SIGNALING_VIA_NFKB APOPTOSIS KRAS_SIGNALING_UP COMPLEMENT IL6_JAK_STAT3_SIGNALING INTERFERON_GAMMA_RESPONSE INTERFERON_ALPHA_RESPONSE INFLAMMATORY_RESPONSE ALLOGRAFT_REJECTION IL2_STAT5_SIGNALING MITOTIC_SPINDLE SPERMATOGENESIS G2M_CHECKPOINT E2F_TARGETS MYC_TARGETS_V2 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_LATE UV_RESPONSE_DN TGF_BETA_SIGNALING HEDGEHOG_SIGNALING NOTCH_SIGNALING PANCREAS_BETA_CELLS WNT_BETA_CATENIN_SIGNALING KRAS_SIGNALING_DN ANGIOGENESIS APICAL_SURFACE MYOGENESIS APICAL_JUNCTION BILE_ACID_METABOLISM PEROXISOME CHOLESTEROL_HOMEOSTASIS XENOBIOTIC_METABOLISM ADIPOGENESIS GLYCOLYSIS FATTY_ACID_METABOLISM ANDROGEN_RESPONSE PROTEIN_SECRETION HEME_METABOLISM HYPOXIA COAGULATION UNFOLDED_PROTEIN_RESPONSE DNA_REPAIR MTORC1_SIGNALING PI3K_AKT_MTOR_SIGNALING P53_PATHWAY UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY EPITHELIAL_MESENCHYMAL_TRANSITION TNFA_SIGNALING_VIA_NFKB APOPTOSIS KRAS_SIGNALING_UP COMPLEMENT IL6_JAK_STAT3_SIGNALING INTERFERON_GAMMA_RESPONSE INTERFERON_ALPHA_RESPONSE INFLAMMATORY_RESPONSE ALLOGRAFT_REJECTION IL2_STAT5_SIGNALING ; this versionpostedMarch20,2018.

ColoRect_AdenoCA ColoRect_AdenoCA

Breast_AdenoCA Breast_AdenoCA

Panc_AdenoCA Panc_AdenoCA

Bladder_TCC Bladder_TCC

Liver_HCC Cell component Liver_HCC HLA class I Head_SCC Head_SCC

Skin_Melanoma Skin_Melanoma

Uterus_AdenoCA Uterus_AdenoCA

Lung_AdenoCA Lung_AdenoCA The copyrightholderforthispreprint(whichwas Stomach_AdenoCA Stomach_AdenoCA

Ovary_AdenoCA Ovary_AdenoCA

Lung_SCC Lung_SCC

Kidney_RCC Kidney_RCC

CNS_GBM CNS_GBM

Lymph_BNHL Lymph_BNHL MITOTIC_SPINDLE SPERMATOGENESIS G2M_CHECKPOINT E2F_TARGETS MYC_TARGETS_V2 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_LATE UV_RESPONSE_DN TGF_BETA_SIGNALING HEDGEHOG_SIGNALING NOTCH_SIGNALING PANCREAS_BETA_CELLS WNT_BETA_CATENIN_SIGNALING KRAS_SIGNALING_DN ANGIOGENESIS APICAL_SURFACE MYOGENESIS APICAL_JUNCTION BILE_ACID_METABOLISM PEROXISOME CHOLESTEROL_HOMEOSTASIS XENOBIOTIC_METABOLISM ADIPOGENESIS GLYCOLYSIS FATTY_ACID_METABOLISM ANDROGEN_RESPONSE PROTEIN_SECRETION HEME_METABOLISM HYPOXIA COAGULATION UNFOLDED_PROTEIN_RESPONSE DNA_REPAIR MTORC1_SIGNALING PI3K_AKT_MTOR_SIGNALING P53_PATHWAY UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY EPITHELIAL_MESENCHYMAL_TRANSITION TNFA_SIGNALING_VIA_NFKB APOPTOSIS KRAS_SIGNALING_UP COMPLEMENT IL6_JAK_STAT3_SIGNALING INTERFERON_GAMMA_RESPONSE INTERFERON_ALPHA_RESPONSE INFLAMMATORY_RESPONSE ALLOGRAFT_REJECTION IL2_STAT5_SIGNALING MITOTIC_SPINDLE SPERMATOGENESIS G2M_CHECKPOINT E2F_TARGETS MYC_TARGETS_V2 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_LATE UV_RESPONSE_DN TGF_BETA_SIGNALING HEDGEHOG_SIGNALING NOTCH_SIGNALING PANCREAS_BETA_CELLS WNT_BETA_CATENIN_SIGNALING KRAS_SIGNALING_DN ANGIOGENESIS APICAL_SURFACE MYOGENESIS APICAL_JUNCTION BILE_ACID_METABOLISM PEROXISOME CHOLESTEROL_HOMEOSTASIS XENOBIOTIC_METABOLISM ADIPOGENESIS GLYCOLYSIS FATTY_ACID_METABOLISM ANDROGEN_RESPONSE PROTEIN_SECRETION HEME_METABOLISM HYPOXIA COAGULATION UNFOLDED_PROTEIN_RESPONSE DNA_REPAIR MTORC1_SIGNALING PI3K_AKT_MTOR_SIGNALING P53_PATHWAY UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY EPITHELIAL_MESENCHYMAL_TRANSITION TNFA_SIGNALING_VIA_NFKB APOPTOSIS KRAS_SIGNALING_UP COMPLEMENT IL6_JAK_STAT3_SIGNALING INTERFERON_GAMMA_RESPONSE INTERFERON_ALPHA_RESPONSE INFLAMMATORY_RESPONSE ALLOGRAFT_REJECTION IL2_STAT5_SIGNALING 4−2 −4 ColoRect_AdenoCA

Color Key Breast_AdenoCA Value

2 0 Panc_AdenoCA

Bladder_TCC

4 Liver_HCC

Head_SCC Cytotxic

Skin_Melanoma

Uterus_AdenoCA

Lung_AdenoCA

Stomach_AdenoCA

Ovary_AdenoCA

Lung_SCC

Kidney_RCC

CNS_GBM

Lymph_BNHL MITOTIC_SPINDLE SPERMATOGENESIS G2M_CHECKPOINT E2F_TARGETS MYC_TARGETS_V2 MYC_TARGETS_V1 OXIDATIVE_PHOSPHORYLATION ESTROGEN_RESPONSE_EARLY ESTROGEN_RESPONSE_LATE UV_RESPONSE_DN TGF_BETA_SIGNALING HEDGEHOG_SIGNALING NOTCH_SIGNALING PANCREAS_BETA_CELLS WNT_BETA_CATENIN_SIGNALING KRAS_SIGNALING_DN ANGIOGENESIS APICAL_SURFACE MYOGENESIS APICAL_JUNCTION BILE_ACID_METABOLISM PEROXISOME CHOLESTEROL_HOMEOSTASIS XENOBIOTIC_METABOLISM ADIPOGENESIS GLYCOLYSIS FATTY_ACID_METABOLISM ANDROGEN_RESPONSE PROTEIN_SECRETION HEME_METABOLISM HYPOXIA COAGULATION UNFOLDED_PROTEIN_RESPONSE DNA_REPAIR MTORC1_SIGNALING PI3K_AKT_MTOR_SIGNALING P53_PATHWAY UV_RESPONSE_UP REACTIVE_OXIGEN_SPECIES_PATHWAY EPITHELIAL_MESENCHYMAL_TRANSITION TNFA_SIGNALING_VIA_NFKB APOPTOSIS KRAS_SIGNALING_UP COMPLEMENT IL6_JAK_STAT3_SIGNALING INTERFERON_GAMMA_RESPONSE INTERFERON_ALPHA_RESPONSE INFLAMMATORY_RESPONSE ALLOGRAFT_REJECTION IL2_STAT5_SIGNALING bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. bladder bone.softtissue breast cervix cns 1.00

0.75

0.50 0.25 Supplementary Figure 8 0.00 <=76 colon.rectum head!neck kidney liver !GBHIAJI'lung 1.00 bladder bone.softtissue breast cervix cns 20 0.75 6 30 75 15 0.50 40

4 20 50 10 0.25 20 2 10 25 5 !3043?@:@37 0.00 A104+* 0 lymphoid0 ovary 0 pancreas 0 prostate 0 skin 9 !"#$>? *942*??@37 1.00 colon.rectum head!neck kidney liver lung 60 80 0.75 120 40

60 20 40 30 0.50 80 40 0.25 20 10 20 40 20 10 0.00

0 0 0 0 0 stomach thyroid A104+* lymphoid1.00 bladder ovary bone.softtissue pancreas breast prostatecervix skincns"*)3043?@:@37 600 60 20 0.75 60 60 4 15 1.2 400 0.50 40 0.75 40 40 4 10 !"#$ 0.25 3 2 20 200 20 0.8 20 3 0.50 5 0.00 2 !"%&'()*++ 0 0 0 0 0 2 1 0.4 !"#&'()*++ stomach0.25 thyroid CD8 T cells gamma delta T cells Macrophages M2 Plasma cells 1 1 ,-()*++ 80 50 CD4 T cells Natural killer cells Dendritic cells Mast cells Regulatory CD4 T cells Mo/Macrophages B cells Neutrophils .373&8*9)*4:&./; 60 0.00 40 0 0 0.0 0 ./&01)234516* 30 40 0.0 0.5 1.0 1.5 0 1 2 3 0 1 2 3 4 5 0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 B()*++ 20 A104+* 20 C+3D&)E:30*:2E(+@F*&4+3: colon.rectum10 head!neck kidney liver lung 0 2.0 0 3 4 4 3 CD8 T cells gamma delta T cells Macrophages M2 Plasma cells 1.5 3 CD4 T2 cells Natural 3killer cells Dendritic cells Mast cells Regulatory CD4 T cells Mo/Macrophages B cells Neutrophils 2 1.0 2 2 01)234516* 1

1./& 0.5 1 1 0.0 0 0 0 0 0.0 0.5 1.0 1.5 2.0 2.5 0 1 2 3 4 0 2 4 6 0 1 2 3 4 0 1!"%&' 2()*++ 3 lymphoid ovary pancreas prostate skin 5 1.5 4 3 4 3 qt.Macrophages_M2 1.0 3 3 2 2 2 2 0.5 1 1 1 1 0 0 0 0.0 0 0 2 4 6 0 1 2 3 4 0 1 2 0.0 0.5 1.0 1.5 2.0 0 1 2 3 4 5 stomach thyroid 3 2.5 2.0 2 1.5 1.0 1 0.5 0 0.0 0 1 2 3 4 0 1 2 3 qt.T_cells_CD8 bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Figure 9 Supplementary Table 1

bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. ADORA2A CASP8 CXCL8 HRAS KRAS PTEN TNFRSF14 AIFM1 CASP9 CYCS ICOS LAG3 PTGS1 TNFRSF18 ALOX12B CCL2 DFFA ICOSLG LAP3 PTGS2 TNFRSF1A ALOX15B CCL22 DFFB IDO1 LGALS1 PVR TNFRSF1B APAF1 CCL28 EBAG9 IDO2 LGALS3 PVRIG TNFRSF21 ARG1 CCL3 ENDOG IFI30 LGALS9 RAET1E TNFRSF25 ARG2 CCL5 ENTPD1 IFNA1 LGMN RAET1G TNFRSF4 B2M CD113 ERAP1 IFNA17 LILRB1 RAET1L TNFRSF6B B3GAT1 CD160 ERAP2 IFNA2 LILRB2 RFX5 TNFRSF9 BAD CD200 FADD IFNA7 LNPEP RFXANK TNFSF10 BAX CD200R1 FADD IFNA8 LTA RFXAP TNFSF10 BCL10 CD226 FAS IFNAR1 MARCH1 RIPK1 TNFSF12 BCL2 CD244 FASLG IFNAR2 MARCH4 RIPK3 TNFSF14 BCL2L1 CD27 FASLG IFNB1 MARCH8 SCAF11 TNFSF15 BCL6 CD274 FURIN IFNG MARCH9 SEC61A1 TNFSF18 BID CD276 GZMA IFNGR1 MCL1 SEC61A2 TNFSF4 BIRC2 CD28 GZMB IL10 MFGE8 SEC61B TNFSF9 BIRC3 CD40 GZMH IL13 MICA SEC61G TPP2 BIRC5 CD40LG GZMK IL33 MICB SERPINB9 TRADD BIRC6 CD47 GZMM IL4 MLKL SIRPA TRAF2 BIRC7 CD48 HAVCR2 IL6 MYC SOCS1 TSLP BIRC8 CD70 HHLA2 JAK1 NECTIN2 STAT1 ULBP1 BTLA CD74 HLA-A JAK2 NLN TAP1 ULBP2 BTN1A1 CD80 HLA-B JAK3 NLRC5 TAP2 ULBP3 BTN2A2 CD86 HLA-C KIR2DL1 NOS1 TAPBP VEGFA BTN3A1 CD96 HLA-DMA KIR2DL3 NOS2 TAPBPL VTCN1 BTNL2 CEACAM1 HLA-DMB KIR2DL4 NOS3 TDO2 XIAP C10orf54 CFLAR HLA-DOA KIR2DS1 NRAS TGFB1 CADM1 CIITA HLA-DOB KIR3DL1 NRDC TGFB2 CALR CRTAM HLA-DPA1 KIR3DL2 NT5E TGFB3 CANX CSF1 HLA-DPB1 KIR3DL3 PDCD1 THOP1 CASP1 CSF2 HLA-DQA1 KIR3DL4 PDCD1LG2 TIGIT CASP10 CSF3 HLA-DQA2 KIR3DS1 PDIA3 TMIGD2 CASP12 CTLA4 HLA-DQB1 KLRB1 PRF1 TNF CASP2 CTNNB1 HLA-DRA KLRC1 PSMB10 TNF CASP3 CTSL HLA-DRB1 KLRC2 PSMB8 TNFRSF10A CASP4 CTSS HLA-E KLRD1 PSMB9 TNFRSF10B CASP5 CXCL12 HLA-F KLRF1 PSME1 TNFRSF10C CASP6 CXCL17 HLA-G KLRG1 PSME2 TNFRSF10D CASP7 CXCL5 HLA-G KLRK1 PSME3 TNFRSF12A bioRxiv preprint doi: https://doi.org/10.1101/285338; this version posted March 20, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

Supplementary Table 2 HLA class I Cytotxic Immune checkpoints Cell component

HLA-A GZMA PDCD1 PTPRC HLA-B GZMB CTLA4 CD2 HLA-C GZMH HAVCR2 CD3G B2M GZMK LAG3 CD3E PSMB8 GZMM BTLA CD4 PSMB9 PRF1 TIGIT CD5 PSMB10 GNLY CD96 CD7 TAP1 NKG7 CD200R1 CD8A TAP2 FASLG LILRB1 KLRK1 TAPBP IFNG LILRB2 KLRB1 NLRC5 CD160 KLRD1 CD19 MS4A1 ITGAX ITGAM CD14 CD33