Supporting Information

Kazmin et al. 10.1073/pnas.1621489114 SI Methods possible combinations of cytokines (for four cytokines that would GeneArray Probe Level Processing. Total mRNA was isolated from equal to 15, not including all-negative subset) is drawn from the frozen PBMCs provided by Walter Reed Army Institute for Re- same distribution for two groups (for example, ARR vs. RRR at search by using Quagen RNAeasy kit (Qiagen) and stored at time = D14), and is not significantly different between the two −80 °C. RNA was then quantified and checked for integrity by comparison groups. Once the measurement of χ2 metric is using Agilent BioAnalyzer (Agilent Technologies). For label obtained, we need to test whether this value represents a sig- preparation, 50 ng of total mRNA were amplified and labeled by nificant difference or can be obtained stochastically. To test this using NuGEN Ovation WB target labeling kit (NuGEN) and possibility, we use a partial permutation test. In this approach, all hybridized to HU-133 plus 2.0 GeneChip (Affymetrix). Samples samples from the two groups being compared are pooled into were processed in batches of 96; special care was taken to ensure one set, and groups to be compared are reformed by randomly that all time points from a particular subject were processed in the assigning each sample to one or the other group to be compared. same batch. Subjects were assigned to batches in a way that bal- The χ2 statistic is calculated for each of these random permu- anced the distribution according to arm and protection status. tations. After a large number (1,000–10,000) of permutations, it Two pooled references were included in each batch to warrant becomes possible to estimate whether the values of χ2 statistic against any batch effects. obtained on the original (nonpermuted) combination of samples Posthybridization QC analysis was performed by using Bio- could be obtained with certain frequency by randomly permuting conductor packages arrayQualityMetrics (www.bioconductor. the samples, that is, stochastically. This frequency forms the P org/packages/release/bioc/html/arrayQualityMetrics.html)and value for the test (for instance, P = 0.05 indicates that there is no AffyQCReport. Spike controls were evaluated following RMA 2 more than 5% probability that a value of χ test obtained in this normalization. Based on probe level QC analysis and RNA quality comparison could be achieved by a stochastic (random) assign- control, 56 of 639 samples were removed from analysis. Background ment of samples to the comparison groups. correction on the remaining samples was performed by GC-RMA, with probes summary by median polish, followed by log transform. 2 Enrichment Analysis. Enrichment analysis was performed by using Procedures were implemented to detect the contribution of batch Gather (changlab.uth.tmc.edu/gather/gather.py)orReactome to the overall variance (batch effect). No batch effect was observed. (www.reactome.org). Microarray data have been submitted to National Center for Biotechnology Information GEO (accession no. GSE89292). Predictive Modeling. Predictive models were built using DAMIP. Application of DAMIP to vaccine clinical studies has been de- Significance Testing. Significance testing comparing the ex- scribed earlier (24, 25). Briefly, subjects in each arm were separated pression at each time point to the baseline at D0 was performed by t – into training and blind test sets. The criteria for subject selection paired two-sided test, followed by Benjamini Hochberg FDR i correction as described (24). with q values <0.05 and ab- were ( ) no fewer than 10 subjects must have nonmissing data at each time point (because of some samples being removed after solute fold change greater than 1.5 were considered to be signif- ii icantly regulated. QC) and ( ) the ratio of protected and nonprotected subjects in the training set is closely matched the ratio in the overall set of GSEA Analysis. For GSEA analysis (37), either fold change with subjects. Feature selection and model training were performed respect to baseline (D0) or correlation coefficients were used as a through 10-fold cross-validation (10× CV) loops in the training ranking metric. For genes represented by multiple probe sets, the set. The predictive models/rules/signatures were selected by using probe set with the greatest extreme value of the ranking metric a predefined accuracy cutoff for predicting the protection status was used. GSEA was run with 10,000 permutations by using Java (typically in the ≥80% accuracy range). Models passing filtering interface (Broad Institute, software.broadinstitute.org/gsea/index. criteria in the training set were then evaluated in the blind test set. jsp) or standalone R code (37). When BTM modules were used, Models passing filtering criteria in the blind test set were reported modules with fewer than 10 gene identifiers were ignored. as candidate signatures of protection.

SPICE Analysis. Analysis and presentation of distributions was Analysis of ASC Responses. Analysis of ASC responses was per- performed by using SPICE version 5.3, downloaded from https:// formed as described (30). exon.niaid.nih.gov (37). Comparison of distributions was per- formed by using a Student’s t test and a partial permutation test Analysis of CSP-Specific T-Cell Responses. FACS analysis of CSP- + + as described (37). Briefly, a χ2 like test is used to test the null specific CD4 and CD8 T-cell responses were performed as hypothesis that the distribution of measurements of any of all described (18).

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 1of10 a c 200 RRR 128 IgA ** * ARR 64 150 32 RRR 16 ARR 8 100 4 2 1 50 0.5 * 128 64 IgG Anti-CS repeat region concentration, EU/mL 0 32 16 0285677 105 140 236 8 b 8 days post-primary immunization 4 2 1 6 * * RRR 0.5

5 128 ARR 64 IgM 4 32 CSP-specific ASCs per million PBMCs (log2) CSP-specific 16 8

mIU/mL, 10 2 4 2 1 Anti-HBsAg concentration, 0 0.5 028345662 0285677 105 140 236 days post-primary immunization days post-primary immunization

d p=0.00674 p<0.00001 p=0.00348 p=0.0399 p=0.01876 p=0.1177

ARR

RRR

14 42 77 105 140 236 days post-primary immunization CD40L IL-2 TNF-α IFN-γ

Fig. S1. Immune responses to vaccination. (A and B) Baseline-normalized anti-CS repeat region antibody concentration (A) and anti-HBsAg antibody con- centration (B) in RRR and ARR groups after vaccination. Lines indicate median across all subjects; shaded areas indicate 25–75% interquartile range. Asterisks indicate significance at a = 0.05 level by Wilcoxon signed rank test. (C) Frequency of CSP-specific antibody secreting cells by ELISPOT. Each dot represents a + subject. (D) SPICE plots illustrating the functionality of CSP-specific CD4 T cells by vaccination at each time point. Sectors indicate the number of markers expressed: blue, one marker; green, two markers; orange, three markers; red, four markers. Arches indicate which markers are expressed. Significance test was performed to illustrate whether the pattern of distribution of T-cell populations for the two vaccines came from the same distribution or different ones. Sum of χ2 values was used as a test metric. P values were generated by partial permutation test. For more details, see ref. 38.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 2of10 a c 15 25 Inflammatory/ TLR/ Chemokines 36 LINC00487 153 RRR SIGLEC1 20 ARR LY6E 10 IFI27 15 EPHB2 LGALS3BP IFI6 IFI44L 10 SERPING1

10 HESX1 5 5 -log (p-value)

0 0 1 2 6 1428 29 34 5657 62 7778 82 0 -5 Day -1 0 1 2 3 Normalized Enrichment Score (Sum) -10 log2 (Fold change)

b 35 Cell Cycle RRR 30 ARR

25

20 0 15 28 Immunizations 56 10 77 Challenge 5

0 0 1 2 6 1428 29 34 5657 62 7778 82 Normalized Enrichment Score (Sum) -5 Day

Fig. S2. Transcriptional responses to vaccination. (A and B) Functional responses to vaccination in the two cohorts. GSEA using BTMs as gene sets was per- formed on lists of genes ranked by fold-change relative to prevaccination baseline. Plotted are the sums of normalized enrichment scores across high-level annotation groups of BTMs displaying statistically significant (FDR q < 0.05) enrichment. Additional details are provided in Dataset S1. (C) Differentially expressed probe sets between Ad35-prime and RTS,S/AS01-prime vaccines at D1 after prime immunization. Criteria used are raw P value of <0.01, and jFold- changej > 1.5-fold.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 3of10 ABTB2 HERC6 SLFN5 SLFN5 BLZF1 SLFN5 FLT1 FLT1 CD80 DEFB1 CD80 FNDC3B RIN2 OASL OASL CCL2 SPSB1 APOBEC3A IFI44 EPSTI1 IL15 IL15 DDX60L MX2 LY6E ATF3 CCNA1 PML IFIT5 IFIT5 SAMD9 SAMD9 DHX58 ISG20 ISG20 PNPT1 FN1 HESX1 SLFN5 APOL6 APOL6 BCL2L14 TRIM21 HERC5 C3AR1 IFI44 IFI44 IRF7 DDX60L MX2 PML EPSTI1 PML LGALS3BP BLZF1 PARP14 PLSCR1 HEG1 DHX58 BLZF1 PNPT1 Schoggins gene setSAMD9 Querec gene set BLZF1 SAMD9 AQP9 IFI6 CD80 MX1 RIN2 PARP14 -3.97 HERC5 -1.77 CACNA1A IFI44 RTP4 -3.014 HERC6 -1.262 -2.058 OASL -0.754 CMPK2 OASL IL15RA -1.102 CMPK2 -0.246 IFIH1 -0.146 IFIH1 0.2614 RTP4 LAMP3 0.8103 LAMP3 0.7692 CD274 2.1603 IFIH1 2.1274 IFI6 IFIH1 3.5102 OAS2 3.4855 MX1 4.8601 ISG15 4.8437 PARP14 OAS3 6.2101 XAF1 6.2018 CFB 7.56 IFI27 7.56 ISG15 SIGLEC1 OAS3 SIGLEC1 SERPING1 USP18 CD38 XAF1 DDX58 CD274 TDRD7 HERC6 OAS2 DDX58 IFIH1 FBXO6 IFIH1 NEXN STAT1 IFIT2 STAT1 BATF2 UBE2L6 CXCL9 KLHDC7B MS4A4A ETV7 MS4A4A ETV7 DDX60 DDX60L SERPING1 MS4A4A IFI27 TRIM5 IFI27 DDX58 OAS2 IFI44L EPSTI1 SIGLEC1 PARP9 SIGLEC1 OAS2 XAF1 CLEC4D XAF1 CLEC4D TLR7 GBP2 MARCKS MARCKS TLR7 GBP1 TCF7L2 OAS3 DDX58 PARP9 IFI44L TDRD7 STAT1 HERC6 STAT1 DDX58 STAT1 STAT1 RNF213 CDKN1C EPSTI1 CDKN1C PARP9 CDKN1C CDKN1C GBP4 EIF2AK2 OAS2 EIF2AK2 XAF1 EIF2AK2 PLSCR1 XAF1 KLHDC7B FCGR1A IRF7 PLSCR1 FCGR1A JUN STAT1 JUN FAM26F JUN JUN FAM26F MARCKS OAS3 MARCKS PARP9 MARCKS NEXN FBXO6 PARP12 IFI35 RRR TRIM22 RRR TRAFD1 Vaccine TRIM5 Vaccine ARR CXCL10 ARR TRAFD1 GBP1 TRAFD1 Day of study GBP1 Day of study 1 2 6 14 28 29 34 56 57 62 77 78 82 GBP1 1 2 6 14 28 29 34 56 57 62 77 78 82 GBP1 OAS1 GBP4 OAS1 GBP5 SAMD9L SAMD9L STAT1 SAMD9L STAT1 TLR7 GBP5 IFIT1 IFIT3 STAT1 IFIT3 SAMD4A IFIT2 IFIT2 WARS RSAD2 WARS RSAD2 ANKRD22 ANKRD22 CCL8 CXCL11 CXCL11 IFIT2 IFIT1 IFIT3 IFIT3 LIPA RSAD2 RSAD2 CXCL10 CCL4 FFAR2 TLR7 STEAP4 STEAP4 IL1RN IL1RN IL1RN TNFAIP6 TNFAIP6 GBP1 GBP1 GBP1 TNFSF10 OAS1 OAS1 SAMD9L SAMD9L SAMD9L TNFSF10 TNFSF10 OAS2 SAMD4A

Fig. S3. Magnitude and kinetics of induction of antiviral and type I IFN-related genes by the ARR and RRR vaccines. The list of antiviral and type I IFN-related genes, indicated as “Schoggins gene set” in the heat map on the left, was obtained from Schoggins et al. (27); the list of genes indicated as “Querec gene set” in the heat map on the right represent genes induced by YF-17D vaccination (24). For the Schoggins gene set, only probe sets that display more than fourfold induction at D1 after primary vaccination with either Ad35.CS or RTS,S/AS01 are included.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 4of10 ahaso ellnae hyrpeet ieo ice niae h u fnraie nihetsoe o Tsicue nhg-ee annotati high-level in included BTMs for scores enrichment normalized of (FDR sum enrichment the significant indicates statistically circles of with Size BTMs represent. Only they lineages cell or pathways S4. Fig. amne al. et Kazmin RRR ARR ucinlercmn mn ee euae nrsos ovciain Tsaegopdit ihlvlantto rusbsdo common on based groups annotation high-level into grouped are BTMs vaccination. to response in regulated genes among enrichment Functional Inflammatory / TLR /chemokines Inflammatory / TLR /chemokines www.pnas.org/cgi/content/short/1621489114 Interferon /antiviralsensing Interferon /antiviralsensing Antigen presentation Antigen presentation DC activation DC activation Plasma cells Plasma cells Neutrophils Neutrophils Monocytes Monocytes Cell cycle Cell cycle NK cells NK cells B cells B cells T cells T cells

D1 D1 q value D2 D2 < .5 r nlddi h us uldtisaeaalbei aae S1. Dataset in available are details Full sums. the in included are 0.05) D6 D6

D14 D14

D28 D28

D29 D29

D34 D34

D56 D56

D57 D57

D62 D62

D77 D77

D78 D78

D82 D82 DAYS DAYS Direction POSITIVE NEGATIVE Prime Boost Challenge ngroups. on 5of10 a Post-primary B cells T cells 2 Plasma cells Signal 1 Cell cycle transduction 0 ECM and D1 NK Cells -11 migration D2 -2 Neutrophils Mitochondrial D6 D14 Antigen Monocytes presenta Interferon/ DC activation antiviral Inflammatory /TLR/chemokines

Post-first boost B cells T cells 2 Plasma cells Signal transduction 1 Cell cycle 0 ECM and NK cells -11 migration D28 -2 D29 Neutrophils Mitochondrial D34 RRR Antigen Monocytes presentation Interferon/ DC activation antiviral Inflammatory /TLR/chemokines

Post-second boost B cells T cells 2 Plasma cells Signal 1 transduction Cell cycle 0 ECM and D56 NK cells -11 migration -2 D57 Neutrophils Mitochondrial D62 Antigen Monocytes presentation Interferon/ DC activation antiviral Inflammatory /TLR/chemokines

b B cells T cells 2 Plasma cells Signal 1 transduction Cell cycle 0 ECM and NK cells -11 migration D1 -22 D2 ARR Neutrophils Mitochondrial D6 Antigen Monocytes presentation Interferon/ DC activation antiviral Inflammatory /TLR/chemokines

Fig. S5. Molecular associations of immunogenicity. Illustration of GSEA with genes being ranked by correlation to the immunogenicity endpoints. Spider plots indicate the average normalized enrichment scores across all modules included in each of the represented high-level annotation groups. The distance from the center of the plot corresponds to the average NES across all BTMs included in the corresponding high-level annotation group. Only modules with FDR q values <0.05 are included in the count, for the rest of the NES is set to zero. Translucent blue zones indicate negative enrichment. (A) Correlates of anti-CSP repeat region IgG concentrations at the day of challenge, shown for each of the prechallenge time points for the RRR arm. (B) Correlates of the frequencies of CS- specific CD4+ T cells expressing three or more markers at D14 postprime immunization in the ARR arm. Time points before D14 are shown. Full details are provided in Dataset S2.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 6of10 abRRR ARR

Day 1 2 6 14 28 29 34 56 57 62 Day 1 2 6 -4 respiratory electron transport chain (mitochondrion) (M216) enriched in B cells (III) (M47.2) respiratory electron transport chain (mitochondrion) (M219) enriched in B cells (IV) (M47.3) respiratory electron transport chain (mitochondrion) (M238) plasma cells & B cells, immunoglobulins (M156.0) enriched in antigen presentation (II) (M95.0) enriched in antigen presentation (I) (M71) cell cycle and transcription (M4.0) enriched in antigen presentation (III) (M95.1) cell cycle (I) (M4.1) 0 complement and other receptors in DCs (M40) mitotic cell cycle - DNA replication (M4.4) myeloid, dendritic cell activation via NFkB (I) (M43.0) cell cycle (II) (M4.10) chemokines and inflamm. molecules in myeloid cells (M86.0) enriched in dendritic cells (M168) mitotic cell division (M6) TLR and inflammatory signaling (M16) mismatch repair (II) (M22.1) 4 TLR8-BAFF network (M25) DNA repair (M76) immune activation - generic cluster (M37.0) cell cycle (III) (M103) MHC-TLR7-TLR8 cluster (M146) Activated (LPS) dendritic cell surface signature (S11) cell division - transcription network (M4.8) inflammatory response (M33) E2F network (M8) innate antiviral response (M150) targets (Q4) (M10.1) enriched in monocytes (II) (M11.0) respiratory electron transport chain (mitochondrion) (M219) enriched in myeloid cells and monocytes (M81) enriched in monocytes (IV) (M118.0) respiratory electron transport chain (mitochondrion) (M231) Monocyte surface signature (S4) respiratory electron transport chain (mitochondrion) (M238) enriched in monocytes (I) (M4.15) mitochondrial cluster (M235) enriched in monocytes (surface) (M118.1) regulation of antigen presentation and immune response (M5.0) formyl peptide mediated neutrophil response (M11.2) enriched in neutrophils (I) (M37.1) enriched in antigen presentation (II) (M95.0) enriched in neutrophils (II) (M163) activated dendritic cells (M67) enriched in NK cells (II) (M61.0) enriched in activated dendritic cells (II) (M165) enriched in NK cells (KIR cluster) (M61.1) TLR and inflammatory signaling (M16) enriched in NK cells (I) (M7.2) TLR8-BAFF network (M25) NK cell surface signature (S1) immune activation - generic cluster (M37.0) DC surface signature (S5) Resting dendritic cell surface signature (S10) inflammatory response (M33) B cells antiviral IFN signature (M75) type I interferon response (M127) Plasma cells viral sensing & immunity; IRF2 targets network (II) (M111.1) innate antiviral response (M150) Cell cycle enriched in monocytes (II) (M11.0) Cytoskeleton and adhesion enriched in monocytes (IV) (M118.0) Monocyte surface signature (S4) Mitochondrial enriched in monocytes (I) (M4.15) enriched in NK cells (II) (M61.0) Antigen presentation and DC enriched in NK cells (KIR cluster) (M61.1) enriched in NK cells (receptor activation) (M61.2) Inflammatory / TLR / chemokines enriched in NK cells (I) (M7.2) NK cell surface signature (S1) Interferon / antiviral PLK1 signaling events (M4.2) proteasome (M226) Monocytes

NK cells

Signal transduction

Neutrophils

Fig. S6. Transcriptional correlates of immunogenicity. Each square represents a BTM. Normalized enrichment scores for representative modules are illustrated by color. Assignment of a BTM to a high-level annotation group is illustrated by a colored sidebar. (A) RRR. (B) ARR. Full details are available in Dataset S2.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 7of10 a RRR Post-primary Post-first boost Post-second boost

B cells B cells B cells 2 T cells 2 Plasma cells T cells Plasma cells T cells 2 Plasma cells 1 1 1 Signal Signal Signal Cell cycle Cell cycle Cell cycle transduction 0 transduction 0 transduction 0 -11 -11 -11 ECM and ECM and ECM and NK cells NK cells -22 NK cells -22 migration migration -22 migration -33 -33 -33 Neutrophils Mitochondrial Neutrophils Mitochondrial Neutrophils Mitochondrial

Antigen Antigen Antigen Monocytes Monocytes Monocytes presentation presentation presentation Interferon/ DC activation Interferon/ DC activation Interferon/ DC activation antiviral D1 antiviral Inflammatory/ D28 Inflammatory/ antiviral Inflammatory/ D56 /TLR/chemokines /TLR/chemokines /TLR/chemokines D2 D29 D57 D6 D34 D62 D14 b ARR Post-primary Post-first boost Post-second boost

B cells B cells B cells 2 2 T cells Plasma cells T cells Plasma cells T cells 2 Plasma cells 1 1 1 Signal Signal Signal Cell cycle Cell cycle Cell cycle transduction 0 transduction 0 transduction 0 -1 -11 -11 ECM and ECM and ECM and NK cells -22 NK cells -22 NK cells migration migration -22 migration -33 -33 -33 Neutrophils Mitochondrial Neutrophils Mitochondrial Neutrophils Mitochondrial

Antigen Antigen Monocytes Monocytes Antigen presentation Monocytes presentation presentation Interferon/ Interferon/ DC activation DC activation Interferon/ DC activation D1 antiviral Inflammatory/ antiviral Inflammatory/ D28 D56 antiviral Inflammatory/ /TLR/chemokines /TLR/chemokines D2 /TLR/chemokines D29 D57 D6 D34 D62 D14

ARR c B cells

Day 1 2 6 14 28 29 34 56 57 62 Plasma cells enriched in antigen presentation (II) (M95.0) enriched in antigen presentation (I) (M71) Cell cycle chemokines and inflammatory molecules in myeloid cells (M86.0) enriched in dendritic cells (M168) Mitochondrial proinflammatory cytokines and chemokines (M29) MHC-TLR7-TLR8 cluster (M146) Antigen presentation and DC Activated (LPS) dendritic cell surface signature (S11) antiviral IFN signature (M75) Inflammatory / TLR / chemokines type I interferon response (M127) RIG-1 like receptor signaling (M68) Interferon / antiviral viral sensing & immunity; IRF2 targets network (I) (M111.0) viral sensing & immunity; IRF2 targets network (II) (M111.1) Monocytes enriched in monocytes (II) (M11.0) Monocyte surface signature (S4) enriched in NK cells (II) (M61.0) NK cells enriched in NK cells (KIR cluster) (M61.1) enriched in NK cells (I) (M7.2) Signal transduction enriched in NK cells (III) (M157) NK cell surface signature (S1) Indicates modules that constitute common correlates of immunogenicity and protection at at least one time point

Fig. S7. Molecular associations of protection. (A and B) Illustration of GSEA with mRNA expression being ranked by correlation to the day of positive smear. For protected individuals, day of smear is set to 28 (the end of the observation period). Spider plots indicate the average normalized enrichment scores across all modules included in each of the represented high-level annotation groups. The distance from the center of the plot corresponds to the average NES across all BTMs included in the corresponding high-level annotation group. Only modules with FDR q values less than 0.05 are included in the count. Translucent blue zones indicate negative enrichment. (A) Correlates of protection for the RRR vaccine. (B) Correlates of protection for the ARR vaccine. Common associations of immunogenicity and protection are indicated by color on heatmaps. Full details are provided in Datasets S3 and S4. (C) Enrichment of representative BTMs that correlate with protection. Each square represents a BTM. Normalized enrichment scores for representative modules are illustrated by color. Assignment of a BTM to a high-level annotation group is illustrated by a colored sidebar.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 8of10 ab

Prime Boost Boost Challenge NES

-2 0 2 Metagene expression

D1 D2 D6 D14 D28 D29 D34 D56 D57 D62 D77 D78 D82 D1 D2 D6 D14 D28 D29 D34 D56 D57 D62 D77 D78 D82

Prime Boost Boost Challenge

Fig. S8. Integration of the current dataset with the previously reported transcriptional response in a RTS,S/AS01 CHMI study. (A and B) Expression of in- flammation related genes from ref. 32. (A) Expression of 63 inflammation related genes identified as being up-regulated 24 h after the third immunization in ref. 32 in the RRR cohort of the current study. (B) A metagene was formed, representing average expression of these 63 genes, and the expression of this metagene was monitored in individual subjects across time points. Boxplots indicate mean, 25–75% interquartile range and 95% range. Each dot represents a subject.

Transcrip correlates of prot immunogenicity Immune correlates

Day 1 post primary Day 1 post primary and secondary Day of final vaccina Day of challenge RRR and secondary Signatures of monocytes, Magnitude of CSP-specific Signatures of B Signatures of NK Pr TLR signaling, and an al an response and plasma cells cells immunity + + - + + -

Day 1,2 post primary Ad35 Days 1, 2 post primary Two weeks post primary and secondary ARR Signatures of monocytes, Magnitude of CSP-specific TLR signaling, and an al Signatures of NK Pr cells poly CD4⁺ T cell immunity response + + -

Fig. S9. A proposed model diagram of mechanisms of protection in the RRR and ARR vaccines.

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 9of10 Table S1. Top 10 representative genes with high frequency of occurrence in the predictive signatures of protection Frequency, of 99 Probe set ID Gene name Gene symbol signatures

208198_x_at Killer cell Ig-like receptor, two domains, short cytoplasmic tail, 1 KIR2DS1 57 207647_at Chromodomain , Y-linked, 1 CDY1 16 214277_at COX11 cytochrome c oxidase assembly homolog (yeast) COX11 15 214575_s_at Szurocidin 1 AZU1 12 214940_s_at Smg-6 homolog, nonsense mediated mRNA decay factor (C. elegans) SMG6 12 220357_s_at Serum/glucocorticoid regulated kinase 2 SGK2 11 208949_s_at Lectin, galactoside-binding, soluble, 3 LGALS3 9 211397_x_at Killer cell Ig-like receptor, two domains, long cytoplasmic tail, 2 KIR2DL2 9 220545_s_at Testis-specific serine kinase substrate TSKS 9 220888_s_at Cas scaffolding protein family member 4 CASS4 9

Other Supporting Information Files

Dataset S1 (XLSX) Dataset S2 (XLSX) Dataset S3 (XLSX) Dataset S4 (XLSX) Dataset S5 (XLSX) Dataset S6 (XLSX) Dataset S7 (XLSX) Dataset S8 (XLSX) Dataset S9 (XLSX) Dataset S10 (XLSX) Dataset S11 (XLSX)

Kazmin et al. www.pnas.org/cgi/content/short/1621489114 10 of 10