Coregulation Mapping Based on Individual Phenotypic Variation in Response to Virus Infection

Supporting Information

Correlations computations

For pairs of quantities (xi, yi), i=1,..,n, the n is the number of samples, and is equal to 145 for our data in the main text. Pearson’s correlation (r) is given by

(xi  x)(yi  y) r  i 2 2 , (xi  x) (yi  y) i i where x and y are the sample average of x’s and y’s respectively.

Let Ri be the rank of xi among all the x’s, Si be the rank of yi among all the y’s. If some of xi’s

(yi’s) are identical, all these ties are assigned the mean of the ranks that they have had if their values had slightly been different. Spearman’s correlation (  ) captures monotonic correlation, and is defined the same as the Pearson’s correlation except that the (xi,yi) is replaced by the ranks (Ri,Si). Let R and S be the sample average of R’s and S’s respectively. The formula for Spearman’s correlation is given by

(Ri  R)(Si  S )   i 2 2 . (Ri  R) (Si  S ) i i

1 Kendall’s correlation ( ) is more non-parametric than Spearman’s. Instead of using the numerical difference of ranks, it uses only the relative ordering of ranks (values). In this case, the data do not have to be ranked at all, as the relative ordering (higher, lower, or the same) of ranks is equivalent to that of the values. Now consider all n(n-1)/2 pairs of data points. (i) A pair is called concordant if the relative ordering of the ranks of two x’s (or the two x’s themselves) is the same as the relative ordering of the ranks of two y’s (or the two y’s themselves). (ii) A pair is called discordant if the relative ordering of two x’s is opposite from that of two y’s. (iii) If there is a tie in the ranks of two x’s but not in the ranks of two y’s, the pair is called “extra y pair”. (iv) We can also define “extra x pair” be similarly. (v) If the tie is in both the x’s and y’s, the pair is ignored in the computation. Kendall’s correlation ( ) is defined by

concordant  discordant   concordant  discordant  extra _ x concordant  discordant  extra _ y

Supporting tables

Table S1. The copy numbers of mRNA for all 34 genes. The measurements are obtained by qPCR and are plate corrected. The top block is for the expression levels for all 134 samples, the bottom block for 6 repeats for checking experimental variation.

Table S2. The summary table of the gene expression data for all 34 genes. The median and

MAD are given for the copy numbers, and the mean and standard deviations are given for the log2 copy numbers. The first four columns are for the population data of 145 samples. The next two columns are for the 6 repeats for checking experimental variation. The last column indicates the % of donor population variation that is contributed by the experimental variation. The four

2 blue genes with a crossed line have low gene expression, and the entry for the last column is colored red when the experimental variation is significantly smaller than the population variation

(p-value of the F-test is no greater than 0.05). The median and MAD (median absolute deviation) indicate the center and spread of the non-Gaussian distribution of the mRNA copy numbers across 145 samples. The median is defined as the point such that 50% of the data fall below or

above it. MAD is defined as 1.48 median(| xi  m |) , where m=median(xi), and xi are the mRNA copy numbers from the 145 samples.

Table S3. Three pairwise correlation among the 13 filtered genes. Pearson’s (A), Spearman’s

(B) and Kendall’s (C) correlation coefficients for all pairs of genes are shown. Since the matrix for each correlation is symmetric, only the lower triangle is displayed. All of p-values for the correlations are less than 5x10-4.

Table S4. TF enrichment analysis for pairs of genes. The TF prediction was phylogenetically constrained by human-chimp conservation. A) All data set, B) GO term filtered. Only the entry with non-zero TSS’s are listed and the list is ordered according to TSS. The two genes in the pair are linked with a dot in the table.

Table S5: TF enrichment analysis for Go filtered four-member clusters.

Table S6: The PCR Primer for all 34 genes and beta-actin.

3 Table S1 – See attached Excel File

Table S2

population (copy #) population (log2) experimental (log2) Gene exp var % median mad mean sd mean sd B2M 876.65 301.67 9.76 0.66 11.13 0.39 33.58 CASP8 0.34 0.16 -1.49 0.94 -1.38 0.57 36.86 CCL4 454.02 244.55 8.8 1.02 8.46 0.54 28.07 CCL5 352.92 262.85 8.27 1.43 7.33 0.37 6.83 CCR7 2.29 1.66 1.21 1.38 1.99 1.19 74.38 CD86 12.61 4.65 3.67 0.72 3.55 0.45 38.58 CXCL10 263.72 177.06 8.04 1.11 10.4 0.91 67.31 DDX58 40.02 21.98 5.36 0.84 9.88 0.66 63.08 DHX9 2.15 0.86 1.13 0.64 7.38 0.42 42.16 DICER1 4.69 3.34 2.32 1.19 1.87 0.79 44.48 EIF2AK2 3.82 1.69 1.95 0.87 7.62 1.4 255.88 IFIT1 100.81 45.89 6.66 0.77 6.28 0.3 15.54 IFIT2 409.56 233.46 8.69 0.92 10.48 0.09 0.87 IFNA1 121.24 96.44 6.99 1.71 10.79 0.8 22.2 IFNA2 110.47 102.25 6.76 1.71 9.16 0.66 14.9 IFNAR1 2.1 1 1.18 0.96 4.94 0.32 11.04 IFNB1 7.24 5.64 2.84 1.38 10.78 0.52 14.3 IFNG 0.48 0.55 -1.02 2.39 1.5 0.95 15.74 IKBKE 2.43 1.6 1.31 1.04 1.59 0.31 8.89 IL12A 6.13 6 2.44 1.7 2.33 1.24 53.25 IL28A 18.04 16.65 4.02 1.64 10.1 0.83 25.42 IL28B 17.3 16.44 4.1 1.64 10.24 0.8 23.91 IL29 263.65 218.55 7.91 1.37 10.43 0.46 11.23 IL6 5.02 4.44 2.36 1.47 4.37 0.49 11.01 IL8 40.94 30.95 5.23 1.2 7.27 1.08 80.2 IRF7 57.37 29.4 5.82 0.94 7.43 0.19 3.98 IRF9 25.27 13.11 4.64 0.84 4.41 0.9 114.58 MX1 37.92 21.27 5.27 0.83 10.83 0.25 9.04 STAT1 9.29 5.15 3.19 1 7.81 1.89 354.04 TBK1 5.05 2.59 2.31 0.91 6.81 0.39 18.9 TLR3 0.31 0.18 -1.54 1.07 1.48 0.74 47.8 TNF 105.77 72.43 6.72 1.18 9.89 0.79 44.61 TRAM1 11.54 5.48 3.55 0.74 5.91 0.38 25.56 TYK2 1.64 1.01 0.84 1.2 3.97 0.24 3.96

4 Table S3

(A): Pearson's correlations

CCL5 IFIT1 IFIT2 IFNA1 IFNA2 IFNAR1 IFNB1 IKBKE IL29 IL6 IRF7 MX1 TBK1 CCL5 1 IFIT1 0.538 1 IFIT2 0.553 0.548 1 IFNA1 0.686 0.514 0.69 1 IFNA2 0.687 0.542 0.7 0.964 1 IFNAR1 0.288 0.364 0.417 0.289 0.352 1 IFNB1 0.71 0.489 0.675 0.819 0.802 0.289 1 IKBKE 0.657 0.816 0.53 0.62 0.639 0.391 0.572 1 IL29 0.756 0.536 0.727 0.896 0.91 0.346 0.849 0.62 1 IL6 0.452 0.608 0.465 0.502 0.517 0.5 0.505 0.642 0.485 1 IRF7 0.57 0.655 0.604 0.483 0.501 0.299 0.542 0.716 0.562 0.506 1 MX1 0.449 0.558 0.464 0.39 0.406 0.405 0.512 0.582 0.461 0.544 0.668 1 TBK1 0.464 0.644 0.543 0.378 0.404 0.426 0.434 0.611 0.51 0.544 0.72 0.709 1

(B): Spearman's correlations

CCL5 IFIT1 IFIT2 IFNA1 IFNA2 IFNAR1 IFNB1 IKBKE IL29 IL6 IRF7 MX1 TBK1 CCL5 1 IFIT1 0.598 1 IFIT2 0.61 0.554 1 IFNA1 0.722 0.584 0.703 1 IFNA2 0.72 0.617 0.704 0.959 1 IFNAR1 0.317 0.407 0.456 0.38 0.433 1 IFNB1 0.757 0.552 0.688 0.792 0.766 0.318 1 IKBKE 0.673 0.803 0.484 0.654 0.686 0.398 0.568 1 IL29 0.809 0.596 0.741 0.88 0.887 0.387 0.832 0.645 1 IL6 0.487 0.655 0.472 0.546 0.556 0.414 0.543 0.651 0.507 1 IRF7 0.601 0.661 0.581 0.482 0.512 0.309 0.538 0.682 0.564 0.546 1 MX1 0.48 0.57 0.486 0.429 0.467 0.471 0.53 0.574 0.491 0.549 0.647 1 TBK1 0.531 0.627 0.552 0.406 0.45 0.465 0.508 0.585 0.56 0.558 0.687 0.748 1 (C) Kendall's correlations CCL5 IFIT1 IFIT2 IFNA1 IFNA2 IFNAR1 IFNB1 IKBKE IL29 IL6 IRF7 MX1 TBK1 CCL5 1 IFIT1 0.432 1 IFIT2 0.445 0.394 1 IFNA1 0.55 0.409 0.524 1 IFNA2 0.549 0.429 0.529 0.837 1 IFNAR1 0.223 0.282 0.307 0.262 0.3 1 IFNB1 0.576 0.378 0.506 0.604 0.581 0.231 1 IKBKE 0.494 0.609 0.348 0.466 0.492 0.275 0.397 1 IL29 0.647 0.42 0.563 0.717 0.724 0.271 0.641 0.457 1 IL6 0.33 0.46 0.324 0.377 0.389 0.292 0.371 0.466 0.348 1 IRF7 0.44 0.487 0.4 0.334 0.359 0.209 0.377 0.503 0.397 0.38 1 MX1 0.331 0.412 0.332 0.297 0.328 0.324 0.366 0.4 0.34 0.381 0.477 1 TBK1 0.374 0.445 0.384 0.272 0.301 0.3175 0.347 0.419 0.385 0.386 0.499 0.542 1 Table S4

(A) All data

Combination TSS IFIT1.IFNB1 5 IFNA1.MX1 3 IFNAR1.TBK1 2.9 IRF7.MX1 2.8 IFNA1.IFNAR1 2 IFNB1.MX1 2 IL6.IRF7 2 IFNA1.IFNB1 1.9 IFIT2.IFNB1 1.8 IFIT2.TBK1 1.8 IKBKE.IRF7 1.8 IKBKE.MX1 1.8 CCL5.IKBKE 1 IFIT1.IFNAR1 1 CCL5.MX1 0.9 CCL5.TBK1 0.9 IFIT2.IFNA1 0.9 IFIT2.IRF7 0.9 IFNA1.IL6 0.9 IFNA1.TBK1 0.9 IFNA2.IFNAR1 0.9 IFNA2.TBK1 0.9 IFNB1.TBK1 0.9 IL6.TBK1 0.9 IRF7.TBK1 0.9 MX1.TBK1 0.9

(B) GO term filtered data

Combination TSS IFNA1.IFNAR1 1 IFNA2.IFNAR1 1

6 Combination TSS CCL5.IFNA1.IFNA2.IFNB1 4.8 IKBKE.IL6.IRF7.MX1 3.55 Table S5 IFNA1.IFNA2.IFNB1.IKBKE 3.2 IFNA1.IFNA2.IFNB1.MX1 3.2 CCL5.IFNB1.IKBKE.IL6 2.75 CCL5.IFNA1.IFNB1.MX1 2.4 IFNA1.IFNA2.IFNB1.IL6 2.4 IFNA1.IFNB1.IL6.MX1 2.4 CCL5.IFNAR1.IL6.IRF7 2.35 CCL5.IFNAR1.IRF7.MX1 2.35 CCL5.IL6.IRF7.MX1 2.35 IFNA1.IKBKE.IRF7.MX1 2.35 IFNAR1.IL6.IRF7.MX1 2.35 CCL5.IFNA1.IFNA2.IKBKE 1.6 CCL5.IFNA1.IFNA2.MX1 1.6 CCL5.IFNA1.IFNB1.IKBKE 1.6 CCL5.IFNA1.IFNB1.IL6 1.6 CCL5.IFNA2.IFNB1.IKBKE 1.6 CCL5.IFNA2.IFNB1.MX1 1.6 CCL5.IFNAR1.IL6.MX1 1.6 IFNA1.IFNA2.IFNB1.IRF7 1.6 IFNA1.IFNA2.IKBKE.IRF7 1.6 IFNA1.IFNA2.IL6.MX1 1.6 IFNA1.IFNB1.IKBKE.IRF7 1.6 IFNA2.IFNB1.IKBKE.IRF7 1.6 IFNA2.IFNB1.IL6.MX1 1.6 IFNAR1.IKBKE.IRF7.MX1 1.55 CCL5.IKBKE.IRF7.MX1 1.5 CCL5.IFNA1.IFNAR1.MX1 1 CCL5.IFNA1.IFNA2.IFNAR1 0.8 CCL5.IFNA1.IFNA2.IL6 0.8 CCL5.IFNA1.IFNAR1.IFNB1 0.8 CCL5.IFNA1.IL6.MX1 0.8 CCL5.IFNA2.IFNAR1.IFNB1 0.8 CCL5.IFNA2.IFNB1.IL6 0.8 CCL5.IFNB1.IL6.MX1 0.8 IFNA1.IFNA2.IFNAR1.IFNB1 0.8 IFNA1.IFNAR1.IKBKE.IRF7 0.8 IFNA1.IFNAR1.IKBKE.MX1 0.8 IFNA1.IFNAR1.IRF7.MX1 0.8 IFNA1.IKBKE.IL6.IRF7 0.8 IFNA1.IKBKE.IL6.MX1 0.8 IFNA1.IL6.IRF7.MX1 0.8 CCL5.IFNA1.IRF7.MX1 0.75 CCL5.IFNA2.IRF7.MX1 0.75 CCL5.IFNB1.IRF7.MX1 0.75 IFNA1.IFNAR1.IL6.IRF7 0.75 IFNA1.IFNB1.IKBKE.IL6 0.75 IFNA2.IFNAR1.IL6.IRF7 0.75 IFNA2.IFNB1.IKBKE.IL6 0.75 IFNA2.IKBKE.IRF7.MX1 0.75 IFNAR1.IFNB1.IKBKE.IL6 0.75 IFNAR1.IFNB1.IL6.IRF7 0.75 IFNAR1.IKBKE.IL6.IRF7 0.75 IFNB1.IKBKE.IL6.IRF7 0.75 IFNB1.IKBKE.IL6.MX1 0.75 IFNB1.IKBKE.IRF7.MX1 0.75

7 ACCESSION# SEQUENCE (5'-3') 1 B2MicroGlo NM_001101 GTGGACTTGGGAGAGGACTG NM_001101 ACTGGAACGGTGAAGGTGAC Table S6 2 CASP8 NM_001228 TCCAAGCAGAGATGAAAGAG NM_001228 ATAAGCTCTCCCCAAACTTG 3 CCL4 NM_002984 ATCCCCATAGGACACTTATC NM_002984 CACATCTCCTCCATACTCAG 4 CCL5 NM_002985 AAGCTCCTGTGAGGGGTTGA NM_002985 TTGCCAGGGCTCTGTGACCA 5 CCR7 NM_001838 AGGTTTTCAGTCCCTGTGAC NM_001838 TGACATGCACTCAGCTCTTG 6 CD86 NM_175862 TGTTAGAAACTAGCCAGGTG NM_175862 GTCTCTGCCCAAACATAAAG ACCESSION# SEQUENCE (5'-3') 7 DICER1 NM_177438 AAGGTGCTGTGTTTTGCTTC 20 IFNG NM_177438NM_000619 TACTAAAGTCCTCCTGCCAGGCTATGTTTTCATCAGGGTC 8 IFIT1 NM_001548NM_000619 GACCTTGTCTCACAGAGTTCAGGCAAGGCTATGTGATTAC 21 CXCL10 NM_001548NM_001565 TCGGAGAAAGGCATTAGATCTGAAGCAGGGTCAGAACATC 9 IKBKE NM_014002NM_001565 GAGTGAGGGAGAGCCAAAGGTCCCATCACTTCCCTACATG 22 IRF7 NM_014002NM_001572 GCTCCAGCTCCATAAGGAAGCCAGGGCAGTAGGTCAAAC 10 IL12A NM_000882NM_001572 CTCCCTAGTTCTTAATCCACGGTGTGTCTTCCCTGGATAG 23 IRF9 NM_000882NM_006084 GCCACAAAAATCCTCCCTTGATTAGCCTTGAGTTCTCCAC 11 IL28A NM_172138NM_006084 TGGGCTGAGGCTGGATACAGATTCTGTCCCTGGTGTAGAG 24 IFIT2 NM_001547.3NM_172138 TCTGGAGGCCACCGCTGACATCGTTCCAAGCATACCGTGA 12 IL28B NM_001547.3NM_172139 CGTGGGCTGAGGCTGGATACCGTGGGAACCTGGTGACTAA 25 MX1 NM_172139NM_002462 TGGCCCTGACGCTGAAGGTTTGCAAGGTGGAGCGATTCTG 13 IL29 NM_172140NM_002462 GGAGTAGGGCTCAGCGCATACGTGGTGATTTAGCAGGAAG 26 EIF2AK2 NM_172140NM_002759 GCCTCCTCACGCGAGACCTCCATGTCAGGAAGGTCAAATC 14 IL6 NM_000600NM_002759 CTGAGGTGCCCATGCTACATACTACGTGTGAGTCCCAAAG 27 DHX9 NM_000600NM_001357 AATGCCAGCCTGCTGACGAATCTAAAGCCACCTCGGGAAA 15 IL8 NM_000584NM_001357 CAACATCACTGTGAGGTAAGATGGCGGTGGATATAGCAGT 28 DDX58 NM_000584AF038963 GTTAAATCTGGCAACCCTAGGGCTTGGGATGTGGTCTACT 16 IFNA1 NM_000605AF038963 ATTTCTGCTCTGACAACCTCAAAGCCTTGGCATGTTACAC 29 STAT1 NM_024013NM_007315 CTGAATGACTTGGAAGCCTGCCTTTCAATTTTACCTTCAG 17 IFNA2 NM_000605NM_007315 ATTTCTGCTCTGACAACCTCCTTCTCTGGCGACAGTTTTC 30 TBK1 NM_000605NM _013254 TGACAGAGACTCCCCTGATGTGGGATCTGGGCACCTTGTA 18 IFNAR1 NM_000629NM _013254 CTTGCCCGTATTTTTAGGACTGGTAGAACGGTGGCTACTG 31 TLR3 NM_000629NM_003265 GTGAAGAACTACAGCAGGACACATTCCTCTTCGCAAACAG 19 IFNB1 NM_002176NM_003265 ACAGCATCTGCTGGTTGAAGTGAGGCGGGTGTTTTTGAAC 32 TNF NM_002176NM_000594 GTCAGAGTGGAAATCCTAAGGAGGAAGGCCTAAGGTCCAC NM_000594 AGTGAAGTGCTGGCAACCAC 33 TRAM1 BC_000687 GGAACAAGATGTGAACACTG BC_000687 GTATCGCTACAGAAAGGCTC 34 TYK2 NM_003331 GGATTTAAGGGCTGGATTAG NM_003331 AACCAAGAGGGGGATGTCAG 35 Beta-Actin GTGGACTTGGGAGAGGACTG ACTGGAACGGTGAAGGTGAC

8 Supporting figures

Figure S1: The MDS plot for the 145 donors and the six experimental repeats. For each gene, the log2 copy numbers of 145 donors is normalized such that its median is zero, and the median normalization is also applied to the six experimental repeats. The 151 combined samples are analyzed by the MDS using Kendall’s correlation. The six experimental repeats are colored in red and labeled from 1 to 6, in which (1,2), (3,4), and (5,6) are done by the one of the three experimenters, respectively. Note the close correspondence in measurements 1,2 and 5,6. The distance between measurements 3 and 4 presumably results from experimental error affecting one of the assays.

Figure S1

9