Supplemental Note 1: Biologically and medically relevant information for selected loci. Central findings are summarized in a one‐line statement (red); supportive references and further information is provided where applicable. Loci numbers and sentinel SNPs are given; identifiers for non‐replicated associations are in parentheses (note that many of the non‐replications can be accounted to a lack in statistical replication power and may well represent true positive associations to be replicated in future and more highly powered studies). Association details, including position, effect allele, allele frequency, summary statistics using three different scaling methods, protein and variant annotations, replication status, power analysis, and tagging SNPs in QMDiab, are available online at http://proteomics.gwas.eu. The information gathered here is a result of an attempt to keep track of all interesting information that we encountered while investigating these loci. Please bear in mind that this table is neither complete nor free of errors, and that all information provided here should be confirmed by additional literature research before being used as a basis for firm conclusions or further experiments.

Locus (SNP) Protein Summary of findings # (rs5498) ICAM1, ICAM5 Extends sICAM‐1 GWAS to anti‐correlation with ICAM5. SNP rs5498 replicates a published cis‐association with sICAM‐1 levels GWAS 1 – in this study this SNP also associates in cis with sICAM‐5 levels, but displays an opposite effect. #L2 (rs4129267) IL6R Two genetic signals at this may be used for drug target validation using genotype dependent drug‐response curves. cis‐association with IL6R (rs4129267). The main genetic determinant of soluble interleukin 6 receptor (sIL‐6R) levels is the missense variant rs2228145 that maps to the cleavage site of IL‐ 6R 2 (rs4129267 is a proxy SNP for rs2228145). Van Dongen et al. 3 reported the association of rs2228145 IL6R, replicating earlier reports by Galicia et al. 2. Cells lacking IL6R can respond to IL6 using the trans‐membrane signal transducer protein gp130 (IL6ST). Two molecules of gp130 are believed to bind the IL6/IL6R complex. We found no genetic variance in IL6, but SNP rs7730934 associated in our study in cis with gp130 (#L272). This SNP tags the coding SNP rs2228044 (G/R Ggt/Cgt). One might therefore investigate possible epistasis between SNPs rs2228145 (soluble IL6R) and rs2228044 (gp130) with an IL6 signalling related outcome. López‐Mejías et al. 4 could not find such a signal, but CardioGram found an association of rs4129267 with CVD (1.7×10‐8). Prototype of a biomarker with a large effect size – unlikely to be an epitope effect since the functional SNP is known and affects the solubilisation of the protein. This variant associates with CVD at p=1.66×10‐8 in CardiogramPlus. #L4 (rs9858542) MST1 Possible link to diabetes. This locus harbours a cis‐association with MSP/MST1/HGFL (Hepatocyte growth factor‐like protein); rs9858542 tags a borderline significant association with fasting glucose (p=4.9×10‐8) in Magic and with CVD (p=6.3×10‐6) in CardioGram; HGFL is a paralogue of HGF; HGF levels are associated with the presence of type 2 diabetes in postmenopausal women. #L7 (rs12459419) CD33 Changes in CD33 protein levels co‐associate with AD. This locus harbours a replicated cis‐pQTL with Myeloid cell surface antigen CD33 (CD33, aka Siglec‐3). This SNP is an A14V amino acid exchange and in near‐perfect LD with rs3865444. rs3865444 is an Alzheimer's disease risk variant 5. Increased expression of CD33 mRNA was

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 1 associated with increasing AD pathology in temporal cortex brain samples 5. Here we show that also protein levels of CD33 associate with rs3865444. See the role of inflammation in Alzheimer's disease in Heppner et al. 6. See also Hernández‐Caselles et al. 7 for alternative splicing in CD33. See Chouraki and Seshadri 8 for the genetics of Alzheimer's. #L8 (rs1926447) MAPKAPK3, (MIF) Strong trans‐pQTL with MAPKAPK3 may be used to identify link to cis‐encoded . This locus harbours a replicated trans‐pQTL to MAP kinase‐activated protein kinase 3 (MAPKAPK3). A second trans‐pQTL Macrophage migration inhibitory factor (MIF) did not have sufficient replication power in QMDiab, but was still nominally significant (p=1.2×10‐3). #L10 (rs17413015), FCGR2B FCGR2B protein levels may influence the risk #L16 (rs7551957) to develop auto‐immune disorders. Two replicated cis‐pQTLs with Low affinity immunoglobulin gamma Fc region receptor II‐b (FCGR2B). rs7551957 is a strong GWAS hit for Inflammatory bowel disease (p<2×10‐38) and related auto‐immune diseases (Ulcerative colitis, Kawasaki disease, systemic lupus erythematous, Inflammatory bowel disease). This association can shed new light on the functional background of the role of FCGR2B in Chrohn's disease. See also #L141 for an association with the ratio of FCGR2B and FCGR3B. #L11 (rs26496), ERAP1 Two independent pQTLs, confirmed by strong #L14 (rs17482078), eQTLs, with additive effect, co‐association to #L120 (rs149313) ankylosing spondylitis suggests additive risk. Two strong cis‐associations endoplasmic reticulum aminopeptidase 1 (ERAP1, aka ARTS1). SNP rs26496 (#L11) is associated with ankylosing spondylitis 9. SNP rs17482078 (#L14, r2=0.059 with rs26496) is associated with Behcet's disease. Hong et al. 10 report an mQTL with SNP rs27529 (r2=0.46 with rs26496) with a metabolite of unknown identity (MS peaks 616.7 m/z, 666.3 m/z, 489.2 m/z). Imputed data reveals at least two independent hits with p<10‐120. Interaction between ERAP1 and HLA‐B27 was reported by Evans et al. 11 . That paper also provides experimental evidence that rs30187 (r2=0.46 with rs26496) and rs17482078 have ~40% slower rates of substrate trimming than wild‐type ERAP1, but results are based only on a small set of experiments (N=3). See #L251 for a non‐replicated associations with IL23R in relation to ankylosing spondylitis. Evans et al. 11 discuss the interaction of ERAP1 and IL23R. #L17 (rs4525) CAMK1 Genetic variance in SELL may modify CAMK1 protein levels via differences in calmodulin binding to SELL. This locus harbours a replicated trans‐pQTL with Calcium/calmodulin‐dependent protein kinase type 1 (CAMK1). SNP rs4525 tags variants that may regulate L‐ (SELL) and P‐ selecting (SELP). SELL binds to calmodulin in the context of leukocyte transmigration 12 . See #L79 for a second replicated trans‐pQTL to CAMK on a different chromosome. See #L413 for a replicated trans‐pQTL with CAMK1 at the same locus that also involves F5. #L18 (rs2302465), BST1 Near total protein loss of BST1 in #L62 (rs12651314), homozygotes of certain minor allele variants. #L151 (rs7657257), #L195 (rs1789) Multiple replicated cis‐pQTLs to ADP‐ribosyl cyclase/cyclic ADP‐ribose hydrolase 2 (BST1). BST1 is an extracellularly exposed enzyme that possesses Cyclic ADP ribose (cADPR) hydrolase activity. cAD is a second messenger that releases calcium from intracellular

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 2 calcium stores. BST1 is mainly expressed in the bone marrow. It appears that the minor allele variants of both, rs2302465 (#L18) and rs7657257 (#L151), tag haplotypes that carry the same functional variant, and which leads to full loss of BST1. Using imputed data a number of highly significant SNPs were identified, i.e. SNP rs73224659 (p=4.0×10‐179 compared to p=1.5×10‐108 for rs2302465). Minor allele carriers of rs73224659 show nearly no protein levels of BST1. #L19 (rs2058622) IL18R1 New insight into the Interleukin‐1 receptor cluster. This locus harbours a replicated cis‐association with Interleukin‐18 receptor 1 (IL18R1). Andiappan et al. 13 reported this variant as a strong eQTL that affects leprosy and Crohn's disease in opposite directions. The authors argue that polymorphic regulation of human neutrophils can impact beneficial as well as pathological inflammatory responses. This locus also associates with IL1RL1 (the coding for sST2) 14 (see #L44) and with atopic dermatitis, Crohn's disease (time to surgery), celiac disease, Inflammatory bowel disease. #L26 (rs2304456) KNG1, LCMT1 Co‐association of KNG1 and LCMT1 suggests a possible link between the kininogen/bradykinin and the PP2A pathways. This locus harbours a replicated trans‐association with Leucine carboxyl methyltransferase 1 (LCMT1) and a cis‐association with Kininogen‐1 (KNG1). LCMT1 methylates Human protein phosphatase 2A (PP2A), a key regulator of many cellular processes, and a potential cancer‐ therapeutic target. #L28 (rs651007), SELE, INSR, FLT4, The pleiotropic ABO locus is discussed in the #L36 (rs505922), MET, KDR, SELP, main paper. #L69 (rs8176749), CD200, VWF, #L269 (rs630510), CD209, CDH5, #L354 (rs8176720), TIE1, TEK, BCAM, #L442 (rs7857390) NOTCH1 This locus is discussed in the main paper. Here we report observations not mentioned in the paper: In Shin et al. 15 SNP rs651007 associates with dipeptide levels (aspartylphenylalanine, p=2×10‐7; X‐14189, annotated as potential Leu‐Ala, but Ser‐Pro is also possible by mass alone, p=9.6×10‐10; X14304, I/L‐Ala or Ala‐I/L?, p=1.8×10‐8; X‐14208, could be Cys‐Met or Tyr‐ Ala or His‐Pro or Ser‐Phe by mass alone, p=1.5×10‐7; X‐14205; X14086: annotated as Thr‐Glu, but mass is 25 too high; Val‐Arg possible by mass alone). This SNP also associates in Raffler et al. 16 (Urine NMR) with a signal at 2.0308 ppm (p=5.1×10‐20). It is also a marginal hit in CardioGram (7.1×10‐8). He et al. 17 report a genome wide association study of genetic loci that influence tumour biomarkers cancer antigen 19‐9, carcinoembryonic antigen and α fetoprotein and their associations with cancer risk. The authors find strong association of SNP rs8176749 with carcinoembryonic antigen (CEA) levels – we find association with Cadherin‐5. CD209 and MBL2 are both lectins. All associated ABO‐associated are glycoproteins, part of glycoprotein complex (LY96), or glycoprotein‐binding (SELE), except for IL27RA (http://jcggdb.jp/rcmg/gpdb/index). According to Uniprot all proteins are glycoproteins (including associations non‐replicated in QMDiab), while only half of all SomaLogic proteins are glycoproteins. The Panther classification system (http://pantherdb.org/tools/compareToRefList.jsp) reported a significant enrichment of the GO process "endothelial cell differentiation" (GO:0045446): 5 out of 17 annotated proteins

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 3 were found (expected were 0.32, p=1.45×10‐5).

Subnetwork of the ABO locus. Six replicated pQTLs at the ABO gene locus (red), including SNP identifiers that are linked to the the six loci (light green KORA SNPs, yellow QMDiab SNPs, bright green both), pQTLS (blue), proteins with significant partial correlation are connected, associations with GWAS endpoints are shown (grey) [NOTE: the anorexia is a false positive, checked this with the consortium]. #L31 (rs2228243), HRG, MAP2K4, HRG may have a regulatory role on MAP2K4, #L43 (rs1042445) (CD27, ESR1) possibly also on CD27 and ESR1. #L31 harbours a replicated trans‐association with Dual specificity mitogen‐activated protein kinase kinase 4 (MAP2K4) and a replicated cis‐association with Histidine‐rich glycoprotein (HRG). Elevated levels of HRG are associated with thrombophilia (OMIM #613116). #L43 harbours a second replicated trans‐association with MAP2K4 in the same genetic region. The associated SNP rs1042445 is a coding variant in HRG. In addition, there are two non‐ replicated, but in QMDiab still nominally significant trans‐pQTL with CD27 antigen (CD27, p=4.7×10‐3 in QMDiab) and Estrogen receptor (ESR1, p=8.3×10‐4 in QMDiab). #L37 (rs342706) GPC5 Extends role of GPC5 in lung cancer association to protein levels. This locus harbours a replicated cis‐pQTL with Glypican‐5 (GPC5). Li et al. 18 report rs2352028 at chromosome 13q31.3 in association with lung cancer in never smokers (combined odds ratio 1.46 [1.26‐1.70], p=5.94×10‐6). The authors also found a cis‐eQTL of GPC5 in normal lung tissues, the high‐risk allele being linked with lower expression. Moreover, the transcription level of GPC5 in normal lung tissue was twice that detected in matched lung adenocarcinoma tissue. rs2352028 is tagged by rs342706 (r2=0.95). Here SNP rs342706 is associated with GPC5 protein levels, with lower levels for the minor allele. #L44 (rs12712135) IL1RL1 New insight into the Interleukin‐1 receptor cluster. This locus harbours a replicated cis‐pQTL to Interleukin‐1 receptor‐like 1 (IL1RL1). This locus also associates with sST2 in a GWAS (IL1RL1 codes for sST2) 14 and with susceptibility to leprosy 19 (see #L19 for association with IL18R1). Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 4

#L46 (rs217181), HP, HBA1/HBB, New cholesterol association. #L67 (rs9302635), FTH1/FTL, #L86 (rs2000999) SERPIND1 Using http://genenetwork.nl/biosqtlbrowser/ we identified different effects on transcripts and methylome: #L46: rs217181 => cis‐eQTL Exon‐Level 2.3×10‐40 chr_16_72108183_72108650 with HPR (effect on the HP related gene) #L67: rs9302635 => cis‐eQTL Exon‐ratio 1.1×10‐32 with ENSG00000257017_16_72094011_72094954 positive score with HP, 8.9×10‐41 with ENSG00000257017_16_72092153_72092408 negative score with HP (effect on splice variants) #L86: rs2000999 => trans‐meQTL 2.8×10‐7 with cg23809679 on chr4:71,554,149 (putative effect on distant gene, possibly UTP3)

The following associations are with HPR!

SNPs rs217181 and rs2000999 were associated in cis with haptoglobin (HP) and in trans with hemoglobin (HBA1/HBB) and heparin cofactor 2 (SERPIND1, also HC II). The effect sizes of both variants were comparable (FC=1.58 per copy of the minor allele for rs217181 and FC=‐ 1.48 for rs2000999 on haptoglobin levels). The effect sizes of both "risk" alleles are additive and lead to a similar effect size of the association with the minor allele‐copy difference between both SNPs (FC=1.41). The association of this difference strengthens the association at this locus with haptoglobin by 27 orders of magnitude. Given that the effects of both variants are of similar strength and opposite direction, and that the minor allele frequencies appear to be similar (36 and 38 minor allele homozygotes in this study, resp.), one can expect the association signal for rs217181 with LDL and total cholesterol to be similar as for rs2000999, but with opposite direction. This was confirmed by a lookup in the association data from the Global Lipids Genetics Consortium Results 20 : Both SNPs associate with a reverse sign with LDL (rs217181‐T decreases LDL with beta=‐0.046 and p=1.1×10‐25; rs2000999‐A increases LDL with beta=0.065 and p=4.2×10‐41), total cholesterol (similar association strength), and triglycerides (weak association signal). There is thus one low frequency variant that increases LDL and one low frequency variant that decreases LDL. This is an example for a drug response case (see Plenge et al. 21). The rs217181‐LDL Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 5 association was never reported. Froguel et al. 22 identified rs2000999 as a strong genetic determinant of circulating haptoglobin levels. This SNP was also reported as a lipid risk variant, associated with LDL cholesterol and with total cholesterol 23. rs217181 was reported in association with glycoprotein acetyls (GlycA, mainly a1‐acid glycoprotein, p=1.46×10‐36) 24, and replicated by Frazier‐Wood et al. (Circulation. 2015;131:AP361). GlycA is a new NMR‐derived plasma marker of inflammation. Also encoded at the locus is haptoglobin‐related protein (HPR). HPR derived from HP by gene duplication. SNP rs2000999 is an intronic SNP in HPR and rs217181 is located 2,857 bases downstream of HPR. RefSeq states: "This gene encodes a haptoglobin‐related protein that binds hemoglobin as efficiently as haptoglobin. Unlike haptoglobin, plasma concentration of this protein is unaffected in patients with sickle cell anemia and extensive intravascular hemolysis, suggesting a difference in binding between haptoglobin‐hemoglobin and haptoglobin‐related protein‐hemoglobin complexes to CD163, the hemoglobin scavenger receptor. This protein may also be a clinically important predictor of recurrence of breast cancer." Uniprot states "Primate‐specific plasma protein associated with apolipoprotein L‐I (apoL‐I)‐ containing high‐density lipoprotein (HDL). This HDL particle, termed trypanosome lytic factor‐1 (TLF‐1), mediates human innate immune protection against many species of African trypanosomes." This paper refers to this review 25, which may also explain the association of the rs2000999 variant with LDL and total cholesterol: "TLF1 is the densest fraction of high density lipoprotein particles (known as fraction 3 or HDL3), which contain a lipid core with an outer hydrophilic layer of phospholipids, cholesterol and several apolipoproteins, including the major component apolipoprotein A1 (APOA1)" 26.

Heparin cofactor 2 is a plasma serine proteinase inhibitor (serpin) that inhibits the coagulant proteinase alpha‐thrombin. Peptides at the N‐terminal of SERPIND1 have chemotactic activity for both monocytes and neutrophils.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 6

Association network at the haptoglobin locus. Boxplots of Haptoglobin (HP), Hemoglobin (HBA1/HBB), Ferritin (FTH1/FTL), and Heparin cofactor 2 (SERPIND1) protein levels (outliers >4s.d. were removed) as a function of the minor allele copy number of rs217181 (A‐D). Sub‐ network depeicting the haptoglobin (HP) locus, with three independent genetic associations that link four proteins and GWAS associations with lipid traits. The association with HP is a cis‐pQTL, all others are trans‐pQTLs. #L48 (rs1032994) LILRB2 Prostate cancer related. This is a replicated cis‐pQTL to Leukocyte immunoglobulin‐like receptor subfamily B member 2 (LILRB2). It is a risk‐locus for prostate cancer 27. SNP rs103294 is in strong linkage equilibrium with a 6.7‐kb germline deletion that removes the first six of seven exons in LILRA3 – reported in Chinese. Imputing yields much stronger associations, i.e. with coding variant rs386056. #L49 (rs2691273) SIGLEC9 An example of a genetically determined epitope effect. This locus harbours a replicated cis‐pQTL with Sialic acid‐binding Ig‐like lectin 9 (SIGLEC9). SIGLEC9 levels display a strong bimodal distribution that replicates in QMDiab. Imputed data suggests two linked causative SNPs: rs2075803 (K100E) and rs2258983 (A315E) (r2=0.39 with rs2691273). These two variants were experimentally shown not to impact the gene function 28. rs2075803 (K100E) is located in the carbohydrate recognition domain of SIGLEC9. rs2075803 (K100E) is located in the carbohydrate recognition domain of SIGLEC9 and hence accessible to aptamer binding. Interesting enough, when excluding samples with very low SIGLEC9 levels, a functionally very convincing trans‐association with rs10935480 (#L52) can be uncovered (p=7.5×10‐6). The SIGLEC9 distribution is highly bi‐modal, with SIGLEC9 levels of rs2075803‐G homozygotes three orders of magnitude lower than those of carriers of the rs2075803‐A allele. Interestingly, when excluding samples with low SIGLEC9 levels, a functionally relevant trans‐association with rs10935480 (#L52) is found (p=7.5×10‐6). rs10935480 is a replicated trans‐association with Vascular endothelial growth factor receptor

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 7

3 (FLT4, aka VEGF sR3). The SNP is located near the ST3 Beta‐Galactoside Alpha‐2,3‐ Sialyltransferase 6 (ST3GAL6) gene, linking genetic variance at ST3GAL6 with FLT4 and SIGLEC9.

#L50 (rs2425143) CPNE1 Near total protein ablation in minor allele homozygotes, confirmed by allele specific RNA expression analysis. Cis‐association with Copine‐1 (CPNE1) – see Supplemental Figure 5 below.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 8

#L51 (rs778986) FUT3 Glycosylation. This locus harbours a replicated cis‐pQTL to Galactoside 3(4)‐L‐fucosyltransferase (FUT3). It is discussed in the main text. #L52 (rs10935480) FLT4 Genetic variant linking VEGF sR3 to protein sialylation. This locus harbours a replicated trans‐association with Vascular endothelial growth factor receptor 3 (FLT4, aka VEGF sR3); SNP rs10935480 is in ST3 Beta‐Galactoside Alpha‐2,3‐ Sialyltransferase 6 (ST3GAL6); a trans‐association (p=7.5×10‐6) with Sialic acid‐binding Ig‐like lectin 9 (SIGLEC9) is observed when eliminating samples with very low SIGLEC9 levels Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 9

(epitope effect, see #L49), and a correlation between SIGLEC9 and FLT4 is then also found. #L54 (rs5167) CSF3 APOC4 and CSF3 may interact. This locus is located in the APOE/C1/C4/C2 cluster and harbours a replicated trans‐ association with Granulocyte colony‐stimulating factor (CSF3). rs5167 is linked to rs35336243 (r2>0.8), which associates with Alzheimer's disease (p=2.55×10‐18) in the International Genomics of Alzheimer's Project (IGAP) stage 1 data. From the data at hand it is however not possible to say whether this is an independent association signal with AD, or a spill‐over from the main AD variant in APOE. This variant is an amino acid changing variant in APOC4 (L96R). Since rs5167 also shows the strongest association with CSF3 when using imputed SNPs, it is likely that this variant in APOC4 causes differential protein expression of CSF3. #L63 (rs489286) SLAMF7 Genetic variant in a cancer target. This locus is discussed in the main paper. #L65 (rs692804) IL25 OAF may regulate IL25. This locus harbours a replicated trans‐association with Interleukin‐25 (IL25, aka IL‐17E). Imputed data suggests coding SNP rs2508490 (R217H) in Drosophila Out at first Homolog (OAF) as a likely causative variant. In drosophila, OAF is vital for proper neuronal development and hatching, little is known about the function of OAF in humans. Our data suggests that genetic variation in OAF modifies IL25 protein expression. #L71 (rs646776) GRN, (RGMA, A potential role for RGMA and CAT in CAT) coronary artery disease. Trans‐association of rs646776 replicates a strong immuno‐assay based GWAS hit 29 at the sortilin (SORT1) locus. Nguyen et al. 30 state that "our understanding of the relationship between anti‐inflammatory progranulin and its processing to the proinflammatory granulins remains limited" .. "in cultured macrophages, HDL blocked cleavage of progranulin into granulins and inhibited granulin‐mediated inflammation" .. "Sortilin‐deficient (Sort1 −/−) mice exhibited fivefold more circulating progranulin". In addition to the replicated GRN association we find here two weaker trans‐associations of rs646776, with Catalase (CAT) and with Repulsive guidance molecule A (RGMA). The CAT association could not be replicated in QMDiab due to a lack of power, the RGMA association showed nominal association (p=0.022) with a similar trend (levels of all three proteins decrease with the minor allele of rs646776). Note that the SORT1 locus is also a major GWAS locus for total and LDL cholesterol, coronary artery disease. Although not replicated in QMDiab, our association data in KORA suggests a role for RGMA and CAT in the pathways that link cholesterol to coronary artery disease. #L72 (rs10494745), HPX CFHR4 may regulate HPX protein levels. #L76 (rs10801582) This locus is discussed in the main paper. #L79 (rs7091871) CAMK1, CAST Variation in CPN1 may modify calpain‐ induced proteolysis via CAMK1 and CAST. This locus harbours two replicated trans‐pQTLs, one with Calcium/calmodulin‐dependent protein kinase type 1 (CAMK1) and one with Calpastatin (CAST). CAST is an endogenous calcium‐dependent cysteine protease (calpain) inhibitor. CAST and CAMK1 are hence both calcium‐regulated. rs7091871 tags variants close to Carboxypeptidase N (CPN1) protects the body from potent vasoactive and inflammatory peptides containing C‐terminal Arg or Lys (such as kinins or anaphylatoxins) which are released into the circulation. This trans‐pQTL hence links variation in CPN1, CAMK1, and CAST in the regulation of calcium‐dependent calpain‐induced proteolysis. See #L17 and #L413 for a replicated trans‐pQTL with CAMK1 a Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 10 different chromosome. #L82 (rs2239908) SEMA3A, This locus links multiple proteins (SEMA3A, CAMK1D, WISP1, CAMK1D, WISP1, and IL5) through trans‐ (IL5) associations, including associations with opposite directions. This locus harbours three replicated trans‐pQTLs, with Semaphorin‐3A (SEMA3A), with Calcium/calmodulin‐dependent protein kinase type 1D (CAMK1D), and with WNT1‐inducible‐ signaling pathway protein 1 (WISP1), and three further non‐replicated, but nominally significant associations with identical trends, with Interleukin‐5 (IL5), UMP‐CMP kinase (CMPK1), and Serine protease 27 (PRSS27). Interestingly, ratios between SEMA3A and Interleukin‐5 (IL5) display a large increase in the strength of association (p‐gain=1.1×1017, replicated p‐gain=1.3×105 in QMDiab). Ratios of CAMK1D also displays a large p‐gain (p‐ gain=8.0×1016 in KORA, p‐gain=8.0×102 in QMDiab, indicating that the genetics variant either hits a pathway in which SEMA3A and CAMK1D are either regulated in opposed directions to IL5, or where IL5 protein levels play a normalizing role. There a multiple candidate genes for the causative cis‐gene at this locus, including TMEM97, which has a strong eQTL that is tagged by rs2239908 (see SNiPA annotation for details). #L87 (rs9904601) CCL18, CCL3 Genetic variation can inform relationship between different chemokines encoded at the chr7‐cluster. This locus includes two replicated cis‐associations, with C‐C motif chemokine 18 (CCL18, aka PARC) and C‐C motif chemokine 3 (CCL3, aka MIP‐1a). Note that there are multiple C‐C motif chemokines encoded at this locus, including CCL3/4/5/7/14/15/16/18/23. See also #L88. #L88 (rs41341749) CCL14, CCL23 Increased CCL14 may induce decrease in CCL23 protein levels. This locus includes two replicated cis‐associations, with C‐C motif chemokine 14 (CCL14, aka HCC1) and C‐C motif chemokine 23 (CCL23, aka MPIF‐1). Two further associations with CCL18 and CCL3 did not have sufficient replication power and were not replicated – note that #L87 harbours replicated cis‐QTLs for these two chemokines. The association of rs41341749 with CCL14 and CCL23 has opposite direction, and the strength of association for the ratio of both increases by 35 orders of magnitude (p‐gain=2.3×1035 in KORA, replicated 7.5×1012 in QMDiab). Note that there are multiple C‐C motif chemokines encoded at this locus, including CCL3/4/5/7/14/15/16/18/23. SNiPA annotation identifies several variants that are tagged by rs41341749 and that regulate CCL14.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 11

Boxplots of C‐C motif chemokine 14 (CCL14, aka HCC‐1) (A), C‐C motif chemokine 23 (CCL23, aka MPIF‐1) (B) and the ratio between both (C) as a function of rs41341749 (#L88); Scatterplot of CCL14 against CCL23, colored by genotype; black = major allele homozygotes, red = heterozygotes, green = minor allele homozygotes; means by genotype = large filled circles (D). #L91 (rs4688759), TXNDC12 Poorly characterized protein, may be further #L95 (rs11715835) investigated using information on cis‐ encoded genes and eQTLs. Two SNPs that are 1.2 MB apart, each harbours a replicated trans‐pQTLs with Thioredoxin domain‐containing protein 12 (TXNDC12). This protein is poorly characterized. Multiple cis‐ encoded genes with strong eQTLs may be used to elucidate its biological role. rs11715835 tags a 20 order of magnitude stronger association on imputed variant rs143867864. rs143867864 is uncorrelated with rs4688759. #L93 (rs7588285) COLEC11, IL19, Variation in COLEC11 may induce variation in (IL1B) IL19. This locus harbours a replicated trans‐pQTLs with Interleukin‐19 (IL19) and a replicated cis‐ pQTL with Collectin‐11 (COLEC11). There is also a trans‐pQTL with Interleukin‐1 beta (IL1B) at Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 12 this locus, which displays nominal association in QMDiab (p=2.4×10‐4). #L96 (rs5030062) F11, KLKB1 Part of multiple cis/trans associations that involve proteins from vasodilatory and blood‐ coagulation pathways. This locus harbours two replicated trans‐pQTLs with Coagulation Factor XI (F11) and with Plasma kallikrein (KLKB1). rs5030062 tags the I581T coding variant rs710446 in KNG1 (kininogen 1). rs5030062 also associates with bradykinin in Shin et al. 15 (p=5.94×10‐13, see http://www.gwas.eu). Kininogen is a multi‐domain protein and a precursor of vasodilator bradykinin via the kallikrein‐kinin system. KNG1 also plays a role in blood coagulation as an activator of F11. This locus has been reported in GWAS with activated partial thromboplastin time (aPTT) 31 and plasma renin activity 32. See #L138 for a cis‐association with Plasma kallikrein (KLKB1) that also associates an association with bradykinin in Shin et al. 15. #L98 (rs6695321) C1S CFH or CFHR3 may regulate C1S. This locus harbours a replicated trans‐association to Complement C1s subcomponent (C1S). Cis‐encoded proteins are CFH and CFHR3. #L106 (rs867186) PROC Replicates GWAS hit with Protein C levels. This locus harbours a replicated trans‐pQTL with Vitamin K‐dependent protein C (PROC), association with CVD in CardioGram with p=1.36×10‐7, other GWAS hits at this locus are with a number of coagulation parameters. The receptor of PROC (PROCR) is encoded at this locus. The sentinel SNP rs867186 is a G/S amino acid exchange. Tang et al. 33 already reported a strong association of variance in PROCR with PROC levels. #L108 (rs1190552), (IMDH1, IMDH2) Possible link to cell cycle control, may play a #L237 (rs1005776) role in genetic association with body height. Two non‐replicated trans‐pQTLs (but both still nominally significant in QMDIab) with Inosine‐ 5'‐monophosphate dehydrogenase 1 (IMPDH1) and Inosine‐5'‐monophosphate dehydrogenase 2 (IMPDH2). Both variants likely tag the same signal: rs1190552 and rs1005776 are in LD, r2=0.65, the association signal of IMPDH2 with rs1005776 and rs1190552 is comparable, that of IMPDH1 with SNPs rs1190552 is stronger). IMPDH1 and IMPDH2 regulate cell growth and catalyse the synthesis of xanthine monophosphate (XMP) from inosine‐5'‐monophosphate (IMP). This is the rate‐limiting step in the de novo synthesis of guanine nucleotides. There is an eQTL with CDK2‐interacting protein (CINP) at this locus. Lovejoy et al. 34 identified CINP as a cell‐cycle checkpoint protein, suggesting a possible link to cell cycle control. IMDH1 and IMDH2 physically interact. This association could also explain a GWAS hit with body height (p=4.1×10‐14). #L113 (rs2545801) SOD2 Possible link of SOD2 to kallikrein‐kinin system Trans‐association with Superoxide dismutase [Mn], mitochondrial (SOD2). Replication in QMDiab not attempted due to lack of a tagging SNP. Genetic variant in cis impacts factor XII levels (not in SOMAscan panel). Verweij et al. 35 report a "Genome‐wide association study on plasma levels of midregional‐proadrenomedullin and C‐terminal‐pro‐endothelin‐1" and state "The minor variants in KLKB1 (rs4253238) and F12 (rs2731672), both part of the kallikrein‐ kinin system, were associated with higher MR‐pro‐ADM (P=4.46E‐52 and P=5.90E‐24, respectively) and higher CT‐pro‐ET‐1 levels (P=1.23E‐122 and P=1.26E‐67, respectively). Epistasis analyses showed a significant interaction between the sentinel SNP of F12 and KLKB1 for both traits." Link to SOD2 not clear. Confirms strong GWAS hits. Endothelin‐1 (ET‐ 1) and adrenomedullin (ADM) are circulating vasoactive peptides and predictors of cardiac

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 13 death and heart failure. #L115 (rs12146727) C1Q, C1R C1R may regulate C1S, indicates role of genetic variant in AD and CVD This locus harbours a replicated trans‐pQTLS to Complement C1q subcomponent (C1QA C1QB C1QC) and a replicated cis‐pQTL to Complement C1r subcomponent (C1R). There is also a weak association in KORA (p=1.8×10‐7) with Complement C4b (C4A/C4B). Functional match between in CS1 and C1q complex, nominal association with AD (st12_comb p=5.7×10‐ 6, note that association strengthens when including more samples: st1 p=3.1×10‐5) and with CVD (Cardiogam p=5.9×10‐5). #L122 (rs1339847) DYNLRB1 Possible link between cis‐variants and protein levels of DYNLRB1. This locus harbours a replicated trans‐pQTL to Dynein light chain roadblock‐type 1 (DYNLRB1). #L123 (rs9283893) MMP8 Possible link between cis‐variants and protein levels of MMP8. This locus harbours a replicated trans‐pQTL to Neutrophil collagenase (MMP8). #L138 (rs1511802) KLKB1 Part of multiple cis/trans associations that involve proteins from vasodilatory and blood‐ coagulation pathways. This locus harbours a replicated cis‐pQTLs with Plasma kallikrein (KLKB1). rs1511802 also associates with bradykinin in Shin et al. 15 (p=4.27×10‐28, see http://www.gwas.eu). See #L96 for more details. #L141 (rs10919544) Low affinity Ratio between protein concentrations; anti‐ immunoglobulin correlation. gamma Fc region receptors II‐B and III‐B

#L150 (rs12099358) POMC Possible link between cis‐variants and protein levels of POMC. This locus harbours a replicated trans‐pQTL to Beta‐endorphin (POMC). #L155 (rs515863) MAP2K2 Possible link between cis‐variants and protein levels of MAP2K2. This locus harbours a replicated trans‐pQTL to Dual specificity mitogen‐activated protein kinase kinase 2 (MAP2K2).

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 14

#L157 (rs1126478) ALPL Possible link between cis‐variants in CCR3, CCR5 or LTF and protein levels of ALPL. This locus harbours a replicated trans‐pQTL to Alkaline phosphatase, tissue‐nonspecific isozyme (ALPL). #L178 (rs2949833) IGFBP3, (IGF1) Functional match between insulin‐like growth factor‐binding protein in cis and insulin‐like growth factor in trans. This locus harbours a replicated cis‐association with insulin‐like growth factor‐binding protein 3 (IGFBP3) levels. A trans‐association with insulin‐like growth factor I (IGF1) levels was only nominal in QMDiab (p=5.1×10‐3). Both associations display the same directionality. This locus replicates Kaplan et al. 36 association of SNP rs11977526 (chr7:46008110) (IGFBP3) with IGF‐I levels. Confirms strong GWAS hits. #L182 (rs17220241) LRPAP1 Possible link between cis‐variants and protein levels of LRPAP1 This locus harbours a replicated trans‐pQTL to alpha‐2‐macroglobulin receptor‐associated protein (LRPAP1). #L190 (rs28362459) FUT3 Co‐association with total protein N‐ glycosylation, IgG N‐glycosylation, and cancer markers This locus is discussed in the main paper. #L192 (rs4494114) SPINT1 Possible link between cis‐variants and protein levels of SPINT1. This locus harbours a replicated trans‐pQTL to Kunitz‐type protease inhibitor 1 (SPINT1). #L203 (rs4683702) ECE1 Possible link between cis‐variants and protein levels of ECE1 and PCOLEC2 expression. This locus harbours a replicated trans‐pQTL to Endothelin‐converting enzyme 1 (ECE1). ECE1 is involved in proteolytic processing of endothelin precursors to biologically active peptides. Mutations in this gene are associated with Hirschsprung disease, cardiac defects and autonomic dysfunction. This variant may impact PCOLCE2 (cis‐eQTL). PCOLEC2 binds to the C‐terminal propeptide of types I and II procollagens and may enhance the cleavage of that propeptide by BMP1. Possible common theme is proteolysis. #L210 (rs626457) NXPH1 Possible link between cis‐variants and protein levels of NXPH1. This locus harbours a replicated trans‐pQTL to Neurexophilin‐1 (NXPH1). #L233 (rs16935432) SAA1 This locus links serum amyloid protein to amyotrophic lateral sclerosis (ALS). This locus harbours a replicated cis‐pQTL to Serum amyloid A‐1 protein (SAA1). It is also a minor risk locus for Amyotrophic lateral sclerosis (sporadic) 37. Pre‐Serum amyloid protein displayed different serum concentrations in ALS 38.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 15

#L251 (rs11209026) (IL23R) Lower levels of IL23R may be protective for multiple autoimmune disorders, including ankylosing spondylitis. This locus hosts a non‐replicated cis‐pQTL with Interleukin‐23 receptor (IL23R). Evans et al. 11 show that the minor allele of rs11209026 is strongly protective for ankylosing spondylitis. Here we show that the minor allele of rs11209026 associates with lower levels of blood circulating IL23R protein levels. The minor allele variant of rs11209026 is strongly protective of many other autoimmune disorders, including Crohn's disease, ulcerative colitis and psoriasis. See #L11 and #L14 for two other associations of ERAP1 in relation to ankylosing spondylitis. Evans et al. 11 discuss the interaction of ERAP1 and IL23R. #L264 (rs11292716) (GOT1, LCMT1) Confirms strong GWAS hits, and adds protein traits. Two non‐replicated trans‐pQTLs with Aspartate aminotransferase, cytoplasmic (GOT1) and Leucine carboxyl methyltransferase 1 (LCMT1). Tang et al. 39 report "Clinical and Genetic Association of Serum Paraoxonase and Arylesterase Activities with Cardiovascular Risk" (PON1). This locus associates in our study with GOT1 (Aspartate aminotransferase, cytoplasmic), but not with PON1 levels. #L298 (rs9651367) (NID1, NID2) Possible link between cis‐variants and protein levels of NID1 and NID2. This locus harbours a non‐replicated trans‐pQTL to Nidogen‐1 (NID1), and a nominal trans‐ pQTL in KORA with NID2 (p=1.2×10‐8); both proteins are encoded at different loci. #L304 (rs722414) CD300C Possible link between cis‐variants and protein levels of CD300C. This locus harbours a replicated trans‐pQTL to CMRF35‐like molecule 6 (CD300C). #L306 (rs4420638) APOE/RUXF Trans‐association with RUXF supports role of splicing in Alzheimer's disease This locus is discussed in the main paper.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 16

Boxplot of the ratio between Small Nuclear Ribonucleoprotein F (SNRPF aka RUXF) and Apolipoprotein E (isoform E2) with rs4420638 (#L306); Scatterplot of the ratio between SNRPF and APOE‐2, coloured by genotype (black = major allele homozygotes, red = heterozygotes, green = minor allele homozygotes; means by genotype = large filled circles); subnetwork showing edges to three identified pQTLs at the APOE locus. #L348 (rs3134906) (PRSS2) Functional match between the T cell receptor beta locus and T‐cell receptor beta variable and joining genes. Non‐replicated trans‐association with Trypsin‐2 (PRSS2). Cis‐SNP is in T‐cell receptor beta variable and joining repertoire. PRSS2 is localized to the T cell receptor beta locus on chromosome 7. Functional match between the T cell receptor beta locus and T‐cell receptor beta variable and joining genes. #L352 (rs17412738) CCL21 Possible link between cis‐variants and protein levels of CCL21. This locus harbours a replicated trans‐pQTL to C‐C motif chemokine 21 (CCL21). #L353 (rs1354034) (IMPDH1) Functional match between Guanine

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 17

Nucleotide Exchange Factor in cis and Inosine‐5'‐monophosphate dehydrogenase in trans. This locus harbours a non‐replicated trans‐pQTL to Inosine‐5'‐monophosphate dehydrogenase 1 (IMPDH1). Gieger et al. 40 report this locus as a platelet count GWAS hit with ARHGEF3 (Rho Guanine Nucleotide Exchange Factor (GEF) 3) as functional gene. Gieger et al. showed that ablation of ARHGEF3 in D. rerio has profound effect on thrombopoiesis and erythropoiesis. Diseases associated with ARHGEF3 include osteoporosis. ARHGEF3 and IMPDH1 are functionally linked, since IMPDH1 catalyses the first step of the guanine ribonucleotide biosynthesis. This locus also has many trans‐eQTLs (N=203); also trans‐ association with Dynein light chain roadblock‐type 1 (DYNLRB1) and Methionine aminopeptidase 2 (METAP2) and several less significant associations. This locus also associates weakly in KORA with IMPDH2 (p=1.3×10‐7). #L389 (rs2559856) (NAAA) Functional match between cis‐SNP in Choline Phosphotransferase and trans‐association in N‐Acylethanolamine Acid Amidase. This locus harbours a non‐replicated trans‐pQTL with N‐acylethanolamine‐hydrolyzing acid amidase (NAAA). The likely causative cis‐variant is located in Choline Phosphotransferase 1 (CHPT1). CHPT1 catalyses the phosphatidylcholine biosynthesis from CDP‐choline. NAAA degrades bioactive fatty acid amides to their corresponding acids, with the following preference: N‐palmitoylethanolamine > N‐myristoylethanolamine > N‐lauroylethanolamine = N‐stearoylethanolamine > N‐arachidonoylethanolamine > N‐oleoylethanolamine, which links CHPT1 and NAAA by their biochemical activities. #L413 (rs9332653) CAMK1, F5 The L1285I variant of F5 may modulate CAMK1 protein levels. This locus harbours a replicated trans‐pQTL with Calcium/calmodulin‐dependent protein kinase type 1 (CAMK1) and a cis‐pQTL with Coagulation Factor V (F5). rs9332653 tags the L1285I substitution rs1046712 in F5. See #L17 for a replicated trans‐pQTL with CAMK1 at the same locus, but without F5. See #L79 for a replicated trans‐pQTL to CAMK on a different chromosome. #L417 (rs16873418) EDAR, Possible link between cis‐variants and protein (ITGA1/ITGB1) levels of EDAR, possible role in morphogenesis. This locus harbours a replicated trans‐pQTL to Tumor necrosis factor receptor superfamily member EDAR (EDAR). It also harbours a non‐replicated trans‐pQTL to the Integrin alpha‐I: beta‐1 complex (ITGA1/ITGB1). The protein encoded at this locus is Zinc Finger Protein, FOG Family Member 2 (ZFPM2). It is a transcription regulator that plays a central role in heart morphogenesis and development of coronary vessels from epicardium, by regulating genes that are essential during cardiogenesis. Association with EDAR, a member of the tumour necrosis factor receptor family, can activate the nuclear factor‐kappaB, JNK, and caspase‐ independent cell death pathways. There is thus a potential (but weak) link to morphogenesis – however, EDAR was so far only linked to ectodermal development. This locus is also a GWAS hit for Vascular endothelial growth factor levels and Platelet counts. #L418 (rs11072996) (IL16R) Imputed data reveals in IL16R that is a risk variant for cancer. This locus harbours a cis‐pQTL to Interleukin‐16 (IL16). Replication was not attempted.

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 18

Imputed data reveals a much stronger association, including coding variant rs11556218 (p=6.36×10‐27), which is a risk variant for cancer in a meta‐analysis 41. References

1. Pare, G. et al. Genome‐wide association analysis of soluble ICAM‐1 concentration reveals novel associations at the NFKBIK, PNPLA3, RELA, and SH2B3 loci. PLoS Genet 7, e1001374 (2011). 2. Galicia, J.C. et al. Polymorphisms in the IL‐6 receptor (IL‐6R) gene: strong evidence that serum levels of soluble IL‐6R are genetically influenced. Genes Immun 5, 513‐6 (2004). 3. van Dongen, J. et al. The contribution of the functional IL6R polymorphism rs2228145, eQTLs and other genome‐wide SNPs to the heritability of plasma sIL‐6R levels. Behav Genet 44, 368‐82 (2014). 4. Lopez‐Mejias, R. et al. Lack of association of IL6R rs2228145 and IL6ST/gp130 rs2228044 gene polymorphisms with cardiovascular disease in patients with rheumatoid arthritis. Tissue Antigens 78, 438‐41 (2011). 5. Walker, D.G. et al. Association of CD33 polymorphism rs3865444 with Alzheimer's disease pathology and CD33 expression in human cerebral cortex. Neurobiol Aging 36, 571‐82 (2015). 6. Heppner, F.L., Ransohoff, R.M. & Becher, B. Immune attack: the role of inflammation in Alzheimer disease. Nat Rev Neurosci 16, 358‐72 (2015). 7. Hernandez‐Caselles, T. et al. A study of CD33 (SIGLEC‐3) antigen expression and function on activated human T and NK cells: two isoforms of CD33 are generated by alternative splicing. J Leukoc Biol 79, 46‐58 (2006). 8. Chouraki, V. & Seshadri, S. Genetics of Alzheimer's disease. Adv Genet 87, 245‐94 (2014). 9. Australo‐Anglo‐American Spondyloarthritis, C. et al. Genome‐wide association study of ankylosing spondylitis identifies non‐MHC susceptibility loci. Nat Genet 42, 123‐7 (2010). 10. Lee, K. & Irudayaraj, J. Correct spectral conversion between surface‐enhanced raman and plasmon resonance scattering from nanoparticle dimers for single‐molecule detection. Small 9, 1106‐15 (2013). 11. Evans, D.M. et al. Interaction between ERAP1 and HLA‐B27 in ankylosing spondylitis implicates peptide handling in the mechanism for HLA‐B27 in disease susceptibility. Nat Genet 43, 761‐7 (2011). 12. Matala, E., Alexander, S.R., Kishimoto, T.K. & Walcheck, B. The cytoplasmic domain of L‐selectin participates in regulating L‐selectin endoproteolysis. J Immunol 167, 1617‐ 23 (2001). 13. Andiappan, A.K. et al. Genome‐wide analysis of the genetic regulation of in human neutrophils. Nat Commun 6, 7971 (2015). 14. Ho, J.E. et al. Common genetic variation at the IL1RL1 locus regulates IL‐33/ST2 signaling. J Clin Invest 123, 4208‐18 (2013). 15. Shin, S.‐Y. et al. An atlas of genetic influences on human blood metabolites. Nat Genet 46, 543‐550 (2014). 16. Raffler, J. et al. Genome‐Wide Association Study with Targeted and Non‐targeted NMR Metabolomics Identifies 15 Novel Loci of Urinary Human Metabolic Individuality. PLoS Genet 11, e1005487 (2015). Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 19

17. Schramm, K. et al. Mapping the genetic architecture of gene regulation in whole blood. PLoS One 9, e93844 (2014). 18. Li, Y. et al. Genetic variants and risk of lung cancer in never smokers: a genome‐wide association study. Lancet Oncol 11, 321‐30 (2010). 19. Liu, H. et al. Discovery of six new susceptibility loci and analysis of pleiotropic effects in leprosy. Nat Genet 47, 267‐71 (2015). 20. Global Lipids Genetics, C. et al. Discovery and refinement of loci associated with lipid levels. Nat Genet 45, 1274‐83 (2013). 21. Plenge, R.M., Scolnick, E.M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat Rev Drug Discov 12, 581‐94 (2013). 22. Froguel, P. et al. A genome‐wide association study identifies rs2000999 as a strong genetic determinant of circulating haptoglobin levels. PLoS One 7, e32327 (2012). 23. Teslovich, T.M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707‐13 (2010). 24. Inouye, M. et al. Novel Loci for metabolic networks and multi‐tissue expression studies reveal genes for atherosclerosis. PLoS Genet 8, e1002907 (2012). 25. Pays, E., Vanhollebeke, B., Uzureau, P., Lecordier, L. & Perez‐Morga, D. The molecular arms race between African trypanosomes and humans. Nat Rev Microbiol 12, 575‐84 (2014). 26. Hajduk, S.L. et al. Lysis of Trypanosoma brucei by a toxic subspecies of human high density lipoprotein. J Biol Chem 264, 5210‐7 (1989). 27. Xu, J. et al. Genome‐wide association study in Chinese men identifies two new prostate cancer risk loci at 9q31.2 and 19q13.4. Nat Genet 44, 1231‐5 (2012). 28. Cheong, K.A. et al. A novel function of Siglec‐9 A391C polymorphism on T cell receptor signaling. Int Arch Allergy Immunol 154, 111‐8 (2011). 29. Carrasquillo, M.M. et al. Genome‐wide screen identifies rs646776 near sortilin as a regulator of progranulin levels in human plasma. Am J Hum Genet 87, 890‐7 (2010). 30. Nguyen, A.D., Nguyen, T.A., Martens, L.H., Mitic, L.L. & Farese, R.V., Jr. Progranulin: at the interface of neurodegenerative and metabolic diseases. Trends Endocrinol Metab 24, 597‐606 (2013). 31. Tang, W. et al. Genetic associations for activated partial thromboplastin time and prothrombin time, their gene expression profiles, and risk of coronary artery disease. Am J Hum Genet 91, 152‐62 (2012). 32. Lieb, W. et al. Genome‐wide meta‐analyses of plasma renin activity and concentration reveal association with the kininogen 1 and prekallikrein genes. Circ Cardiovasc Genet 8, 131‐40 (2015). 33. Tang, W. et al. Genome‐wide association study identifies novel loci for plasma levels of protein C: the ARIC study. Blood 116, 5032‐6 (2010). 34. Lovejoy, C.A. et al. Functional genomic screens identify CINP as a genome maintenance protein. Proc Natl Acad Sci U S A 106, 19304‐9 (2009). 35. Verweij, N. et al. Genome‐wide association study on plasma levels of midregional‐ proadrenomedullin and C‐terminal‐pro‐endothelin‐1. Hypertension 61, 602‐8 (2013). 36. Camara, A., Contigiani, M.S. & Medeot, S.I. [Concomitant activity of 2 bunyaviruses in horses in Argentina]. Rev Argent Microbiol 22, 98‐101 (1990). 37. Xie, T. et al. Genome‐wide association study combining pathway analysis for typical sporadic amyotrophic lateral sclerosis in Chinese Han populations. Neurobiol Aging 35, 1778 e9‐1778 e23 (2014). Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 20

38. Goldknopf, I.L. et al. Complement C3c and related protein biomarkers in amyotrophic lateral sclerosis and Parkinson's disease. Biochem Biophys Res Commun 342, 1034‐9 (2006). 39. Tang, W.H. et al. Clinical and genetic association of serum paraoxonase and arylesterase activities with cardiovascular risk. Arterioscler Thromb Vasc Biol 32, 2803‐12 (2012). 40. Gieger, C. et al. New gene functions in megakaryopoiesis and platelet formation. Nature 480, 201‐8 (2011). 41. Mo, C.J. et al. Positive association between IL‐16 rs11556218 T/G polymorphism and cancer risk: a meta‐analysis. Asian Pac J Cancer Prev 15, 4697‐703 (2014).

Connecting genetic risk to disease endpoints through the human blood plasma proteome Supplemental Note 1, Page 21