Supplementary material

Proteome-wide survey of the autoimmune target repertoire in autoimmune polyendocrine syndrome type 1

*Nils Landegren1,2, Donald Sharon3,4, Eva Freyhult2,5,6,, Åsa Hallgren1,2, Daniel

Eriksson1,2, Per-Henrik Edqvist7, Sophie Bensing8, Jeanette Wahlberg9, Lawrence M.

Nelson10, Jan Gustafsson11, Eystein S Husebye12, Mark S Anderson13, Michael

Snyder3, Olle Kämpe1,2

Nils Landegren and Donald Sharon contributed equally to the work

Affiliations

1Department of Medicine (Solna), Karolinska University Hospital, Karolinska

Institutet, Sweden

2Science for Life Laboratory, Department of Medical Sciences, Uppsala University,

Sweden

3Department of Genetics, Stanford University, California, USA

4Department of Molecular, Cellular, and Developmental Biology, Yale University,

Connecticut, USA

1

5Department of Medical Sciences, Cancer Pharmacology and Computational

Medicine, Uppsala University

6Bioinformatics Infrastructure for Life Sciences

7Department of Immunology, Genetics and Pathology, Uppsala University, Sweden and Science for Life Laboratory

8 Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm,

Sweden

9Department of Endocrinology and Department of Medical and Health Sciences and

Department of Clinical and Experimental Medicine, Linköping University, Linköping,

Sweden

10Integrative Reproductive Medicine Group, Intramural Research Program on

Reproductive and Adult Endocrinology, National Institute of Child Health and Human

Development, National Institutes of Health, Bethesda, MD 20892, USA.

11Department of Women's and Children's Health, Uppsala University, Sweden

12Department of Clinical Science, University of Bergen, and Department of Medicine,

Haukeland University Hospital, Bergen, Norway

2

13Diabetes Center, University of California San Francisco, USA

*Correspondence should be addressed to

Nils Landegren [email protected]

Department of Medicine Solna, Karolinska Institute

Experimental endocrinology

Center for Molecular Medicine, L8:01, Karolinska University Hospital

SE-171 76 Stockholm [email protected] phone: +46851779158

3 40000

All Patients

count Healthy Controls Negative

20000

0

1e+01 1e+03 1e+05

Fig. S1. Distribution of log-intensities for arrays probed with patient sera, healthy control sera or without serum sample (Negative).

DDC

100

50

Autoantibody index Autoantibody 3 SD 0

APS1 Healty

Fig. S2. The proteome array results for DDC autoantibody detection were validated using a radio-ligand binding assay (RLBA) in the 51 patients from the APS1 discovery cohort and in 39 healthy controls. The upper limit of the normal range was defined as three standard deviations above the average of the healthy controls.

4

45000 40000 35000 30000 25000

cpm 20000 15000 10000 5000 0

Fig. S3. Representative raw data from the DDC autoantibody radio-ligand binding assay. All patient and control samples were analyzed in duplicate. Negative control

(bar 1-4), positive control (bar 5-8), patients with APS1 (bar 9-68) and healthy control subjects (bar 69-96). Counts per minute (CPM).

DDC Autoantibodies

60000

40000

20000

0 Proteome array signal array Proteome

Positive in RLBA Negative in RLBA

Fig. S4. The proteome array results for DDC were compared against the results from the radio-ligand binding assay (RLBA) for the 51 patients in the APS1 discovery cohort. Patients who were positive for DDC autoantibodies in the RLBA are grouped to the left (cut-off = average of the healthy + 3SD). Proteome array signals are

5 represented as the average fluorescence signal intensities for duplicates on the array after subtraction of the background signal.

160 140 120 100 80 60 40 20 0 10 100 1000 10000 100000 -20

Fig. S5. The proteome array results for DDC were compared against the results from the radio-ligand binding assay (RLBA) for the 51 patients in the APS1 discovery cohort. Proteome array signal results are shown on the x-axis and RLBA results on the y-axis. Proteome array signals are represented as the average fluorescence signal intensities for protein duplicates on the array after subtraction of the background signal. RLBA results are shown as autoantibody index values.

Cluster Dendrogram 1.5 1.0 0.5 0.0 Height R257X homozygotes Healthy controls Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy controls Healthy Healthy controls Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy controls Healthy R257X homozygotes Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy controls Healthy Healthy controls Healthy Healthy controls Healthy controls Healthy Healthy controls Healthy controls Healthy R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes R257X homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes c967 − 979del13 homozygotes Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare Heterozygotes and rare AIRE mutations Heterozygotes and rare

as.dist(1 − c) hclust (*, "complete")

6 Fig. S6. Patients with APS1 and healthy controls were clustered based on Spearman correlations for the known autoantigens; glutamate decarboxylase 2 (GAD2), glutamate decarboxylase 1 (GAD1), tryptophan hydroxylase (TPH1), aromatic L- amino acid decarboxylase (DDC), gastric intrinsic factor (GIF), cytochrome P450, family 1, subfamily A, polypeptide 2 (CYP1A2), potassium channel regulator (KCNRG), testis specific 10 (TSGA10), 22 (IL22), interleukin 17A (IL17A), omega (IFNW1) and interferon alpha 1 (IFNA1). Distance = 1 – correlation.

7 value 60000

40000

20000

0 IL1B IL32 TSLP IL3 IL24 IL36B IL20 IL26 IL12A IL1RN IL1F10 IL34 LIF IL21 IL9 IL23A IL16 IL25 IL4 IFNK IL19 OSM IL36G IL13 CLCF1 IL10 IL7 IL2 IL15 IL5 IL27 IL17A IL22 IFNA16 IFNW1 IL36A IFNA17 IFNA1 IFNA13 IFNA4 IFNA21 IFNA2 IFNA5 IFNA6 IFNA8 IFNA14 IFNG IL6 IL8

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20 P21 P22 P23 P24 P25 P26 P27 P28 P29 P30 P31 P32 P33 P34 P35 P36 P37 P38 P39 P40 P41 P42 P43 P44 P45 P46 P47 P48 P49 P50 P51 HC1 HC2 HC3 HC4 HC5 HC6 HC7 HC8 HC9 HC10 HC11 HC12 HC13 HC14 HC15 HC16 HC17 HC18 HC19 HC20 HC21

Fig. S7. Focused investigation of 49 , and other closely related . High autoantibody signals were seen specifically for the previously

8 established immune targets, including multiple alpha interferon specifies, IFNW1,

IL22 and IL17A (IL17F was not present in the array panel)

60000

40000 DDC CORO2A C1orf137 Intensity EPDR1

20000

0

● ● ● ●

● ● ●

● ●● ● ● ● 10000 ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●

● ● ● ● ● ● ● ●● ● ● ● ● ● ● CORO2A ● ● ● ● ● C1orf137 ● ● ● ● ● Intensity ● ● EPDR1 ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ● 100 ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●●●● ●● ● ● ●●● ●●●● ● ●● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●

● ●

100 1000 10000 Intensity DDC

Fig. S8. Printing contamination artifacts exemplified by DDC and adjacent targets.

Correlated protein array signals between DCC, an established APS1 autoantigen, and

9 nearby located ; CORO2AA, C1orf137 and EPDR1. Healthy subjects (n=21) are grouped to the left and patients with APS1 (n=51) to the right in the upper image. Average fluorescence signal intensities for protein duplicates on the array after subtraction of the background signal are shown.

0.25

0.5 Absolute correlation 0.75

1 A G I I G G G D C C E T I I I I I I I I I I I F I I F M M W P I M C T L F F F F F F F F F F F F Q L G G P D A A D T P A A A I O 1 A A C 2 N N N N N N N N N N N N O 1 D F C S H o M I C 2 P M M D D D D G G R W A A A A A A A A A A A D L R B R 1 r F G 2 1 1 4 5 T 4 2 1 4 1 2 5 8 1 6 1 E E R 7 7 N f C O 1 K 4 2 G 1 1 7 4 3 1 0 B B 1 5 1 4 1 3 2 F A 3 4 2 0 L B A 7 1

Fig. S9. Identification of printing contamination artifacts in the proteome array data set. Correlations between the top targets in the protein array screen before annotation are shown. Unexpected strong correlations between targets closely located on the array were identified as artifacts (marked in Italic) and excluded from the analyses.

10 PDILT

30000

20000

10000

0 Proteome array signal array Proteome

Positive in RLBA Negative in RLBA

Fig. S10. Validation of PDILT autoantibodies in the APS1 discovery cohort (n=51) using a radio-ligand binding assay (RLBA). Patients that were positive for PDILT autoantibodies in the RLBA are grouped to the left. The cut-off in the RLBA was defined as an index-value of 15. Proteome array signals are shown as the average fluorescence signal intensities for protein duplicates on the array after subtraction of the background signal.

MAGEB2

60000

40000

20000

0 Proteome array signal array Proteome

Positive in RLBA Negative in RLBA

Fig. S11. Validation of MAGEB2 autoantibodies in the APS1 discovery cohort (n=51) using a radio-ligand binding assay (RLBA). Patients that were positive for MAGEB2

11 autoantibodies in the RLBA are grouped to the left. The cut-off in the RLBA was defined as an index-value of 15. Proteome array signals are shown as the average fluorescence signal intensities for protein duplicates on the array after subtraction of the background signal.

12

Fig. S12. mRNA levels of MAGEB2 and PDILT were assessed in multiple human tissues by digital droplet PCR. The greatest number of high amplitude events is seen in the testis cDNA sample, and very few or no high amplitude events are seen for the other tissue samples and the no template control (NTC). Grey dots denote events with intensities below cut-off. The results for PDILT and MAGEB2 were compared against that of β2M.

13

Autoantigen Prevalence

IFNW1 47/51 92%

IFNA1 46/51 90%

DDC 37/51 73%

IL22 35/51 67%

TPH1 28/51 55%

GAD2 22/51 43%

GAD1 15/51 29%

GIF 14/51 27%

IL17A 12/51 24%

KCNRG 3/51 6%

CYP1A2 2/51 4%

TSGA10 2/51 4%

Table S1. Prevalence of established autoantibodies in the protein array screen. 51 patients with APS1 were investigated. The upper limit of the normal range was defined as three standard deviations above the average of the healthy control subjects.

14

Gene Tissue expression, mRNA seq (HPA)

TPH1 Group enriched (colon, duodenum, rectum, small intestine, stomach).

DDC Tissue enhanced (duodenum, kidney, small intestine).

GAD2 Tissue enriched (cerebral cortex).

GAD1 Tissue enriched (cerebral cortex).

IFNA13 Not detected.

IL22 Not detected.

GIF Tissue enriched (stomach).

MAGEB2 Tissue enriched (testis).

MAGEB4 Tissue enriched (testis).

PDILT Group enriched (stomach, testis).

TGM4 Tissue enriched (prostate).

Table S2. Tissue expression profiles of the major immune targets in the proteome array screen. mRNA sequencing data on expression profiles obtained from the

Human Protein Atlas, http://www.proteinatlas.org. For comparison, among 20,344 investigated by the Human Protein Atlas, 2,355 (12%) were classified as being

Tissue enriched, 1,109 as Group enriched (5%), 3,478 (17%) as Tissue enhanced,

8,874 (44%) as Expressed in all, 2,696 (13%) as Mixed and 1,832 (9%) as Not detected

39. Several type 1 interferon species were major immune targets in the patients with

APS1 but are here represented by IFNA13 only.

15

Best GOs GO description Genes Group Total P-value count count (Benjamini corrected) GO:0042136 neurotransmitter biosynthetic TPH1 3 16 1.05e-05 process GAD2 GAD1 GO:0006540 glutamate decarboxylation to GAD2 2 2 1.23e-05 succinate GAD1 GO:0006105 succinate metabolic process GAD2 2 2 1.23e-05 GAD1 GO:0004351 glutamate decarboxylase activity GAD2 2 2 1.23e-05 GAD1 GO:0042133 neurotransmitter metabolic TPH1 3 32 1.86e-05 process GAD2 GAD1 GO:0006538 glutamate catabolic process GAD2 2 3 2.47e-05 GAD1 GO:0016831 carboxy-lyase activity GAD2 3 56 7.37e-05 DDC GAD1 GO:0005615 extracellular space IFNW1 5 528 7.96e-05 IL22 IL17A GIF IFNA13 GO:0006519 amino acid and derivate IL22 5 548 8.48e-05 metabolic process TPH1 GAD2 DDC GAD1 GO:0019842 vitamin binding GAD2 4 245 9.17e-05 GIF DDC GAD1 GO:0001505 regulation of neurotransmitter TPH1 3 78 0.000117 levels GAD2

16 GAD1 GO:0009308 amine metabolic process IL22 5 636 0.000117 TPH1 GAD2 DDC GAD1 GO:0007267 cell-cell signaling IL22 5 640 0.000117 IL17A TPH1 GAD2 GAD1 GO:0016830 carbon-carbon lyase activity GAD2 3 82 0.000117 DDC GAD1 GO:0005132 interferon-alpha/beta receptor IFNW1 2 9 0.000118 binding IFNA13 GO:0006807 nitrogen compound metabolic IL22 5 678 0.000134 process TPH1 GAD2 DDC GAD1 GO:0005125 activity IFNW1 4 310 0.000137 IL22 IL17A IFNA13 GO:0019752 carboxylic acid metabolic process IL22 5 795 0.000238 TPH1 GAD2 DDC GAD1 GO:0006082 organic acid metabolic process IL22 5 798 0.000238 TPH1 GAD2 DDC GAD1 GO:0030170 pyridoxal phosphate binding GAD2 3 117 0.000238 DDC GAD1

17 GO:0043648 dicarboxylic acid metabolic GAD2 2 16 0.000268 process GAD1 GO:0006536 glutamate metabolic process GAD2 2 16 0.000268 GAD1 GO:0044421 extracellular region part IFNW1 5 910 0.000386 IL22 IL17A GIF IFNA13 GO:0009065 glutamine familty amino acid GAD2 2 20 0.000389 catabolic process GAD1 GO:0006520 amino acid metabolic process IL22 4 474 0.00049 TPH1 GAD2 GAD1 GO:0006100 tricarboxylic acid cycle GAD2 2 32 0.000934 intermediate metabolic process GAD1 GO:0042401 biogenic amine biosynthetic TPH1 2 34 0.000982 process DDC GO:0006952 defense response IFNW1 4 584 0.000982 IL22 IL17A IFNA13 GO:0042398 amino acid derivate biosynthetic TPH1 2 42 0.00145 process DDC GO:0016829 lyase activity GAD2 3 278 0.00207 DDC GAD1 Table S3. terms showing significant overepresentation in established

APS1 autoantigens present on the protein array (TPH1, DDC, GAD2, GAD1, IFNA13,

IL22, GIF, MAGEB2, MAGEB4, PDILT, TGM4, IFNW1, IL17A, KCNRG, CYP1A2, TSGA10).

Associated GO terms were retrieved from the GOa_human database. Minimal length of considered GO paths was set to 3 (default), and top 30 most significant were

18 reported. 'False discovery rate' according to Benjamini was used for correcting p- values for multiple testing. Results were not clustered.

19