b c a f % % Platelets

Relative Abundance % Platelets 0.01 0.02 Supplementary figure 1 0.0% 0.5% 1.0% 0.05 0 1 0 1 0.1 0 1 0 1 0 2 1 2 0 0 Whole SERPIND1 FGG ITIH4 Spun Spun LD + + washes 3 Spun LD + + washes 2 Spun LD + + wash 1 Spun P + 1 washes 1 P + Leucocytes Platelets Leucocytes Platelets P + 2 washes 2 P + C4BPA Leucocytes Platelets HRG FGB P + 3 washes 3 P +

0 0.2 0.1

0.01 0.005 0.2 0.1 0 0 0 0.001 0.002 % Leukocytes %

APCS FGA LBP

% Leukocytes % % Leukocytes % C2 C1 g d e

Spun - A Whole Spun - B Signal peptide

Glycoprotein Spun + LD + 1 wash - B Spun + LD + 1 wash - A Secreted Pellet Spun + LD + 2 washes -A TMT-126 RBC Spun + LD + 3 washes - A TMT-127N

Spun + LD + 2 washes -B Leukodeplete

0 Spun + LD + 3 washes - B - log 10 (Corrected p 0 Wash Relative abundanceRelative TMT-127C Spun - A 1 20 Spun - B Spun + LD + 1 wash - B TMT-128N TMT-128C Wash

Spun + LD + 1 wash - A 2

- Spun + LD + 2 washes -A value) Spun + LD + 3 washes - A TMT-129N TMT-129C Wash

Spun + LD + 2 washes -B 3 40 1 Spun + LD + 3 washes - B TMT-130N Supplementary Figure 1. An optimised RBC surface protein enrichment protocol. (a) Contamination of enriched RBC by platelets and leukocytes after microcentrifugation. 1 ml of was centrifuged at 1,000 g for 5 min in a 1.5 ml Eppendorf tube. A 200 ml pipette tip was used to withdraw the indicated volume of packed erythrocytes from the very bottom of the tube, and contamination was assessed by flow cytometry using CD61 as a marker and CD45 as a leukocyte marker. Data are the average of two independent experiments. (b) RBC enriched from whole blood using the method indicated were similarly assessed for platelet and leukocyte contamination. ‘Spun’ indicates centrifugation as described in (a) followed by removal of 150 ml of packed erythrocytes. Leukodepletion (LD) used a Plasmodipur filter on spun populations of RBC as described in the Methods section, with 1-3 washes. The bottom panel shows an enlargement of the top graph, indicating that after leukodepletion and 3 washes, there were <0.001% leukocytes and <0.01% platelets. Data are the average of two independent experiments. (c) Comparison of leukocyte and platelet depletion methods using MACS microbeads with the optimised microcentrifugation / leukodepletion / wash protocol. (d) Schematic of the experimental workflow. (e) Hierarchical cluster analysis of proteins quantified in biological duplicate from the indicated RBC preparations described in (b), using signal:noise data normalised to a maximum of 1 for each protein from TMT-based proteomics. An enlargement of the subcluster C1 is shown in the right panel. (f) Example plots of the relative abundance of individual proteins in cluster C1. (g) Application of DAVID software identified enrichment of secreted amongst proteins in clusters C1 and C2, compared to all quantified proteins, indicative of serum contaminants. Supplementary figure 2

UK donors Senegal donors

Supplementary Figure 2. Known blood group proteins constitute the majority of the RBC surface proteome. iBAQ abundances of all 25 quantified blood group proteins as a proportion of the total RBC surface proteome, in both populations. See also Supplementary Data 4. Supplementary figure 3

log10(iBAQ) UK vs Senegalese donors 12 GYPA 11 GYPC BSG SLC4A1 10 CD44 ICAM4 9 SEMA7A CD55 8 TFRC ABCB6

7 CR1

6 (iBAQ) Senegalese donors Senegalese (iBAQ)

10 5

log 4 4 5 6 7 8 9 10 11 12

log10(iBAQ) UK donors

Supplementary Figure 3. Comparison of RBC surface protein abundance between populations. iBAQ abundance values for each protein from UK and Senegalese donors, as shown in Fig. 2a but here with known malaria receptors highlighted. Supplementary figure 4

GYPA GYPC SLC4A1 1.0

0.5 UK donors

abundance 0 1.0

Senegalese Relative Relative 0.5 donors

0

Normalisation Normalised by total signal strategy Normalised by GYPA/GYPC/SLC4A1

Supplementary Figure 4. Expression of GYPA, GYPC and SLC4A1 molecules are similar across multiple donors, irrespective of normalisation method. Relative abundances of the RBC surface proteins GYPA, GYPC and SLC4A1, using normalisation by two different strategies. Data are plotted as a box and whisker plots showing mean, median and interquartile ranges. n = 9 independent biological samples for each group. Supplementary figure 5 UK donors 120

100 R2 = 0.04

80 LPCAT3

SLC4A2 LAMP1 60

%CV LMAN2 ITGA2B METTL7A ICAM4 40

CD35 20 SLC2A1 GYPC GYPA 0 SLC4A1 4 5 6 7 8 9 10 11 12

log10(iBAQ)

Senegalese donors 120

XG 100 R2 = 0.03

80 SHISA4 ITGA2B 60 SLC4A2 TFRC

%CV LPCAT3 TM9SF2 40 CD35

SLC4A1 20 SLC2A1 GYPC GYPA 0 4 5 6 7 8 9 10 11 12

log10(iBAQ) Supplementary Figure 5. Measured variability in protein expression (%CV) does not correlate with protein abundance. Abundance (iBAQ) and %CV values from Fig. 2 and 3 were plotted against one another. Little correlation was seen, apart from low %CV for the four most abundant proteins. For each population these four proteins and the seven most variable proteins are labelled. CD35/CR1 is also labelled for illustration. Supplementary figure 6

a b Donor 1 Donor 2 10000 PE-GYPA

1000 maximum)

APC- 100

ICAM4 Median fluorescence intensity fluorescence Median 10 0 2 4

PE-CD35 Time point (weeks) Count Count (normalised to

Donor 1 GYPA Donor 2 GYPA Signal intensity Donor 1 ICAM4 Donor 2 ICAM4 4 weeks Donor 1 CD35 2 weeks Donor 2 CD35 0 weeks Unstained RBC control

c

A

-

SC S

FSC-A

Supplementary Figure 6. Stability of RBC surface protein expression over time assessed by flow cytometry. (a) RBC surface protein expression from two UK donors. Data is displayed as a histogram normalised to the maximum, with fluorescence signal intensity on the x-axis and normalised count on the y-axis. (b) The median fluorescence intensity of staining for the three marker proteins illustrated in (a) plotted over time. (c) Example of RBC gating strategy based on light scattering properties, shown for deglycerolised RBC.