medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1 Title: Immuno-proteomic profiling reveals abundant airway CD8 T cells and ongoing 2 epithelial injury in prolonged post-COVID19 respiratory disease 3 4 Authors: Bavithra Vijayakumar1,2,3,+, Karim Boustani1,4,+, Patricia P. Ogger1,+, Artermis 5 Papadaki5,+, James Tonkin1,3, Christopher M. Orton1,3, Poonam Ghai1, Kornelija 6 Suveizdyte1, Richard J. Hewitt1,3, Robert J. Snelgrove1,4, Philip L. Molyneaux1,3, 7 Justin L. Garner2,3, James E. Peters5,*, Pallav L. Shah1,2,3,*, Clare M. Lloyd1,4* and 8 James A. Harker1,4,* 9 10 1National Heart and Lung Institute, Imperial College London, London, UK 11 2Chelsea and Westminster Hospital, London, UK 12 3Royal Brompton and Harefield Hospitals, Guy’s and St Thomas’ NHS Foundation 13 Trust, London, UK 14 4Asthma UK Centre for Allergic Mechanisms of 15 5Centre for Inflammatory Disease, Dept of Immunology and Inflammation, Imperial 16 College London, Hammersmith Hospital, London, UK 17 +denotes equal contribution 18 *co-senior authors 19 20 Lead contact: Dr James A Harker, [email protected] 21 Sir Alexander Fleming Building, National Heart and Lung Institute, South Kensington 22 Campus, Imperial College London, London, UK. SW7 2AZ.

NOTE: This preprint reports new research that has not been certified1 by peer review and should not be used to guide clinical practice. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

23 Highlights 24 25 • The post-COVID19 airway is characterized by increased cytotoxic . 26 27 • Distinct airway proteomes are associated with the airway immune cell landscape. 28 29 • The peripheral blood does not predict immune-proteome alterations in the airway 30 post-COVID19. 31 32 • Persistent abnormalities in the airway immune-proteome post-COVID19 airways 33 correlate with ongoing epithelial damage. 34

2 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

35 Summary 36 Some patients hospitalized with acute COVID19 suffer respiratory symptoms that 37 persist for many months. To characterize the local and systemic immune responses 38 associated with this form of ‘Long COVID’, we delineated the immune and proteomic 39 landscape in the airway and peripheral blood of normal volunteers and patients from 40 3 to 6 months after hospital discharge. The bronchoalveolar lavage (but not 41 peripheral blood) proteome was abnormal in patients with post-COVID19 lung 42 disease with significantly elevated concentration of associated with 43 , tissue repair and epithelial injury. This correlated with an increase in 44 cytotoxic lymphocytes (especially tissue resident CD8+ T cells), lactate 45 dehydrogenase and albumin (biomarkers of cell death and barrier integrity). Follow- 46 up of a subset of these patients greater than 1-year post-COVID19 indicated these 47 abnormalities resolved over time. Collectively, these data indicate that COVID-19 48 results in a prolonged change to the airway immune landscape in those with 49 persistent lung disease, with evidence of cell death and tissue repair linked to 50 ongoing activation of cytotoxic T cells. 51 52 53 Key words: respiratory viral infection, tissue resident memory, COVID19, SARS- 54 CoV-2, airways

3 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

55 Introduction 56 57 Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) related coronavirus 58 disease (COVID19) manifests as a spectrum of acute illnesses ranging from mild 59 respiratory symptoms to severe, sometimes fatal, respiratory failure (Docherty et al., 60 2020). While the acute impact of COVID19 on morbidity and mortality is well- 61 documented, we are still in the infancy of understanding the longer-term 62 consequences. Morbidity from a range of persistent symptoms, including 63 breathlessness, fatigue and memory impairment have been noted in patients 64 recovering after the acute illness and described under the umbrella term of “long 65 COVID” (Nalbandian et al., 2021; Sigfrid et al., 2021). Complex respiratory 66 complications have been found in up to 18.4% of inpatients (Drake et al., 2021), and 67 persistent breathlessness reported in more than 50% of patients recovering from 68 COVID19 (Mandal et al., 2021). The underlying aetiology for persistent respiratory 69 morbidity is likely to be multifactorial but may be due to persistent parenchymal 70 abnormalities and resultant ineffective gaseous exchange. Persistent radiological 71 abnormalities post-COVID19 are common and may be present even up to 6 months 72 post hospital discharge (Fabbri et al., 2021; Guler et al., 2021; Han et al., 2021; Myall 73 et al., 2021). There is, therefore, a pressing need to understand the molecular and 74 cellular basis of post-COVID19 pulmonary syndromes. 75 76 The acute immunological and inflammatory events that occur during human 77 respiratory virus infections, including SARS-CoV-2, are relatively well described 78 (Harker and Lloyd, 2021). In contrast, the immunological landscape of the human 79 respiratory tract after recovery from acute viral infection is poorly understood. SARS- 80 CoV-2 infection results in formation of long-lasting systemic immunological memory, 81 with virus-specific antibodies and T cell responses still detectable in the majority of 82 those infected at least 8 months post infection and higher titers seen in previously 83 hospitalized individuals (Dan et al., 2021). Circulating counts and the 84 function and frequency of are also reduced during acute disease, but they 85 appear to return to normal shortly after resolution of acute disease (Mann et al., 86 2020; Scott et al., 2020). Likewise, plasma concentrations of inflammatory mediators 87 such as IL-6 and CXCL10, that are highly elevated in acute disease, reduce as 88 individuals recover (Rodriguez et al., 2020). Together, this suggests that systemic 89 inflammatory and immune responses associated with acute disease severity resolve 90 in line with recovery from the acute symptoms. It therefore remains unclear if the 91 severity of inflammation during acute disease is associated with the persistent

4 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

92 respiratory pathology seen in some SARS-CoV-2 infected individuals months after 93 infection, or if there is ongoing inflammation in these individuals. 94 95 This study sought to examine the relationship between the immune system and 96 respiratory pathology post-COVID19. The immune cell and proteomic composition of 97 the airways and peripheral blood were analyzed in a group of previously hospitalized 98 COVID19 patients with persistent radiological abnormalities in their lungs over 3 99 months post discharge. In comparison to healthy individuals, the post-COVID19 100 airway showed substantial increases in activated tissue resident memory CD8+ and 101 CD4+ TRM, and an altered pool. The airway proteome was also distinct 102 from that observed in healthy individuals, with elevation in proteins associated with 103 ongoing cell death, loss of barrier integrity and immune cell recruitment. Importantly, 104 none of these alterations were predicted by changes in the proteome or immune cells 105 of the matched peripheral blood. Moreover, the scale of these alterations was not 106 linked to the initial severity of disease while in hospital but instead was heterogenous. 107 Some individuals displaying heightened T cell responses associated with significant 108 increases in CXCR3 chemokines in the airways and linked to prolonged epithelial 109 damage and extracellular matrix (ECM) dysregulation, while other individuals 110 exhibited a return to relative airway homeostasis. Subsequent long-term follow-up 111 also suggested that these ongoing changes to the airway landscape progressively 112 return to normal.

5 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

113 Results 114 115 Increased airway lymphocyte numbers characterize patients recovering from 116 hospitalization with SARS-CoV-2 117 118 We recruited 38 patients undergoing bronchoscopy for investigation of persistent 119 respiratory abnormalities 3-6 months following infection after acute SARS-CoV-2 120 infection (post-COVID19) (Figure 1A). All patients had both ongoing respiratory 121 symptoms and radiological pulmonary abnormalities on computed tomography (CT) 122 scanning. We obtained plasma samples from peripheral blood and bronchoalveolar 123 lavage (BAL) fluid from their airways. We stratified the post-COVID19 cohort based 124 on the level of respiratory support used during their initial hospitalization with acute 125 COVID19, into moderate (no or minimal oxygen administered), severe (non-invasive 126 ventilation) and very severe (invasive ventilation). To provide a control group, we 127 used BAL and plasma samples obtained from 10 healthy volunteers recruited prior to 128 the COVID19 pandemic (demographic information in Table S1). 129 130 We first compared the cellular composition of BAL fluid in post-COVID19 patients to 131 healthy controls (HC) by flow cytometry. Post-COVID19 patients had a significantly 132 higher numbers of cells in their airways compared to the healthy volunteers (Figure 133 1B). This increased cellularity was due to significantly elevated numbers of airway T 134 and B cells, and a trend towards increased airway macrophages (AM), CD56+CD3- 135 (natural killer, NK) cell and CD56+CD3+ (NKT) cell numbers, while CD14+ monocyte, 136 and numbers were similar to those found in healthy controls 137 (Figure 1C). As a proportion of airway leukocytes, CD14+ monocytes and 138 were decreased in patients recovering from COVID19 compared to healthy controls 139 (Figure S1A). There was no association between the severity of acute COVID19 in 140 hospital and the immune cell composition of the post-COVID19 airways (Figure 1C). 141 In contrast to the peripheral lymphopenia that is associated with acute COVID19 142 (Chen and Wherry, 2020), we found that in this post-COVID19 patient cohort the 143 frequency of T cells, B cells and CD14+ monocytes in the peripheral blood was 144 broadly similar to healthy controls (Figure S1B), although the proportion of NK and 145 NKT cells was significantly decreased (Figure S1B). Collectively, these data indicate 146 that after recovery from severe SARS-CoV-2 infection, immune cell frequencies in 147 the peripheral blood are comparable to those in a group of age-matched controls. In 148 contrast, the immune landscape of the airways remains altered, being marked by 149 residual lymphocytes.

6 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

150 151 The post-COVID19 airway immune landscape displays a limited relationship to 152 inflammatory biomarkers found during acute disease 153 154 Severe COVID19 has a distinctive clinical biomarker blood profile, characterized by 155 elevated inflammatory markers (including C-reactive (CRP) and ferritin), 156 elevated coagulation markers (D-dimer and fibrinogen), and lymphopenia (Guan et 157 al., 2020). At the time of hospitalization, our patients displayed a similar pattern of 158 clinical laboratory test abnormalities (Table S1). We therefore examined whether 159 there was any association between the peak levels (or nadir in the case of 160 lymphocyte count) of these clinical biomarkers measured during acute hospitalization 161 and the immune cell composition of the airways at 3 months following discharge. Of 162 the 5 biomarkers measured, ferritin correlated (Spearman rho > 0.4) with total airway 163 monocytes, specifically classical and intermediate monocytes, and , and 164 inversely correlated with AM (Table S2). Total airway B cells correlated with peak D- 165 dimer, while airway eosinophils correlated with lowest lymphocytes. None of the 166 biomarkers correlated with follow up airway T cell, NK cell or neutrophil numbers, and 167 fibrinogen and WCC showed no substantial correlation with any airway immune cell 168 analyzed. Taken together these data suggest that the post-COVID19 airway immune 169 cell composition shows a limited relationship with the severity of initial acute disease. 170 This observation is supported by the lack of differences in immune cell composition 171 when segregating between moderate, severe and very severe patients (Figure 1). 172 173 The post-COVID19 airway displays a proteome profile distinct from that found in the 174 circulation. 175 176 Since clinical biomarkers did not reveal a relationship between acute disease and 177 prolonged pathology and immune responses in the lungs post discharge, we next 178 evaluated the airway and blood (plasma) proteomes at follow-up. We used the Olink 179 proteomics platform to measure 435 unique proteins (Supplementary File 1A) in 19 180 post-COVID19 patients and 9 healthy volunteers. The proteins measured were highly 181 enriched for those involved in immuno-inflammatory processes (Supplementary File 182 1B-C). 183 184 Principal components analysis (PCA) of BAL fluid proteomes revealed differences 185 between post-COVID19 patient and healthy control samples (Figure 2A), with 186 separation of case and controls most evident along PC1. In plasma, PCA also

7 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

187 revealed differences in post-COVID19 patients and healthy controls, most evident 188 along PC2, although the differences appeared less marked than for BAL. However, in 189 both BAL and plasma there was considerable overlap in the spatial location of post- 190 COVID19 and healthy control samples in the PCA plots. Unsupervised hierarchical 191 clustering revealed two major clusters of samples in BAL, one consisting 192 predominantly of post-COVID19 samples and the other predominantly healthy 193 controls (Figure S2A), although, as with the PCA, some patient and control samples 194 were grouped together. In contrast, in plasma, there was no visible structure to the 195 clustering and lack of clear separation of cases and control samples (Figure S2B). 196 These analyses indicate that post-COVID19 syndromes are reflected to a much 197 greater extent in the airway proteome than the peripheral blood. They also suggest 198 that there is considerable heterogeneity in the BAL and blood proteomes of post- 199 COVID19 patients, with some patients displaying similar profiles to that of healthy 200 controls, despite persistent symptoms and radiological findings.

201 We next performed differential protein abundance analysis comparing post-COVID19 202 cases with healthy controls. In BAL fluid, we identified 22 proteins with significantly 203 altered concentration (5% false discovery rate, FDR) (Figure 2B-C, Supplemental 204 File 1D). These were all upregulated in post-COVID19 patients compared to healthy 205 controls (Figure 2C). In order to provide a succinct and standardised nomenclature, 206 we report proteins by the symbols of the encoding them (see Supplementary 207 File 1A for a mapping of symbols to full protein names). The proteins that were most 208 significantly differentially abundant between post-COVID19 patients and healthy 209 controls were: SERPINA7 (thyroxine binding globulin), DPP4 (dipeptidyl peptidase 4), 210 SERPINA5 (plasma serine protease inhibitor), KLK6 (kallikrein related peptidase-6), 211 LYVE1 (Lymphatic vessel endothelial hyaluronic acid receptor 1), AREG 212 (amphiregulin), F3 (factor 3), FLT3LG (Fms-related tyrosine kinase 3 ligand), QPCT 213 (glutaminyl-peptide cyclotransferase), and SRC (Proto-oncogene tyrosine-protein 214 kinase Src) (Figure 2C-D). Pathway annotation of the 22 upregulated proteins using 215 String-DB highlighted “leucocyte activation”, “regulation of cell death”, “response to 216 injury” and “response to wounding” (Supplemental File 1E). Next, we analyzed the 217 relationship between the 22 differentially abundant proteins and the airway immune 218 cell proportions. Neutrophils most strongly correlated with AREG and LDLR (low 219 density lipoprotein receptor), while monocyte proportions correlated with F3, 220 FLT3LG, MB (myoglobin) and IL1RN (IL-1 receptor antagonist protein) (Figure 2E). 221 T cells, despite being significantly elevated in the airways of post-COVID19 patients, 222 displayed only weak correlations with the differentially abundant proteins.

8 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

223 224 In contrast to BAL, no significant differences between protein levels were detected in 225 plasma in post-COVID19 patients versus healthy controls (Supplemental File 1F).

226 Comparison of the estimated log2 fold changes for the 22 proteins upregulated in

227 post-COVID19 BAL fluid with the estimated log2 fold changes for these same 228 proteins in plasma revealed no correlation (Figure S2C-D), indicating an absence of 229 any relationship between airway and plasma proteome changes. 230 231 The modest sample size and multiple testing burden of 435 proteins likely limited the 232 statistical power to detect differentially abundant proteins. To examine whether there 233 was evidence of signal in the proteomic data that was hidden by the hard- 234 thresholding in the differential abundance analysis, we examined quantile-quantile 235 (QQ) plots of the distribution of expected p-values under the null hypothesis of no 236 proteomic differences between cases and controls versus the observed p-values. For 237 both BAL and plasma, the QQ plots revealed substantial deviation from the diagonal 238 (albeit more so in BAL), indicating the presence of systematic differences between 239 post-COVID19 and healthy controls for plasma proteins as well as BAL proteins 240 (Figure S3A). Corroborating this, the distribution of p-values for the proteins was not 241 uniformly distributed, with skewing towards zero (Figure S3B). This is consistent with 242 the observation that there was separation of post-COVID19 and control samples on 243 the PCA plots for both BAL and plasma. In summary, these data suggests that there 244 are differences in both the BAL and plasma proteomes of post-COVID19 cases 245 compared to healthy controls, but that the effects are much stronger in BAL and this 246 study was underpowered to detect them in plasma.

247 To increase power, and investigate potential protein-protein relationships, we utilized 248 a network analysis method, Weighted Coexpression Network Analysis (WGCNA) 249 (Langfelder and Horvath, 2008; Zhang and Horvath, 2005), that leverages the 250 correlation between proteins to enable dimensionality reduction and thus reduce 251 multiple testing burden. We used WGCNA to identify modules of correlated proteins, 252 and then tested for association between these protein modules (represented 253 quantitatively by an eigenprotein value) and case/control status. In BAL fluid, this 254 revealed two modules (‘red’ and ‘blue’) associated with case/control status using a 255 5% FDR significance cut-off (Figure 3A, Supplementary File 1G-I). Applying a more 256 conservative Bonferroni corrected p-value threshold, only the red module remained

257 significant (PBonferonni 0.03).

9 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

258 The red module consisted of 37 proteins (Figure 3B, Supplementary File 1H). 259 Overall, the red module was characterized by proteins associated with chemotaxis, 260 inflammation, cell death and repair. Examination of the levels of proteins in the red 261 module across samples highlighted both upregulation of individual proteins in post- 262 COVID19 patients versus healthy controls, and the co-upregulation of groups of 263 related proteins such as the CXCR3 chemokines (CXCL9, CXCL10 and CXCL11), 264 and IL1A (interleukin-1A) and its antagonist IL1RN (Figure 3B). We used the 265 STRING database to visualize known or predicted relationships between proteins in 266 the module (Figure 3C, Methods). To highlight putative key proteins in the red and 267 blue modules in a data-driven way, we identified hub proteins, defined as those that 268 are highly interconnected in the proteomic network defined by WGCNA 269 (Supplementary File 1J). This analysis identified CASP3 (-3), EPCAM 270 (epithelial cell adhesion molecule), F3 and MB as hub proteins in the red module. 271 Notably, F3 and MB, an oxygen binding protein release of which is linked to muscle 272 damage, were also identified as upregulated in the univariate differential abundance 273 analysis (Figure 2B-C). CASP3 is a protein involved in cell death, EPCAM and 274 KRT19 (Keratin-19) are indicative of epithelial cell debris within the BAL, and TGFA 275 (transforming growth factor A) is an EGFR ligand involved in EC repair. The 276 presence of CASP3, EPCAM, KRT19 and TGFA in the red eigenprotein module 277 therefore suggests that one of the key features of the post-COVID19 airway is the 278 presence of ongoing epithelial injury and repair.

279 As with the red module, blue module proteins were predominantly upregulated in 280 post-COVID19 versus healthy control BAL (Figure S4A). The blue module was 281 larger than the red module, containing 108 proteins involved in a wide range of 282 biological activities. Several members were involved in cell adhesion and immune 283 cell signaling. The hub proteins in the blue module were CD93 (Complement 284 component C1q receptor), COMP (Cartilage oligomeric matrix protein), IGFBP3 285 (Insulin-like growth factor-binding protein 3), IL1R2 (Interleukin-1 receptor type 2), 286 LYVE1, MMP2 (72 kDa type IV collagenase), NCAM1 (Neural cell adhesion molecule 287 1), SELL (L-), TIE1 (Tyrosine-protein kinase receptor Tie-1), TNXB 288 (Tenascin-X) and VASN (Vasorin) (Figure S4B). Of these, LYVE1 and VASN were 289 also identified in the differential abundance analysis.

290 In contrast to the BAL network analysis, no protein modules in plasma were 291 associated with case-control status. These results suggest that persistent post- 292 COVID19 respiratory abnormalities have a demonstrable proteomic signature in BAL

10 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

293 that is distinct compared to that of healthy control BAL. In contrast, we were unable 294 to detect changes in the plasma proteome of post-COVID19 patients, even with the 295 enhanced statistical power provided by the WGCNA method. 296 297 CXCR3 ligands and signs of ongoing epithelial damage correlate with airway T cell 298 and monocyte responses 299 300 Given that the airways of post-COVID19 patients displayed a distinct proteome 301 alongside elevated numbers of T cells, B cells and NK cells, we next sought to 302 determine which specific proteins might be linked to distinct immune cell populations. 303 Testing for associations between BAL fluid proteins and immune cell composition in 304 post-COVID19 patients revealed several significant findings (Figure 4A and 305 Supplemental file 1K). The proportion of monocytes in the airways was significantly 306 associated with a range of airway proteins, including the CCR7 ligand CCL19, the 307 CXCR3 ligands CXCL9 and 11, TRAIL (TNFSF10), and BAFF (TNFSF13B) (Figure 308 4A). CXCL9 and 11 also positively correlated with lymphocyte and T cell frequencies 309 and negatively correlated with airway macrophage frequencies in the BAL (Figure 310 4A). T cell frequencies were also positively correlated with SH2D1A (otherwise 311 known as SLAM associated protein or SAP). 312 313 Analysis of combined data from healthy controls and post-COVID19 patients 314 revealed a composite score reflecting CXCL9, CXCL10 and CXCL11 levels, CXCL9, 315 CXCL10 and CXCL11, although no significant correlation was seen with CD3+CD56+ 316 NKT cells (Figure 4B). Within the post-COVID19 data set (as CD16 was not present 317 in historic flow data used for healthy controls), total monocyte frequencies also 318 correlated with average CXCR3 ligand expression (Figure 4C). Analysis of 319 subpopulations revealed the proportion that were intermediate (CD14+CD16+) 320 monocytes positively correlated with CXCR3 ligands, while CD14+ monocytes 321 displayed a negative correlation, and CD16+ monocytes displayed no correlation 322 (Figure 4C). T cell proportions in the airways correlated tightly with the concentration 323 of CD8a protein, but not CD4, in the BAL (Figure 4D), suggesting the increased 324 airway T cells are most likely the result of increased CD8+ T cell frequencies. 325 326 Given the robust correlation of the CXCR3 ligands, particularly with T cells, we next 327 sought to determine the relationship between T cell frequencies and other members 328 of the red module, specifically those indicative of ongoing epithelial damage in the 329 post-COVID19 airways. CD8a correlated strongly with the concentrations of CASP3

11 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

330 and EPCAM, concomitant with two of the differentially expressed proteins: MB and 331 DPP4 (Figure 4E). Collectively, these data suggest that proteins linked to the 332 recruitment of T cells, especially cytotoxic T cells, are strongly associated with 333 proteins that are both indicative of ongoing epithelial damage and upregulated in the 334 airways post-COVID19. 335 336 To further evaluate this and confirm the presence of increased damage in the post- 337 COVID19 airway, we measured DPP4, alongside 2 markers of damage not analyzed 338 through Olink, albumin and lactate dehydrogenase (LDH) in an expanded cohort of 339 healthy controls and post-COVID19 patients (Figure 4F). All 3 molecules were 340 significantly upregulated in the airways of patients post-COVID19 compared to 341 healthy controls, validating the observations made by Olink and confirming the 342 presence of ongoing damage within the respiratory tract in patients previously 343 hospitalized for COVID19. 344 345 Distinct myeloid populations are present in the BAL compared to the peripheral blood 346 but do not correlate with ongoing airway damage post-COVID19 347 348 Dysfunction of circulating classical monocytes is a hallmark of the initial, acute phase 349 of COVID19 infection (Mann et al., 2020). Given the correlations observed between 350 different airway monocyte populations and specific proteins in the BAL identified by 351 Olink, we further investigated the nature of airway and circulating myeloid cells in a 352 subgroup (n = 21) of post-COVID19 patients using high-parameter spectral 353 deconvolution cytometry. Unbiased FlowSOM analysis of the airway and circulating 354 myeloid cells revealed 10 distinct cell populations (Figure 5A), with marker analysis 355 suggesting these populations could be assigned to known subsets of monocytes or 356 dendritic cells (DCs). The circulation was dominated by CD14+ classical monocytes 357 (metacluster 1), which constituted most myeloid cells identified, while the airways 358 were more heterogenous, with the presence of intermediate monocytes, non- 359 classical monocytes and DCs (Figure 5B). Conventional gating of these myeloid 360 subsets (as shown in Methods Figure 2) confirmed these observations, with 361 classical monocytes being most abundant in the circulation, while CD16+ monocytes 362 were more frequent in the airways (Figure 5C). In comparison, DCs were identifiable 363 in both sites as pDCs, cDC1s and cDC2s, although were infrequent (Figure 5D). 364 Furthermore, airway monocyte activation status (as determined by CD86 and HLA- 365 DR expression) was not dependent on acute disease severity (Figure 5E). Likewise, 366 correlation of airway LDH and albumin concentrations revealed no strong

12 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

367 associations between the relative abundance of different myeloid cells and ongoing 368 damage in the respiratory tract after COVID19 (Figure 5F); nor was there any 369 association between these cell types and DPP4 concentrations. Taken together this 370 indicates that while myeloid cell frequencies in the airways are associated with a 371 specific airway proteome landscape, this likely has limited association with ongoing 372 tissue damage after SARS-CoV-2 infection. 373 374 The post-COVID19 airway is enriched for activated, tissue resident T cells 375 376 The post-COVID19 airway showed significantly increased T cell numbers (Figure 377 1C), associated with a proteome linked to ongoing epithelial damage and repair 378 (Figure 2). FlowSOM analysis of the airway and circulating lymphocytes indicated 379 that 12 metaclusters with distinct protein expression profiles were present (Figure 380 6A). Several metaclusters were unique to either the blood or the airways, with 381 specifically clusters 1 and 10 only present in the airways, and cluster 5 only present 382 in the blood (Figure 6B-C). Marker expression analysis identified these metaclusters 383 (as described in Figure 6B), with the differential clusters being identified as tissue- 384 resident memory CD4 and CD8 T cells (metaclusters 1 and 10), and naïve CD4 T 385 cells (metacluster 5). Manual gating of T cell populations showed that while the blood 386 was enriched for naïve CD4 and CD8 T cells, the airways contained populations of 387 antigen-experienced CD4 and CD8 T cells (Figure 6D). Additionally, the airways 388 contained increased proportions of activated and tissue-resident memory (TRM) CD4 389 and CD8 T cells compared to blood (Figure 6D). Proportions of γδ T cells and NK 390 cells in the airways and blood were comparable however there was a significant 391 increase in NKT cell proportions compared to blood (Figure 6E). Comparison of the 392 T cell dynamics in the post-COVID19 airways showed that airway CD4 T cells retain 393 a highly tissue resident phenotype (CD69+CD45RA-), with few naïve (CD45RA+) CD4 394 T cells observed, and the frequency of these 2 cell populations remains relatively 395 static, irrespective of the total number of CD4 T cells found in the airways (Figure 396 6F). In comparison, as CD8 T cell numbers increased in the airways, the proportion 397 displaying a CD45RA-CD69+ phenotype significantly increased, while those with a 398 CD45RA+ phenotype declined (Figure 6F). 399 400 In line with activated CD8 T cells, rather than activated CD4 T cells, being linked to 401 ongoing epithelial damage in the post-COVID19 airway there was a correlation (r > 402 0.4) between CD8, but not CD4, T cell proportions in the airways and the 403 concentrations of airway albumin and LDH (Figure 6G); this correlation was stronger

13 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

404 when only CD69+ CD8 T cells were analyzed (r = 0.5, p < 0.05 for both Albumin and 405 LDH). LDH most significantly correlated with the proportion of CD69+ CD103- CD8 T 406 cells in the airways (r = 0.7, p < 0.001), while albumin most significantly correlated 407 with the proportion of CD69+ CD103+ CD8 T cells in the airways (r = 0.6, p < 0.001), 408 highlighting distinct markers of damage were linked to different phenotypes of CD8 T 409 cells in the airway. Of note however, the CD8 T cell frequency did not correlate with 410 the concentration of DPP4 in the airways (Figure 6G). Deeper analysis revealed that 411 albumin concentrations were associated with reduced expression of molecules 412 associated with T cell activation on both CD103+ and CD103- CD69+ CD8 T cells, 413 most notable CD28 and CD38 (Figure 6H). LDH concentrations meanwhile were 414 linked to downregulation of CD127 and, especially on CD103+ CD69+ CD8 T cells, 415 but upregulation of CD69 and CD27 (CD69: r = 0.5, p < 0.05; CD27: r = 0.6, p < 0.05, 416 Figure 6H). In contrast individuals with higher concentrations of DPP4 in the airways 417 appeared to possess CD8 TRM with higher expression of CRTH2 and CD38 (Figure 418 6H). No relationship was seen between the expression of activation markers on the 419 surface of CD8 TRM and the severity of disease in hospital (Figure S5). Taken 420 together, these results highlight the connection between T cells in the airways, 421 specifically CD8 TRM, and ongoing epithelial damage after recovery from COVID19, 422 and indicate differential CD8 T cell activation status may be associated with types of 423 damage within the airways. 424 425 The enhanced airway immune cell frequencies seen post-COVID19 decline over time 426 427 Acute COVID19 can result in persistent respiratory symptoms, and the data here 428 highlight the presence of substantial immunological and proteome alterations in the 429 airways of patients hospitalized with COVID19 up to 6 months after discharge. To 430 determine if these changes to the airways reduced over time, we examined 3 431 patients with continued lung CT abnormalities greater than 1 year following discharge 432 (Table S3). The total number of BAL cells recovered was greatly reduced in all 3 433 patients between the initial bronchoscopy and the 1 year follow up bronchoscopy, 434 down to comparable levels in healthy control airways (Figure 7A). Similarly, numbers 435 of T cells, B cells, NK cells and NKT cells were reduced to nearly or within the normal 436 range seen in the airways of healthy individuals (Figure 7B). In the 2 individuals with 437 elevated lymphocytes the ratio of CD4 to CD8 T cells increased (Figure 7C). In 438 accordance with this the proportion of CD8, but not CD4, T cells trended to decrease, 439 although the proportion of each that were of a TRM or activated (CD69+) phenotype, 440 remained similar between the 2 time points (Figure 7D). Fitting with progressive

14 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

441 recovery trajectory airway DPP4 concentrations declined in the 2 patients with 442 elevated concentrations at the first bronchoscopy (Figure 7E), while CT abnormality 443 was also reduced in all 3 patients between the first CT and the follow up (Figure 7F). 444 Interestingly LDH was however increased between the first and second visit, while 445 albumin concentrations were unchanged, but also within the range of healthy controls 446 at both time points for all 3 patients analyzed (Figure 7G). 447 448 Collectively, our findings show ongoing changes to the immune and proteome 449 landscape of the airways, characterized by increased T cells and markers of 450 epithelial damage, several months after recovery from the acute phase of COVID19. 451 These changes, and lung pathology, do however appear to resolve over the longer (> 452 1 year) term. 453

15 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

454 Discussion 455 456 Recovery from acute SARS-CoV-2 infection is associated with a wide range of 457 persistent symptoms including ongoing respiratory pathology. Here we examine the 458 airway and circulating landscape of patients previously hospitalized with COVID19, 459 finding they possess a distinct proteome and immunological profile in their airways, 460 but not peripheral blood, 3 to 6 months post recovery from acute infection. While 461 there is substantial heterogeneity between patients, common upregulated signals 462 include proteins associated with ongoing cell death, epithelial damage and tissue 463 repair; features that correlate with the presence of increased numbers of activated 464 tissue resident CD8 T cells within the airways. Importantly, preliminary evidence 465 suggests this altered airway landscape does improve over the long term, with 466 reductions in airway immune cell numbers 1 year post discharge from hospital. 467 468 The acute response to SARS-CoV-2 infection is characterized by upregulation of a 469 wide diversity of plasma proteins including members of the IFN pathway and their 470 interferon stimulated genes (ISGs), chemokines, cytotoxic proteins, and markers of 471 epithelial damage (Arunachalam et al., 2020; Filbin et al., 2021; Gisby et al., 2021). 472 More severe disease is associated with increased concentrations of inflammatory 473 proteins such as IL-6, TNF, GM-CSF, IL-1RN and IL-18 (Arunachalam et al., 2020; 474 Filbin et al., 2021; Thwaites et al., 2021). A similar pattern of upregulated proteins, 475 especially chemokines like CXCL10 and cytokines such as IL-6, is seen in the 476 airways during acute COVID19 (Liao et al., 2020; Saris et al., 2021; Szabo et al., 477 2021). Here, we show that 3 to 6 months after SARS-CoV-2 infection, despite the 478 presence of ongoing respiratory morbidity, the majority of the plasma proteins 479 differentially expressed during acute disease appear to have returned to similar 480 concentrations to those seen in healthy controls, and even data dimension reduction 481 approaches such as WGCNA fail to highlight any significant associations between 482 COVID19 infection and the plasma proteome months later. 483 484 In contrast, the post-COVID19 airways continue to display an abnormal proteome. 485 This displayed both shared and distinct features to that seen in acute disease. 486 Proteins linked to inflammation featured less prominently than in acute COVID19; 487 instead, we observed airway upregulation of proteins involved in epithelial damage 488 and repair (e.g. the EGFR ligand AREG and the epithelial marker KRT19). Matrix 489 metalloproteinase-3 (MMP-3), which regulates the extracellular matrix (ECM), was 490 also differentially upregulated in the post-COVID19 airway. MMP3 and AREG are

16 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

491 both upregulated after influenza A virus (IAV) infection in vivo in mice, and in vitro in 492 human fibroblasts and epithelial cells (Boyd et al., 2020); and these proteins are both 493 linked to epithelial repair and fibrosis in the lungs e.g. (Morimoto et al., 2018; 494 Yamashita et al., 2011). 495 496 The elevated concentrations of both lactate dehydrogenase and albumin in the 497 airways provide further evidence of ongoing cell death and damage to the barrier 498 integrity within the respiratory tract post-COVID19. This observation is reinforced by 499 the upregulation of a module of proteins in the BAL of post-COVID19 patients whose 500 individual members link epithelial damage (EPCAM, KRT19), cell death (CASP3) and 501 epithelial repair (TGFA), but also suggest a connection between these processes and 502 factors involved immune cell recruitment and survival (CXCL9, CXC10, CXCL11, IL- 503 7). These markers of ongoing damage and repair tightly correlate with the frequency 504 of T cells, especially CD8 TRM in the airways. In mouse models of severe acute 505 respiratory virus infection CD8 T cells have long been known to act as a double- 506 edged sword. Although the cytotoxic molecules and cytokines they release are 507 essential for clearing virus, they can also cause tissue damage and immune 508 pathology (reviewed in (Duan and Thomas, 2016; Schmidt and Varga, 2018)). While 509 the presence of pre-existing virus specific CD8 TRM in the airways is thought to be 510 highly protective against a re-encounter with the same virus in both mice and man 511 (Jozwik et al., 2015; Wu et al., 2014) almost nothing is known regarding their role in 512 long-term respiratory viral pathology, especially in humans. This is primarily due to 513 the lack of relevant samples collected during the recovery period. However, the data 514 shown here for SARS-CoV-2 may provide a potential insight, supporting the concept 515 that sustained activation of CD8 TRM in the airways, long after recovery from acute 516 disease, may contribute to ongoing immune pathology through sustained damage to 517 the respiratory epithelium. 518 519 The mechanism behind the increased number of CD8 TRM in the airways is unclear, 520 although another study has also reported detecting virus specific CD8 T cells in lung 521 tissue up to a year post-infection (Grau-Exposito et al., 2021). Both virus specific 522 CD4 and CD8 T cells rapidly expand in the circulation and airways of patients 523 following SARS-CoV-2 infection, leading to the formation of TRM (Szabo et al., 524 2021). The majority of these activated effector T cells rapidly contract after resolution 525 of acute disease. In mice Slutter et al have shown that lung CD8 TRM are more 526 apoptotic than CD4 TRM after IAV infection, resulting in a more rapid decline in their 527 number. They also observed that the lungs of a mouse that had previously

17 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

528 experienced IAV infection more robustly maintained CD8 TRM compared to an 529 uninfected lung, showing that severe infection promotes a pro-TRM niche (Slutter et 530 al., 2017). This fits with our observation that while CD8 TRM numbers dynamically 531 change dependent on the concentration of damage and proteins in the airways, and 532 indeed longitudinally in the same individuals, CD4 TRM remain relatively static. A 533 number of factors likely contribute to the heterogeneity of the CD8 TRM niche in the 534 post-COVID19 airway. Firstly, while all our post-COVID19 samples were taken from 535 patients who tested negative for SARS-CoV-2 by qPCR immediately prior to 536 bronchoscopy, persistent antigen has been observed months after other respiratory 537 infections such as IAV (Kim et al., 2010), and antigen depots in SARS-CoV-2 could 538 drive ongoing cytotoxic activity and maintenance of CD8 TRM. Secondly, the 539 persistence of lung resident TRM is also reliant on the availability of local T cell 540 survival signals such as IL-7 (Szabo et al., 2019) and the CXCR3 ligands, which are 541 known to be involved in airway recruitment and retention of CD8 memory T cells in 542 murine IAV infection (Slutter et al., 2013). We found that both IL-7 and the CXCR3 543 ligands are part of the protein network that is maintained in the post-COVID19 544 airway. These proteins display a heterogeneous pattern of upregulation across post- 545 COVID19 patient samples (Figure 3B) and correlate with airway T cells and CD8a 546 concentrations. Lastly there is also some evidence of ongoing aberrant inflammation 547 after acute infection and the development of autoantibodies in some patients recently 548 recovered from COVID19 (Lucas et al., 2020; Wang et al., 2020). It is likely that 549 these different mechanisms act in concert to shape the number and function of CD8 550 TRM, and other immune cells, in the post-COVID19 airway, and the scale and 551 duration of ongoing epithelial damage and respiratory pathology observed. 552 553 One factor that does not appear to influence ongoing T cell recruitment and damage 554 in the post-COVID19 airway is the activation status of the myeloid compartment, in 555 particular monocytes. Functional impairment of monocytes and DCs in the peripheral 556 blood of acutely infected patients (Arunachalam et al., 2020; Laing et al., 2020; Mann 557 et al., 2020), and hyperactivation of airway monocyte populations, are canonical 558 features of severe acute SARS-CoV-2 (Liao et al., 2020; Szabo et al., 2021). 559 Moreover, early T cell phenotypes have been strongly linked to the status of these 560 monocytes (Filbin et al., 2021; Laing et al., 2020; Szabo et al., 2021). In our post- 561 COVID19 patients, however, the monocyte and myeloid pools in both the airways 562 and peripheral blood appear to have returned to relatively normal numbers and their 563 frequency does not correlate with indicators of ongoing epithelial damage. Likewise, 564 their activation status, as determined by HLA-DR and CD86 expression, is not

18 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

565 substantially different between individuals in line with other reports on circulating 566 monocytes from earlier stages of convalescence (Scott et al., 2020). 567 568 The airway monocyte frequency is, however, one of the few features to correlate with 569 a clinical biomarker of severity in acute disease, serum ferritin. High serum ferritin is 570 a prognostic marker used at presentation with acute COVID19 and linked to more 571 severe acute radiological findings (Carubbi et al., 2021; Ruan et al., 2020). Ferritin, 572 and transferrin, the protein responsible for transporting ferritin, are core components 573 of iron metabolism, and their dysregulation is associated with a range of lung 574 pathologies including decreased lung function and lung fibrosis (Ali et al., 2020; Lee 575 et al., 2020). In the respiratory tract the expression of CD71, the receptor that 576 scavenges transferrin from the environment, on alveolar macrophages is altered in 577 chronic respiratory disease (Striz et al., 1993), and the presence of CD71- alveolar 578 macrophages, possessing both an immature and pro-fibrotic transcriptional 579 signature, has been linked to pulmonary fibrosis (Allden et al., 2019). In humans, 580 monocytes are recruited to the airways to differentiate into new AMs (Byrne et al., 581 2020), and acute ferritin could be acting as a biomarker of severity of disruption to 582 AMs. This link is however challenging to elucidate, and it is possible that ferritin is 583 simply a proxy for the strength of the acute inflammatory response. 584 585 To our knowledge, this is the first study exploring the airway immune and proteomic 586 profiles in COVID19 patients between 3 months and 1 year post discharge from 587 hospital, and their links to ongoing respiratory pathology. Moreover, it is one of the 588 first studies to explore the human airway immune-proteome landscape after a 589 substantial period of time following any severe respiratory virus infection. Since all 590 samples were obtained more than 3 months post hospital discharge, when one might 591 expect normalization of the immune profile, the persistent immune abnormalities offer 592 possible explanations into the longevity of persistent respiratory morbidity post 593 SARS-CoV2 infection. In addition, these findings may be relevant to those suffering 594 long-term sequalae during convalescence from other viral pneumonias where data 595 are currently very limited. 596 597 We highlight a number of potential limitations of our study. Firstly, our data showing 598 immunological and proteomic changes in the BAL of post-COVID19 are generated on 599 patients undergoing clinically indicated bronchoscopy because of persistent 600 respiratory abnormalities. Whether our findings extend to individuals with no 601 radiological abnormalities or respiratory symptoms post-COVID19 remains unknown.

19 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

602 This selection bias also affects longitudinal sampling greater than 12 months post- 603 COVID19, since the majority of patients initially sampled between 3-6 months post- 604 COVID19 had shown sufficient improvement in respiratory pathology such that a 605 follow-up bronchoscopy was not indicated. The progressive resolution of radiological 606 abnormalities in the majority of post-COVID19 patients has been observed by others 607 (Han et al., 2021), and importantly within our study even the 3 patients with ongoing 608 pathology have a significantly improved CT and reduced immune cell infiltration 609 within their airways. This fits with the hypothesis that SARS-CoV2 infection can result 610 in organizing pneumonia, with subsequent changes reflecting ongoing epithelial 611 damage and healing parenchyma rather than established fibrosis (Kory and Kanne, 612 2020). This is both compatible with autopsy findings during the acute disease 613 (Wichmann et al., 2020) and the steroid-responsive nature of the acute pathology 614 (Horby et al., 2021). 615 616 Although we did not detect a plasma proteomic signature post-COVID19, this is likely 617 due to our limited sample size not being powered to detect small differences in 618 circulating proteins between post-COVID19 patients and healthy controls. 619 Examination of the distributions of p-values suggests that such differences may exist 620 (Figure S3) but will likely require future much larger cohort studies to reveal them. 621 Regardless, the absence of any correlation between the differentially expressed 622 proteins in the airways and their corresponding changes in the plasma points to the 623 limited utility of peripheral blood as an indicator of the pathological processes 624 ongoing in the lung. 625 626 Finally, as with most studies, we were limited to sampling the airways post-infection 627 and so did not have paired pre-infection samples for intra-individual comparisons. 628 Therefore, it is possible that some differences observed between healthy controls 629 and post-COVID19 patients could reflect a pre-infection phenotype that predisposed 630 them to developing prolonged sequelae. Indeed, one of the most differentially 631 expressed proteins in the airways, DPP4, is the binding receptor for another 632 coronavirus MERS (Raj et al., 2013), and postulated to be capable of mediating 633 some SARS-CoV-2 binding (Li et al., 2020). Thus, it is conceivable that pre-existing 634 upregulation of DPP4 increased susceptibility to post-COVID19 syndrome via 635 increased viral entry (i.e. reverse causation), rather than DPP4 upregulation 636 occurring in response to COVID19. However, the longitudinal reduction of DPP4 637 argues against this hypothesis. More generally, the majority of proteins and markers 638 upregulated are associated with ongoing lung pathology in other contexts (e.g. LDH),

20 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

639 and are absent or only present at very low concentrations in the healthy airway, 640 suggesting that their upregulation is more likely to be a consequence of COVID19 641 than a pre-disposing risk factor. 642 643 In summary, our study offers unique and novel insights into ongoing 644 immunopathology post-COVID19. In contrast to the inflammatory pathways observed 645 during acute disease, we found proteins associated with ongoing epithelial damage, 646 cell death, and repair were upregulated in conjunction with ongoing CD8 T cell 647 activation. These data indicate that significant immune pathways operate within the 648 tissue, presumably to facilitate repair and resolution, even in the absence of 649 peripheral inflammation. In the future it will be important to determine how such a 650 substantial shift in the immune landscape of the airways might affect the response to 651 a subsequent respiratory infection such as seasonal influenza. The progressive 652 improvement in respiratory pathology, even in this cohort with substantive 653 radiological abnormalities should be seen as a positive indicator for the long-term 654 prognosis for those with persistent morbidity. Moreover, the involvement of the 655 immune response suggests this recovery could be accelerated using immuno- 656 modulatory treatments, especially those designed against TH1 immune responses. 657

21 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

658 Funding 659 This work was supported by a Wellcome Trust Senior Fellowship 107059/Z/15/Z to 660 C.M.L., a Rosetrees Seed Fund to P.L.M. and J.A.H. (A2172) and an Imperial 661 College Healthcare NHS Trust BRC award to J.A.H. (RDF04). B.V. is funded by the 662 Joint Research Committee on behalf of a joint collaboration between CW+ and 663 Westminster Medical School Research Trust. K.B. is funded by an Asthma UK PhD 664 Studentship as part of the Asthma UK centre in allergic mechanisms of asthma 665 (AUK-BC-2015-01). R.J.S. is supported by a Wellcome Trust Senior Fellowship 666 (209458/Z/17/Z). J.E.P. is supported by a UKRI COVID19 Rapid Response Rolling 667 Call (MR/V027638/1), the Imperial College London Community Jameel and the 668 Imperial President's Excellence Fund, and a UKRI Innovation Fellowship at Health 669 Data Research UK (MR/S004068/2). 670 671 Acknowledgements 672 We thank Jack Gisby for statistical and analytical advice and Simone Walker for 673 technical assistance. We also thank the staff of Sir Alexander Fleming Building Flow 674 Cytometry Facility at Imperial College London for assistance with flow acquisition and 675 analysis. The schematic in Figure 1A was made using Biorender. We also thank Dr 676 Ryan Thwaites and Professor Peter Openshaw, Imperial College London, for 677 critically reading this manuscript before submission. 678 679 Author contributions: 680 BV and PLS designed the PHENOTYPE study and were involved in recruitment of 681 the post-COVID19 cohort. PLS, CML, JEP and JAH designed and conceived the 682 current study. JAH and CML co-ordinated the study. BV, JT, CO, JG and PLS 683 collected clinical samples and patient information for the post-COVID19 cohort. RJH 684 and PLM designed and were responsible for recruitment and sampling of the heathy 685 control cohort. BV, KB and PPO processed clinical samples for the post-COVID19 686 cohort. PG processed clinical samples for the heathy control cohort. BV, KB, PPO 687 and PG conducted flow cytometry and mediator measurements. KS measured LDH 688 and albumin. KB and PPO analysed the flow cytometry analysis. AP analyzed the 689 Olink proteomics data set. PS supervised the clinical aspects of the project, JEP 690 supervised the bioinformatic analysis. RJS, PLM, CML and JAH supervised the 691 laboratory preparation of samples and flow cytometry and mediator measurement 692 analysis. BV, KB, PPO, AP, JEP, PLS, CML and JAH were involved in data 693 interpretation. JAH wrote the initial draft of the manuscript and BV, KB, PPO, AP, 694 JEP, PLS and CML carried out major writing of the paper. All the authors reviewed

22 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

695 and approved the submitted manuscript. For information on proteomics contact 696 James E Peters, [email protected]; For clinical information contact Pallav L 697 Shah, [email protected]; For epithelial cell biology and repair contact Clare 698 M Lloyd, [email protected]; For immune cell phenotyping contact James A 699 Harker, [email protected]. 700

23 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

701 Figure legends 702 703 Figure 1: Increased airway lymphocytes in the 3 to 6 months post COVID-19 704 (A) Schematic of study design. BAL and blood were sampled from healthy donors 705 and at > 80 days after hospital discharge from post-COVID19 patients. Traditional 706 and spectral flow cytometry and Olink proteomics was performed on BAL and plasma 707 and correlated with clinical parameters. (B) Left: Total number of cells in BAL from 708 healthy controls and post-COVID19 patients. Right: total number of cells in BAL from 709 post-COVID19 patients, stratified according to severity of the acute illness. (C) Total 6 710 cell numbers of immune populations (x10 /ml) in BAL from healthy controls and post- 711 COVID19 patients, based on gating shown in Methods Figure 1. 712 Data are presented as mean ± SEM. Healthy controls n = 16, post-COVID19 patients 713 n = 21, moderate n = 6, severe n = 8, very severe n = 7. Statistical significance was 714 tested by Mann Whitney U test or Kruskal Wallis test + Dunn’s multiple comparison 715 test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 0.001. 716 717 Figure 2: A distinct proteome is present in the post COVID-19 airway 718 436 proteins in BAL and plasma 435 proteins were measured using Olink 719 immunoassays in post-COVID19 patients (n = 19) and healthy controls (n = 9). (A) 720 Principal component analysis (PCA) of BAL and plasma proteomes: each point 721 represents a sample. (B) Left: heatmap displaying Z-score normalised protein 722 abundance for the 22 proteins that were significantly differentially abundant (5% 723 FDR) between post-COVID19 and healthy controls in BAL. Samples have been 724 ordered by case control status and then by peak severity during acute COVID-19 725 infection. Proteins are ordered by hierarchical clustering. Right: heatmap for these 726 same 22 proteins in plasma, presented in the same order as for BAL. (C) Volcano 727 plot showing differentially protein abundance analysis between post-COVID19 728 patients and healthy controls in BAL. Nominal –log P values are shown. 10 729 Significantly differentially abundant proteins (5% FDR) are coloured in red and 730 labelled. (D) BAL and plasma normalised protein abundance (NPX) expression for 731 the 5 most significantly differentially abundance proteins between post-COVID19 732 patients and healthy controls. P = Benjamini-Hochberg adjusted p-values. (E) BH 733 Correlation between the 22 differentially abundant proteins and immune cell 734 frequency in BAL. 735

24 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

736 Figure 3: A network of proteins linked to immune cell chemotaxis and cell 737 death is upregulated in the BAL post-COVID19 738 WGCNA of BAL proteome of post-COVID19 patients and healthy controls. (A) 739 Associations of protein modules identified by WGCNA with case-control status (post- 740 COVID19 or healthy). (B) Heatmap displaying Z-score normalised protein abundance 741 for the 37 proteins that form the ‘red’ protein module. Samples are ordered according 742 to clinical status. Severity refers to peak severity of the acute COVID19 episode. 743 Proteins are ordered by hierarchical clustering. (C) Right: Network representation of 744 proteins in the red module and their interconnections defined using String-db. An 745 edge in the network represents a relationship between proteins, coloured according 746 to the type of evidence for the connection (see Methods). Left: list of hub proteins 747 within the network (with hub proteins that were also significantly differentially 748 upregulated in post-COVID19 patient BAL highlighted in red). 749 750 Figure 4. CXCR3 ligands and markers of epithelial damage correlate with CD8 T 751 cells in the airways 752 BAL immune cells and protein concentrations were analysed post-COVID19 753 infection. (A) Linear regression analysis was conducted between n = 435 proteins 754 measured in the BAL using the Olink platform and BAL immune cell frequencies 755 identified by flow cytometry as shown in Figure 1. A heatmap showing proteins with 756 the highest correlation versus 5 major immune cell frequencies is shown. (B-C) For 757 each sample, protein levels for CXCL9, -10 and -11, were normalised to the median 758 level in healthy controls. For each sample, the mean of the normalised values for the 759 3 proteins was calculated to provide a summary metric for CXCR3 chemokines. This 760 was then plotted against versus (B) T, NK and NKT proportions in post-COVID19 761 patients and healthy controls and (C) monocyte frequencies and subsets in post- 762 COVID19 patients only. (D) BAL T cell frequency versus CD4 and CD8a 763 concentrations as measured by Olink. (E) CD8a concentration versus CASP3, 764 EPCAM, MB and DPP4 in the airways. (F) DPP4, albumin and LDH concentrations in 765 the BAL determined by ELISA. Data are presented as median ± IQR. (A) Pearsons 766 correlation of n = 19 post-COVID19 patients, the r value is shown. (B-E) Each point 767 represents an individual patient, linear regression line +/- 95% confidence intervals 768 are depicted, and r and p values from Pearsons correlation are stated. (F) represents 769 n = 38 post-COVID19 and n = 20 healthy control individuals. Statistics were 770 conducted using Mann-Whitney U test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 771 0.001. pCOVID = post-COVID19. 772

25 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

773 Figure 5. Myeloid cell frequencies are not linked to indications of ongoing 774 damage in the post-COVID19 airway 775 BAL and blood immune cells from post-COVID19 patients were analyzed by spectral 776 deconvolution cytometry. (A) Heatmap of normalised median expression of myeloid 777 cell modulatory and subset markers by clusters of myeloid cells in the airways 778 identified by FlowSOM analysis. (B) Violin plot showing frequencies of each cluster in 779 BAL and blood. (C) Violin plots showing classical, non-classical and intermediate 780 monocyte subsets as proportions of live leukocytes in BAL and blood identified by 781 manual gating. (D) Violin plots showing pDC, cDC1 and cDC2 cell proportions in BAL 782 and blood. (E) Geometric mean fluorescence intensity of activation markers + 783 expressed by CD11c monocytes. (F) Pearson’s correlation between BAL myeloid 784 subsets (as a % of leukocytes) and airway LDH, albumin and DPP4. Data are 785 presented as median ± IQR. Each point represents an individual patient. Statistical 786 significance for (B), (C) and (D) was tested by Mann-Whitney U test. *P < 0.05, **P < 787 0.01, ***P < 0.005, ****P < 0.001. 788 789 Figure 6. The airways of post-COVID19 patients are enriched for activated CD8 790 TRM cells 791 BAL and blood immune cells from post-COVID19 patients were analyzed by spectral 792 deconvolution cytometry.(A) Heatmap of normalised median expression of T cell 793 modulatory and subset markers by clusters of T cells in the airways identified by 794 FlowSOM analysis. (B) Violin plot showing frequencies of each cluster in BAL and 795 blood. (C) tSNE projection of clusters identified by FlowSOM analysis in BAL and 796 blood. (D) Violin plots showing CD4+ and CD8+ T cell subsets as proportions of all T 797 cells in BAL and blood identified by manual gating. (E) Violin plots showing gd, NK, 798 NKT and MAIT cell proportions in BAL and blood. (F) Pearson correlations between + + 799 total airway CD4 and CD8 T cell numbers and subsets. (G) Heatmap depicting 800 Pearson correlation between different T cell populations (as a % of BAL leukocytes) 801 and the concentration of DPP4, LDH and Albumin in the airways. (F) Pearson 802 correlations between the gMFI of LDH and albumin. Data are presented as median ± 803 IQR. Each point represents an individual patient. (A, B, E and F) represent n = 20 804 individuals. (C) represents an equal number of cells from n = 10 patients combined. 805 (G & H) represents n = 18 individuals. Statistical significance for (B), (D), (E) and (G) 806 was tested by Mann-Whitney U test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 807 0.001. 808

26 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

809 Figure 7. Reduced cellularity in the airways one year after initial bronchoscopy 810 post-COVID19 811 (A) Total cell counts (left) and proportions of lymphocytes (right) in the BAL following 812 first bronchoscopy and one year follow-up bronchoscopy. (B) Cell counts of 813 lymphocyte populations in the BAL following first bronchoscopy and one year follow- 814 up bronchoscopy. (C) Cell counts of myeloid populations in the BAL following first 815 bronchoscopy and one year follow-up bronchoscopy. (D) Proportions of T cell 816 subsets in the BAL following first bronchoscopy and one year follow-up 817 bronchoscopy. (E) LDH (left) and albumin (right) measurements in BAL following first 818 year bronchoscopy and one year follow-up bronchoscopy. (F) Percentage of 819 abnormal CT following first year CT and one year follow-up CT. (G) BAL LDH and 820 albumin quantification following first year bronchoscopy and one year follow-up 821 bronchoscopy. Each point represents a single patient. Green shading indicates 822 median+/-IQR for proportions of populations and mediator levels observed in healthy 823 airways. 824 825

27 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

826 Methods 827 Post-COVID-19 patient recruitment 828 Post-COVID19 bronchoalveolar lavage fluid (BAL) was obtained from patients 829 recruited to the PHENOTYPE study (NCT 04459351), an observational, longitudinal 830 study recruiting patients at Chelsea and Westminster Hospital, London. 38 samples 831 were collected from patients requiring sampling for clinical purposes. Ethical approval 832 for the study was given by Yorkshire & The Humber - Sheffield Research Ethics 833 Committee (IRAS 284497). 834 835 Patients who met the inclusion and exclusion criteria were recruited to the 836 PHENOTYPE study: 837 Inclusion criteria for the study were: 838 • Aged 18 years or older 839 • Previous confirmed COVID-19 infection (positive PCR or antibody) 840 • Attending a respiratory follow-up outpatient appointment for follow-up of 841 persistent respiratory symptoms following visit post hospital attendance 842 with COVID-19. infection or referred by the community for covid-related 843 symptoms. 844 845 Patients were seen at approximately 4-6 weeks (Visit 1) and 3 months (Visit 2) 846 following discharge from hospital or referral (if referred from the community). Patients 847 underwent clinical assessment at both visits, including collection of demographic 848 data, clinical history and clinical examination and assessment of vital 849 parameters (heart rate, peripheral oxygen saturations, blood pressure reading 850 and temperature). They also underwent clinical blood tests (including full blood count, 851 renal function, liver function, C-reactive protein (CRP), ferritin, fibrinogen, D-dimer 852 and pro-calcitonin). Patients had a computed tomography (CT) scan of the lungs 853 approximately 3 months post discharge from hospital. In patients with abnormal CT 854 findings, or persistent respiratory symptoms, a bronchoscopy and lavage was 855 performed as part of clinical work-up. Patients underwent formal lung function tests 856 (including spirometry, lung volumes and gas transfer) near the time of the 857 bronchoscopy (usually during the days immediately preceding the procedure). 858 Further follow-up was determined on the basis of clinical need, with a maximum 859 follow up period of up to 2 years post hospital discharge or referral. 860 861 Post-COVID19 bronchoalveolar lavage (BAL) sampling

28 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

862 Bronchoscopy was performed under conscious or deep sedation. 150 ml of normal 863 saline were instilled into the most affected segment (as determined by CT imaging), 864 in 50 ml aliquots. 10 ml of fluid return was used for the scientific analysis described in 865 this paper. 866 867 Healthy volunteer recruitment and sampling 868 Control, uninfected bronchoalveolar lavage was obtained from healthy donors 869 (collected between April 2016 and December 2019). Ethical approval for the study 870 was granted by the Research Ethics Committee (15/SC/0101) and all patients 871 provided informed written consent as described previously (Allden et al., 2019; Byrne 872 et al., 2020; Invernizzi et al., 2021). Briefly, 240 ml aliquots of warmed sterile saline 873 were instilled in the right middle lung and aspirated by syringe. Lavage aliquots were 874 pooled for each subject. All subjects provided written, informed consent to participate 875 in the study. Healthy volunteers had no self-reported history of lung disease, an 876 absence of infection within the last 6 months and normal spirometry. 877 878 BAL processing 879 BAL samples were processed and stained on the day of sample collection. BAL was 880 strained through a 70μm filter and subsequently centrifuged (1800 rpm, 2 min, 4°C). 881 Supernatant was snap-frozen and stored at -80°C. Pellets were incubated in red

882 blood cell lysis buffer (155mM NH4Cl, 10mM KHCO3, 0.1mM 883 ethylenediaminetetraacetic acid, pH 7.4) for 10 minutes before washing and 884 resuspension in complete media (RPMI 1640 with 10% fetal calf serum, 2mM L- 885 glutamine, 100U/ml penicillin-streptomycin). 886 887 Blood processing 888 Peripheral blood was collected in EDTA coated vacutainers on the same day as 889 bronchoscopy. 1ml blood was centrifuged at 100g for 10 minutes (4°C), followed by 890 centrifugation at 20,000g for 20 minutes (4°C) to separate plasma, which was 891 subsequently stored at -80°C. 2ml blood from post COVID-19 patients was incubated

892 with red blood cell lysis buffer (155mM NH4Cl, 10mM KHCO3, 0.1mM 893 ethylenediaminetetraacetic acid, pH 7.4) for 10 minutes before washing and 894 resuspension in complete media (RPMI 1640 with 10% fetal calf serum, 2mM L- 895 glutamine, 100U/ml penicillin-streptomycin). 2.5 ml blood from healthy controls was 896 used to isolate peripheral blood mononuclear cells (PBMC) by Percoll density 897 centrifugation, as per manufacturer’s instructions.

29 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

898 899 Flow cytometry staining 900 For traditional flow cytometry, 2 - 5 x105 cells were plated, while for high parameter 901 analysis using the Cytek Aurora 1 x 106 cells from each site were used. Cells were 902 washed with PBS and incubated with either near-infrared (traditional flow cytometry) 903 or blue (Cytek Aurora) fixable live/dead stain (Life Technologies), as per the 904 manufacturer’s instructions. Before incubation with human fc block (BD Pharmingen) 905 cells were washed with FACS buffer (1% FCS, 2.5% HEPES, 1mM EDTA) and 906 surface staining was performed at 4°C for 30 minutes using antibody panels as 907 described in the Key Resources Table. Surface staining was followed by washing 908 with FACS buffer and fixation with 1% paraformaldehyde for 10 minutes. Labelled 909 cells were acquired on a 4-laser BD Fortessa (traditional flow cytometry; BD 910 Bioscience) or 5-laser Cytek Aurora flow cytometer (Cytek Bio). 911 912 Flow cytometry analysis 913 Conventional flow cytometry data was analysed using FlowJo v 10.6 (Tree Star). 914 Data was pre-gated to exclude doublets and dead cells. In BAL samples CD45+ cells 915 were selected, and immune cell populations were identified using the gating strategy 916 shown in Methods Figure 1. Percentages of the CD45+ gate were calculated. In 917 blood samples, leukocytes were selected based on FSC and SSC and immune cell 918 populations were identified using the gating strategy shown in Methods Figure 1. 919 Percentages of total leukocytes were calculated. High-parameter spectral 920 deconvolution flow cytometry data from the Cytek Aurora was analysed using 921 Cytobank (Beckman). tSNE analysis was performed on 300,000 events from 11 files. 922 Iteration number was set to 1500 with a perplexity of 30 and theta of 0.5. FlowSOM 923 analysis was performed subsequently using hierarchical consensus clustering with 924 12 metaclusters, 100 clusters and 10 iterations. Manual gating of high parameter 925 cytometry data was carried out as shown in Methods Figure 2. Heatmaps were 926 generated from median fluorescence values in Prism 9.0 (GraphPad). 927 928 Quantification and statistical analysis for flow cytometry 929 Differences in means between two sample groups were compared using two-tailed 930 Mann-Whitney U tests. Multiple group comparisons were done using Kruskal Wallis 931 ANOVA followed by Dunn’s post-test. Spearman-Rank correlations were used to 932 correlate clinical blood parameters with immune cell populations. Analysis was 933 performed in GraphPad Prism. For all figures, * denotes p value < 0.05, ** denotes p 934 value < 0.01 and *** denotes p value < 0.001.

30 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

935 936 Proteomic assays 937 Plasma and BAL proteomic measurement was performed using the Olink proximity 938 extension immunoassay platform. Five 92-protein multiplex Olink panels were used 939 (‘Inflammation’, ‘Immune Response’, ‘Cardiometabolic’, Cardiovascular 2’, 940 ‘Cardiovascular 3’), providing measurements of 460 protein targets per sample. 941 Cryopreserved BAL and plasma samples were thawed on ice and mixed well by 942 pipetting before plating 88 samples per plate ensuring case/control balance and 943 random well ordering to prevent confounding of technical and biological effects. For 944 BAL samples, a pilot study was performed using three control samples and three 945 post-COVID19 samples (severe group) to determine optimal dilution parameters. 946 Ultimately BAL was used neat. Since a small number of proteins were assayed on 947 more than one panel, we measured a total of 435 unique proteins. We removed 948 duplicate assays at random prior to subsequent analyses. 949 950 Proteomics analyses 951 Proteomic data was normalised using standard Olink workflows to produce relative 952 protein abundance on a log2 scale (‘NPX’). BAL and plasma proteomic data were 953 normalised separately. Quality assessment was performed by (1) examination of 954 Olink internal controls and (2) inspection of boxplots, relative log expression plots, 955 and PCA. 956 PCA was performed using singular value decomposition. Following these steps, 2 957 clear outlying samples were removed from the BAL dataset. 958 959 To identify proteins that were differentially abundant between case and controls, for 960 each protein we performed linear regression (lm function in R) with case/control 961 status as the independent variable and protein level (NPX) as the dependent 962 variable. P-values were adjusted for multiple testing using the Benjamini-Hochberg 963 procedure (p.adjust function in R). A 5% false discovery rate was used to define 964 statistical significance. 965 966 We used the WGCNA R package (Langfelder and Horvath, 2008; Zhang and 967 Horvath, 2005) to create a weighted protein correlation network. Prior to WGCNA 968 analysis, protein data were scaled and centred, and missing data were imputed using 969 the R caret package. We used the WCGNA adjacency function to produce a weighed 970 network adjacency matrix, using parameters “type=signed” and “power=13”. This 971 soft-thresholding power was selected as the lowest power to achieve approximate

31 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

972 scale-free topology. We next defined a topological overlap matrix of dissimilarity 973 using the TOMdist function. Clusters (‘modules’) of interconnected proteins were 974 identified using hierarchical clustering and the cutreeDynamic function with 975 parameters: method=“hybrid”, deepSplit=2, minClusterSize=15. We then tested 976 association of these modules with case/control status. Multiple testing correction was 977 performed to account for the number of modules. We report both Benjamini- 978 Hochberg and Bonferroni adjusted p-values to provide two levels of stringency. 979 980 To assess the distribution of p-values from the differential protein abundance 981 analyses, we plotted histograms and constructed QQ plots. QQ plots were made by 982 comparing the expected distribution of -log10 P values under the null hypothesis of 983 no proteomic differences between post-COVID19 patients and controls to the 984 observed p-values for the 435 proteins. 985 986 We performed pathway enrichment analysis for the 435 proteins measured. This was 987 performed using terms from KEGG database (Supplementary File 1B) and the 988 Reactome database (Supplementary File 1C). 989 990 Protein modules were visualised using STRING (https://string-db.org/), with known or 991 suspected interconnections between module members displayed as edges in a 992 network diagram. An edge represents a protein-to-protein relationship defined as 993 shared contributions to a particular function, and not necessarily implying physical 994 binding. In Figure 3C, edge colour indicates the type of evidence for the relationship: 995 turquoise represents known interactions from curated databases; magenta 996 represents experimentally determined interactions; green represents predicted 997 Interactions from neighbourhood analysis; red represent predicted interactions 998 from gene fusions, blue represent predicted Interactions from gene co-occurrence; 999 green represents interaction from text-mining; black represents interaction from 1000 co-expression data, and violet represents information from protein . 1001 1002 CXCR3 chemokine composite score 1003 To create a composite score that reflected the CXCR3 chemokines (CXCL9, 1004 CXCL10 and CXCL11), we used the following approach. For each sample, protein 1005 levels for CXCL9, -10 and -11, were normalised to the median level in healthy 1006 controls (to avoid unduly weighting the score towards chemokines with higher NPX 1007 values). For each sample, the mean of the normalised values for the 3 proteins was 1008 then calculated to provide a summary metric for CXCR3 chemokines.

32 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1009 1010 1011 Quantification of mediators in BAL 1012 DPP4 (R&D systems, DY1180) and albumin (Bethyl Laboratories, E80-129) 1013 concentrations in the BAL were quantified by ELISA according to manufacturer’s 1014 instructions. LDH concentrations were quantified using an in vitro toxicology assay 1015 (Sigma, TOX7). Briefly, 25μl of BAL sample were incubated with 50μl of LDH assay 1016 reaction mixture. After 30 minutes, the reaction was stopped with 7.5μl 1N HCL and 1017 absorbance was measured at 490nm with background correction at 690nm. All 1018 absorbances were measured using a SpectraMax i3x (Molecular Devices). 1019 1020 Key Resources Table Reagent or Resource Source Identifier Dilution Antibodies Anti-Human CD69, BD Biosciences 564364 1:20 BUV395 Anti-human CD8, BD Biosciences 612942 1:100 BUV496 Anti-Human CD45RA, BD Biosciences 612927 1:100 BUV563 Anti-Human CD11c, BD Biosciences 612968 1:100 BUV661 Anti-Human CD56, BD Biosciences 612767 1:100 BUV737 Anti-Human CD3, BD Biosciences 612896 1:100 BUV805 Anti-Human IgD, Biolegend 348226 1:100 BV421 Anti-Human CD16, ThermoFisher 62-0166-42 1:100 SuperBright436 Anti-Human CD25, ThermoFisher 48-0257-42 1:100 eFluor450 Anti-Human CD20, BD Biosciences 566132 1:100 BV480 Anti-Human CD127, Biolegend 351332 1:50 BV510

33 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

Anti-Human HLA-DR, Biolegend 307638 1:100 BV570 Anti-Human CD28, Biolegend 302968 1:100 BV605 Anti-Human CD38, Biolegend 356620 1:100 BV650 Anti-Human CD15, Biolgend 323050 1:100 BV711 Anti-Human CD279, Biolegend 329966 1:100 BV750 Anti-Human CD206, Biolegend 321142 1:20 BV785 Anti-Human CD45, ThermoFisher Q10156 1:800 QDOT800 Anti-Human CXCR5, BD Biosciences 564624 1:50 BB515 Anti-Human CD169, R&D systems FAB5197G 1:100 AF488 Anti-Human CD4, Biolegend 344656 1:100 Spark Blue 550 Anti-Human CD161, Biolegend 3399334 1:20 PerCP Anti-Human CD27, BD Biosciences 566449 1:100 BB700 Anti-Human Siglec8, Biolegend 347108 1:100 PerCP Cy5.5 Anti-Human CD86, ThermoFisher 46-0869-42 1:100 PerC eFLuor710 Anti-Human CD141, Biolegend 344104 1:20 PE Anti-Human TCRg/d, Biolegend 331226 1:100 PEdz594 Anti-Human TCRa/b Biolegend 306710 1:100 PE Cy5 Anti-Human CD11b, Biolegend 301322 1:100 PE Cy7

34 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

Anti-Human CD123, Biolegend 306012 1:100 APC Anti-Human CRTH2, Biolegend 350104 1:100 AF647 Anti-Human CD14, Biolegend 367150 1:100 Spark NIR Anti-Human CD1c, BD Biosciences 566614 1:100 APC R700 Anti-Human CD103, Biolegend 350228 1:100 APC Cy7 Anti-Human CD45, ThermoFisher 45-0459-42 1:25 PerCP Cy5.5 Anti-Human Siglec8, R&D systems FAB7975G 1:100 AF488 Anti-Human CD19, Biolegend 302234 1:100 BV421 Anti-Human CD4, Biolegend 317444 1:100 BV510 Anti-Human CD117, Biolegend 313218 1:100 BV605 Anti-Human CD14, Biolegend 301838 1:100 BV711 Anti-Human CD16, Biolegend 302046 1:100 BV785 Anti-Human CD177, Biolegend 315804 1:100 FITC Anti-Human Siglec8, R&D systems FAB7975P 1:100 PE Anti-Human CD56, Biolegend 318348 1:100 PEdz594 Anti-Human CD3, PE- Biolegend 300420 1:100 Cy7 Anti-Human CD206, Biolegend 321110 1:100 APC Anti-Human FcE, Biolegend 334630 1:100 AF700

35 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

LIVE/DEAD Fixable ThermoFisher L34976 1:1000 NIR Cell Stain LIVE/DEAD Fixable ThermoFisher L34961 1:1000 Blue? Cell Stain TruStain FcX Biolegend 422302 1:100 RPMI 1640 Gibco 21875091

1021 1022

36 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1023 1024 1025

37 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1026 References 1027 1028 Ali, M.K., Kim, R.Y., Brown, A.C., Donovan, C., Vanka, K.S., Mayall, J.R., Liu, G., 1029 Pillar, A.L., Jones-Freeman, B., Xenaki, D., et al. (2020). Critical role for iron 1030 accumulation in the pathogenesis of fibrotic lung disease. J Pathol 251, 49-62. 1031 Allden, S.J., Ogger, P.P., Ghai, P., McErlean, P., Hewitt, R., Toshner, R., Walker, 1032 S.A., Saunders, P., Kingston, S., Molyneaux, P.L., et al. (2019). The Transferrin 1033 Receptor CD71 Delineates Functionally Distinct Airway Macrophage Subsets during 1034 Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med 200, 209-219. 1035 Arunachalam, P.S., Wimmers, F., Mok, C.K.P., Perera, R., Scott, M., Hagan, T., 1036 Sigal, N., Feng, Y., Bristow, L., Tak-Yin Tsang, O., et al. (2020). Systems biological 1037 assessment of immunity to mild versus severe COVID-19 infection in humans. 1038 Science 369, 1210-1220. 1039 Boyd, D.F., Allen, E.K., Randolph, A.G., Guo, X.J., Weng, Y., Sanders, C.J., 1040 Bajracharya, R., Lee, N.K., Guy, C.S., Vogel, P., et al. (2020). Exuberant fibroblast 1041 activity compromises lung function via ADAMTS4. Nature 587, 466-471. 1042 Byrne, A.J., Powell, J.E., O'Sullivan, B.J., Ogger, P.P., Hoffland, A., Cook, J., 1043 Bonner, K.L., Hewitt, R.J., Wolf, S., Ghai, P., et al. (2020). Dynamics of human 1044 monocytes and airway macrophages during healthy aging and after transplant. J Exp 1045 Med 217. 1046 Carubbi, F., Salvati, L., Alunno, A., Maggi, F., Borghi, E., Mariani, R., Mai, F., 1047 Paoloni, M., Ferri, C., Desideri, G., et al. (2021). Ferritin is associated with the 1048 severity of lung involvement but not with worse prognosis in patients with COVID-19: 1049 data from two Italian COVID-19 units. Sci Rep 11, 4863. 1050 Chen, Z., and Wherry, E.J. (2020). T cell responses in patients with COVID-19. Nat 1051 Rev Immunol 20, 529-536. 1052 Dan, J.M., Mateus, J., Kato, Y., Hastie, K.M., Yu, E.D., Faliti, C.E., Grifoni, A., 1053 Ramirez, S.I., Haupt, S., Frazier, A., et al. (2021). Immunological memory to SARS- 1054 CoV-2 assessed for up to 8 months after infection. Science 371. 1055 Docherty, A.B., Harrison, E.M., Green, C.A., Hardwick, H.E., Pius, R., Norman, L., 1056 Holden, K.A., Read, J.M., Dondelinger, F., Carson, G., et al. (2020). Features of 20 1057 133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical 1058 Characterisation Protocol: prospective observational cohort study. BMJ 369, m1985. 1059 Drake, T.M., Riad, A.M., Fairfield, C.J., Egan, C., Knight, S.R., Pius, R., Hardwick, 1060 H.E., Norman, L., Shaw, C.A., McLean, K.A., et al. (2021). Characterisation of in- 1061 hospital complications associated with COVID-19 using the ISARIC WHO Clinical 1062 Characterisation Protocol UK: a prospective, multicentre cohort study. Lancet 398, 1063 223-237. 1064 Duan, S., and Thomas, P.G. (2016). Balancing Immune Protection and Immune 1065 Pathology by CD8(+) T-Cell Responses to Influenza Infection. Front Immunol 7, 25. 1066 Fabbri, L., Moss, S., Khan, F., Chi, W., Xia, J., Robinson, K., Smyth, A., Jenkins, G., 1067 and Stewart, I. (2021). Post-viral parenchymal lung disease of COVID-19 and viral 1068 pneumonitis: A systematic review and meta-analysis. medRxiv, 1069 2021.2003.2015.21253593. 1070 Filbin, M.R., Mehta, A., Schneider, A.M., Kays, K.R., Guess, J.R., Gentili, M., 1071 Fenyves, B.G., Charland, N.C., Gonye, A.L.K., Gushterova, I., et al. (2021). 1072 Longitudinal proteomic analysis of severe COVID-19 reveals survival-associated 1073 signatures, tissue-specific cell death, and cell-cell interactions. Cell Rep Med 2, 1074 100287. 1075 Gisby, J., Clarke, C.L., Medjeral-Thomas, N., Malik, T.H., Papadaki, A., Mortimer, 1076 P.M., Buang, N.B., Lewis, S., Pereira, M., Toulza, F., et al. (2021). Longitudinal 1077 proteomic profiling of dialysis patients with COVID-19 reveals markers of severity and 1078 predictors of death. Elife 10.

38 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1079 Grau-Exposito, J., Sanchez-Gaona, N., Massana, N., Suppi, M., Astorga-Gamaza, 1080 A., Perea, D., Rosado, J., Falco, A., Kirkegaard, C., Torrella, A., et al. (2021). 1081 Peripheral and lung resident memory T cell responses against SARS-CoV-2. Nat 1082 Commun 12, 3010. 1083 Guan, W.J., Ni, Z.Y., Hu, Y., Liang, W.H., Ou, C.Q., He, J.X., Liu, L., Shan, H., Lei, 1084 C.L., Hui, D.S.C., et al. (2020). Clinical Characteristics of Coronavirus Disease 2019 1085 in China. N Engl J Med 382, 1708-1720. 1086 Guler, S.A., Ebner, L., Aubry-Beigelman, C., Bridevaux, P.O., Brutsche, M., 1087 Clarenbach, C., Garzoni, C., Geiser, T.K., Lenoir, A., Mancinetti, M., et al. (2021). 1088 Pulmonary function and radiological features 4 months after COVID-19: first results 1089 from the national prospective observational Swiss COVID-19 lung study. Eur Respir J 1090 57. 1091 Han, X., Fan, Y., Alwalid, O., Li, N., Jia, X., Yuan, M., Li, Y., Cao, Y., Gu, J., Wu, H., 1092 and Shi, H. (2021). Six-month Follow-up Chest CT Findings after Severe COVID-19 1093 Pneumonia. Radiology 299, E177-E186. 1094 Harker, J.A., and Lloyd, C.M. (2021). Overlapping and distinct features of viral and 1095 allergen immunity in the human lung. Immunity 54, 617-631. 1096 Horby, P., Lim, W.S., Emberson, J.R., Mafham, M., Bell, J.L., Linsell, L., Staplin, N., 1097 Brightling, C., Ustianowski, A., Elmahi, E., et al. (2021). Dexamethasone in 1098 Hospitalized Patients with Covid-19. N Engl J Med 384, 693-704. 1099 Invernizzi, R., Wu, B.G., Barnett, J., Ghai, P., Kingston, S., Hewitt, R.J., Feary, J., Li, 1100 Y., Chua, F., Wu, Z., et al. (2021). The Respiratory Microbiome in Chronic 1101 Hypersensitivity Pneumonitis Is Distinct from That of Idiopathic Pulmonary Fibrosis. 1102 Am J Respir Crit Care Med 203, 339-347. 1103 Jozwik, A., Habibi, M.S., Paras, A., Zhu, J., Guvenel, A., Dhariwal, J., Almond, M., 1104 Wong, E.H.C., Sykes, A., Maybeno, M., et al. (2015). RSV-specific airway resident 1105 memory CD8+ T cells and differential disease severity after experimental human 1106 infection. Nat Commun 6, 10224. 1107 Kim, T.S., Hufford, M.M., Sun, J., Fu, Y.X., and Braciale, T.J. (2010). Antigen 1108 persistence and the control of local T cell memory by migrant respiratory dendritic 1109 cells after acute virus infection. J Exp Med 207, 1161-1172. 1110 Kory, P., and Kanne, J.P. (2020). SARS-CoV-2 organising pneumonia: 'Has there 1111 been a widespread failure to identify and treat this prevalent condition in COVID-19?'. 1112 BMJ Open Respir Res 7. 1113 Laing, A.G., Lorenc, A., Del Molino Del Barrio, I., Das, A., Fish, M., Monin, L., Munoz- 1114 Ruiz, M., McKenzie, D.R., Hayday, T.S., Francos-Quijorna, I., et al. (2020). A 1115 dynamic COVID-19 immune signature includes associations with poor prognosis. Nat 1116 Med 26, 1623-1635. 1117 Langfelder, P., and Horvath, S. (2008). WGCNA: an R package for weighted 1118 correlation network analysis. BMC Bioinformatics 9, 559. 1119 Lee, J., Park, H.K., Kwon, M.J., Ham, S.Y., Kim, J.M., Lim, S.Y., and Song, J.U. 1120 (2020). Decreased lung function is associated with elevated ferritin but not iron or 1121 transferrin saturation in 42,927 healthy Korean men: A cross-sectional study. PLoS 1122 One 15, e0231057. 1123 Li, Y., Zhang, Z., Yang, L., Lian, X., Xie, Y., Li, S., Xin, S., Cao, P., and Lu, J. (2020). 1124 The MERS-CoV Receptor DPP4 as a Candidate Binding Target of the SARS-CoV-2 1125 Spike. iScience 23, 101400. 1126 Liao, M., Liu, Y., Yuan, J., Wen, Y., Xu, G., Zhao, J., Cheng, L., Li, J., Wang, X., 1127 Wang, F., et al. (2020). Single-cell landscape of bronchoalveolar immune cells in 1128 patients with COVID-19. Nat Med 26, 842-844. 1129 Lucas, C., Wong, P., Klein, J., Castro, T.B.R., Silva, J., Sundaram, M., Ellingson, 1130 M.K., Mao, T., Oh, J.E., Israelow, B., et al. (2020). Longitudinal analyses reveal 1131 immunological misfiring in severe COVID-19. Nature 584, 463-469. 1132 Mandal, S., Barnett, J., Brill, S.E., Brown, J.S., Denneny, E.K., Hare, S.S., 1133 Heightman, M., Hillman, T.E., Jacob, J., Jarvis, H.C., et al. (2021). 'Long-COVID': a

39 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1134 cross-sectional study of persisting symptoms, biomarker and imaging abnormalities 1135 following hospitalisation for COVID-19. Thorax 76, 396-398. 1136 Mann, E.R., Menon, M., Knight, S.B., Konkel, J.E., Jagger, C., Shaw, T.N., Krishnan, 1137 S., Rattray, M., Ustianowski, A., Bakerly, N.D., et al. (2020). Longitudinal immune 1138 profiling reveals key myeloid signatures associated with COVID-19. Sci Immunol 5. 1139 Morimoto, Y., Hirahara, K., Kiuchi, M., Wada, T., Ichikawa, T., Kanno, T., Okano, M., 1140 Kokubo, K., Onodera, A., Sakurai, D., et al. (2018). Amphiregulin-Producing 1141 Pathogenic Memory T Helper 2 Cells Instruct Eosinophils to Secrete Osteopontin and 1142 Facilitate Airway Fibrosis. Immunity 49, 134-150 e136. 1143 Myall, K.J., Mukherjee, B., Castanheira, A.M., Lam, J.L., Benedetti, G., Mak, S.M., 1144 Preston, R., Thillai, M., Dewar, A., Molyneaux, P.L., and West, A.G. (2021). 1145 Persistent Post-COVID-19 Interstitial Lung Disease. An Observational Study of 1146 Corticosteroid Treatment. Ann Am Thorac Soc 18, 799-806. 1147 Nalbandian, A., Sehgal, K., Gupta, A., Madhavan, M.V., McGroder, C., Stevens, J.S., 1148 Cook, J.R., Nordvig, A.S., Shalev, D., Sehrawat, T.S., et al. (2021). Post-acute 1149 COVID-19 syndrome. Nat Med 27, 601-615. 1150 Raj, V.S., Mou, H., Smits, S.L., Dekkers, D.H., Muller, M.A., Dijkman, R., Muth, D., 1151 Demmers, J.A., Zaki, A., Fouchier, R.A., et al. (2013). Dipeptidyl peptidase 4 is a 1152 functional receptor for the emerging human coronavirus-EMC. Nature 495, 251-254. 1153 Rodriguez, L., Pekkarinen, P.T., Lakshmikanth, T., Tan, Z., Consiglio, C.R., Pou, C., 1154 Chen, Y., Mugabo, C.H., Nguyen, N.A., Nowlan, K., et al. (2020). Systems-Level 1155 Immunomonitoring from Acute to Recovery Phase of Severe COVID-19. Cell Rep 1156 Med 1, 100078. 1157 Ruan, Q., Yang, K., Wang, W., Jiang, L., and Song, J. (2020). Clinical predictors of 1158 mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, 1159 China. Intensive Care Med 46, 846-848. 1160 Saris, A., Reijnders, T.D.Y., Nossent, E.J., Schuurman, A.R., Verhoeff, J., Asten, 1161 S.V., Bontkes, H., Blok, S., Duitman, J., Bogaard, H.J., et al. (2021). Distinct cellular 1162 immune profiles in the airways and blood of critically ill patients with COVID-19. 1163 Thorax. 1164 Schmidt, M.E., and Varga, S.M. (2018). The CD8 T Cell Response to Respiratory 1165 Virus Infections. Front Immunol 9, 678. 1166 Scott, N.A., Knight, S.B., Pearmain, L., Brand, O., Morgan, D.J., Jagger, C., Khan, 1167 S., Hackney, P., Smith, L., Menon, M., et al. (2020). Recovery of monocyte 1168 exhaustion is associated with resolution of lung injury in COVID-19 convalescence. 1169 medRxiv, 2020.2010.2010.20207449. 1170 Sigfrid, L., Drake, T.M., Pauley, E., Jesudason, E.C., Olliaro, P., Lim, W.S., Gillesen, 1171 A., Berry, C., Lowe, D.J., McPeake, J., et al. (2021). Long Covid in adults discharged 1172 from UK hospitals after Covid-19: A prospective, multicentre cohort study using the 1173 ISARIC WHO Clinical Characterisation Protocol. medRxiv, 1174 2021.2003.2018.21253888. 1175 Slutter, B., Pewe, L.L., Kaech, S.M., and Harty, J.T. (2013). Lung airway-surveilling 1176 CXCR3(hi) memory CD8(+) T cells are critical for protection against influenza A 1177 virus. Immunity 39, 939-948. 1178 Slutter, B., Van Braeckel-Budimir, N., Abboud, G., Varga, S.M., Salek-Ardakani, S., 1179 and Harty, J.T. (2017). Dynamics of influenza-induced lung-resident memory T cells 1180 underlie waning heterosubtypic immunity. Sci Immunol 2. 1181 Striz, I., Wang, Y.M., Svarcova, I., Trnka, L., Sorg, C., and Costabel, U. (1993). The 1182 phenotype of alveolar macrophages and its correlation with immune cells in 1183 bronchoalveolar lavage. Eur Respir J 6, 1287-1294. 1184 Szabo, P.A., Dogra, P., Gray, J.I., Wells, S.B., Connors, T.J., Weisberg, S.P., 1185 Krupska, I., Matsumoto, R., Poon, M.M.L., Idzikowski, E., et al. (2021). Longitudinal 1186 profiling of respiratory and systemic immune responses reveals myeloid cell-driven 1187 lung inflammation in severe COVID-19. Immunity 54, 797-814 e796.

40 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Immune-proteome landscape post-COVID19

1188 Szabo, P.A., Miron, M., and Farber, D.L. (2019). Location, location, location: Tissue 1189 resident memory T cells in mice and humans. Sci Immunol 4. 1190 Thwaites, R.S., Sanchez Sevilla Uruchurtu, A., Siggins, M.K., Liew, F., Russell, C.D., 1191 Moore, S.C., Fairfield, C., Carter, E., Abrams, S., Short, C.E., et al. (2021). 1192 Inflammatory profiles across the spectrum of disease reveal a distinct role for GM- 1193 CSF in severe COVID-19. Sci Immunol 6. 1194 Wang, E.Y., Mao, T., Klein, J., Dai, Y., Huck, J.D., Liu, F., Zheng, N.S., Zhou, T., 1195 Israelow, B., Wong, P., et al. (2020). Diverse Functional Autoantibodies in Patients 1196 with COVID-19. medRxiv, 2020.2012.2010.20247205. 1197 Wichmann, D., Sperhake, J.P., Lutgehetmann, M., Steurer, S., Edler, C., Heinemann, 1198 A., Heinrich, F., Mushumba, H., Kniep, I., Schroder, A.S., et al. (2020). Autopsy 1199 Findings and Venous Thromboembolism in Patients With COVID-19: A Prospective 1200 Cohort Study. Ann Intern Med 173, 268-277. 1201 Wu, T., Hu, Y., Lee, Y.T., Bouchard, K.R., Benechet, A., Khanna, K., and Cauley, 1202 L.S. (2014). Lung-resident memory CD8 T cells (TRM) are indispensable for optimal 1203 cross-protection against pulmonary virus infection. J Leukoc Biol 95, 215-224. 1204 Yamashita, C.M., Dolgonos, L., Zemans, R.L., Young, S.K., Robertson, J., Briones, 1205 N., Suzuki, T., Campbell, M.N., Gauldie, J., Radisky, D.C., et al. (2011). Matrix 1206 metalloproteinase 3 is a mediator of pulmonary fibrosis. Am J Pathol 179, 1733-1745. 1207 Zhang, B., and Horvath, S. (2005). A general framework for weighted gene co- 1208 expression network analysis. Stat Appl Genet Mol Biol 4, Article17. 1209

41 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A B Total BAL cells

1.5 ✱✱ ) L l A m

/ 1.0 B

s t l l n e u c

o 6 c

0 l l 1 0.5 e x (

C

0.0

HC post-COVID19 (all) Moderate COVID-19 Severe COVID-19 Very Severe COVID-19 C AM Monocytes T cells B cells

0.8 0.025 0.6 ✱ 0.015 ✱✱ 0.6 0.020 s ) ) ) e ) s t s l s

s 0.4 l l l 0.010 l y l l l s s e c e

0.015 l l e e l l c o c c c M

e

e 0.4 n 6 6 A c 6 6 c

o

0 0 0 0 T 1 1 B 1 1

M 0.010

x x x x . ( ( ( ( l 0.2 0.005

0.2 C 0.005

0.0 0.000 0.0 0.000

NK cells NKT cells Neutrophils Eosinophils

0.015 0.025 0.04 0.020

0.020

0.03 ) 0.015 s l s ) ) i l

) 0.010 s s l s l l s h l s i l l s l l e l l l p l h 0.015 e e c e e

c o e p c c

6

c c

n 0.02 o

i T 6 6 r

0.010 0 6 - t s 0 K 0 1 0 u K o 0.010 1 1 x N 1 x ( e

x 0.005 N E

( x ( ( 0.01 N 0.005 0.005

0.00 0.000 0.000 0.000

Figure 1: Altered immune cell profile in post-COVID19 BAL over 80 days post discharge (A) Schematic of study design. BAL and blood were sampled from healthy donors and at > 80 days after hospital discharge from post-COVID19 patients. Traditional and spectral flow cytometry and Olink proteomics was performed on BAL and plasma and correlated with clinical parameters. (B) Left: Total number of cells in BAL from healthy controls and post-COVID19 patients. Right: total number of cells in BAL from post-COVID19 patients, stratified according to severity of the acute illness. (C) Total cell numbers of immune populations (x106/ml) in BAL from healthy controls and post-COVID19 patients, based on gating shown in Methods Figure 1. Data are presented as mean ± SEM. Healthy controls n = 16, post-COVID19 patients n = 21, moderate n = 6, severe n = 8, very severe n = 7. Statistical significance was tested by Mann Whitney U test or Kruskal Wallis test + Dunn’s multiple comparison test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 0.001. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A BAL Post-COVID19 B BAL Plasma COVID19_Status COVID19_Status 2 COVID19_Status COVID19_Status HC Severity COVID19 −ve Severity COVID19 −ve 2 1 COVID19 +ve COVID19 +ve IL1RN IL1RN Severity 1 0 Severity KLK6 KLK6 COVID19 −ve COVID19 −ve 0 −1 Moderate ANPEP ANPEP Moderate Severe −1 Severe −2 CD59 Very Severe CD59 Very Severe −2 SPRY2 −3 SPRY2 −3 F3 F3

CHL1 CHL1

MB MB

QPCT QPCT

C1QTNF1 C1QTNF1

MMP3 MMP3

Plasma AREG AREG

HSPG2 HSPG2

LDLR LDLR

FLT3LG FLT3LG

SAA4 SAA4

SERPINA5 SERPINA5

LYVE1 LYVE1

SERPINA7 SERPINA7

VASN VASN

DPP4 DPP4

SRC SRC BAL Plasma C D E IL1RN PBH= 0.028 PBH = 0.974 BAL KLK6 CD59

MB NPX F3 SERPINA7 QPCT

MMP3 PBH = 0.028 PBH = 0.917 C1QTNF1

CHL1 NPX

DPP4 Corr 1.0 ANPEP

SPRY2 0.5

SRC 0.0 value

- P = 0.028 P = 0.307 BH BH DPP4 −0.5 p

10 HSPG2 −1.0

NPX LDLR log -

SERPINA5 AREG

VASN

PBH = 0.035 PBH = 0.974 LYVE1

FLT3LG

SERPINA5 NPX KLK6 SERPINA7

SAA4

PBH = 0.035 PBH = 0.973

log2 fold change (case – control) NPX

LYVE1 LYVE1 t.cell_percent_of_live_CD45mac_percent_of_live_CD45

neutrophil_percent_of_live_CD45 lymphocytes_percent_of_live_CD45

cd14_cd16_mono_percent_of_live_CD45

Figure 2: A distinct proteome is present in the post COVID-19 airway 436 proteins in BAL and plasma 435 proteins were measured using Olink immunoassays in post- COVID19 patients (n = 19) and healthy controls (n = 9). (A) Principal component analysis (PCA) of BAL and plasma proteomes: each point represents a sample. (B) Left: heatmap displaying Z-score normalised protein abundance for the 22 proteins that were significantly differentially abundant (5% FDR) between post-COVID19 and healthy controls in BAL. Samples have been ordered by case control status and then by peak severity during acute COVID-19 infection. Proteins are ordered by hierarchical clustering. Right: heatmap for these same 22 proteins in plasma, presented in the same order as for BAL. (C) Volcano plot showing differentially protein abundance analysis between post-

COVID19 patients and healthy controls in BAL. Nominal –log10 P values are shown. Significantly differentially abundant proteins (5% FDR) are coloured in red and labelled. (D) BAL and plasma normalised protein abundance (NPX) expression for the 5 most significantly differentially abundance

proteins between post-COVID19 patients and healthy controls. PBH = Benjamini-Hochberg adjusted p- values. (E) Correlation between the 22 differentially abundant proteins and immune cell frequency in BAL. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A B

COVID19_Status COVID19_Status Severity 3 COVID19 −ve LCN2 COVID19 +ve 2 TGM2 Severity CST5 1 COVID19 −ve IL1A Moderate IL1RN 0 Severe KLK6 −1 Very Severe PRSS27 CCL11 −2 PRCP SOD2 IL7 CXCL9 CXCL10 CXCL11 PLAT STC1 C KRT19 LGALS4 PRDX5 Red module CASP3 Hub GLO1 Proteins: C1QTNF1 CHL1 MB CASP3 FST TGFA EPCAM PDGFA TNFRSF11A F3 DNER ANGPT1 MB CCL28 ALCAM EPCAM F3 MMP10 ANPEP TIMP1

COVID19_Status COVID19_Status Severity 3 COVID19 −ve LCN2 COVID19 +ve 2 TGM2 Severity CST5 1 COVID19 −ve IL1A Moderate IL1RN 0 Severe KLK6 −1 Very Severe PRSS27 CCL11 −2 PRCP SOD2 IL7 CXCL9 CXCL10 Figure 3: A network of proteins linked to immune cell chemotaxisCXCL11 and cell death is upregulated PLAT in the BAL post-COVID19 STC1 WGCNA of BAL proteome of post-COVID19 patients and healthyKRT19 controls. (A) Associations of protein LGALS4 modules identified by WGCNA with case-control status (postPRDX5 -COVID19 or healthy). (B) Heatmap CASP3 displaying Z-score normalised protein abundance for theGLO137 proteins that form the ‘red’ protein C1QTNF1 module. Samples are ordered according to clinical status. SeverityCHL1 refers to peak severity of the acute MB COVID19 episode. Proteins are ordered by hierarchical clusteringFST . (C) Right: Network representation TGFA of proteins in the red module and their interconnections definedPDGFA using String-db. An edge in the TNFRSF11A network represents a relationship between proteins, colouredDNERaccording to the type of evidence for the ANGPT1 connection (see Methods). Left: list of hub proteins within theCCL28 network (with hub proteins that were also significantly differentially upregulated in post-COVID19 patientALCAM BAL highlighted in red). EPCAM F3 MMP10 ANPEP TIMP1 medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A B 2.5 HC 2.0 pCOVID Correlation Immune cells (% CD45) LPL CXCL9 CXCL11 CCL19 PSIP1 STC1 SH2D1A TNFSF10 TNFRSF13B (r) 1.5 Airway Macrophages 0.3 -0.8 -0.8 -0.7 -0.6 -0.5 -0.8 -0.4 -0.4 +1 -0.4 0.8 0.8 0.7 0.6 0.5 0.8 0.3 0.4 1.0 Lymphocytes 0

Monocytes 0.0 0.7 0.6 0.9 0.8 0.8 0.6 0.7 0.7 composite score 0.5 r = 0.68 r = 0.62 r = 0.29 -1 CXCR3 chemokine -0.4 0.8 0.8 0.7 0.6 0.5 0.8 0.3 0.3 p = 0.0001 p = 0.001 p = 0.16 T cells 0.0 Neutrophils 0.8 -0.4 -0.3 -0.1 -0.1 0.0 -0.4 0.2 0.2 0 20 40 60 80 100 0 1 2 3 4 0 1 2 3 4 airway T cells (% of CD45) airway NK cells (% of CD45) airway NKT cells (% of CD45)

C 2.0 100 CD16+ D 100 HC r = 0.02 80 1.5 p = 0.92 80 pCOVID 60 CD16+ CD14+ 60 1.0 r = 0.82 40 -4 p < 10 40 0.5 + composite score r = 0.57 20 CD14 CXCR3 chemokine

T cells (% CD45) 20 r = 0.72 r = 0.26 p = 0.016 r = -0.37

subset (% on monocytes) p < 0.0001 p = 0.2 0.0 0 p = 0.13 0.0 2.5 5.0 7.5 0.0 0.5 1.0 1.5 2.0 0 0 2 4 6 0 2 4 6 8 airway monocytes (% of CD45) CXCR3 chemokines composite score CD8a (NPX/ml) CD4 (NPX/ml)

✱✱✱✱ ✱✱ E 15 HC 15 15 15 F 30 ✱✱✱✱ 0.3 250 pCOVID 200

10 20 0.2 g/ml) µ 150 10 10 10 100 5 10 0.1 MB (NPX/ml) DPP4 (ng/ml) Albumin ( r = 0.67 DPP4 (NPX/ml) CASP3 (NPX/ml) r = 0.69 r = 0.68 EPCAM (NPX/ml) r = 0.56 50 LDH (O.D. 450nm) p < 0.0001 p < 0.0001 p = 0.002 p < 0.0001 5 5 5 0 0 0.0 0 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 CD8a (NPX/ml) CD8a (NPX/ml) CD8a (NPX/ml) CD8a (NPX/ml)

Figure 4. CXCR3 ligands and markers of epithelial damage correlate with CD8 T cells in the airways BAL immune cells and protein concentrations were analysed post-COVID19 infection. (A) Linear regression analysis was conducted between n = 435 proteins measured in the BAL using the Olink platform and BAL immune cell frequencies identified by flow cytometry as shown in Figure 1. A heatmap showing proteins with the highest correlation versus 5 major immune cell frequencies is shown. (B-C) For each sample, protein levels for CXCL9, -10 and -11, were normalised to the median level in healthy controls. For each sample, the mean of the normalised values for the 3 proteins was calculated to provide a summary metric for CXCR3 chemokines. This was then plotted against versus (B) T, NK and NKT proportions in post-COVID19 patients and healthy controls and (C) monocyte frequencies and subsets in post-COVID19 patients only. (D) BAL T cell frequency versus CD4 and CD8a concentrations as measured by Olink. (E) CD8a concentration versus CASP3, EPCAM, MB and DPP4 in the airways. (F) DPP4, albumin and LDH concentrations in the BAL determined by ELISA. Data are presented as median ± IQR. (A) Pearsons correlation of n = 19 post-COVID19 patients, the r value is shown. (B-E) Each point represents an individual patient, linear regression line +/- 95% confidence intervals are depicted, and r and p values from Pearsons correlation are stated. (F) represents n = 38 post-COVID19 and n = 20 healthy control individuals. Statistics were conducted using Mann-Whitney U test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 0.001. pCOVID = post-COVID19. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A C ) + s c + ✱✱✱✱ ) e ) 1 s c CD123CD1c CD45RAHLA-DRCD141Siglec8CD86 CD14 CD16 CD25 CD11cCD11b t

5 s + 1 e y Normalised median expression median Normalised ) 1 e c t 5 t c s D 1 5 y 1

4 y o c e 1

C ✱✱ D c + t

Classical monocytes 1 k 4 o 15 4 o D + y 6 C k u 3 k Blood 6 c C

4 u 5 e 3 u

+ l 3 1

CD45RA+ myeloid o

2 e

+ l e D 4 2 l

k D e

6 2 10 BAL 1

e 2 u C

v ✱✱ e 1 C i

Intermediate monocytes v 3 l i v D 1 e l i

l + D 1 + l 1

C 4 3 C %

2 e

0 % Intermediate monocytes 4 0 1 5 ( %

( 0 D v ( i D l

C Classical monocytes 5 D C pDC % 0 cDC1 ( cDC2 )

0 ) ) Non-classical monocytes 6 s s s + + + e c e e t c

0.8 ) t t 2.0 c 0.08 y 1 1 s CD123+ CD141+ DC 7 y y 1 c 1 1 c c e 1 ✱✱ o + t

0.6 o o D 15 D 1.5 0.06 k D y 6 cDC1 8 -2 k k C C u Blood c

C u u 5

e o +

+ 0.4 l e e

1.0 + 0.04 l l D cDC1 k 1

9 c

c 10

e BAL u C 4 e e 1 1 v

i

0.2 v v e 1 l i 0.5 i 0.02 D l +

pDC l l

10 D

-4 D 3 C e C

% 5

0.0 C % % ( 0.0 D v 0.00 ( ( i l

C

% 0 ( B E F ✱✱✱✱ ✱ ✱ 1.0 Cell type (% of BAL

leukocytes) (O.D.) LDH Albumin(ug/ml) (ng/ml) DPP4 y Blood c 0.8 CD14 monocytes -0.3 0.0 -0.1 n BAL CD16+ monocytes 0.1 -0.2 0.1 e 0.6 Correlation u CD14+ CD16+ monocytes -0.2 -0.3 -0.1 (r) q 0.4

e +1 r

F 0.2 cDC2 -0.2 0.2 -0.3 0 cDC1 -0.2 0.1 -0.2 0.0 -1 1 2 3 4 5 6 7 8 9 10 pDC -0.2 0.2 -0.2 Metacluster

Figure 5. Myeloid cell frequencies are not linked to indications of ongoing damage in the post- COVID19 airway BAL and blood immune cells from post-COVID19 patients were analyzed by spectral deconvolution cytometry. (A) Heatmap of normalised median expression of myeloid cell modulatory and subset markers by clusters of myeloid cells in the airways identified by FlowSOM analysis. (B) Violin plot showing frequencies of each cluster in BAL and blood. (C) Violin plots showing classical, non-classical and intermediate monocyte subsets as proportions of live leukocytes in BAL and blood identified by manual gating. (D) Violin plots showing pDC, cDC1 and cDC2 cell proportions in BAL and blood. (E) Geometric mean fluorescence intensity of activation markers expressed by CD11c+ monocytes. (F) Pearson’s correlation between BAL myeloid subsets (as a % of leukocytes) and airway LDH, albumin and DPP4. Data are presented as median ± IQR. Each point represents an individual patient. Statistical significance for (B), (C) and (D) was tested by Mann-Whitney U test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 0.001. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

αβ γδ

TCR TCR CD4 CD8 CD103CRTH2CXCR5CD27CD69CD45RACD56CD127HLA-DRCD28CD38PD1 CD161CD86CD25 + ) + s 4 4 + D e

A + ✱✱ + t D 15

1 4 D

4 4 y 100 6 100 +

C Blood ) c ) 100 100 5

D ✱✱ C )

D 100 ) 4 ) s o

s

2 Normalised median l + s l - D s C l k s l l C

80 80 l D

10 l l

l BAL

80 3

80 3 l u C

e 80 e - ✱✱✱✱ +

e e C 0 0

3 e e c c ✱✱✱✱ l +

A

60 c 60

c A

1 1 60 c

+ 60

3 60 ✱✱✱✱ ✱✱✱✱ e T T R 5 expression R T 4 D

9 D

T D v T

i

40 40 5

2 5 l 6 40 40 C C

C 40 % 4 %

4

5 % % ( ( D % + + (

20 % 20 ( D

0 (

D 20 20 ( C 9 20 9

6 C C 6 0 0 0 0 6 0 D 7 D C 0 C

8 + + 8 8 9 + + D D 8 +

100 8

100 C ) 100 100 C 100 ) D

) ) 8 10 )

D s - s + s l s s l C l l l D

80 l l

l 80

3 l C

80 3 80 l 80 e 11

-2 e + e C 0 - e 0 ✱✱ e c

c c

60 c

1 c A 60 1

A

60 + ✱✱✱✱ 60 60 12 T ✱✱✱✱ T D 9 R D T R T

T

✱✱✱

40 6 5 40 40 5

40 C

C 40

%

4 % % 4 % D % ( + ( + ( ( 20 20 (

D 20

D 20 20 9 ✱✱✱✱ ✱✱✱✱ ✱✱✱✱ C 9 6 C 6 0 C 0 0 0 0 D B 1.0 D C C

y Blood c 0.8 ) s n ) ) ) e ) ✱✱ + s BAL t e

s 15 s 0.6 E s y 6 e e - e u + t e Blood

✱✱ c t

15 5 + t 15 15 t 3

y 15 o 1 y q y y 6 D c k c 6 0.4 + D c 10 c BAL 5 e u C o o 1 δ

o o r C e k k D l + γ

10 k 10 k D 10 10 u u + 3 F u C 0.2 u e R 5

e C e 6 l D v l e e

i l + l

C l 5

+

C e 3 e

T 5 5 e e 3 0.0 D 5 5 v v % i i D v v 0 l ( l i i D C

l l 1 2 3 4 5 6 7 8 9 10 11 12

C C % 0 % 0 % ( % ( 0 0 ( C Metacluster ( F 100 1 80 CD69+ CD45RA- 80 CD69+ CD45RA- 2 r = 0.30 3 40 r = 0.57 4 p = 0.18 60 p = 0.007 5 8 CD45RA+ 6 40 CD45RA+ 6 r = -0.31 r = -0.51 7 4 20 8 p = 0.11 p = 0.01 9 2 10 0 0 subset (% of CD8 T cells) T subset (% of CD8 11 cells) T subset (% of CD4 0.0 0.1 0.2 0.3 0.4 0.00 0.05 0.10 12 BAL CD4 T cells (# x 106 per ml) BAL CD8 T cells (# x 106 per ml)

G H

Marker (gMFI) - Marker (gMFI) - Cell type (% of BAL CD103- CD69+ CD103+ CD69+ LDH (O.D.) LDH Albumin(ug/ml) (ng/ml) DPP4 LDH (O.D.) LDH Albumin(ug/ml) (ng/ml) DPP4 leukocytes): (O.D.) LDH Albumin(ug/ml) (ng/ml) DPP4 CD8 T cells CD8 T cells T cells 0.3 0.3 0.0 CRTH2 0.0 -0.1 0.5 CRTH2 0.0 -0.1 0.5 CD4 T cells 0.1 0.2 0.1 CXCR5 0.0 0.0 -0.2 CXCR5 0.1 0.1 -0.2 CD8 T cells 0.4 0.5 0.1 CD27 0.1 -0.1 -0.5 CD27 0.6 -0.1 -0.6 Correlation CD69 0.3 -0.1 0.2 CD69 0.5 -0.1 0.2 (r) +1 CD45RA+ CD8 T cells 0.3 0.2 -0.2 CD127 -0.5 -0.2 -0.3 CD127 -0.5 -0.2 -0.2 CD69+ CD8 T cells 0.5 0.6 0.1 HLA-DR 0.0 -0.3 0.2 HLA-DR 0.0 -0.3 0.2 0 CD28 -0.2 -0.4 -0.1 CD28 -0.1 -0.5 0.0 -1 CD69+ CD103- CD8 T cells 0.7 0.3 0.0 CD38 0.3 -0.4 0.4 CD38 0.3 -0.5 0.5 CD69+ CD103+ CD8 T cells 0.3 0.6 0.2 CD25 0.0 0.1 -0.2 CD25 0.0 0.1 -0.2 CD69- CD103+ CD8 T cells 0.0 0.5 -0.1 CD86 0.0 -0.1 -0.3 CD86 0.0 -0.1 -0.3

Figure 6. The airways of post-COVID19 patients are enriched for activated CD8 TRM cells BAL and blood immune cells from post-COVID19 patients were analyzed by spectral deconvolution cytometry.(A) Heatmap of normalised median expression of T cell modulatory and subset markers by clusters of T cells in the airways identified by FlowSOM analysis. (B) Violin plot showing frequencies of each cluster in BAL and blood. (C) tSNE projection of clusters identified by FlowSOM analysis in BAL and blood. (D) Violin plots showing CD4+ and CD8+ T cell subsets as proportions of all T cells in BAL and blood identified by manual gating. (E) Violin plots showing gd, NK, NKT and MAIT cell proportions in BAL and blood. (F) Pearson correlations between total airway CD4+ and CD8+ T cell numbers and subsets. (G) Heatmap depicting Pearson correlation between different T cell populations (as a % of BAL leukocytes) and the concentration of DPP4, LDH and Albumin in the airways. (F) Pearson correlations between the gMFI of LDH and albumin. Data are presented as median ± IQR. Each point represents an individual patient. (A, B, E and F) represent n = 20 individuals. (C) represents an equal number of cells from n = 10 patients combined. (G & H) represents n = 18 individuals. Statistical significance for (B), (D), (E) and (G) was tested by Mann-Whitney U test. *P < 0.05, **P < 0.01, ***P < 0.005, ****P < 0.001. medRxiv preprint doi: https://doi.org/10.1101/2021.08.10.21261834; this version posted August 11, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license .

A B ) ) L L A A ) ) ) B L

B 0.6

L

) l L l

0.15 0.15 A L A 0.003 A m B

m 0.005 /

A B B l / 0.004

6

l B l 6

0 m l 0 0.4 /

m 0.004 1 0.10 0.10 m / 6 / 1 m x 0.002 6

6 0.003 / 0 ( x

6 0 0 ( 0.003 1 s

0 x 1 1 t ( e x 1 x

t n ( 0.2 ( 0.002 x 0.05 +

0.05 y ( u 0.001 0.002 + +

6 c o + 0 3 5 o c 6 2 h

D D 0.001 0.001 l 5 l p D C 0.00 C D 0.0 0.00 e

C 0.000 m +

C 0.000 y C

3 0.000 ) L s D e

t 0.4 C y First appointment c o 0.3 k Follow-up u e l 0.2 ) )

C D ) % ) s s ( s l l

l s l l + l l

0.1 l e 4 30 e ) ) 80 30 50 80 e 40 e 1 c c s s c l l

2.5

c

l l D

0.0 o T T i C e e T

t T 40 c c 60 60

a 2.0

30 r % %

% 20

( T T (

% 20 (

8 ( 30

+

1.5 + D % % M

40 4 40 8 20 M ( ( C

/ R D 1.0 20 D R + +

4 10 10 T C C 8 4 T

D

20 20 +

+ 10 + D D 0.5 + C 10 8 9 9 4 C C 6 D 6 ) 0.0 D 0 s D

0 0 C 0 0 D 0 C e

t 0.4 C C y First appointment c o 0.3 k Follow-up u e l 0.2 %

( E F

G

+ 0.1 4 1

D 0.0 C

Figure 7. Reduced cellularity in the airways one year after initial bronchoscopy post-COVID19 (A) Total cell counts (left) and proportions of lymphocytes (right) in the BAL following first bronchoscopy and one year follow-up bronchoscopy. (B) Cell counts of lymphocyte populations in the BAL following first bronchoscopy and one year follow-up bronchoscopy. (C) Cell counts of myeloid populations in the BAL following first bronchoscopy and one year follow-up bronchoscopy. (D) Proportions of T cell subsets in the BAL following first bronchoscopy and one year follow-up bronchoscopy. (E) LDH (left) and albumin (right) measurements in BAL following first year bronchoscopy and one year follow-up bronchoscopy. (F) Percentage of abnormal CT following first year CT and one year follow-up CT. (G) BAL LDH and albumin quantification following first year bronchoscopy and one year follow-up bronchoscopy. Each point represents a single patient. Green shading indicates median+/-IQR for proportions of populations and mediator levels observed in healthy airways.