BMJ

Confidential: For Review Only Diagnostic accuracy of serological tests for COVID-19: a systematic review and meta-analysis

Journal: BMJ

Manuscript ID BMJ-2020-059052

Article Type: Research

BMJ Journal: BMJ

Date Submitted by the 30-May-2020 Author:

Complete List of Authors: Bastos, Mayara; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation; Rio de Janeiro State University, Social Medicine Institute Tavaziva, Gamuchirai ; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation Abidi, Syed ; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation Campbell, Jonathon; McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit & Centre for Outcomes Research & Evaluation; McGill University, Departments of Epidemiology, Biostatistics & Occupational Health, and Medicine Haraoui, Louis-Patrick; University of Sherbrooke, 3- Department of Microbiology and Infectious Diseases, Faculty of Medicine and Health Sciences Johnston, James; Centre for Disease Control, Vancouver, British Columbia Lan, Zhiyi; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation Law, Stephanie; Harvard University, Department of Global Health & Social Medicine, Harvard Medical School Maclean, Emily; McGill University, Departments of Epidemiology, Biostatistics & Occupational Health, and Medicine Trajman, Anete; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation; State University of Rio de Janeiro Biomedical Centre, Social Medicine Institute Menzies, Dick; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & Evaluation; McGill University, Departments of Epidemiology, Biostatistics & Occupational Health, and Medicine Benedetti, Andrea; McGill University, Departments of Epidemiology, Biostatistics & Occupational Health, and Medicine; Research Institute of the McGill University Health Centre, Respiratory Epidemiology and

https://mc.manuscriptcentral.com/bmj Page 1 of 51 BMJ

1 2 3 Clinical Research Unit, Centre for Outcomes Research & Evaluation 4 Ahmad Khan, Faiz; Research Institute of the McGill University Health 5 Centre, Respiratory Epidemiology and Clinical Research Unit, Centre for 6 Outcomes Research & Evaluation; McGill University, Departments of 7 Medicine & Epidemiology, Biostatistics & Occupational Health 8 COVID-19, SARS-CoV2, diagnostic accuracy, serological tests, meta- 9 Keywords: analyses, systematic review, point of care 10 Confidential: For Review Only 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 2 of 51

1 2 3 Diagnostic accuracy of serological tests for COVID-19: a systematic review and meta-analysis 4 5 1,2 1 1 6 Mayara Lisboa Bastos, MD, Gamuchirai Tavaziva, MScPH, , Syed Kunal Abidi, BSc, Jonathon R. 7 Campbell, PhD,1,6 Louis-Patrick Haraoui,3 MDCM, James C. Johnston,4 MD, Zhiyi Lan, MSc,1 8 Stephanie Law, PhD,5 Emily MacLean, MSc,6 Anete Trajman, MD,1,2 Dick Menzies, MD,1,6 Andrea 9 Benedetti, PhD,1,6 Faiz Ahmad Khan, MDCM1, 6 10 11 Confidential: For Review Only 12 Affiliations 13 1- Respiratory Epidemiology and Clinical Research Unit, Centre for Outcomes Research & 14 Evaluation, Research Institute of the McGill University Health Centre, Montreal, Canada 15 16 2- Social Medicine Institute, State University of Rio de Janeiro, Rio de Janeiro, Brazil 17 3- Department of Microbiology and Infectious Diseases, Faculty of Medicine and Health 18 Sciences, Université de Sherbrooke, Sherbrooke, Québec, Canada 19 4- University of British Columbia, Vancouver, Canada 20 5- Department of Global Health & Social Medicine, Harvard Medical School, Boston, USA 21 22 6- Departments of Epidemiology, Biostatistics & Occupational Health, and Medicine, McGill 23 University, Montreal, Canada 24 25 Emails 26 Mayara Lisboa Bastos: [email protected] 27 28 Gamuchirai Tavaziva: [email protected] 29 Syed Kunal Abidi: [email protected] 30 Jonathon R. Campbell: [email protected] 31 Louis-Patrick Haraoui: [email protected] 32 James C. Johnston: [email protected] 33 Zhiyi Lan: [email protected] 34 Stephanie Law: [email protected] 35 Emily MacLean: [email protected] 36 37 Anete Trajman: [email protected] 38 Dick Menzies: [email protected] 39 Andrea Bendetti: [email protected] 40 41 Corresponding Author: 42 Faiz Ahmad Khan 43 44 5252 Blvd. de Maisonneuve, Office D3.60 45 Montreal, Quebec, Canada, H4A 3S5 46 [email protected] 47 514-934-1934, extension 32129. Mobile: 514-816-7651 48 49 50 Word count: 51 Manuscript (excluding references & footnotes): 3,311 52 Abstract: 400 words 53 54 55 Key words: COVID-19, Serological tests, diagnostic accuracy, point of care, systematic Review, 56 meta-analysis 57 58 1 59 60 https://mc.manuscriptcentral.com/bmj Page 3 of 51 BMJ

1 2 3 ABSTRACT (400 words) 4 5 6 Objective: To determine the diagnostic accuracy of serological tests (‘sero-diagnostics’) for 7 coronavirus disease-2019 (COVID-19). 8 9 Design: Systematic review and meta-analysis. 10 11 Confidential: For Review Only 12 Data sources: We searched MEDLINE, bioRxiv, and medRxiv, from 1 January to 30 April 2020, 13 using subject headings/subheadings combined with text words for the concepts of COVID-19 14 and sero-diagnostics. 15 16 17 Eligibility criteria and data analysis: Eligible studies measured sensitivity and/or specificity of a 18 COVID-19 sero-diagnostic compared to a reference standard of viral culture or reverse- 19 transcriptase polymerase chain reaction. We excluded studies with fewer than five participants 20 or samples. We assessed risk of bias using QUADAS-2. We used random-effects bivariate meta- 21 22 analyses to estimate pooled sensitivity and specificity. 23 24 Main outcome measure: The primary outcome was overall sensitivity and specificity, stratified 25 by sero-diagnostic method (enzyme linked immunosorbent assays, ELISA; lateral flow 26 27 immunoassays, LFIA; or chemiluminescence immunoassays, CLIA) and immunoglobulin class 28 (IgG, IgM, or both). Secondary outcomes were stratum-specific sensitivity and specificity within 29 sub-groups defined by study or participant characteristics, including time since symptom onset. 30 31 Results: We identified 5016 references and included 39 studies. We performed 48 risk of bias 32 33 assessments (one for each population and method evaluated). We found high risk of patient 34 selection bias in 47/48 (98%), and high or unclear risk of bias from performance or 35 interpretation of the sero-diagnostic in 35/48 (73%). Only 3/39 (8%) studies included 36 outpatients. No tests were evaluated at the point of care. For each sero-diagnostic method, 37 38 pooled sensitivity and specificity were not associated with the immunoglobulin class measured. 39 The pooled sensitivity of ELISAs measuring both immunoglobulins was 84% (95% confidence 40 interval 76% to 91%), of LFIAs, 68% (52% to 81%), and of CLIAs, 98% (46% to 100%). In all 41 analyses, pooled sensitivity was lower for LFIAs, the potential point-of-care method. Pooled 42 specificities were high for all three methods, ranging from 97% to 99%. However, 10165/12360 43 44 (82%) samples used for estimating specificity were from populations tested before the 45 epidemic, or not suspected to have COVID-19. Amongst LFIAs, pooled sensitivity of commercial 46 kits (67%, 51% to 80%) was lower than for non-commercial tests (88%, 83% to 91%). 47 Heterogeneity was seen in all analyses. In the first week of symptoms, pooled sensitivity ranged 48 49 from 13% to 56%, and after the third week from 70% to 98%. 50 51 Conclusion: There is an urgent need for higher quality clinical studies assessing the accuracy of 52 COVID-19 sero-diagnostics. Currently, available evidence does not support the continued use of 53 existing point-of-care sero-diagnostics. 54 55 56 Study registration: PROSPERO CRD42020179452. 57 58 2 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 4 of 51

1 2 3 4 5 6 Summary box 7 What is already known on this topic 8  Tests to detect antibodies (‘sero-diagnostics’) to severe acute respiratory syndrome 9 10 coronavirus-2 (SARS-CoV-2) could improve diagnosis of coronavirus disease 2019 (COVID- 11 19) andConfidential: be useful tools for epidemiologic For surveillance. Review Only 12  The number of sero-diagnostics has rapidly increased, and many are being marketed for 13 point-of-care use, but the evidence base supporting their diagnostic accuracy has not been 14 formally evaluated. 15 16 17 What this study adds 18 19 20  The available evidence on the accuracy of COVID-19 sero-diagnostics is characterized by 21 risks of bias, and heterogeneity, such that estimates of sensitivity and specificity are 22 unreliable and have limited generalizability. The evidence is particularly weak for point-of- 23 care diagnostics, none of which were studied at the point of care. 24  25 Caution is warranted if using COVID-19 sero-diagnostics for clinical decision-making or 26 epidemiologic surveillance. Evidence is insufficient to support the continued use of existing 27 point-of-care sero-diagnostics. 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 3 59 60 https://mc.manuscriptcentral.com/bmj Page 5 of 51 BMJ

1 2 3 INTRODUCTION 4 5 Accurate and rapid diagnostic tests will be critical for achieving control of coronavirus disease 6 2019 (COVID-19), a pandemic illness caused by severe acute respiratory syndrome-coronavirus 7 2 (SARS-CoV-2). Diagnostics for COVID-19 fall into two major categories: molecular tests that 8 9 detect viral RNA, and sero-diagnostics that detect anti-SARS-CoV-2 immunoglobulins. Reverse 10 transcriptaseConfidential: PCR (RT-PCR), a molecular test, For is widely Review used as the reference Only standard for 11 diagnosis of COVID-19, however, limitations include potential false-negative results1 2 12 3 13 changes in diagnostic accuracy over the disease course, and precarious availability of test 14 materials4. Sero-diagnostics have generated substantial interest as an alternative or 15 complement to RT-PCR in the diagnosis of acute infection, as some may be cheaper and easier 16 17 to implement at the point of care. A clear advantage of sero-diagnostics over RT-PCR is that 18 these methods can identify individuals previously infected by SARS-CoV-2, even if they never 19 underwent testing while acutely ill. As such, sero-diagnostics could be deployed as surveillance 20 tools to better understand SARS-CoV-2 epidemiology and potentially inform individual risk of 21 22 future disease. 23 24 In a short period of time, many COVID-19 sero-diagnostics have become available, including 25 some marketed for use as rapid, point of care tests. However, the pace of development has 26 27 exceeded that of rigorous evaluation, and there remains important uncertainty about test 5 28 accuracy. We undertook a systematic review and meta-analysis to assess the diagnostic 29 accuracy of sero-diagnostics for SARS-CoV-2 infection. Our objectives were to evaluate the 30 quality of the available evidence, to compare pooled sensitivities and specificities of different 31 sero-diagnostic methods, and to identify study, test, and patient characteristics associated with 32 33 test accuracy. 34 35 METHODS 36 37 38 Search strategy and selection criteria 6 39 Our systematic review and meta-analysis is reported according to PRISMA guidelines (PRISMA 40 checklist available in online supplemental material). We searched Ovid-MEDLINE for studies 41 published in 2020, with no restrictions on language. We used subject headings/subheadings 42 43 (where applicable) combined with text words for the concepts of COVID-19 (or SARS-CoV-2) 44 and sero-diagnostics. The complete search strategy, run on April 6th and repeated April 28th, 45 2020, is provided in the Supplement (page S2). To identify pre-peer reviewed studies, we 46 47 searched the entire list of COVID-19 preprints from medRxiv and bioRxiv 48 (https://connect.medrxiv.org/relate/content/181) initially on April 4th, and again on April 30th 49 2020. We also considered articles referred by colleagues or identified in references of included 50 51 studies. 52 53 Eligible studies were randomized trials, cohort or case-control studies, and case series, 54 55 reporting the sensitivity and/or specificity of a sero-diagnostic for COVID-19. We excluded 56 review articles, editorials, case reports, modelling or economic studies, articles with sample size 57 58 4 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 6 of 51

1 2 3 less than 5, and studies that only reported analytical accuracy (as opposed to diagnostic 4 7 5 accuracy). Three investigators (MB, GT, FAK) independently screened titles and abstracts, and 6 two (MB, GT) independently screened full-text papers. We used a sensitive screening strategy 7 at the title/abstract level wherein selection by a single reviewer was sufficient for a study to 8 9 undergo full-text review. Disagreements between reviewers at the full-text stage were resolved 10 by a thirdConfidential: reviewer (FAK). In the systematic For review Reviewand meta-analyses, Onlywe included studies 11 where sensitivity and/or specificity of at least one COVID-19 sero-diagnostic was measured 12 13 against a reference standard of viral culture or RT-PCR. 14 15 Data analysis 16 17 One investigator (MB) extracted aggregate study-level data using a piloted standardized 18 electronic data entry form. For each study, a second reviewer (ZL or EM) verified all entered 19 data. No duplicate data were identified. We collected information on study characteristics 20 (location, design), study populations (age, sex, clinical severity, sources of populations used for 21 22 estimating specificity), the timing of specimen collection with respect to onset of symptoms, 23 and methodological details about index and reference tests. We categorized sero-diagnostics by 24 method: enzyme linked immunosorbent assays (ELISA), lateral flow immunoassays (LFIA), or 25 26 chemiluminescence immunoassays (CLIA). In several studies, investigators assessed the 27 accuracy of more than one test method (e.g. ELISA and also LFIA), and/or more than one 28 particular index test (e.g. one study evaluated 9 different LFIA). We extracted the numbers 29 30 needed to construct 2x2 contingency tables for each particular index test performed in a study. 31 Each evaluation of a particular index test was considered its own study arm. For example, an 32 article assessing 9 LFIA and 2 ELISA on the same set of patients would contribute 11 study arms. 33 34 35 The main summary measures were pooled sensitivity and pooled specificity, with 95% 36 confidence intervals (95%CI). For meta-analyses, we used bivariate generalized linear mixed 37 38 models (GLMMs), specifying random effects at the level of the particular test and of the study. 39 When models with two random-effects did not converge, we used only the test-level random 40 effect. To describe heterogeneity, we estimated tau-squared (the variance of the random- 41 2 42 effects parameters) and constructed forest plots. We did not use the I -statistic as our models 43 were bivariate. Studies that did not report both sensitivity and specificity were excluded from 44 bivariate meta-analyses. 45 46 47 Two reviewers independently assessed risks of bias and applicability concerns using the 48 QUADAS-2 tool, for the domains of patient selection, performance of the index test, 49 performance of the reference test, and flow and timing (for risk of bias only).8 Conflicts were 50 51 resolved through consensus. We performed a quality assessment for each test method and 52 population. For example, an article assessing 9 LFIA and 2 ELISA on the same set of patients 53 would have two QUADAS-2 assessments (one for LFIA, one for ELISA). 54 55 56 57 58 5 59 60 https://mc.manuscriptcentral.com/bmj Page 7 of 51 BMJ

1 2 3 We first estimated pooled sensitivity and specificity by test method (ELISA, LFIA, CLIA) and 4 5 immunoglobulin class detected (IgM, IgG, or both IgG & IgM) as we expected accuracy would be 6 associated with immunoglobulin class as seen for other coronoviruses.9-11 In a post-hoc 7 analysis, we estimated pooled sensitivities and specificities for each method pooling different 8 9 classes of immunoglobulin and without further stratification. To do so, we used the combined 10 IgG and Confidential:IgM result when available, and otherwise For we Review used the separate Only IgG and IgM results. For 11 tests that had a 2x2 table for IgM and another 2x2 table for IgG, both contributed arms, sharing 12 13 the same test-level and study-level random effects. 14 15 To assess the following pre-specified variables as potential determinants of diagnostic accuracy, 16 17 we estimated stratum-specific pooled sensitivity and specificity: peer-review status; reporting 18 of data at the level of patients or samples; the type of SARS-CoV-2 antigen used; whether 19 testing was via commercial kit or an in-house assay; whether the population used to estimate 20 specificity consisted of samples collected prior to the emergence of SARS CoV-2, individuals 21 22 without suspected COVID-19 tested during the epidemic, individuals with suspected COVID-19, 23 or individuals with other viral infections; and the timing of sample collection with respect to the 24 onset of symptoms (during the first week, during the second week, or after the second week). 25 26 In these analyses, we pooled regardless of immunoglobulin class in order to maximize sample 27 size, using the same approach described above. Because data were not available to study the 28 association between the timing of sampling and specificity, this analysis was done with 29 30 univariate models and included studies that only reported sensitivity. 31 32 We used the statistical software R12 package Lme413 for meta-analyses, and package mada to 33 14 34 create forest plots. 35 36 Patient and public involvement 37 38 Patients were not involved in the development of the research question or its outcome 39 measures, conduct of the research, or preparation of the manuscript. 40 41 RESULTS 42 43 We identified 4969 unique titles. We excluded 4696 based on title and abstract screening and 44 238 after full-text review. 39 studies15-53 met inclusion criteria (Figure 1). Supplement Tables S1 45 and S2 report characteristics of each individual study (pages S3-S6), and Supplement Table S3 46 47 enumerates methodological aspects of the sero-diagnostics and the reference tests (page S7). 48 Table 1 summarizes the studies by test method; the sum of the number of studies exceeds 39 49 because some studies evaluated more than one method. 28/39 (72%) studies took place in 50 16-35 38-41 45-48 15 36 43 42 51 China, 3/39 (7%) in Italy, and the remainder in the United States, 52 Denmark,51 Spain,37 Sweden,53, Japan,44 and the United Kingdom.52 Both sensitivity and 53 specificity were reported in 31/39 (79%) studies,15 17 18 20 23 24 26-30 32 35-37 39-53 sensitivity alone in 54 7/39 (18%),16 19 21 22 25 31 34 and specificity alone in 1/3933 (3%). Amongst included studies, 22/39 55 24-34 37 42-47 49-52 56 (56%) were not peer reviewed, and 31/39 (79%) used a case-control design for 57 58 6 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 8 of 51

1 2 3 selecting the study population15-22 24 25 27-30 34-44 46-53. 3/39 (8%) included outpatient 4 29 49 52 15-22 24 25 27 28 30 31 34 36 39-41 43 44 46- 5 populations, and 25/39 (58%) were restricted to inpatients. 6 48 51 Clinical severity of the study population was not reported in 23/39 (59%).2 15 18 19 21 23-28 31-34 7 37 39 42-46 53 Accuracy stratified by weeks since symptom onset was reported in 18/39 (46%).16 18 8 20 25 27 29 31 37 39 40 44 47-53 9 10 Confidential: For Review Only 11 RT-PCR was the reference standard to rule-in SARS-CoV-2 infection in all 38 studies reporting 12 15-32 34-53 13 sensitivity. RT-PCR as a reference standard was performed on nasopharyngeal 14 specimens in 16/39 (41%) studies,15 16 18 26 28 29 34 36 37 40 41 43 44 49 50 52 in sputum, saliva, or oral, 15 throat/pharyn28geal swabs in 7/39 (18%),17 19 22 23 25 31 39 and the specimen type was not 16 20 21 24 27 30 32 35 38 42 45-48 51 53 17 reported in 15/39 (38%). Blood or serum samples for measuring 18 specificity were obtained from different sources: individuals with symptoms of COVID-19 who 19 tested negative by RT-PCR in 10/39 (26%) studies;15 23 26 29 35 39 40 43 45 47 stored specimens 20 collected prior to the COVID-19 epidemic in 11/39 (28%) studies,17 20 33 37 44 48 49 52 53 stored 21 22 specimens collected during the epidemic from individuals without suspected COVID-19 in 23 10/39 (26%)15 18 30 32 38 41 46 47 51 studies; and from individuals with lab-confirmed infection by 24 other viruses (respiratory or non-respiratory) collected either before or during the COVID-19 25 17 24 51 52 26 epidemic in 4/39 (10%) studies. 27 28 Sero-diagnostics that detected IgM were tested in 24/39 (61%) studies,15 17 18 20 23 24 26 27 29 30 36-41 29 44-49 52 53 15 17 18 20 27 29 30 35-41 44-49 51-53 15 18 30 IgG in 25/39 (63%), and both IgM and IgG in 16/39(41%). 31 23 26-29 32 37 42-44 47 49 51 52 Sero-diagnostics that detected antibodies against the nucleocapsid 32 protein alone were used in 8/38 (21%) studies17 19 20 27 28 37 41 52 the surface protein alone in 33 15 20 23 24 33 42 43 48 49 51 52 34 11/39 (28%), and both nucleocapsid and surface proteins in 14/39 35 (36%).16 18 22 30 34-36 38 39 46 47 50 52 53 The antigen was not reported in 10/39 (24%) studies.21 25 26 29 36 31 44 45 49 51 52 Amongst 16 studies evaluating potential point of care tests (LFIA), none performed 37 38 testing at the point of care. Whole blood obtained by venipuncture, finger-stick, or capillary 39 sampling—i.e. specimen types that would be used at the point of care—were included in 5/16 40 studies of LFIA15 23 31 43 53 and data on the performance of such specimens were available for a 41 42 total of 44 patients across all study arms (2% of patients tested with LFIA). Commercial kits 43 were evaluated in 22/39 (61%) studies,15 16 20 21 25 26 28-31 34 36 37 39-41 43 44 49 51-53 in-house assays in 44 16/39 (41%),17 22-24 27 32 33 35 38 42 46-50 52 and in one study45 this information was not clear. 45 46 Supplement Table S4 lists the commercial assays whose names were reported (page S10). 47 48 Figure 2 summarizes the QUADAS-2 assessment, and Supplement Figure S1 displays the 49 assessment of each study (page S12). In total, we performed 48 assessments (see Methods for 50 51 why number of assessments exceeds 39). For the patient selection domain, a high risk of bias 52 was seen in 47/48 (98%) QUADAS-2 assessments, mostly due to their case-control design. For 53 the index test domain, 35/48 (73%) assessments concluded a high or unclear risk of bias, arising 54 55 because it was not clear that the sero-diagnostic was interpreted blind to the reference 56 standard. For LFIA (17 of the QUADAS-2 assessments), where test results are subjectively 57 58 7 59 60 https://mc.manuscriptcentral.com/bmj Page 9 of 51 BMJ

1 2 3 interpreted by a human reader (e.g. appearance of a line), a description of the number of 4 5 readers and assessment of reliability was provided in 2/17 (12%) assessments. For the 6 reference standard domain, we judged the risk of bias as low in 36/48 (75%) assessments, and 7 high or unclear in 12/48 (25%), mostly due to inadequate details about specimens used for RT- 8 9 PCR or use of specimens other than nasopharyngeal swabs. Risk of bias from flow and timing 10 was highConfidential: or unclear in 31/48 (65%) due to For missing informationReview or results Only not stratified by the 11 timing of sample collection with respect to symptom onset. Major applicability concerns for the 12 13 index test were seen in 13/48 (27%), mostly due to LFIA being performed in labs and not using 14 point-of-care type specimens. 15 16 17 Table 2 enumerates within-study and pooled sensitivity stratified by test method and 18 immunoglobulin class. Within each test method (CLIA, ELISA, LFIA), point estimates were similar 19 between the different types of immunoglobulins and confidence intervals overlapped. Within 20 each class of immunoglobulin, sensitivity was lowest for the LFIA method. Table 3 reports on 21 22 specificity. Pooled specificity was high regardless of test method and immunoglobulin class. We 23 report measures of heterogeneity in Supplement Table S5 (page S13), and provide forest plots 24 in Supplement Figure S2 (pages S15-S24). Study-level random effects could only be estimated 25 26 in models for sensitivity and specificity of LFIA with combined IgG and IgM, and these results 27 indicated between-study heterogeneity. Arm-level random effects were estimated for all three 28 test methods and all three immunoglobulin assays, and all indicated between-arm 29 30 heterogeneity. For all test methods and immunoglobulin classes, visual inspection of forest 31 plots demonstrated between-study and between-arm variability in sensitivity and specificity, 32 albeit less pronounced in the latter. As these observations suggested that immunoglobulin class 33 34 is not associated with either sensitivity or specificity, we did a post-hoc analysis without 35 stratification by immunoglobulin, the results of which are reported in Supplement Table S6 36 (page S13). Pooled sensitivities were: ELISA, 84% (77% to 89%); LFIA, 73% (60% to 84%); CLIA, 37 38 89% (79% to 96%). Pooled specificities were: ELISA, 99% (97% to 100%); LFIA, 97% (95% to 39 98%); CLIA, 97% (84% to 99%). Tau-squared estimates indicated between-study and between- 40 arm heterogeneity in these post-hoc analyses. 41 42 43 Sensitivity and specificity reported in three studies that used sero-diagnostic methods other 44 than ELISA, LFIA, or CLIA, are reported in Supplementary Table S7. Sensitivity and/or specificity 45 46 were low for all, with the exception of an IgM enzyme immunoassay in one arm of 16 patients. 47 48 Table 4 reports stratified meta-analyses for evaluating potential sources of heterogeneity in 49 sensitivity and specificity. Peer-review status and the level of reporting (samples versus 50 51 patients) were not associated with accuracy. Point estimates for pooled sensitivity and 52 specificity were higher when both surface and nucleocapsid proteins were used, although 53 confidence intervals overlapped. Point estimates of pooled sensitivity were lower for 54 55 commercial kits versus in-house assays, for all three methods, with the strongest difference 56 seen for LFIA, where sensitivity of commercial kits was 67% (51% to 80%) and of non- 57 58 8 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 10 of 51

1 2 3 commercial tests, 88% (83% to 91%). For all three test methods, pooled specificity was high 4 5 when measured in populations where COVID-19 was not suspected, regardless of whether the 6 sampling had been done prior to or during the epidemic. For both LFIA and CLIA, pooled 7 specificity was lower amongst individuals with suspected COVID-19 as compared to other 8 9 groups; similar data were not available for ELISA. For LFIA, specificity was lower when 10 estimated in individuals with other viral infections, but this was not the case for ELISA or CLIA. 11 Confidential: For Review Only 12 13 Pooled sensitivity stratified by the timing of sample collection relative to symptom onset is 14 reported in Table 5. Regardless of immunoglobulin class or test method, pooled sensitivity was 15 lowest in the first week of symptom onset, and highest in the third week or later. Data on 16 17 specificity stratified by timing were not available. 18 19 DISCUSSION 20 21 22 Principal findings 23 We found existing evidence about the diagnostic accuracy of COVID-19 sero-diagnostics is 24 characterized by high risks of bias, heterogeneity, and limited generalizability to point-of-care 25 26 testing and to outpatient populations. We found sensitivities were consistently lower with the 27 LFIA method compared to ELISA and CLIA. For each test method, the type of immunoglobulin 28 being measured—IgM, IgG, or both—was not associated with diagnostic accuracy. Pooled 29 30 sensitivities were lower with commercial kits, and in the first and second week after symptom 31 onset as compared to after the second week. Pooled specificities of each test method were 32 high. However, stratified results suggested specificity was lower in individuals with suspected 33 34 COVID-19, and that other viral infections could lead to false-positive results for the LFIA 35 method. These observations should serve as an urgent wakeup call regarding the state of 36 evidence on COVID-19 sero-diagnostics, particularly those being marketed as point-of-care 37 38 tests. 39 40 Meaning of the study 41 54 42 The utility of a low-cost, rapid, and accurate point-of-care test has spurred the development 43 and marketing of a number of COVID-19 LFIA sero-diagnostics.55 We found no published or 44 posted studies had actually performed LFIA at the point of care, and the vast majority did not 45 46 test specimen types that would be used at the point of care. The low sensitivity of LFIA is 47 particularly concerning given most studies used sample preparation steps that are likely to 48 increase sensitivity compared to the use of whole blood as would be done at the point of care. 49 These observations argue against the use of LFIA COVID-19 sero-diagnostics beyond research 50 51 and evaluation purposes, and support interim recommendations issued by the World Health 52 Organisation. 56 53 54 55 Cautious interpretation of specificity estimates is warranted for a number of reasons. 56 Importantly, few data were available from people who were tested due to suspected SARS-CoV- 57 58 9 59 60 https://mc.manuscriptcentral.com/bmj Page 11 of 51 BMJ

1 2 3 2 infection; hence our overall pooled estimates may not be generalizable to people who need 4 5 testing because of COVID-19 symptoms. For CLIA, the lower specificity amongst people with 6 suspected COVID-19 could be a spurious finding due to false-negative RT-PCR given CLIA 7 specificity was high amongst people with confirmed other viral infections. By contrast, for LFIA, 8 9 other viral infections could have contributed to the lower specificity in suspected COVID-19. 10 Confidential: For Review Only 11 Considering pooled sensitivity ranged from 13% to 56% within one week of symptom onset, 12 13 current COVID-19 sero-diagnostics have limited utility in the diagnosis of acute COVID-19. And 14 while sensitivity estimates were higher in the third week or later, we observed an important 15 false negative rate for ELISA and LFIA even at this time point, and high heterogeneity for all 16 17 methods. Moreover, the majority of studies did not provide time-stratified results. 18 19 Strengths and limitations of this review 20 Our study has a number of strengths. First, we used sensitive search strategies and included 21 22 pre-peer reviewed literature. While our use of pre-peer reviewed studies may be criticized, we 23 found that biases were also present in the peer-reviewed literature. Moreover, pre-peer review 24 literature has taken an unprecedentedly large role57in discussions and policy making around 25 26 COVID-19—hence the importance of subjecting this literature to critical review. Second, is our 27 systematic assessment of potential sources of bias conducted by two independent reviewers. 28 Third, all data extraction was verified by a second investigator. 29 30 31 Our study also has some limitations. Most importantly, we compared pooled estimates 32 between different study populations. As such, there is the possibility of confounding (e.g. from 33 34 differences in timing sampling, between studies) explaining differences in sensitivity or 35 specificity.58 This approach was taken because few studies performed head-to-head 36 comparisons. We did not perform meta-regression as many studies would have been excluded 37 38 due to limited reporting of co-variates. 39 40 Conclusion and future research 41 42 In summary, we have found major weaknesses in the evidence base for COVID-19 sero- 43 diagnostics. While the scientific community should be lauded for the pace at which novel sero- 44 diagnostics have been developed, this review underscores the need for high quality clinical 45 46 studies to evaluate these tools. With international collaboration, such studies could be rapidly 47 conducted and provide less biased, more precise, and more generalizable information upon 48 which to base clinical and public health policy to alleviate the unprecedented global health 49 emergency that is COVID-19. 50 51 52 FOOTNOTES 53 Authors’ contribution 54 55 Original idea and study design: MLB, FAK, AB, DM, JC; Search strategy & execution: MLB; Screen 56 ing of studies: MLB, GT, FAK; Data extraction and quality assessment: MLB, GT, SL, EM, AT, ZL; Fi 57 58 10 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 12 of 51

1 2 3 gures: MLB, SA; Data analysis: MLB, AB, FAK, ZL; Data interpretation, writing & critical review of 4 5 manuscript: all authors. 6 Ethical approval: Not required. 7 Conflict of Interests: All authors have completed the ICMJE uniform disclosure form at 8 www.icmje.org/coi_disclosure.pdf. Dr. Law reports personal fees from Carebook Technologies 9 10 Inc., outside the submitted work. All other authors have nothing to disclose. 11 Acknowledgments:Confidential: The authors thanks Ms. For Geneviève Review Gore, McGill University Only Academic 12 Librarian, for assistance in developing the search strategy for OVID-Medline and the pre-peer 13 14 review literature. 15 License: The Corresponding Author has the right to grant on behalf of all authors and does 16 grant on behalf of all authors, an exclusive licence on a worldwide basis to the BMJ Publishing 17 Group Ltd (“BMJ”), and its Licences to permit this article (if accepted) to be published in The 18 19 BMJ’s editions and any other BMJ products and to exploit all subsidiary rights, as set out in our 20 licence. 21 Funding: There was no funding source for this study. 22 Data Sharing: Data can be requested from the corresponding author. 23 24 Transparency statement: The study guarantor (FAK) affirms that this manuscript is an honest, 25 accurate, and transparent account of the study being reported; that no important aspects of 26 the study have been omitted; and that any discrepancies from the study as planned (and, if 27 relevant, registered) have been explained. 28 Dissemination to participants and related patient and public communities: The results of the 29 30 meta-analysis will be disseminated to patients, providers, policymakers through social media, 31 academic and institutional networks. 32 Protocol: Available on 33 https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=179452 34 35 36 37 38 FIGURE LEGENDS 39 Figure 1. Study selection. 40 41 Figure 2. Summary of QUADAS-2 assessment. 42 43 REFERENCES 44 45 46 1. Winichakoon P, Chaiwarith R, Liwsrisakun C, et al. Negative Nasopharyngeal and 47 Oropharyngeal Swabs Do Not Rule Out COVID-19. J Clin Microbiol 2020;58(5) doi: 48 10.1128/jcm.00297-20 [published Online First: 2020/02/28] 49 50 2. Chen Z, Li Y, Wu B, et al. A Patient with COVID-19 Presenting a False-Negative Reverse 51 Transcriptase Polymerase Chain Reaction Result. Korean J Radiol 2020;21(5):623-24. doi: 52 10.3348/kjr.2020.0195 [published Online First: 2020/03/25] 53 3. Sethuraman N, Jeremiah SS, Ryo A. Interpreting Diagnostic Tests for SARS-CoV-2. JAMA 2020 54 55 doi: 10.1001/jama.2020.8259 56 57 58 11 59 60 https://mc.manuscriptcentral.com/bmj Page 13 of 51 BMJ

1 2 3 4. Microbiology ASf. ASM Expresses Concern about Coronavirus Test Reagent Shortages. 4 5 Available on: https://asm.org/Articles/Policy/2020/March/ASM-Expresses-Concern- 6 about-Test-Reagent-Shortages. Accessed 22 May 2020. 2020 7 5. Maxmen A. The researchers taking a gamble with antibody tests for coronavirus. Nature 2020 8 doi: 10.1038/d41586-020-01163-5 [published Online First: 2020/04/24] 9 6. Liberati A, Altman DG, Tetzlaff J, et al. The PRISMA statement for reporting systematic 10 11 reviewsConfidential: and meta-analyses of studies For that evaluate Review healthcare interventions:Only 12 explanation and elaboration. BMJ 2009;339:b2700. doi: 10.1136/bmj.b2700 13 7. Saah AJ, Hoover DR. "Sensitivity" and "specificity" reconsidered: the meaning of these terms 14 in analytical and diagnostic settings. Ann Intern Med 1997;126(1):91-4. doi: 15 16 10.7326/0003-4819-126-1-199701010-00026 [published Online First: 1997/01/01] 17 8. Whiting PF, Rutjes AW, Westwood ME, et al. QUADAS-2: a revised tool for the quality 18 assessment of diagnostic accuracy studies. Ann Intern Med 2011;155(8):529-36. doi: 19 10.7326/0003-4819-155-8-201110180-00009 [published Online First: 2011/10/19] 20 9. Meyer B, Drosten C, Müller MA. Serological assays for emerging coronaviruses: challenges 21 22 and pitfalls. Virus Res 2014;194:175-83. doi: 10.1016/j.virusres.2014.03.018 [published 23 Online First: 2014/03/29] 24 10. Woo PCY, Lau SKP, Wong BHL, et al. Detection of Specific Antibodies to Severe Acute 25 Respiratory Syndrome (SARS) Coronavirus Nucleocapsid Protein for Serodiagnosis of 26 27 SARS Coronavirus Pneumonia. Journal of Clinical Microbiology 2004;42(5):2306-09. doi: 28 10.1128/jcm.42.5.2306-2309.2004 29 11. Woo PC, Lau SK, Wong BH, et al. Differential sensitivities of severe acute respiratory 30 syndrome (SARS) coronavirus spike polypeptide enzyme-linked immunosorbent assay 31 (ELISA) and SARS coronavirus nucleocapsid protein ELISA for serodiagnosis of SARS 32 33 coronavirus pneumonia. J Clin Microbiol 2005;43(7):3054-8. doi: 10.1128/jcm.43.7.3054- 34 3058.2005 [published Online First: 2005/07/08] 35 12. R: A language and environment for statistical computing [program]: R Foundation for 36 Statistical Computing, , 2019. 37 38 13. Bates D, Mächler M, Bolker B, et al. Fitting Linear Mixed-Effects Models Using lme4. 2015 39 2015;67(1):48. doi: 10.18637/jss.v067.i01 [published Online First: 2015-10-07] 40 14. Mada: Meta-Analysis of Diagnostic Accuracy. [program], 2019. 41 15. Cassaniti I, Novazzi F, Giardina F, et al. Performance of VivaDiagTM COVID-19 IgM/IgG Rapid 42 Test is inadequate for diagnosis of COVID-19 in acute patients referring to emergency 43 44 room department. Journal of Medical Virology 2020;30:30. 45 16. Gao HX, Li YN, Xu ZG, et al. Detection of serum immunoglobulin M and immunoglobulin G 46 antibodies in 2019-novel coronavirus infected cases from different stages. Chinese 47 Medical Journal 2020;26:26. 48 49 17. Guo L, Ren L, Yang S, et al. Profiling Early Humoral Response to Diagnose Novel Coronavirus 50 Disease (COVID-19). Clinical Infectious Diseases 2020;21:21. 51 18. Liu W, Liu L, Kou G, et al. Evaluation of Nucleocapsid and Spike Protein-based ELISAs for 52 detecting antibodies against SARS-CoV-2. Journal of Clinical Microbiology 2020;30:30. 53 19. Zhang W, Du RH, Li B, et al. Molecular and serological investigation of 2019-nCoV infected 54 55 patients: implication of multiple shedding routes. Emerging Microbes & Infections 56 2020;9(1):386-89. 57 58 12 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 14 of 51

1 2 3 20. Zhao J, Yuan Q, Wang H, et al. Antibody responses to SARS-CoV-2 in patients of novel 4 5 coronavirus disease 2019. Clinical Infectious Diseases 2020;28:28. 6 21. Xiao DAT, Gao DC, Zhang DS. Profile of Specific Antibodies to SARS-CoV-2: The First Report. 7 Journal of Infection 2020;21:21. 8 22. To KK, Tsang OT, Leung WS, et al. Temporal profiles of viral load in posterior oropharyngeal 9 saliva samples and serum antibody responses during infection by SARS-CoV-2: an 10 11 observationalConfidential: cohort study. The Lancet For Infectious Review Diseases 2020;23:23. Only 12 23. Li Z, Yi Y, Luo X, et al. Development and Clinical Application of A Rapid IgM-IgG Combined 13 Antibody Test for SARS-CoV-2 Infection Diagnosis. Journal of Medical Virology 14 2020;27:27. 15 16 24. Cai X, Chen J, Hu J, et al. A Peptide-based Magnetic Chemiluminescence Enzyme 17 Immunoassay for Serological Diagnosis of Corona Virus Disease 2019 (COVID-19). 2020 18 25. Gao Y, Yuan Y, Li TT, et al. Evaluation the auxiliary diagnosis value of antibodies assays for 19 detection of novel coronavirus (SARS-Cov-2) causing an outbreak of pneumonia (COVID- 20 19). 2020 21 22 26. Jia X, Zhang P, Tian Y, et al. Clinical significance of IgM and IgG test for diagnosis of highly 23 suspected COVID-19 infection. 2020 24 27. Lin D, Liu L, Zhang M, et al. Evaluations of serological test in the diagnosis of 2019 novel 25 coronavirus (SARS-CoV-2) infections during the COVID-19 outbreak. 2020 26 27 28. Liu L, Liu W, Wang S, et al. A preliminary study on serological assay for severe acute 28 respiratory syndrome coronavirus 2 (SARS-CoV-2) in 238 admitted hospital patients. 29 2020 30 29. Liu Y, Liu Y, Diao B, et al. Diagnostic Indexes of a Rapid IgG/IgM Combined Antibody Test for 31 SARS-CoV-2. 2020 32 33 30. Lou B, Li T, Zheng S, et al. Serology characteristics of SARS-CoV-2 infection since the 34 exposure and post symptoms onset. 2020 35 31. Pan Y, Li X, Yang G, et al. Serological immunochromatographic approach in diagnosis with 36 SARS-CoV-2 infected COVID-19 patients. 2020 37 38 32. Zhang P, Gao Q, Wang T, et al. Evaluation of recombinant nucleocapsid and spike proteins 39 for serological diagnosis of novel coronavirus disease 2019 (COVID-19). 2020 40 33. Zhao R, Li M, Song H, et al. Serological diagnostic kit of SARS-CoV-2 antibodies using CHO- 41 expressed full-length SARS-CoV-2 S1 proteins. 2020 42 34. Long Qx, Deng Hj, Chen J, et al. Antibody responses to SARS-CoV-2 in COVID-19 patients: the 43 44 perspective application of serological tests in clinical practice. 2020 45 35. Chen Z, Zhang Z, Zhai X, et al. Rapid and sensitive detection of anti-SARS-CoV-2 IgG using 46 lanthanide-doped nanoparticles-based lateral flow immunoassay. Analytical Chemistry 47 2020;23:23. 48 49 36. Infantino M, Grossi V, Lari B, et al. Diagnostic accuracy of an automated chemiluminescent 50 immunoassay for anti-SARS-CoV-2 IgM and IgG antibodies: an Italian experience. Journal 51 of Medical Virology 2020;24:24. 52 37. Garcia FP, Perez Tanoira R, Romanyk Cabrera JP, et al. Rapid diagnosis of SARS-CoV-2 53 infection by detecting IgG and IgM antibodies with an immunochromatographic device: 54 55 a prospective single-center study. 2020 56 57 58 13 59 60 https://mc.manuscriptcentral.com/bmj Page 15 of 51 BMJ

1 2 3 38. Zhong L, Chuan J, Gong B, et al. Detection of serum IgM and IgG for COVID-19 diagnosis. 4 5 Science China Life sciences 2020;63(5):777-80. 6 39. Jin Y, Wang M, Zuo Z, et al. Diagnostic value and dynamic variance of serum antibody in 7 coronavirus disease 2019. International Journal of Infectious Diseases 2020;94:49-52. 8 40. Xie J, Ding C, Li J, et al. Characteristics of Patients with Coronavirus Disease (COVID-19) 9 Confirmed using an IgM-IgG Antibody Test. Journal of Medical Virology 2020;24:24. 10 11 41. XiangConfidential: F, Wang X, He X, et al. Antibody ForDetection Review and Dynamic Characteristics Only in Patients 12 with COVID-19. Clinical Infectious Diseases 2020;19:19. 13 42. Freeman B, Lester S, Mills L, et al. Validation of a SARS-CoV-2 spike ELISA for use in contact 14 investigations and serosurveillance. 2020 15 16 43. Paradiso AV, De Summa S, Loconsole D, et al. Clinical meanings of rapid serological assay in 17 patients tested for SARS-Co2 RT-PCR. 2020 18 44. Imai K, Tabata S, Ikeda M, et al. Clinical evaluation of an immunochromatographic IgM/IgG 19 antibody assay and chest computed tomography for the diagnosis of COVID-19. 2020 20 45. Yangchun F. Optimize Clinical Laboratory Diagnosis of COVID-19 from Suspect Cases by 21 22 Likelihood Ratio of SARS-CoV-2 IgM and IgG antibody. 2020 23 46. Qian C, Zhou M, Cheng F, et al. Development and Multicenter Performance Evaluation of 24 The First Fully Automated SARS-CoV-2 IgM and IgG Immunoassays. 2020 25 47. Ma H, Zeng W, He H, et al. COVID-19 diagnosis and study of serum SARS-CoV-2 specific IgA, 26 27 IgM and IgG by a quantitative and sensitive immunoassay. 2020 28 48. Perera RA, Mok CK, Tsang OT, et al. Serological assays for severe acute respiratory 29 syndrome coronavirus 2 (SARS-CoV-2), March 2020. Euro Surveillance: Bulletin Europeen 30 sur les Maladies Transmissibles = European Communicable Disease Bulletin 2020;25(16) 31 49. Adams ER, An, R, et al. Evaluation of antibody testing for SARS-Cov-2 using ELISA and lateral 32 33 flow immunoassays. 2020 34 50. Burbelo PD, Riedo FX, Morishima C, et al. Detection of Nucleocapsid Antibody to SARS-CoV- 35 2 is More Sensitive than Antibody to Spike Protein in COVID-19 Patients. 2020 36 51. Lassauniere R, Frische A, Harboe ZB, et al. Evaluation of nine commercial SARS-CoV-2 37 38 immunoassays. 2020 39 52. Whitman JD, Hiatt J, Mowery CT, et al. Test performance evaluation of SARS-CoV-2 40 serological assays. medRxiv 2020:2020.04.25.20074856. doi: 41 10.1101/2020.04.25.20074856 42 53. Hoffman T, Nissen K, Krambrich J, et al. Evaluation of a COVID-19 IgM and IgG rapid test; an 43 44 efficient tool for assessment of past exposure to SARS-CoV-2. Infect Ecol Epidemiol 45 2020;10(1):1754538. doi: 10.1080/20008686.2020.1754538 [published Online First: 46 2020/05/05] 47 54. Beeching NJ, Fletcher TE, Beadsworth MBJ. Covid-19: testing times. BMJ 2020;369:m1403. 48 49 doi: 10.1136/bmj.m1403 50 55. Petherick A. Developing antibody tests for SARS-CoV-2. Lancet 2020;395(10230):1101-02. 51 doi: 10.1016/s0140-6736(20)30788-1 [published Online First: 2020/04/06] 52 56. World Health O. Advice on the use of point-of-care immunodiagnostic tests for COVID-19: 53 scientific brief, 8 April 2020. Geneva: World Health Organization, 2020. 54 55 56 57 58 14 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 16 of 51

1 2 3 57. Majumder MS, Mandl KD. Early in the epidemic: impact of preprints on global discourse 4 5 about COVID-19 transmissibility. Lancet Glob Health 2020;8(5):e627-e30. doi: 6 10.1016/s2214-109x(20)30113-3 [published Online First: 2020/03/30] 7 58. Deeks JJ. Systematic reviews in health care: Systematic reviews of evaluations of diagnostic 8 and screening tests. Bmj 2001;323(7305):157-62. doi: 10.1136/bmj.323.7305.157 9 [published Online First: 2001/07/21] 10 11 Confidential: For Review Only 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 15 59 60 https://mc.manuscriptcentral.com/bmj Page 17 of 51 BMJ

1 2 MAIN TABLES 3 4 Table 1| Summary of characteristics of included studies, stratified by sero-diagnostic method 5 6 ELISA LFIA CLIA Other * 7 N studies N participants N studies N participants N studies N participants N studies N participants 8 Overall participants 15 2548 16 † 1857 13 3750 3 140 9 Peer reviewed 10 Yes Confidential:8 1339 For 6Review810 Only7 3090 1 16 11 No 7 1209 11 1047 6 660 2 123 12 Geographic location 13 China 11 1572 8 911 12 3635 1 57 14 Denmark 1 112 1 62 - 1 16 15 Italy - - 3 301 1 125 - - 16 Japan - - 1 160 - - - - 17 Spain - - 1 100 - - - - 18 Sweden - - 1 153 - - - - United Kingdom 1 90 1 90 - - 1 67 19 United States 2 774 1 80 - - - - 20 Clinical setting 21 Inpatient only 11 1307 9 1508 11 3119 1 16 22 Inpatient & Outpatient 2 170 3 349 - - 23 Not reported 2 1041 5 778 2 631 2 124 24 Study design 25 Case-control 15 2548 11 845 11 3410 2 83 26 Cohort 0 - 6 1012 1‡ 56 1 57 § 27 Time from symptom onset to index test 28 First week 6 172 7 190 5 41 - - Second week 7 239 9 195 5 105 - - 29 Third week or later 5 159 9 215 5 328 - - 30 Population for estimating specificity|| 31 None 2 - 3 - 3 - 1 - 32 Stratified 13 - 8 - 33 Samples collected prior to COVID-19 7 1015 5 384 1 330 1 32 34 epidemic 35 Samples collected during COVID-19 5 830 3 280 4 2296 - - 36 epidemic in individuals not suspected to 37 have COVID-19 ¶ 38 Individuals suspected to have COVID-19 - - 5 361 4 167 1 33 39 but RT-PCR negative Individuals with confirmed other viral 3 34 1 52 1 167 - - 40 infection** 41 Mix of above 1 519 1 32 2 144 - - 42 Accuracy at level of patient or sample 43 Patient 6 1495 9 1358 8 3080 2 73 44 Samples†† 9 2115 8 1252 5 1599 1 132 45 Notes: 46 * Includes: Enzyme immunoassay, fluorescence immunochromatographic assay, liquid phase immunoassay 47 †One study, (Cassaniti et al ref: 15) includes two distinct populations, triage and hospitalized patients, with different study design (cohort and case-control). 48 In this table we report this study as two different cohorts, hence sum of number of studies across LFIA rows is 17. 49 ‡One study was poorly reported, and it was difficult to classify study design 50 §First week range:0-7 days (one cohort reported 0-10 days, counted in this group); second week: 7-14 days; third week: 15 days or more. ||Numbers include samples and patients. Some studies reported more than one type of population to access specificity. 51 ¶ Includes studies where timing was unclear. 52 **Includes some samples originating from prior to COVID-19 epidemic, and some during the epidemic, as further stratification not possible. 53 ††Patients could have contributed more than 1 sample and analyses did not account for correlation. 54 Abbreviations: 55 CLIA: chemiluminescent immunoassay, ELISA: enzyme-linked immunosorbent assay, LFIA: lateral flow immunoassay, N: number, RT-PCR: reverse 56 transcription polymerase chain reaction 57 58 16 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 18 of 51

1 2 3 Table 2| Individual and pooled sensitivity by sero-diagnostic method and immunoglobulin 4 * 5 class detected Numbers in parentheses are 95% confidence intervals 6 7 IgM IgG IgG or IgM 8 TP FN Sensitivity TP FN Sensitivity TP FN Sensitivity 9 ELISA (N=13 arms) 10Liu, Liu 18 Confidential:174 40 81% (75%, 86%) For172 Review42 80% (74%, 85%) Only186 28 87% (82%, 91%) 11Adams, An 49 28 12 70% (54%, 81%) 34 6 84% (70%, 92%) 34 6 84% (70%,92%) 12Whitman, Hiatt 52 13 Commercial kit 74 56 57 (48%, 65%) 96 34 74% (66%, 80%) 98 32 75% (67%, 82%) 14 In house ------94 36 72% (64%,79%) 15Liu, Liu 28 ------127 26 83% (76%, 88%) 16Freeman, Lester 42 ------95 4 96% (89%, 98%) 17Zhao, Yuan 20 143 30 82% (76%, 87%) 112 61 65% (57%, 71%) 18Lou, Li 30 74 6 92% (84%,96%) 71 9 88% (79%,94%) - - - 19Zhong, Chuan 38 46 1 97% (88%, 99%) 46 1 97% (88%, 99%) - - - 20Xiang, Wang 41 51 15 77% (65%, 85%) 55 11 83% (72%, 90%) - - - 21Perera, Mok 48 37 10 78% (65%, 87%) 34 13 72% (58%, 83%) - - - 22Guo, Ren 17 62 20 75% (65%, 83%) ------23Lassauniere, Frische 51 - - - 20 10 66% (49%, 80%) - - - 24POOLED 689 190 81% (72%, 89%) 640 187 81% (72%, 88%) 634 132 84% (76%, 91%) 25LFIA (N=36 arms) 26Li, Yi 23 328 69 83% (79%, 86%) 280 117 70% (66%, 75%) 352 45 89% (85%, 91%) 27Garcia, Perez Tanoira 37 12 43 22% (92%, 100%) 23 32 42% (30%, 55%) 26 29 47% (35%, 60%) 44 28Imai, Tabata 60 79 43% (35%, 52%) 20 119 15% (10%, 21%) 60 79 43% (35%, 51%) 52 29Whitman, Hiatt Commercial kit 1 79 49 62% (53%, 70%) 71 57 55% (47%, 64%) 83 45 65% (56%, 81%) 30 Commercial kit 2 91 37 71% (63%, 78%) 80 48 62% (54%, 70%) 95 33 74% (66%, 81%) 31 Commercial kit 3 85 41 68% (59%, 75%) 84 42 67% (58%, 74%) 85 41 67% (59%, 75%) 32 Commercial kit 4 94 36 72% (64%, 79%) 62 54 53% (44%, 62%) 95 35 73% (65%, 80%) 33 Commercial kit 5 33 82 29% (21%, 38%) 72 44 62% (53%, 70%) 66 50 57% (48%, 65%) 34 Commercial kit 6 89 40 69% (61%, 76%) 69 60 53% (45%, 62%) 91 38 70% (62%, 78%) 35 Commercial kit 7 62 67 48% (40%, 57%) 73 56 57% (48%, 65%) 74 55 57% (49%, 65%) 36 Commercial kit 8 79 51 61% (52%, 69%) 73 57 56% (48%, 54%) 80 50 61% (53%, 69%) 37 Commercial kit 9 79 42 65% (57%, 73%) 77 44 64% (55%, 72%) 79 42 65% (56%, 73%) 38 Commercial kit 10 ------87 39 69% (60%, 76%) 29 39Liu, Liu 34 56 38% (29%, 48%) 75 15 83% (74%, 89%) 77 13 85% (76%, 91%) 15 40Cassaniti, Novazzi Hospitalized group 25 5 83% (66%, 93%) 24 6 79% (62%,90%) 25 5 82% (66%, 92%) 41 Triage group ------7 31 19% (10%, 34%) 42Lou, Li 30 71 9 89% (80%, 94%) 69 11 86% (77%, 92%) 43Hoffman, Nissen 53 20 9 69% (51%, 83%) 27 2 92% (76%, 97%) 44Chen, Zhang 35 - 7 0 94% (60%, 99%) 45Zhang, Gao 32 ------106 16 87% (79%, 92%) 46Paradiso, De Summa 43 ------21 49 30% (21%, 42%) 47Adams, An 49 48 Commercial kit 1 ------18 15 54% (38%, 70%) 49 Commercial kit 2 ------23 15 60% (45%, 74%) 50 Commercial kit 3 ------21 12 63% (46%, 77%) Commercial kit 4 ------25 12 67% (51%, 80%) 51 Commercial kit 5 ------19 12 61% (44%, 76%) 52 Commercial kit 6 ------20 11 64% (47%, 78%) 53 Commercial kit 7 ------23 10 69% (52%, 82%) 54 Commercial kit 8 ------18 14 56% (39%, 71%) 55 Commercial kit 9 ------22 18 55% (40%, 69%) 56Lassauniere, Frische 51 57 58 17 59 60 https://mc.manuscriptcentral.com/bmj Page 19 of 51 BMJ

1 2 3 Commercial kit 1 ------27 3 89% (73%, 96%) 4 Commercial kit 2 ------27 3 89% (73%, 96%) 5 Commercial kit 3 ------28 2 92% (77%, 97%) 6 Commercial kit 4 ------25 5 82% (66%, 92%) 7 Commercial kit 5 ------4 1 75% (36%, 94%) 8 Commercial kit 6 ------1 0 75% (20%, 97%) 9 POOLED 1241 715 62% (51%, 72%) 1186 764 65% (54%, 75%) 1810 828 68% (52%, 81%) 10CLIA (N=10 arms) 27 11Lin, Liu Confidential:65 14 82% (72%, 89%) For65 Review14 82% (72%, 89%) Only72 7 91% (82%, 95%) 47 12Ma, Zeng 209 7 97% (93%, 98%) 209 7 97% (93%, 98%) 215 1 99% (99%, 100%) Cai, Chen 24 158 118 57% (51%, 63%) 197 79 71% (66%, 76%) - - - 13 Infantino, Grossi 36 44 17 72% (60%, 81%) 46 15 75% (63%, 84%) - - - 14Zhong, Chuan 38 46 1 97% (88%, 99%) 45 2 95% (85%, 98%) - - - 15Jin, Wang 39 13 14 48% (31%, 66%) 24 3 88% (71%, 95%) - - - 16Xie, Ding 40 15 1 91% (69%, 98%) 16 0 97% (77%, 100%) - - - 17Yangchun 45 144 61 70% (64%, 76%) 197 8 96% (92%, 98%) - - - 18Qian, Zhou 46 432 71 86% (82%, 89%) 486 17 97% (95%, 98%) - - - 19Lou, Li 30 69 11 86% (77%, 92%) ------20POOLED 1195 315 84% (71%, 93%) 1285 145 94% (85%, 98%) 287 8 98% (46%, 100%) 21Notes: *Pooled estimates from bivariate random effects meta-analysis. 22Abbreviations: 23CLIA: chemiluminescent immunoassay, ELISA: enzyme-linked immunosorbent assay, LFIA: lateral flow immunoassay. TP: true positive, FN: false negative, TN true negative, FP: false positive, N: number, Ig: immunoglobulin 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 18 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 20 of 51

1 2 3 Table 3| Individual and pooled sensitivity by sero-diagnostic method and immunoglobulin 4 * 5 class detected Number in parentheses are 95% confidence intervals 6 7 IgM IgG IgG or IgM 8 TN FP Specificity TN FP Specificity TN FP Specificity 9 ELISA (N=13 arms) 10 Liu, Liu 18 Confidential:100 0 100% (96%, 100%) For100 Review0 100% (96%, 100%) Only100 0 100% (96%, 100%) 11 Adams, An 49 50 0 100% (93%, 100%) 50 0 100% (93%, 100%) 50 0 100% (93%, 100%) 12 Whitman, Hiatt 52 13 Commercial kit 155 5 97% (93%, 99%) 142 18 89% (83%, 93%) 140 20 88% (82%, 92%) 14 In house ------152 8 95% (90%, 97%) 28 15 Liu, Liu ------116 4 96% (91%, 98%) 42 16 Freeman, Lester ------515 4 99% (98%, 100%) Zhao, Yuan 20 210 3 99% (96%, 100%) 195 2 99% (96%, 99%) - - - 17 Lou, Li 30 300 0 100% (99%, 100%) 100 0 100% (96%, 100%) - - - 18 Zhong, Chuan 38 299 1 99% (98%, 100%) 299 1 99% (98%, 100%) - - - 19 Xiang, Wang 41 60 0 100% (94%, 100%) 57 3 95% (86%, 98%) - - - 20 Perera, Mok 48 207 0 100% (98%, 100%) 204 3 99% (96%, 100%) - - - 21 Guo, Ren 17 285 0 100% (99%, 100%) ------22 Lassauniere, Frische 51 - - - 79 3 96% (90%, 99%) - - - 23 POOLED 1666 9 99.7% (99.0%, 100%) 1226 30 98.9% (96.7%, 99.8%) 1073 36 98% (93%, 99%) 24 LFIA (N=36 arms) 25 Li, Yi 23 117 11 91% (85%, 95%) 177 11 94% (90%, 97%) 116 12 92% (84%, 95%) 26 Garcia, Perez Tanoira 37 45 0 100% (92%, 100%) 45 0 100% (92%, 100%) 45 0 100% (92%, 100%) 27 Imai, Tabata 44 47 1 98% (89%, 100%) 48 0 100% (93%, 100%) 47 1 98% (89%,100%) 52 28 Whitman, Hiatt 29 Commercial kit 1 138 21 87% (81%, 91%) 151 8 95% (90%, 97%) 134 25 84% (78%, 89%) Commercial kit 2 141 8 95% (90%, 97%) 141 8 95% (90%, 97%) 136 13 91% (85%, 95%) 30 Commercial kit 3 144 15 91% (85%, 94%) 148 11 93% (88%, 96%) 142 17 89% (84%, 93%) 31 Commercial kit 4 129 31 81% (74%, 85%) 152 6 96% (92%, 98%) 129 31 81% (74%,86%) 32 Commercial kit 5 130 6 96% (91%, 98%) 134 2 99% (95%, 100%) 129 7 95% (90%, 98%) 33 Commercial kit 6 157 3 98% (95%, 99%) 158 2 99% (96%, 100%) 155 5 97% (93%, 99%) 34 Commercial kit 7 160 0 100% (98%, 100%) 160 0 100% (98%, 100%) 160 0 100% (98%, 100%) 35 Commercial kit 8 154 5 97% (93%, 99%) 155 4 98% (94%, 99%) 154 5 97% (93%, 99%) 36 Commercial kit 9 139 9 94% (89%, 97%) 143 5 97% (92%, 99%) 139 9 94% (89%, 97%) 37 Commercial kit 10 ------146 1 99% (96%, 100%) 38 Liu, Liu 29 83 6 93% (86%, 97%) 85 15 92% (85%, 96%) 81 8 91% (83%, 95%) 15 39 Cassaniti, Novazzi 40 Hospitalized group 30 0 100% (89%, 100%) 30 0 100% (89%,100%) 30 0 100% (89%, 100%) Triage group ------11 1 92% (65%, 99%) 41 Lou, Li 30 205 4 98% (95%, 99%) 208 1 99% (97%, 100%) - - - 42 Hoffman, Nissen 53 124 0 100% (97%, 100%) 123 1 99% (96%, 100%) 43 Chen, Zhang 35 - - 11 1 92% (65%, 99%) 44 Zhang, Gao 32 ------41 0 100% (92%, 100%) 45 Paradiso, De Summa 43 ------107 13 89% (82%, 94%) 46 Adams, An 49 47 Commercial kit 1 60 0 100% (94%, 100%) 48 Commercial kit 2 ------90 1 99% (94%, 100%) 49 Commercial kit 3 ------58 2 97% (89%, 99%) 50 Commercial kit 4 ------59 2 97% (89%,99%) 51 Commercial kit 5 ------58 2 97% (89%, 99%) Commercial kit 6 ------59 1 99% (91%, 100%) 52 Commercial kit 7 ------57 3 96% (87%, 98%) 53 Commercial kit 8 ------60 0 100% (94%, 100%) 54 Commercial kit 9 ------138 4 97% (93%, 99%) 55 Lassauniere, Frische 51 56 Commercial kit 1 ------32 0 100% (89%, 100%) 57 58 19 59 60 https://mc.manuscriptcentral.com/bmj Page 21 of 51 BMJ

1 2 3 Commercial kit 2 ------32 0 100% (89%, 100%) 4 Commercial kit 3 ------32 0 100% (82%, 100%) 5 Commercial kit 4 ------17 0 100% (82%, 100%) 6 Commercial kit 5 ------12 3 80% (55%, 93%) 7 Commercial kit 6 ------13 2 87% (62%, 96%) 8 POOLED 1943 120 97% (94%, 99%) 2066 67 98% (96%, 99%) 2679 169 97% (95%, 98%) 9 CLIA (N=10 arms) 10 Lin, Liu 27 65 15 81% (71%, 88%) 78 2 98% (91%, 99%) 64 16 80% (70%, 87%) 47 11 Ma, Zeng Confidential:446 37 92% (90%, 94%) For482 Review1 99% (98%, 100%) Only483 0 100% (99%, 100%) 24 12 Cai, Chen 167 0 100% (97%,100%) 167 0 100% (98%, 100%) - - - Infantino, Grossi 36 60 4 94% (85%, 98%) 64 0 100% (94%, 100%) - - - 13 Zhong, Chuan 38 286 14 95% (92%, 97%) 290 10 97% (94%, 98%) - - - 14 Jin, Wang 39 33 0 100% (89%, 100%) 30 3 90% (76%, 97%) - - - 15 Xie, Ding 40 6 34 15% (7%, 29%) 0 40 0% (0%, 9%) - - - 16 Yangchun 45 76 3 96% (89%, 98%) 73 6 92% (84%, 96%) - - - 17 Qian, Zhou 46 1529 29 98% (97%, 99%) 1528 30 98% (97%, 99%) - - - 18 Lou, Li 30 298 2 99% (97%,100%) ------19 POOLED 2966 138 97% (85%, 100%) 2712 92 98% (63%, 100%) 547 16 99 % (80%, 100%) 20 Notes: 21 *Pooled estimates from bivariate random effects meta-analysis. 22 Abbreviations: 23 CLIA: chemiluminescent immunoassay, ELISA: enzyme-linked immunosorbent assay, LFIA: lateral flow immunoassay. TP: true positive, FN: false negative, 24 TN true negative, FP: false positive, N: number, Ig: immunoglobulin 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 20 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 22 of 51

1 2 3 4 5 Table 4| Accuracy of COVID-19 sero-diagnostics stratified by potential sources of *, † 6 heterogeneity Numbers in parentheses are 95% confidence intervals 7 8 N arms TP FN Pooled sensitivity TN FP Pooled Specificity 9 Peer reviewed 10ELISA: Yes Confidential:10 772 For190 Review84% (74%, 92%) 1916 Only13 99% (98%, 100%) 11ELISA: No 8 613 129 85% (76%,92%) 1452 39 98% (94%, 100%) 12LFA: Yes 6 438 92 79% (41%, 98%) 415 15 97% (84%, 100%) 13LFA: No 32 1566 767 70% (52%, 84%) 2935 160 97% (89%, 99%) 14CLIA: Yes 8 249 53 89% (61%, 99%) 769 105 87% (80%, 90%) 15CLIA: No 9 1970 373 90% (73%, 97%) 4385 86 99% (91%, 100%) 16Data-level 17ELISA: Patient 8 589 133 88% (67%, 96%) 1723 18 98.4% (97%. 100%) 18ELISA: Sample 11 796 186 82% (77%, 87%) 1655 34 99.9% (98%, 100%) 19LFA: Patient 14 667 197 77% (60%, 88%) 775 40 98% (94%, 100%) 20 LFA: Sample 24 1337 662 75% (56%, 95%) 2575 135 98% (94%, 100%) 21 CLIA: Patient 11 1490 185 91% (80%, 97%) 3915 185 88% (49%. 94%) 22 CLIA: Sample 6 729 241 89% (46%, 99%) 1239 6 99% (95%, 100%) 23 Antigen target 24 ELISA 25 S, alone 8 573 166 81% (68%, 90%) 1600 35 98% (94%. 100%) 26 N, alone 5 389 108 78% (71%, 83%) 670 15 98% (92%, 100%) 27 S & N 5 423 45 92% (80%, 98%) 1098 2 99.9 (99.8%, 100%) 28 LFIA 29 S, alone 6 594 191 67% (35%, 89%) 439 51 93% (83%, 100%) 30 31 N, alone 1 26 29 - 45 0 - 32 S & N 7 363 119 79% (64%, 91%) 967 20 99% (92%, 100%) 33CLIA 34S, alone 1 355 197 - 334 0 - 35N, alone 1 72 7 - 64 16 - 36S & N 10 1420 152 92% (73%, 98%) 4601 92 98% (94%, 100%) 37Commercial kit 38 ELISA: Yes 10 937 228 93% (80%, 98%) 1357 35 99.9% (99.5%, 100%) 39 ELISA: No 18 448 91 87% (72%, 95%) 2011 17 99% (98%, 100%) 40 LFIA: Yes 35 1539 798 67% (51%, 80%) 3182 162 97% (95%, 99%) 41 LFIA: No 3 465 61 88% (83%, 91%) 168 13 94% (80%, 100%) 42 CLIA: Yes 7 227 61 84% (61%, 96%) 491 83 90% (20%, 100%) 43 CLIA: No 8 1651 296 93% (76%, 99%) 4515 99 99% (87%, 100%) 44Population for estimating specificity 45 ELISA 46 Samples collected prior to COVID-19 epidemic 7 1060 30 98% (91%, 100%) 47 Samples collected during COVID-19 epidemic in 10 - - 1441 9 99.5% (97.9%, 100%) 48 individuals not suspected to have COVID-19 49 Individuals suspected to have COVID-19 but RT------PCR negative 50 Individuals with confirmed other viral infection 5 - - 555 21 97% (59%, 100%) 51 LFIA 52 Samples collected prior to COVID-19 epidemic 25 2071 79 98% (96%, 100%) 53 Samples collected during COVID-19 epidemic in 7 - - - 518 5 99% (98%, 100%) 54 individuals not suspected to have COVID-19 55 Individuals suspected to have COVID-19 but RT- 12 - - - 410 40 91% (88%, 93%) 56 PCR negative 57 58 21 59 60 https://mc.manuscriptcentral.com/bmj Page 23 of 51 BMJ

1 2 3 Individuals with confirmed other viral infection 12 445 56 91% (76%, 96%) 4 CLIA 5 Samples collected prior to COVID-19 epidemic 1 - - - 642 18 - 6 Samples collected during COVID-19 epidemic in 7 - 4167 125 98% (86%,90%) 7 individuals not suspected to have COVID-19 8 Individuals suspected to have COVID-19 but 8 - - 242 92 77% (15%, 98%) 9 RT-PCR negative 10 Individuals with confirmed other viral infection 1 - - - 334 0 - 11Notes: *Pooled estimatesConfidential: from bivariate random effects meta-analysis For. Review Only 12Abbreviations: CLIA: Chemiluminescent immunoassay ELISA: enzyme-linked immunosorbent assay LFIA: Lateral flow immunoassay. TP: True positive, 13FN: false negative, TN true negative, FP: False positive. N: number. Ig: immunoglobulin. S: Surface protein, N: Nucleocapside protein RT-PCR: Reverse 14transcription polymerase chain reaction 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 22 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 24 of 51

1 2 3 Table 5| Sensitivity by sero-diagnostic method and timing relative to onset of symptoms*, † 4 5 Time post onset IgM IgG 6 N arms TP FN Pooled Sensitivity N arms TP FN Pooled Sensitivity 7 ELISA 8 First week 4 36 99 27% (16%, 35%) 5 133 39 24% (13%, 38%) 9 Second week 5 169 80 57% (16%, 88%) 6 165 91 65% (46%, 79%) 10 Third week or more 5 146 32 78% (54%, 92%) 6 165 36 82% (76%, 89%) 11 LFIA Confidential: For Review Only 12 First week 15 105 301 25% (16%, 31%) 15 74 329 13% (5%,29%) 13 Second week 15 471 265 51% (30%, 69%) 15 442 312 50% (24%, 77%) Third week or more 15 304 152 70% (58%, 80%) 15 361 95 80% (71%, 87%) 14 CLIA 15 First week 5 28 19 50% (11%, 80%) 5 25 20 56% (36%, 71%) 16 Second week 5 70 35 62% (4%, 99%) 5 82 18 85% (45%, 98%) 17 Third week or more 5 280 44 90% (52%, 99%) 5 321 7 98% (87%, 100%) 18 Notes: *First week range:0-7 days (one cohort reported 0 10 days, counted in this group), second week: 7-14 third 19 week: 15 days or more. †Pooled using univariate random effect meta-analysis. Abbreviations: CLIA: Chemiluminescent 20 immunoassay ELISA: enzyme-linked immunosorbent assay LFIA: Lateral flow immunoassay TP: True positive, FN: false 21 negative, TN true negative, FP: False positive. 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 23 59 60 https://mc.manuscriptcentral.com/bmj Page 25 of 51 BMJ

1 2 3 Figure 1. Study selection 4 5 6 7 5,014 titles identified 8 (Medline, MedRiv, BioRiv) 9 10 Confidential: For Review Only45 duplicates removed 11 2,179 titles not relevant excluded 12

13 14 2,790 abstracts retained 15 16 2,517 abstracts removed 17 2 text identified from 18 hand search 19 20

21 275 full text reviewed 236 full text excluded, reasons: 22 125 Studies not reporting accuracy 23 12 Reference tests either not used or not 24 25 viral culture or RT-PCR 26 6 Other diseases not COVID-19 27 2 Inadequate results report 28 78 index tests not relevant 29 13 <5 participants 30 39 studies (72 arms) 31 32 33

34 35

36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 https://mc.manuscriptcentral.com/bmj BMJ Page 26 of 51

1 2 3 4 5 6 7 8 9 10 11 Confidential: For Review Only 12 13 14 15 16 17 18 19 Figure 2 Summary of QUADAS-2 Assessments 20 21 207x76mm (144 x 144 DPI) 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 https://mc.manuscriptcentral.com/bmj Page 27 of 51 BMJ

Supplementary Appendix 1 2 Supplement to: Bastos, ML et al. “Diagnostic accuracy of serological tests for COVID-19: a systematic review and meta-analysis.” 3 4 5 6 Search strategy (Medline- Ovid)...... 2 7 Table S1- Characteristics of included studies (N=39) ...... 3 8 Confidential: For Review Only 9 Table S2- Patients’ characteristics in included studies...... 5 10 11 Table S3 -Information on laboratory methods among studies that evaluated serologic tests ...... 7 12 Table S4- Studies that reported the commercial kits used in index and/or reference test (RT-PCR) ...... 10 13 14 Figure S1- QUADAS assessment per individual study...... 12 15 16 Table S5-Reporting of Tau-squared from main tables 2 and 3, by method and antibody test ...... 13 17 Table S6: Sensitivity and specificity either IgM or IgG results by test method (sensitivity analyses) *...... 13 18 19 Table S7- Sensitivity and specificity results from studies that used other methods (EIA, FIA and LPI)...... 14 20 Figure S2- Forest plots according with sensitivity, specificity, method and immunoglobulin evaluated...... 15 21 22 References...... 26 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S1 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 28 of 51

1 Search strategy (Medline- Ovid) 2 1. ((exp coronavirus/ or exp coronavirus infections/ or (betacoronavirus* or beta coronavirus* or coronavirus* or corona virus*).mp.) and (exp china/ or (china 3 or chinese or or ).af.)) or (coronavirus* or corona virus* or betacoronavirus* or beta coronavirus*).ti,kf. 4 2. (severe acute respiratory syndrome coronavirus or "SARS CoV-2" or cov2 or "sars 2" or COVID or "coronavirus 2" or covid19 or nCov or ((new or Novel) adj3 5 6 coronavirus*) or ncp).ti,ab,kf. or ((exp pneumonia/ or pneumonia.ti,ab,kf.) and wuhan.af.) 7 3. 1 or 2 8 4. molecular diagnostic techniques/Confidential: For Review Only 9 5. (assay* or detect* or diagnos* or immunoassay* or kit? or screen* or technique* or test*).ti,ab,kf. 10 6. exp Nucleic Acid Amplification Techniques/ 11 12 7. (PCR or (Polymerase adj2 "Chain Reaction*") or nucleic acid*).ti,ab,kf. 13 8. Serologic Tests/ 14 9. exp Antigens/ or exp Antibodies/ 15 10. (antigen* or antibod*).ti,ab,kf. 16 11. exp Immunoglobulin Isotypes/ 17 18 12. (IGG or IGA or IGM or immunoglobulin*).ti,ab,kf. 19 13. ((virus* or viral) adj2 (culture* or rna)).ti,ab,kf. 20 14. (Virolog* adj2 assess*).ti,ab,kf. 21 15. (Specimen* or sample* or swab*).ti,ab,kf. 22 16. ((clinical or respir*) adj3 symptom*).ti,ab,kf. 23 24 17. radiography, thoracic/ or exp Tomography, X-Ray Computed/ or (radiograph* or tomograph* or x ray* or xray* or chest ct or ct imag* or ct scan*).ti,ab,kf. 25 18. (imaging adj (feature* or finding*)).ti,ab,kf. 26 19. (rapid array* or microarray*).ti,ab,kf. 27 20. di.fs. 28 21. exp diagnosis/ 29 30 22. or/4-21 31 23. 3 and 22 32 24. 2020*.dt,ez,da. 33 25. 23 and 24 34 35 -> No languages restrictions. 36 37 38 39 40 41 42 43 44 S2 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 29 of 51 BMJ

Table S1- Characteristics of included studies (N=39) 1 Study Peer City, Country Study Population Population to assess specificity* Clinical setting Study design 2 reviewed 3 Cassaniti, Novazzi 15 Yes Pavia, Italy COVID-19 patients and controls Triage population: Individuals suspected to have COVID-19 but RT-PCR Inpatient, emergency Cohort & 4 negative case control 5 Hospitalized: Samples collected during COVID-19 epidemic in individuals 6 not suspected to have COVID-19 7 Gao, Li 16 Yes Shijiazhuang, China COVID-19 patients No controls Inpatient, not specified Case-control 8 Guo, Ren 17 Yes Wuhan,Confidential: China COVID-19 patients and controls Stratified:For Review Only Inpatient, not specified Case-control 9 -Confirmed with other virus infections 10 - Samples collected prior to COVID-19 epidemic 11Liu, Liu 18 Yes Wuhan, China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected Inpatient, not specified Case-control 12 to have COVID-19 13Zhang, Du 19 Yes Wuhan, China COVID-19 patients No controls Inpatient, not specified Case-control 20 14Zhao, Yuan Yes Shenzhen, China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected Inpatient, not specified Case-control 15 to have COVID-19 21 16Xiao, Gao Yes Shanghai, China COVID-19 patients No controls Inpatient, not specified Case-control To, Tsang 22 Yes Hong Kong , China COVID-19 patients No controls Inpatient, not specified Case-control 17 Li, Yi 23 Yes Six provinces, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative N/R Cohort 18 Cai, Chen 24 No Chongqing, China COVID-19 patients and controls Confirmed with other virus infections Inpatient, not specified Case-control 19 Gao, Yuan 25 No Fuyang, China COVID-19 patients No controls Inpatient, not specified Case-control 20Jia, Zhang 26 No Shenzhen, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative N/R Cohort 21Lin, Liu 27 No Shenzhen, China COVID-19 patients and controls Mixed controls: Samples collected during COVID-19 epidemic in individuals Inpatient, not specified Case-control 22 not suspected to have COVID-19 and tuberculosis patients 23Liu, Liu 28 No Beijing, China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected Inpatient, not specified Case-control 24 to have COVID-19 25Liu, Liu 29 No Wuhan, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative Inpatient, (ICU and non- Cohort 26 ICU), Outpatient 27Lou, Li 30 No , China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected Inpatient, not specified Case-control 28 to have COVID-19 29Pan, Li 31 No Wuhan, China COVID-19 patients No controls Inpatient, not specified Cohort 30Zhang, Gao 32 No Shenyang, Shijiazhuang COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected N/R Cohort 31 China to have COVID-19 33 32Zhao, Li No Beijing and Wuhan, China Controls Samples collected prior to COVID-19 epidemic Only controls samples Case-control 34 33Long, Deng No Chongqing, China COVID-19 patients No controls Inpatient Case-control (ICU and non-ICU) 34 Chen, Zhang 35 Yes China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative N/R Case-control 35 Infantino, Grossi 36 Yes Florence, Italy COVID-19 patients and controls Mixed of samples during and before epidemic. No stratified results Inpatient, not specified Case-control 36 Garcia, Perez Tanoira No Spain COVID-19 patients, suspected Samples collected during COVID-19 epidemic in individuals not suspected N/R Case-control 3737 COVID-19 patients and controls to have COVID-19 38Zhong, Chuan 38 Yes , China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected N/R Case-control 39 to have COVID-19 40Jin, Wang 39 Yes Hangzhou, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative Inpatient, not specified Case-control 41Xie, Ding 40 Yes Wuhan, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative Inpatient, not specified Cohort 42Xiang, Wang 41 Yes Wuhan, China COVID-19 patients and controls Samples collected during COVID-19 epidemic in individuals not suspected Inpatient, not specified Case-control 43 to have COVID-19 44 S3 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 30 of 51

Freeman, Lester 42 No United States COVID-19 patients and controls Mixed of samples before epidemic and confirmed another virus infection N/R Case-control 1 Paradiso, De Summa No Bari, Italy Suspected COVID-19 patients Individuals suspected to have COVID-19 but RT-PCR negative Inpatient, non-ICU only Cohort 2 43 (hospitalized or 3 emergency room) 4 Imai, Tabata 44 No Saitama, Japan COVID-19 patients and controls Samples collected prior to COVID-19 epidemic Inpatient, not specified Case-control 5 Yangchun 45 No Wuhan, China COVID-19 patients and controls Individuals suspected to have COVID-19 but RT-PCR negative N/R Not clear 6 Qian, Zhou 46 No Hubei and six other COVID-19 patients and controls Donors during the epidemic period Inpatient, not specified Case-control 7 provinces from China 8 Ma, Zeng 47 No Hefei,Confidential: China COVID-19 patients and controls Stratified:For Review Only Inpatient (ICU and non- Case-control 9 -Individuals suspected to have COVID-19 but RT-PCR negative ICU) 10 -Samples collected prior to COVID-19 epidemic Perera, Mok 48 Yes Hong Kong, China COVID-19 patients and controls Samples collected prior to COVID-19 epidemic Inpatient (ICU and non- Case-control 11 ICU) 12 Adams, An 49 No United Kingdom COVID-19 patients and controls Samples collected prior to COVID-19 epidemic Inpatient, not specified, Case-control 13 outpatient 14Burbelo, Riedo 50 No United States COVID-19 patients and controls Samples collected prior to COVID-19 epidemic N/R Case-control 15Lassauniere, Frische 51 No Hillerod, Denmark COVID-19 patients and controls Stratified: Inpatient, ICU Case-control 16 - Samples collected during COVID-19 epidemic in individuals not suspected 17 to have COVID-19 18 -Confirmed other virus infection 19Whitman, Hiatt 52 No San Francisco, USA COVID-19 patients and controls Stratified: Inpatient (ICU and non- Case-control 20 -Samples collected prior to COVID-19 epidemic ICU), outpatient 21 -Confirmed other virus infection 53 22Hoffman, Nissen Yes Uppsala, Sweden COVID-19 patients and controls Samples collected prior to COVID-19 epidemic N/R Case-control 23Abbreviations: N/R: Not reported ICU: Intensive care unit 24Notes: *Samples before the epidemic: samples from patients with other condition diseases than not virus infection, or health donors. Sample during the epidemic: samples from patient not suspect of covid-19 or 25 health donors. Includes studies where timing was unclear. Confirmed with other virus infection: samples from patients before or during epidemic period confirmed with virus infection,, as further stratification 26not possible. Suspect covid-19 patients, patients with clinical features of COVID-19 but RT-PCR negative 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S4 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 31 of 51 BMJ

1 2 3 Table S2- Patients’ characteristics in included studies 4 Study N patients included Age years Male: Female Severity Sample collection period 5 6 (mean/median) 15 * 7 Cassaniti, Novazzi 50 62 (range 33-97) 34:16 N/R N/R 8 Confidential:60† Cases: 74 (range 38-86) For 36:24 ReviewMild, Severe or critical Only7 days (IQR 4-11) - after diagnosis 9 Controls: 39 (range 25-69) 10 Gao, Li 16 22 40 (range 4-72) 8:14 Mild, Severe or critical 1-24 days- after onset of symptoms 11 Guo, Ren 17 317‡ N/R N/R Mild, Severe or critical 1-39 days- after onset of symptoms 12 Liu, Liu 18 314 N/R N/R N/R 15 days (range 0-55)- after onset of symptoms 13 Zhang, Du 19 16 N/R N/R N/R N/R 14 Zhao, Yuan 20 173 48 (IQR 35,61) 84:89 Mild, Severe or critical 7 days (IQR 5-10 days) - after onset of symptoms 15 Xiao, Gao 21 34 N/R 16:18 N/R 0-14 days- after onset of symptoms 16 To, Tsang 22 16 N/R N/R Mild/severe 0-14 days- after onset of symptoms 17 Li, Yi 23 525 N/R N/R N/R N/R 18 Cai, Chen 24 443‡ 48 (IQR 37,56) 151:127 N/R 2-27 days- after onset of symptoms 19 Gao, Yuan 25 38 40.5 (IQR 31,50) 21:17 N/R N/R 20 Jia, Zhang 26 57 N/R N/R N/R 1-34 days-after onset of symptoms 21 Lin, Liu 27 159 N/R N/R N/R 0-14 days- after onset of symptoms 22 Liu, Liu 28 358 55 (IQR 38,65) 138:200 N/R 0-20 days-after onset of symptoms 23 Liu, Liu 29 179 Cases: 76 (SD 21) 98:81 Mild, Severe or critical 0-16 days -after onset of symptoms 24 Controls: 56 (SD 21) 25 30 26 Lou, Li 80 55 (IQR 45,64) 49:31 Mild, Severe or critical 8 (IQR 6,10)- days after onset of symptoms 31 27 Pan, Li 67 N/R N/R N/R 0-34 days-after onset of symptoms 32 28 Zhang, Gao 163 samples not clear N of N/R N/R N/R N/R 29 patients 30 Zhao, Li 33 257 samples not clear N of N/R N/R N/R N/R 31 patients 32 Long, Deng 34 63 N/R N/R N/R 2-23- days after onset of symptoms 33 Chen, Zhang 35 19 samples not clear N of N/R N/R N/R N/R 34 patients 35 Infantino, Grossi 36 125 Cases: 59 (SD 23) 47:78 Mild, Severe or critical 12 days (range 8-17) days after onset of symptoms 36 Controls: 49 (SD 17) 37 Garcia, Perez Tanoira 37 100 Cases: 62 (IQR 52,74) 55:45 N/R 11 days (range 7-23) days after onset of symptoms 38 Controls: 55 (IQR 50,79) 39 Zhong, Chuan 38 347 48 (range 18-82) 118:229 Mild, Severe or critical N/R 40 Jin, Wang 39 76 Cases: 47 (IQR 34,59) 39:37 N/R 16 days (IQR 9,20) days after onset of symptoms 41 Controls: 31 (IQR 26,38) 42 Xie, Ding 40 56 57 (IQR 49.3, 64.8) 24:32 Mild, Severe or critical 7-41 days after onset of symptoms 43 44 S5 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 32 of 51

Xiang, Wang 41 145 Cases: 51 (range 32, 65) 56:89 Mild, Severe 13-29 days after onset of symptoms 1 Controls: 34 (range 29,51) 2 Freeman, Lester 42 694 N/R N/R N/R N/R 3 Paradiso, De Summa 43 191 59 115:75 N/R Asymptomatic and symptomatic from 0-15 days 4 onset of symptoms 5 Imai, Tabata 44 112 67 (IQR 45,74) 64:48 N/R 5 (IQR 2,7) days after onset of symptomatic 6 Yangchun 45 185 N/R N/R N/R N/R 7 46 8 Qian, Zhou Confidential:2061 N/R ForN/R ReviewN/R OnlyN/R 47 9 Ma, Zeng 699 samples not clear N of N/R N/R Mild, Severe or critical 4-41 days after onset of symptoms 10 patients 48 11 Perera, Mok 263 samples not clear N of N/R N/R Mild, Severe or critical 0-30 days after onset of symptoms 12 patients 13 Adams, An 49 90 N/R N/R Mild, Severe or critical 13 (range 8-19) after onset of symptoms. 14 Convalescent plasma 48 (31-62) after onset of 15 symptoms 16 Burbelo, Riedo 50 67§ 61 (range 19-95) 20:15 Mild, Severe or critical 2-50 days after onset of symptoms 17 Lassauniere, Frische 51 112 N/R N/R Severe 0-21 days after onset of symptoms 18 Whitman, Hiatt 52 80 53 (SD 15) 55:25 Mild, Severe or critical 1-20 days after onset of symptoms 19 Hoffman, Nissen 53 153 N/R N//R N/R N/R 20 Notes: *Patients from emergency department, †Hospitalized patients, ‡Not clear how many controls, authors described number of samples for controls, and number 21 of patients for cases.§Only information of 35 cases 22 Abbreviations: IQR: Interquartile range, N/R: Not reported, N: Number 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S6 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 33 of 51 BMJ

1 2 3 4 Table S3 -Information on laboratory methods among studies that evaluated serologic tests 5 6 Index tests Reference test 7 Study Method Immunoglobuli Antigens Type Lab based Commerci Method Sample type PCR gene target Commercial kit 8 n Confidential:Sample or POCFor test al kitReview Only 9 Cassaniti, LFIA IgG, IgM S protein Serum/whole Unclear Yes RT-PCR NPS, BAL RdRp, E Yes 10Novazzi 15 blood 16 11Gao, Li ELISA IgG, IgM S, N protein Serum Lab based Yes RT-PCR Nasal and ORFlab. N Yes 12 CLIA IgG, IgM S, N protein Serum Lab based Yes pharyngeal swab 13 LFIA IgG, IgM S, N protein Serum Lab based Yes 17 14Guo, Ren ELISA IgG, IgM, IGA N protein Blood Lab based No RT-PCR/ Deep Throat swabs Not Not sequence reported reported 15 Liu, Liu 18 ELISA IgG, IgM N, S protein Serum Lab based Yes RT-PCR OPS, NPS Not Not 16 Reported reported 17Zhang, Du 19 ELISA IgG, IgM SARSr-CoV Serum Lab based No RT-PCR OPS S Yes 18 Rp3-N protein 19Zhao, Yuan 20 ELISA IgG, IgM, Ab S protein (IgM Blood Lab based Yes RT-PCR Respiratory samples Not Not 20 & Ab) reported reported 21 22 N protein 23 (IgG) 24 Xiao, Gao 21 CLIA IgG, IgM Not reported Blood Lab based Yes RT-PCR N/R Not Not 25 reported reported 26 To, Tsang 22 EIA IgG, IgM S, N protein Serum Lab based No RT-PCR Saliva RdRp No 27Li, Yi 23 LFIA IgG, IgM S protein Serum/ plasma/ Unclear for No RT-PCR Pharyngeal, sputum Not Not 28 whole blood/ most of the reported reported 29 fingerpick population 30Cai, Chen 24 CLIA IgG, IgM S Serum Lab based No RT-PCR Not Not Not 31 reported reported reported 32Gao, Yuan 25 LFIA IgG, IgM Not reported Serum Lab based Yes RT-PCR Throat swabs ORF1ab, N Yes 33 34Jia, Zhang 26 FIA IgG, IgM Not reported N/R Lab based Yes RT-PCR OPS, NPS Not Not 35 reported reported 36Lin, Liu 27 CLIA IgG, IgM N protein Serum Lab based No RT-PCR Respiratory samples ORF1ab, N Yes 28 37Liu, Liu ELISA IgG, IgM N protein Serum Lab based Yes RT-PCR OPS, NPS ORF1ab, N Yes 29 38Liu, Liu LFIA IgG, IgM Not reported Serum Lab based Yes RT-PCR Nasal and Not Not 39 pharyngeal swab reported reported Lou, Li 30 ELISA IgG, IgM, Ab S, N Serum Lab based Yes RT-PCR Not Not Not 40 reported Reported reported 41 LFIA IgG, IgM, Ab S, N Plasma Unclear 42 CLIA Ab, IGM S, N Serum Lab based 31 43Pan, Li LFIA IgG, IgM Not reported Serum/ Unclear Yes RT-PCR Throat swab Not Not 44 S7 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 34 of 51

plasma reported reported 1 Zhang, Gao 32 LFIA IgG, IgM S protein Serum Lab based No RT-PCR Not Not Not 2 reported reported reported 3 Zhao, Li 33 ELISA Not specified S protein Serum Lab based No Frozen Serum N/A N/A 4 samples 5 before 6 epidemic4 7 Long, Deng 34 CLIA IgG, IgM S, N protein Serum Lab based Yes RT-PCR Nasal, pharyngeal Not Not 8 Confidential: For Reviewswab Onlyreported reported 9 Chen, Zhang LFIA IgG S, N protein Serum Lab based No RT-PCR Not Not Not 35 reported reported reported 10 Infantino, CLIA IgG, IgM N, S protein Serum Lab based Yes RT-PCR OPS, NPS Not Not 11 Grossi 36 reported reported 12Garcia, Perez LFIA IgG, IgM N Serum Lab based Yes RT-PCR NPS Not Not 13Tanoira 37 reported reported 14Zhong, Chuan ELISA IgG, IgM N, S protein Serum Lab based No RT-PCR Not Not Not 1538 reported reported reported 16 CLIA IgG, IgM N, S protein Lab based No 17Jin, Wang 39 CLIA IgG, Liu, Liu [1] N, S protein Serum Lab based Yes RT-PCR Oral swabs Not Not 18 Liu, Liu [1] reported reported 19Xie, Ding 40 CLIA IgG, IgM E, N protein Serum Lab based Yes RT-PCR Nasopharyngeal ORF1ab, N Yes 20 swab, Throat 21Xiang, Wang ELISA IgG, IgM N protein Serum Lab based Yes RT-PCR NPS, OPS ORF1ab, N Not 2241 reported 23Freeman, ELISA IgG, IgM, pan-Ig S protein Serum Lab based No RT-PCR Not Not Not 24Lester 42 reported reported reported 25Paradiso, De LFIA IgG, IgM S protein Venous blood Lab based Yes RT-PCR NPS, OPS E, RdRP, N Yes 26Summa 43 27Imai, Tabata LFIA IgG, IgM Not Serum Lab based Yes RT-PCR NPS, OPS Not Not 44 28 reported reported reported 45 29Yangchun CLIA IgG, IgM Not Not Lab based N/R RT-PCR Not Not Not reported reported reported reported reported 30 Qian, Zhou 46 CLIA IgG, IgM N, S protein Serum Lab based No RT-PCR Not Not Not 31 (fusion) reported reported reported 32Ma, Zeng 47 CLIA IgA, IgG, IgM N, RbD S Serum Lab based No RT-PCR Not Not Not 33 protein reported reported reported 34Perera, Mok ELISA IgG, IgM RbD S protein Serum Lab based No RT-PCR Not Not Not 3548 reported reported reported 36Adams, An 49 ELISA IgG, IgM S protein Plasma Lab based No RT-PCR NPS Not Not 37 LIFA 1 IgG, IgM Lab based Yes reported reported 38 LIFA 2 IgG, IgM Lab based Yes 39 LIFA 3 IgG, IgM Lab based Yes 40 LIFA 4 IgG, IgM Lab based Yes 41 LIFA 5 IgG, IgM Lab based Yes 42 LIFA 6 IgG, IgM Lab based Yes 43 LIFA 7 IgG, IgM Lab based Yes 44 S8 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 35 of 51 BMJ

LIFA 8 IgG, IgM Lab based Yes 1 LIFA 9 IgG, IgM Lab based Yes 2 Burbelo, LPI Not reported N, S protein Plasma/ Lab based No RT-PCR Nasal, throat swabs Not Not 3 Riedo 50 serum reported reported 4 Lassauniere, ELISA Ab S protein Serum Lab based Yes RT-PCR Respiratory samples Not Not 5 Frische 51 ELISA IgG, IgA S protein Lab based Yes reported reported 6 LFIA IgG, IgM Not reported Lab based Yes 7 LFIA IgG, IgM Not reported Lab based Yes 8 LFIA IgG, IgMConfidential:Not reported Lab basedForYes Review Only 9 LFIA IgG, IgM Not reported Lab based Yes 10 LFIA IgG, IgM Not reported Lab based Yes 11 LFIA IgG, IgM Not reported Lab based Yes 12 LFIA IgG, IgM Not reported Lab based Yes 13 LFIA IgG, IgM Not reported Lab based Yes 14 LFIA IgG, IgM Not reported Lab based Yes Whitman, ELISA IgG, IgM S protein Plasma/ Lab based No RT-PCR NPS, OPS Not Not 15 Hiatt 52 ELISA IgG, IgM N protein serum Lab based Yes reported reported 16 LFIA IgG, IgM Rbd, S Lab based Yes 17 LFIA IgG, IgM N, S Lab based Yes 18 LFIA IgG, IgM Not reported Lab based Yes 19 LFIA IgG, IgM Not reported Lab based Yes 20 LFIA IgG, IgM Unclear Lab based Yes 21 LFIA IgG, IgM N, S Lab based Yes 22 LFIA IgG, IgM Not reported Lab based Yes 23 LFIA IgG, IgM Not reported Lab based Yes 24 LFIA IgG, IgM Not reported Lab based Yes 25 LFIA IgG, IgM Not reported Lab based Yes 26Hoffman, LFIA IgG, IgM N, S protein Blood/ Lab based Yes RT-PCR Not Not Not 27Nissen 53 serum reported reported reported 28Abbreviations: LFIA: lateral flow immunoassay. ELISA: enzyme-linked immunosorbent assay, CLIA: Chemiluminescent immunoassay, EIA: Enzyme Immunoassay, FIA: Fluorescent Immunoassay, RT-PCR: 29Reverse transcription polymerase chain reaction. BAL: Bronchoalveolar lavage, OPS: oropharyngeal swab, NPS: nasopharyngeal swab S protein: Surface protein, N protein: Nucleocapsid protein,POC: Point 30of care. LPI: liquid phase immunoassay, Ab: total antibodies. Ig: immunoglobulin 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S9 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 36 of 51

1 2 3 4 5 6 Table S4- Studies that reported the commercial kits used in index and/or reference test (RT-PCR) 7 Study ID Index KIT Reference kit 8 Confidential: For Review Only 15 9 Cassaniti, Novazzi VivaDiagTM COVID-19 IgM/IgG Rapid Test (Viva check) QIAsymphony (QIAGEN, Qiagen, Hilden) 10 Gao, Li 16 Beier Bioengineering 2019-nCoV RNA Test Kit, Daan Gene 11 Liu, Liu 18 Lizhu Company & Hotgen N/R 12 Zhao, Yuan 20 Beijing Wantai Biological Pharmacy Enterprise N/R 13 Xiao, Gao 21 Shenzhen YHLO Biotech N/R 14 Gao, Yuan 25 Innovita Biological Technology Shanghai BioGerm Medical Biotechnology Company 15 26 Beijing Diagreat Biotechnologies Daan, Sansure, BGI, Shanghai ZJ Biotech, GeneoDx, 16 Jia, Zhang 17 Biogerm 27 18 Lin, Liu N/A- in house kit GeneoDX Company 19 Liu, Liu 28 Lizhu Daan Company 20 Lou, Li 30 Beijing Wantai Biological Pharmacy Enterprise Company N/R 21 Pan, Li 31 Zhuhai Livzon Diagnositic Biogerm 22 Long, Deng 34 Bioscience Diagnostic N/R 23 36 24 Infantino, Grossi Shenzhen YHLO Biotech N/R 37 25 Garcia, Perez Tanoira AllTest COV-19 IgG / IgM kit (AllTest Biotech Company) VIASURE SARS-CoV-2 Real Time PCR Detection Kit 26 (Certest Biotech) & Allplex 2019-nCoV assay 27 (Seegene) 28 Jin, Wang 39 Shenzhen YHLO Biotech N/R 29 Xie, Ding 40 YHLO Biological Technology N/R 30 Xiang, Wang 41 ELISA kits, Livzon N/R 31 43 32 Paradiso, De Summa VivaDiagTM COVID-19 IgM/IgG Rapid Test (Viva check Company Allplex 2019-nCoV assay (Seegene) 33 Imai, Tabata 44 One Step Novel Coronavirus (COVID-19) IgM/IgG Antibody Test (Artron) N/R 34 Lassauniere, Frische 51 ELISA: Wantai SARS-CoV-2 Ab ELISA (Beijing Wantai Biological Pharmacy N/R 35 Enterprise) & Anti-SARS-CoV-2 IgG and IgA ELISAs (Euroimmun Medizinische 36 Labordiagnostika, Lübeck 37 LFIA: 38 -2019-nCOV IgG/IgM Rapid Test (Dynamiker Biotechnology) 39 40 -OnSiteTM COVID-19 IgG/IgM Rapid Test (CTK Biotech); 41 -Anti-SARS-CoV-2 Rapid Test (AutoBio Diagnostics) 42 -One Step Novel Coronavirus (COVID-19) IgM/IgG Antibody Test (Artron) 43 -2019-nCoV IgG/IgM Rapid Test Cassette (Acro Biotech) 44 S10 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 37 of 51 BMJ

- AllTest COV-19 IgG / IgM kit (AllTest Biotech) 1 Whitman, Hiatt 52 N/R 2 ELISA: Epitope Diagnostics, San Diego, CA, USA 3 LFIA 4 -BioMedomics Inc 5 -Bioperfectus Technologies 6 -Decombio Biotechnology 7 -DeepBlue Medical Technology 8 Confidential:-Innovita Biological Technology For Review Only 9 10 -Premier Biotech 11 -Sure Biotech 12 -UCP Biosciences 13 -VivaChek Biotech Co 14 -Wondfo Biotech Co Ltd, 15 Hoffman, Nissen 53 Orient Gene Biotech Company N/R 16 Abbreviations: NR: Not reported N/A: Not applicable 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S11 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 38 of 51

1 Figure S1- QUADAS assessment per individual study 2 3 4 5 6 7 8 9 10 11 Confidential: For Review Only 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 S12 60 https://mc.manuscriptcentral.com/bmj Page 39 of 51 BMJ

1 Table S5-Reporting of Tau-squared from main tables 2 and 3, by method and antibody test 2 Sensitivity Specificity 3 IgM IgG IgM/IgG IgM IgG IgM/IgG 4 ELISA 5 Study level Tau-squared -* -* -* -* -* -* 6 Arm level Tau-squared 0.4357 0.3644 0.2846 1.9188 1.7221 1.2394 7 8 LFIA Confidential: For Review Only 9 Study level Tau-squared -* -* 0.94245 -* -* 1.09951 10 Arm level Tau-squared 0.6896 0.7920 0.11277 1.2289 0.6869 0.04748 11 CLIA 12 Study level Tau-squared -* -* -* -* -* -* 13 Arm level Tau-squared 1.290 1.582 1.246 5.592 16.658 3.549 14 15 Notes: *Study level did not converge, or not multiple arms reported. Only arm level used 16 Abbreviations: LFIA: lateral flow immunoassay. ELISA: enzyme-linked immunosorbent assay, CLIA: Chemiluminescent immunoassay 17 18 19 20 21 Table S6: Sensitivity and specificity either IgM or IgG results by test method (sensitivity analyses) * 22 Sensitivity Specificity 23 Arms TP FN Pooled sensitivity Tau-squared TN FP Pooled Specificity Tau-squared 24 ELISA 18 1385 319 84% (77%, 89%) Study level: 0.45391 3368 52 99% (97%,100%) Study level: 1.36747 25 Arm level: 0.01045 Arm level: 0.35393 26 LFIA 38 2004 859 73% (60%, 84%) Study level: 0.96014 3335 175 97% (95%, 98%) Study level: 0.06301 27 28 Arm level: 0.11379 Arm level: 1.20992 29 CLIA 17 2219 426 89% (79%, 96%) Study level: 0.6825 5154 191 97% (84%, 99%) Study level: 6.9635 30 Arm level: 0.8316 Arm level: 1.0133 31 *Pooled using bivariate random effect model. Therefore, only includes studies that reported sensitivity and specificity. Abbreviations: CLIA: Chemiluminescent 32 immunoassay ELISA: enzyme-linked immunosorbent assay LFIA: Lateral flow immunoassay, TP: True positive, FN: false negative, TN true negative, FP: False 33 positive 34 35 36 37 38 39 40 41 42 43 44 S13 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 40 of 51

Table S7- Sensitivity and specificity results from studies that used other methods (EIA, FIA and LPI) 1 IgM IgG Total antibody 2 Method Sensitivity Specificity Sensitivity Specificity Sensitivity Specificity 3 Study TP FN Estimate TN FP Estimate TP FN Estimate TN FP Estimate TP FN Estimate TN FP Estimate 4 Jia, Zhang 26 FIA 19 5 72% (60%, 91%) 13 20 39% (25%, 56%) 17 7 71% (51%, 85%) 17 16 52% (35%,67%) ------5 To, Tsang 22 EIA anti N 14 2 88% (60%, 98%) - - - 15 1 ------6 EIA anti Rbd 15 1 94% (67%; 100%( - - - 16 0 ------7 Burbelo, LPI anti N ------76 32 70% (60%, 79%) 32 0 100% (87%, 100%) 8 Riedo 50 LPI anti S - Confidential:------For- Review- - - - Only60 40 60% (50%, 70%) 32 0 100% (87%, 100%) 9 Abbreviations: EIA: Enzyme Immunoassay, FIA: Fluorescent Immunoassay, S protein: Surface protein, N protein: Nucleocapsid protein. LPI: liquid phase immunoassay, Ab: total antibodies 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S14 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 41 of 51 BMJ

1 2 3 Figure S2- Forest plots according with sensitivity, specificity, method and immunoglobulin evaluated. 4 S2a- ELISA 5 6 IgM 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S15 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 42 of 51

IgG 1 2 3 4 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S16 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 43 of 51 BMJ

IgM and/or IgG 1 2 3 4 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S17 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 44 of 51

1 2 S2b- LFIA 3 4 IgM 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S18 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 45 of 51 BMJ

1 2 3 4 IgG 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S19 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 46 of 51

1 2 3 4 IgM and/or IgG 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S20 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 47 of 51 BMJ

1 2 3 4 5 6 S2c- CLIA 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S21 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 48 of 51

IgM 1 2 3 4 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S22 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 49 of 51 BMJ

IgG 1 2 3 4 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S23 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 BMJ Page 50 of 51

1 2 IgM/IgG 3 4 5 6 7 8 Confidential: For Review Only 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 S24 45 https://mc.manuscriptcentral.com/bmj 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Page 51 of 51 BMJ

1 References 2 3 4 15. Cassaniti I, Novazzi F, Giardina F, et al. Performance of VivaDiagTM COVID-19 IgM/IgG Rapid Test is inadequate 5 for diagnosis of COVID-19 in acute patients referring to emergency room department. Journal of Medical Virology 2020; 6 30: 30. 7 16. Gao HX, Li YN, Xu ZG, et al. Detection of serum immunoglobulin M and immunoglobulin G antibodies in 2019- 8 novel coronavirus infected cases from different stages. Chinese Medical Journal 2020; 26: 26. 9 17. Guo L, Ren L, Yang S, et al. Profiling Early Humoral Response to Diagnose Novel Coronavirus Disease (COVID-19). 10 11 Clinical InfectiousConfidential: Diseases 2020; 21: 21. For Review Only 12 18. Liu W, Liu L, Kou G, et al. Evaluation of Nucleocapsid and Spike Protein-based ELISAs for detecting antibodies 13 against SARS-CoV-2. Journal of Clinical Microbiology 2020; 30: 30. 14 19. Zhang W, Du RH, Li B, et al. Molecular and serological investigation of 2019-nCoV infected patients: implication 15 of multiple shedding routes. Emerging Microbes & Infections 2020; 9(1): 386-9. 16 20. Zhao J, Yuan Q, Wang H, et al. Antibody responses to SARS-CoV-2 in patients of novel coronavirus disease 2019. 17 Clinical Infectious Diseases 2020; 28: 28. 18 21. Xiao DAT, Gao DC, Zhang DS. Profile of Specific Antibodies to SARS-CoV-2: The First Report. Journal of Infection 19 2020; 21: 21. 20 21 22. To KK, Tsang OT, Leung WS, et al. Temporal profiles of viral load in posterior oropharyngeal saliva samples and 22 serum antibody responses during infection by SARS-CoV-2: an observational cohort study. The Lancet Infectious Diseases 23 2020; 23: 23. 24 23. Li Z, Yi Y, Luo X, et al. Development and Clinical Application of A Rapid IgM-IgG Combined Antibody Test for 25 SARS-CoV-2 Infection Diagnosis. Journal of Medical Virology 2020; 27: 27. 26 24. Cai X, Chen J, Hu J, et al. A Peptide-based Magnetic Chemiluminescence Enzyme Immunoassay for Serological 27 Diagnosis of Corona Virus Disease 2019 (COVID-19). 2020. 28 25. Gao Y, Yuan Y, Li TT, et al. Evaluation the auxiliary diagnosis value of antibodies assays for detection of novel 29 30 coronavirus (SARS-Cov-2) causing an outbreak of pneumonia (COVID-19). 2020. 31 26. Jia X, Zhang P, Tian Y, et al. Clinical significance of IgM and IgG test for diagnosis of highly suspected COVID-19 32 infection. 2020. 33 27. Lin D, Liu L, Zhang M, et al. Evaluations of serological test in the diagnosis of 2019 novel coronavirus (SARS-CoV- 34 2) infections during the COVID-19 outbreak. 2020. 35 28. Liu L, Liu W, Wang S, Zheng S. A preliminary study on serological assay for severe acute respiratory syndrome 36 coronavirus 2 (SARS-CoV-2) in 238 admitted hospital patients. 2020. 37 29. Liu Y, Liu Y, Diao B, et al. Diagnostic Indexes of a Rapid IgG/IgM Combined Antibody Test for SARS-CoV-2. 2020. 38 30. Lou B, Li T, Zheng S, et al. Serology characteristics of SARS-CoV-2 infection since the exposure and post 39 40 symptoms onset. 2020. 41 31. Pan Y, Li X, Yang G, et al. Serological immunochromatographic approach in diagnosis with SARS-CoV-2 infected 42 COVID-19 patients. 2020. 43 32. Zhang P, Gao Q, Wang T, et al. Evaluation of recombinant nucleocapsid and spike proteins for serological 44 diagnosis of novel coronavirus disease 2019 (COVID-19). 2020. 45 33. Zhao R, Li M, Song H, et al. Serological diagnostic kit of SARS-CoV-2 antibodies using CHO-expressed full-length 46 SARS-CoV-2 S1 proteins. 2020. 47 34. Long Qx, Deng Hj, Chen J, et al. Antibody responses to SARS-CoV-2 in COVID-19 patients: the perspective 48 49 application of serological tests in clinical practice. 2020. 50 35. Chen Z, Zhang Z, Zhai X, et al. Rapid and sensitive detection of anti-SARS-CoV-2 IgG using lanthanide-doped 51 nanoparticles-based lateral flow immunoassay. Analytical Chemistry 2020; 23: 23. 52 36. Infantino M, Grossi V, Lari B, et al. Diagnostic accuracy of an automated chemiluminescent immunoassay for 53 anti-SARS-CoV-2 IgM and IgG antibodies: an Italian experience. Journal of Medical Virology 2020; 24: 24. 54 37. Garcia FP, Perez Tanoira R, Romanyk Cabrera JP, Arroyo Serrano T, Gomez Herruz P, Cuadros Gonzalez J. Rapid 55 diagnosis of SARS-CoV-2 infection by detecting IgG and IgM antibodies with an immunochromatographic device: a 56 prospective single-center study. 2020. 57 58 59 S25 60 https://mc.manuscriptcentral.com/bmj BMJ Page 52 of 51

38. Zhong L, Chuan J, Gong B, et al. Detection of serum IgM and IgG for COVID-19 diagnosis. Science China Life 1 sciences 2020; 63(5): 777-80. 2 3 39. Jin Y, Wang M, Zuo Z, et al. Diagnostic value and dynamic variance of serum antibody in coronavirus disease 4 2019. International Journal of Infectious Diseases 2020; 94: 49-52. 5 40. Xie J, Ding C, Li J, et al. Characteristics of Patients with Coronavirus Disease (COVID-19) Confirmed using an IgM- 6 IgG Antibody Test. Journal of Medical Virology 2020; 24: 24. 7 41. Xiang F, Wang X, He X, et al. Antibody Detection and Dynamic Characteristics in Patients with COVID-19. Clinical 8 Infectious Diseases 2020; 19: 19. 9 42. Freeman B, Lester S, Mills L, et al. Validation of a SARS-CoV-2 spike ELISA for use in contact investigations and 10 serosurveillance. 2020. 11 Confidential: For Review Only 43. Paradiso AV, De Summa S, Loconsole D, et al. Clinical meanings of rapid serological assay in patients tested for 12 13 SARS-Co2 RT-PCR. 2020. 14 44. Imai K, Tabata S, Ikeda M, et al. Clinical evaluation of an immunochromatographic IgM/IgG antibody assay and 15 chest computed tomography for the diagnosis of COVID-19. 2020. 16 45. Yangchun F. Optimize Clinical Laboratory Diagnosis of COVID-19 from Suspect Cases by Likelihood Ratio of SARS- 17 CoV-2 IgM and IgG antibody. 2020. 18 46. Qian C, Zhou M, Cheng F, et al. Development and Multicenter Performance Evaluation of The First Fully 19 Automated SARS-CoV-2 IgM and IgG Immunoassays. 2020. 20 47. Ma H, Zeng W, He H, et al. COVID-19 diagnosis and study of serum SARS-CoV-2 specific IgA, IgM and IgG by a 21 22 quantitative and sensitive immunoassay. 2020. 23 48. Perera RA, Mok CK, Tsang OT, et al. Serological assays for severe acute respiratory syndrome coronavirus 2 24 (SARS-CoV-2), March 2020. Euro Surveillance: Bulletin Europeen sur les Maladies Transmissibles = European 25 Communicable Disease Bulletin 2020; 25(16). 26 49. Adams ER, An, R, et al. Evaluation of antibody testing for SARS-Cov-2 using ELISA and lateral flow 27 immunoassays. 2020. 28 50. Burbelo PD, Riedo FX, Morishima C, et al. Detection of Nucleocapsid Antibody to SARS-CoV-2 is More Sensitive 29 than Antibody to Spike Protein in COVID-19 Patients. 2020. 30 51. Lassauniere R, Frische A, Harboe ZB, et al. Evaluation of nine commercial SARS-CoV-2 immunoassays. 2020. 31 32 52. Whitman JD, Hiatt J, Mowery CT, et al. Test performance evaluation of SARS-CoV-2 serological assays. medRxiv 33 2020: 2020.04.25.20074856. 34 53. Hoffman T, Nissen K, Krambrich J, et al. Evaluation of a COVID-19 IgM and IgG rapid test; an efficient tool for 35 assessment of past exposure to SARS-CoV-2. Infect Ecol Epidemiol 2020; 10(1): 1754538. 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 S26 60 https://mc.manuscriptcentral.com/bmj