ASPECTS, the Mismeasure of Stroke: a Metrological Investigation

ASPECTS, The Mismeasure of Stroke: A Metrological Investigation Richard A. Suss, M.D. ([email protected]) Neuroradiology Section, Department of Radiology The University of Texas Southwestern Medical Center 5323 Harry Hines Boulevard, Dallas, TX 75390-9178 OSF Preprints: https://doi.org/10.31219/osf.io/c4tkp or https://osf.io/c4tkp/ Versions: 2019: 1 (Dec. 1), 2 (Dec. 29), 3 (Dec. 31); 2020: 4 (May 15), 5 (Dec 31). Abstract—ASPECTS (Alberta Stroke Program Early CT Score) is an irregular scoring system that reads out as 10 to 0 points. It was originally conceived to reflect acute MCA (middle cerebral artery) territory infarct volume on (plain) CT, but it is equally applicable to and testable by MRI. Background material prepares the reader by explaining pertinent measurement principles and ASPECTS’s genesis, definitions, claims, and quirks. This report investigates ASPECTS as a volume surrogate without independently advocating for or against the therapies it might help plan. Method: The original authors of four publications provided their unpublished primary numerical data for further analysis. A CT study expands on the ACCESS database, which compares ASPECTS with two other subjective infarct scales: the legacy 1/3 MCA estimate and the multivariate IST-3 scale. An MRI study pools three diffusion-weighted (DWI) series as a patient-level meta-analysis unifying their comparisons of ASPECTS to the volumes that were found by semiautomated measurement. CT results: ASPECTS is unreliable, showing wide interrater variation with three quantifiable effects. It delivers little more than half as much entropy reduction as the IST-3 scale shows CT can support. Its large standard deviation (SD) takes up much of the scale width. Converting SD to a loss function with respect to a reference standard neuroradiologist gives an alarming weighted measure of error in ratings. MRI results: There are many-fold ranges of volume per ASPECTS and of ASPECTS per volume, causing sometimes large misclassifications one or both ways by any ASPECTS dichotomization. Looking past the ranges to averages, an estimate of 1/3 of the MCA territory corresponds best to ASPECTS cutting between 6 and 5 (6//5) and a volume of 65 cm3 (<1/4 of the average MCA territory). Discussion: There is already a simple, quick, and reliable manual volume measure (ABC/2, 2Sh/3). An attempt to salvage ASPECTS by reinterpreting its purpose does not hold up under scrutiny. ASPECTS can be replaced in stroke guidelines: the guideline cutting at ASPECTS 6//5 is consistent on average with the increasingly commonly stated, and better defined, 70 cm3 threshold. Arguments defending ASPECTS are rebutted by the evidence herein and by literature citations. Conclusion: ASPECTS subtracts value from the more objective direct volume measurements that are universally available by manual calculation and are becoming available by automatic software. ASPECTS inherently risks clinically significant misclassification (harm) for many patients. Page 1 Page 2 ASPECTS, The Mismeasure of Stroke: A Metrological Investigation Acknowledgments My local colleague and friend, Marco Pinho, recruited me for this project. Thank you, Marco! Ke Lin, Constance de Margerie-Mellon, Julian Schröder, J. M. Wardlaw and Francesca Chappell shared data from their studies. J. M. Wardlaw sent me an otherwise unavailable article. Ke Lin, J. M. Wardlaw, Tudor Jovin, Keith Muir, Greg Albers, and Francesca Chappell corresponded with me on various topics. Marco Pinho, Ke Lin, and J. M. Wardlaw commented on full drafts, each sharing insights that I incorporated. Several members of my Neuroradiology Section and Radiology Department, whose residents’ misgivings about ASPECTS amplified my original interest, expressed support for this investigation. In particular, Merissa Harris, Will Moore, Michael Achilleos, John Barr, Amit Agarwal, Joseph Maldjian, and Devan Moody made helpful remarks or offered valuable suggestions, Kelly Tornow based a neuroradiology journal club presentation on two of the references, Lee Pride provided §7’s critical case in point and contributed to its description and Lauren Kolski and Jacob Fleming assisted in its preparation. My wife, Shelley, graciously shared me with this project and critiqued parts of several drafts. I am indebted to all of these individuals for their generosity. Through it all, I retain sole responsibility for my errors of commission and omission and would appreciate feedback for possible future versions. Prospective readers—from students, residents and fellows, to radiologists, neurologists and other clinicians, and to research scientists and administrators—are also acknowledged for their time, interest, and labor, in the arrangement of the material. Depending on the desired topics and depth, each might read just the Synopsis, or selected sections such as the central data section (§6), or the full main text alone, or perhaps also explore the footnotes which supplement the arguments and list page numbers for the convenience of verifying or exploring the references. Version Changes Version 2 makes clarifications and additions on pages 2, 19-21, 24, 27-28, 45, 48-49, 60-63, 70, 74, and 76-78, and adds 20 new references (now 537 and with digital object identifiers if available), to reinforce the point that the use of ASPECTS is a suboptimal and superfluous practice. Version 3 adds to Page 1 this article’s OSF (Open Science Framework) Preprints digital object identifier (doi). Version 4 corrects minor technical errors; tweaks wording, typography, and graphics; credits Cavalieri’s principle; augments page 67’s last paragraph; discriminates the suitabilities of “satisficing”; notes a 207-word summary at https://doi.org/10.3174/ajnr.A6485; and adds references: one for “data theory” on page 7, three for “QIB” on page 49, two more for “ABC/2” on page 52, as well as Yoo et al. 2012, Goyal et al. 2016b, Sheth et al. 2019, and Qiu et al. 2020. Text pagination and most footnote numbers are unchanged. Version 5 cites, on page 49, a head-to-head comparison of auto-ASPECTS and auto-volume applying the same software package to baseline CTs (Brugnara et al. 2020). Footnote numbers and pagination are otherwise unchanged. ASPECTS, The Mismeasure of Stroke: A Metrological Investigation Contents Abstract . 1 Acknowledgments . 2 Version Changes . 2 Introductory Material and Essential Background §1. Preface . 4 Synopsis . 5 §2. Ground Rules of Measurement . 7 §2.1 Scales . 7 §2.2 Purpose . 8 §2.3 Precision and Accuracy . 9 §2.4 Getting It Wrong—The Loss Function . 10 §2.5 Correlation and Surrogacy . 12 §3. The Development of ASPECTS . 13 §3.1 ASPECTS Invented. 13 §3.2 ASPECTS Taught . 16 §4. The MCA Territory and Its “1/3” Rule . 22 §5. Kappa Agreement . 29 Reanalysis of Published Data §6. How Not to Measure Volume: ASPECTS . 34 §6.1 The Real Anatomy of ASPECTS . 34 §6.2 A Further Analysis of Some CT-ASPECTS Data . 38 §6.3 A Patient-Level Pooling of DWI-ASPECTS Data . 45 Discussion Topics §7. How To Measure Volume: ABC/2, 2Sh/3 . 51 §8. ASPECTS Repurposed . 60 §9. Whither ASPECTS? . 64 Summation and Conclusion §10. Looking Back At and Forward From ASPECTS . 69 References . 79 Special Symbols and Abbreviations . 99 Page 3 Page 4 ASPECTS, The Mismeasure of Stroke: A Metrological Investigation §1. Preface ASPECTS is the Alberta Stroke Program Early CT Score, a surrogate for infarct volume (§3, §6). CT is x-ray computed tomography, but the Score applies also to magnetic resonance imaging (MRI). In this report, §6.2 reanalyzes data from a CT series comparing 3 scales using 32 scans and 259 raters, and §6.3 assembles a patient-level meta-analysis of 3 MRI series (949 patients: ASPECTS vs. volume). They both show ASPECTS to be an erratic measure. Background sections address foundational issues in measurement (§2) and ASPECTS’s motive and method (§3), and review its primary claims (§4, §5). The innovative §6 assesses ASPECTS as a measurement for its target, infarct volume. §7 describes an efficient alternative. §8 rebuts a revisionist rebuttal. §9 looks at guidelines using ASPECTS. §10 sums up pros and cons and looks ahead. Flanking §6, §§1-5 collect essential introductory context, and §§7-10 elaborate on §6’s methodological implications and potential clinical significance. Accordingly, while the broad setting is early acute brain infarct therapy by human recombinant tissue plasminogen activator (tPA,1 alteplase) or embolectomy, this report is a logic- and data-driven examination of measurement technique taking no position on the therapeutic procedures themselves. While not the sole determinant of therapy, volume or whatever ASPECTS represents must be taken seriously as a quantity, subject to measurement skills, common sense, testing, and criticism. “Metrology is the science of measurement.”2 Without becoming professional metrologists (minders of the measures), we still depend on concepts and tools that refer back to their discipline. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your thoughts, advanced to the stage of science....3 This opinion by William Thomson (since 1892, Lord Kelvin), that expression-by-numbers is necessary for science, might carelessly be converted to a feeling that it suffices. The feeling might be reinforced by statistical correlation, statistical

ASPECTS, the Mismeasure of Stroke: a Metrological Investigation

Stevens, Stanley Smith (1906–73) Banaji M R, Hardin C D 1996 Automatic Stereotyping

Vol 1 Ross A. Mcfarland Papers

Stanley Smith Stevens

Networking Western Psychology's Elite: a Digital Analysis Of

Science Outside the Laboratory: Measurement in Field Science And

Neuroscience

Psychologypsychology PSY A01.Qxd 2/2/05 4:30 Pm Page Ii

Library of the History of Psychology Theories

A History of Psychology in Autobiography”

Bibliography