I Am Smarticus – BOBI Submission
Total Page:16
File Type:pdf, Size:1020Kb
Taking the fight to COVID-19 Understanding how the UK’s response to COVID-19 compares with that of other countries and why 1 Contents 1. Analytical approach and rationale 2. Executive Summary 3. How the UK measures up 4. The drivers of case and death rates internationally 5. The drivers of case and death rates in England 6. Gladiators, are you ready? 2 1. Analytical approach and rationale 3 In our analysis, we have answered the brief and have gone beyond it, in an engaging way An analysis of the global COVID-19 data, compared with and contrasted to the UK data • A good overview of the data, showing the key issues around the global pandemic The Brief • Commentary on the information • Tease out and answer questions that arise • Give the user the ability to interact with the data The key drivers/predicters of success in fighting the pandemic • Which countries did better and why? • How could we segment countries, based on their success in fighting We decided to go the pandemic beyond the brief • Which regions within England did better and why? 4 Before we started our analysis, we checked and filtered the world data for consistency 1. Countries removed 2. Consistent time period 3. Variables removed Non-country locations (e.g. continents) The majority of countries reported data We decided to concentrate on new cases were removed as it was not known how this from week 15 of 2020 (w/c 06/04/2020) and and new deaths as the most meaningful data was calculated. week 8 of 2021 (w/e 21/02/2021). variables to track COVID-19 over time. Countries without sufficient weekly We decided to focus on analysis of the We also decided to look at the data week information were removed, i.e. Cayman World data on this time period, which was a by week, rather than day by day, as public Islands, Comoros, Hong Kong, Lesotho, total of 46 weeks. holidays and country differences meant Marshall Islands, Samoa, Solomon Islands, their would be inconsistencies in daily Tajikistan, Vanuatu and Yemen) comparisons. This decision made the ‘new cases smoothed’ and ‘new deaths Countries with data limited to late 2020 or smoothed’ data unnecessary. early 2021 were also removed, i.e. Anguilla, Bermuda, Faeroe Islands, Falkland Islands, Any data that was not consistently provided Gibraltar, Guernsey, Isle of Man, Jersey, by the majority of countries was also Macao, Micronesia, Northern Cyprus, Saint removed, e.g. ICU/hospital admissions and Helena, Turks and Caicos Islands and patients, new and total tests, vaccination Vanuatu. rates, etc. 5 We filled some data gaps in the world data through regression interpolation, created consistency in the stringency index and removed outliers 4. Regression Interpolation 5. Stringency Index 6. Outliers removed Where data points in new cases and new In the original data this was provided daily Countries were removed from the analysis deaths were missing on a daily basis, the but only updated weekly. This meant that if they difference with cumulative totals were used there were often missing values mid-week. a) were below the 2.5th percentile in to fill in the gaps. We decided to replace this data with the population size, and/or In Kosovo, life expectancy data was added maximum weekly value, which provided b) were below the 2.5th percentile in terms from: more consistency. of death rate, and/or https://data.worldbank.org/indicator/SP.DYN.L Within the analysis, these were collapsed E00.IN?locations=XK c) were below the 2.5th percentile in terms into of case rate. a) average stringency index over whole Countries removed for one or more of these period reasons were: Cambodia, China, Dominica, b) maximum stringency index over whole Laos, Liechtenstein, Monaco, Saint Kitts & period. Nevis, San Marino, Tanzania, Timar, Vatican and Vietnam. In our key driver analysis, only (a) was found to be significant in the models. This and removals at Box 1 left us with a data set of 169 countries 6 In England, we identified supplementary data to enrich our key driver and segmentation analysis 1. Supplementary Data 2. LTLA Summaries Created 3. Consistent Time Period Additional data from external sources (see The data matched was created from data at A period of 48 weeks was chosen where Data Sources - Supplementary) was lower levels of geography (Lower Super there was consistent data on deaths and matched to the core data by Lower Tier Output Areas = LSOAs), which were cases from across 314 LTLAs. The date Local Authority (LTLA). The core English averaged in two alternative ways to create range was from Calendar week 13 2020 Covid dataset contained 315 of the 317 the LTLA (Lower Tier Local Authority) (w/c 23/03/2020) to Calendar week 8 2021 LTLAs on the Annual Population summaries: (w/e 21/02/2021). Survey/Labour Force Survey dataset* a) rank average of the LSOA rank for Due to missing data at the start of the each score (used to create deciles for series, North Devon was dropped from the each measure) and analysis b) average raw scores across an LTLA’s constituent LSOAs *City of London and Hackney were merged on the Covid data, Isles of Scilly were merged with Cornwall 7 In England, we also filled data gaps through regression interpolation before running final data checks 4. Regression Interpolation 5. Data Checks Four of the 314 regions did not have Exploratory analysis of death rate, case matchable data for the proportion aged 16+ rate and all average LTLA predictor scores in employment and the proportion of non- (average raw scores across an LTLA’s white ethnicity due to structural changes in constituent LSOAs) suggested that the definitions of LTLAs during 2020. The distributions were approximately normally in effected LTLAs were Aylesbury Vale, most cases and not overly skewed. A Chiltern, South Bucks and Wycombe. The decision was made to work with these missing values on these two variables were averaged raw scores for developing interpolated for these four LTLAs using the regression models regression interpolation method. This ensured that the English data set of 314 LTLAs was complete and that no values remained missing. 8 As we proceeded with our analysis, we followed two key principles Focus on Case / Death Rate, Convert raw data into deciles not total cases and deaths for consistency and clarity In our analysis of both the World and The distribution of the case rate, death rate England data, we have focused on cases and all potential predictors was skewed in and deaths per 1 million or per 1,000 the World Data, which was not surprising people, respectively. This seeks to go given the diversity of the countries. Rather beyond the obvious conclusion that more than transforming the variables we decided populous countries and regions have more to rank the countries on each measure and cases and deaths, neutralising the group these ranks into deciles. The population effect. regression models for the World data are Clarity and built entirely from these new decile Consistency variables, eliminating extreme outliers and producing better fitting models. The England modelling was based on raw data, but then reported as deciles for consistency and clarity. 9 Data Sources - Core Data Set Description Source Core / Supplementary UK COVid-19 new cases Up to 21 February 2021 https://www.bhbia.org.uk/assets/Downloads/sp Core ecimendate_agedemographic-unstacked- uk.csv Taken from: https://coronavirus.data.gov.uk/downloads/dem ographic/cases/specimenDate_ageDemographi c-unstacked.csv Rest of World COV-19 data Up to 26 February 2021 https://www.bhbia.org.uk/assets/Downloads/owi Core d-covid-data.csv Taken from Coronavirus Source Data - Our World in Data 10 Data Sources - Supplementary Data Set Description Source Core / Supplementary English indices of Released: 26 September 2019 https://www.gov.uk/government/statistics/englis Supplementary deprivation 2019 Data used: File 10: local authority district h-indices-of-deprivation-2019 summaries Variables include: • Index of Multiple Deprivation • Income • Employment • Education, Skills and Training • Health and Disability • Crime • Barriers to Housing and Services • Living Environment • Income Deprivation Affecting Children • Income Deprivation Affecting Older People UK COVid-19 new deaths Date of Data: 8 March 2021 https://coronavirus.data.gov.uk/api/v2/data?are Supplementary within 28 days of test aType=ltla&metric=newDeaths28DaysByDeath Date&format=csv 11 Data Sources - Supplementary Data Set Description Source Core / Supplementary Estimates of the population Released: 24 June 2020 https://www.ons.gov.uk/file?uri=/peoplepopulati Supplementary for the UK, England and Data used: MYE2 – Persons, MYE5 onandcommunity/populationandmigration/popul Wales, Scotland and Date of Data: Mid 2019 – Mid 2020 ationestimates/datasets/populationestimatesfor Northern Ireland Geography: Local authority: district / ukenglandandwalesscotlandandnorthernireland unitary (England only) /mid2019april2020localauthoritydistrictcodes/uk Variables: Age-0-90+, Area midyearestimates20192020ladcodes.xls Annual Population Survey / Released: 26 Jan 2021 https://www.nomisweb.co.uk/ Supplementary Labour Force Survey Geography: local authority: district / unitary (England only) Date of Data:Sept 2020 Variables selected: T18:1 (All Ages - All : All People ) T18:4 (All Ages - White: All People) T01:1 (All aged 16 & over - All : All People) T01:7 (All aged 16 & over - In employment : All People) T01:16 (All aged 16 & over - Unemployed : All People) T01:19 (All aged 16 & over - Inactive : All People) 12 Our submission consists of the following 3 documents