City Research Online
City, University of London Institutional Repository
Citation: Harper, G. (2017). A study of the use of linked routinely collected administrative data at the local level to count and profile populations. (Unpublished Doctoral thesis, City, University of London)
This is the accepted version of the paper.
This version of the publication may differ from the final published version.
Permanent repository link: https://openaccess.city.ac.uk/id/eprint/18244/
Link to published version:
Copyright: City Research Online aims to make research outputs of City, University of London available to a wider audience. Copyright and Moral Rights remain with the author(s) and/or copyright holders. URLs from City Research Online may be freely distributed and linked to.
Reuse: Copies of full items can be used for personal research or study, educational, or not-for-profit purposes without prior permission or charge. Provided that the authors, title and full bibliographic details are credited, a hyperlink and/or URL is given for the original metadata page and the content is not changed in any way.
City Research Online: http://openaccess.city.ac.uk/ [email protected]
A Study of the Use of Linked Routinely Collected Administrative Data at the Local Level to Count and Profile Populations
by Gillian Harper
A Dissertation Submitted to
Cass Business School City, University of London
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy in Statistics
April 2017
1 Contents
1 Introduction ...... 11 1.1 Overview ...... 11 1.2 Papers ...... 14 1.2.1 Paper 1 – Using administrative Data to Count populations ...... 14 1.2.2 Paper 2 - Applications of Population Counts Based on Administrative Data at Local Level ...... 16 1.2.3 Paper 3 - Using Administrative Data to Count and Classify Households with Local Applications ...... 17 1.2.4 Paper 4 - Impact of Asthma on Educational Attainment in a Socioeconomically Deprived Population: A Study Linking Health, Education and Social Care Datasets ...... 20 1.3 Individual contribution to the co-authored papers presented in the thesis ...... 21 1.3.1 Personal contribution to the work presented in chapter 2 ...... 21 1.3.2 Personal contribution to the work presented in chapter 3 ...... 23 1.3.3 Personal contribution to the work presented in chapter 4 ...... 23 1.3.4 Personal contribution to the work in chapter 5 ...... 24 2 Using Administrative Data to Count Local Populations ...... 26 2.1 Introduction ...... 26 2.2 Background ...... 29 2.3 Data Sources ...... 32 2.4 Methodology ...... 34 2.5 Residuals ...... 38 2.6 Evaluation of Results ...... 40 2.7 Matching Algorithms ...... 41 2.7.1 Address Matching ...... 42 2.7.2 Person Matching ...... 43 2.8 A Worked Example ...... 44 2.9 Conclusions ...... 51 3 Applications of Population Counts Based on Administrative Data at Local Level ...... 56 3.1 Introduction ...... 56 3.2 Limitations of Official Population Statistics ...... 58 3.2.1 Resource Allocation ...... 58 3.2.2 Use as Denominators ...... 59 3.2.3 Use in Delivery of Local Services ...... 61 3.2.4 Geography as a Barrier ...... 62 3.2.5 Use of Administrative Data in Practice ...... 64 3.3 Data Structures Using Administrative Data Sources ...... 66 3.4 Application Example ...... 70 3.4.1 Background to Case Study ...... 70 3.4.2 Take-up Rates of Free Eye Tests Under the NHS ...... 74 3.4.3 Geographical Access to Free Eye Tests ...... 75 3.4.4 Impact of Geographical Access on Take-up Rates ...... 77 3.4.5 Evaluation of an Alternative Service Configuration ...... 79 3.4.6 Implications for Resource Allocation ...... 81 3.5 Discussion ...... 82 3.6 Conclusions ...... 83 4 Using Administrative Data to Count and Classify Households with Local Applications . 86 4.1 Introduction ...... 86 4.1.1 Why Analyse Households? ...... 86
2 4.1.2 Present Arrangements and the Case for Change ...... 88 4.1.3 Aims of Paper ...... 91 4.2 Creating Household Statistics from Administrative Data ...... 92 4.2.1 Background to Demographic Counts ...... 92 4.2.2 Alternative Household Classification Systems Using Administrative Data ...... 93 4.2.3 Enumerating Household Types ...... 95 4.2.4 Mapping Household Counts on to Standard Types ...... 98 4.2.5 Examples of Household Enumeration ...... 99 4.3 A Case Study: Child Poverty in Hackney ...... 102 4.3.1 Access to Children’s Centres in Hackney ...... 107 4.4 Administrative Counts Versus Official Household Statistics ...... 109 4.5 Comparison of Household Types Using Official Figures ...... 114 4.6 Reasons for Differences Between Sources ...... 117 4.7 Discussion ...... 118 4.8 Appendix 4.A - Measuring Statistical Quality of Population Estimation and Household Counts from Administrative Data Method ...... 120 5 Impact of Asthma on Educational Attainment in a Socioeconomically Deprived Population: A Study Linking Health, Education and Social Care Datasets ...... 123 5.1 Introduction ...... 123 5.2 Methods ...... 124 5.2.1 Study participants ...... 124 5.2.2 Outcome variables ...... 124 5.2.3 Predictor variables ...... 125 5.2.4 Data linkage ...... 126 5.2.5 Statistical methods ...... 128 5.3 Results ...... 130 5.3.1 Primary analysis ...... 131 5.4 Discussion ...... 133 5.4.1 Summary ...... 133 5.4.2 Strengths and weaknesses ...... 134 5.4.3 Comparison with other studies ...... 135 5.4.4 Clinical and policy relevance ...... 135 5.4.5 Conclusion ...... 136 5.4.6 Acknowledgments ...... 136 5.5 Appendix 5.A - Sensitivity analyses ...... 136 6 Conclusions ...... 139 6.1 Impact of the research ...... 140 6.1.1 Academic penetration ...... 140 6.1.2 Commercial implementation ...... 141 6.1.3 Informing national statistics and trailblazing administrative data strategy ...... 145 6.1.4 Research Excellence Framework (REF) recognition and award ...... 151 6.2 Critical reflection ...... 151 6.3 Overall Summary ...... 152
3 List of Tables
Table 2.1: Features of available local administrative data sets ...... 32 Table 2.2: Example of a simple truth-table based on Figure 2.1. Key: A accept; R reject ...... 36 Table 2.3: Population count audit trail for a case study ...... 45 Table 2.4: Comparison of case study population age breakdown from different sources ...... 48 Table 2.5: Enumeration of rejected records for case study ...... 51 Table 3.1: Typical structure of currently available official population data (OA = Output Area) 68 Table 3.2: Typical structure of databases using administrative data sources (OA = Output Area) ...... 68 Table 3.3: Household structure, population, tenure and benefit status ...... 73 Table 3.4: Table segmenting the population of Tower Hamlets by access to eye test centres according to the given risk factors ...... 77 Table 3.5: Alternative resource allocation scenarios ...... 82 Table 4.1: Specific examples of households defined by size and age group (Key: O indicates a person) ...... 95 Table 4.2: Possible combinations of household demographic types based on size and age (see text for details of the highlighted cells) ...... 97 Table 4.3: Mapping household demographic combinations on to the eight standard types, A to H for the case of three age groups and up to four occupants ...... 99 Table 4.4: Summary table showing a breakdown of households across the six Olympic Boroughs and selected key attributes ...... 100 Table 4.5: Average household age and occupancy ...... 102 Table 4.6: Summary table showing a breakdown of households in Hackney according to the number of children, housing tenure and benefit status according to three different communities as at 2011 ...... 105 Table 4.7: Odds of income deprivation by risk factor including 95 % confidence intervals (CI) ...... 106 Table 4.8: Comparison based on total number of households by local authority using administrative, DCLG, GLA and ONS Census data, and the % difference of each compared to the administrative data counts ...... 111 Table 4.9: Comparison of count and % of vacant dwellings by local authority using administrative, DCLG, GLA and ONS Census data, ...... 113 Table 4.10: The government household typology scheme ...... 115 Table 4.11: Household type counts in London Borough of Hackney using administrative, DCLG, and ONS Census data, and the % difference of each compared to the administrative data counts ...... 116 Table 5.1: Key stage tests for the UK’s National Curriculum ...... 124 Table 5.2: Characteristics of 12,136 pupils that sat Key Stage tests in 2002 to 2005...... 128 Table 5.3: Coefficients, 95% confidence intervals and P-values for the effect of socio demographic and clinical variables on standardised attainment scores in Key Stage tests 1, 2 and 3, from multiple regression model allowing for clustering ...... 131 Table 5.4: Table 5.3: Coefficients, 95% confidence intervals and P-values for the effect of socio demographic and clinical variables on standardised attainment scores in Key Stage tests 1, 2 and 3, from multiple regression model allowing for clustering ...... 133
4 List of Figures
Figure 2.1: Simple Venn diagram partitioning different categories of administrative data with and without addresses ...... 35 Figure 2.2: Summary of population count methodology stages ...... 37 Figure 2.3: Pathway to determine if a person is a current resident at a UPRN or not ...... 38 Figure 2.4: Residuals and possible remedial actions ...... 39 Figure 2.5: Extended UPRN assignment flow chart. Key: SAON = Secondary Address Object, PAON = Primary Address Object ...... 43 Figure 2.6: Distribution of high UPRN occupancy levels resulting from the case study ...... 46 Figure 2.7: Chart showing the differences in estimates by age group between the administrative count and ONS and GLA ...... 49 Figure 3.1: The flow of administrative data at national and local level ...... 66 Figure 3.2: Density of Bangladeshi population in Tower Hamlets by Lower Super Output Area (LSOA) (Contains Ordnance Survey data © Crown copyright and database right 2010, and data sourced from London Borough of Tower Hamlets) Note: LAPs are Local Area Partnerships ...... 71 Figure 3.3: Geographical access to eye testing centres based on 10-minute walk time or 500 m. Round symbols indicate locations of households with one or more persons aged 60+ (Contains Ordnance Survey data © Crown copyright and database right 2010, and data sourced from London Borough of Tower Hamlets) ...... 75 Figure 3.4: Free eye test take-up in the 60+ population based on distance from nearest eye test centre ...... 78 Figure 3.5: Geographical access to GP practices based on 10-minute walk time or 500 m. Round symbols indicate locations of households with one or more persons aged 60+ (Contains Ordnance Survey data © Crown copyright and database right 2010, and data sourced from London Borough of Tower Hamlets) ...... 79 Figure 3.6: Predicted change in eye test take-up in the 60+ population following re- configuration ...... 80 Figure 4.1: Stages in the production of person and household level data and policy domains supported ...... 94 Figure 4.2: The eight standard household types from administrative data ...... 96 Figure 4.3: Scatter-gram of household types showing occupancy versus average age by output area ...... 101 Figure 4.4: Map showing the locations of households on benefits with children aged<5 that are outside pram pushing distance from the nearest children’s centre ...... 108
5 Acknowledgements
The research material in this thesis is from papers published previously in peer-reviewed journals. Chapters 2, 3, 4 and 5 were originally published in Harper and Mayhew (2012a), Harper and Mayhew (2012b), Harper and Mayhew (2016), and Sturdy et al (2012) respectively. The author is grateful to the editors of the journals involved (Journal of Applied Spatial Analysis and Policy, and PLOS ONE) for giving their permission for the papers to appear in this thesis. Thanks also to the anonymous referees for their comments and suggestions that led to the improvement of the published papers.
The author would like to thank ESRC (ESRC RES-163-27-0019: ‘Using Administrative Data to Estimate the Population and Measure Deprivation’) for funding to be able to convert the methodology into the academic papers in chapters 2 and 3.
The author would like to acknowledge all the helpful comments and suggestions made by her colleague Professor Les Mayhew, and Professor Richard Verrall in his capacity as supervisor.
Further, the author would also like to acknowledge the contributions to the research of Professor Les Mayhew and John Eversley, and the Blizard Institute, Queen Mary, University of London, and the support of Cass Business School.
Lastly, the author is grateful to Jon and her friends for their support and always believing in her.
6 Declaration
The author grants powers of discretion to the University Librarian to allow this thesis to be copied in whole or in part without further reference to the author. This permission covers only single copies made for study purposes, subject to normal conditions of acknowledgement.
7 Abstract