<<

A POPULATION PROJECTION MODEL FOR ETHNIC GROUPS: SPECIFICATION FOR A MULTI-COUNTRY, MULTI-ZONE AND MULTI-GROUP MODEL FOR THE UNITED KINGDOM

Paper presented at the International Conference on Effects of Migration on Population Structures in Europe, Vienna, 1-2 December 2008, organized by the Vienna Institute for Demography and held at the Austrian Academy of Sciences, Theatersaal, Sonnenfelsgasse 19, 1010 Vienna, Austria

Philip Rees, Pia Wohland, Paul Norman and Peter Boden School of Geography, University of Leeds, Leeds LS2 9JT, United Kingdom

Contact Phil Rees: Email [email protected], Tel +44 (0)113 343 3341

Version 0.6, 27 November 2008 Acknowledgment: This paper is part of a research project funded by ESRC Research Award RES- 165-25-0032 (1/10/07 to 30/9/09) What Happens When International Migrants Settle? Ethnic Group Population Trends and Projections for UK Local Areas. (http://www.geog.leeds.ac.uk/projects/migrants/)

ABSTRACT As part of a project entitled What happens when international migrants settle? Ethnic group population trends and projections for UK local areas (ESRC Award RES-165-25-0032), a team at the University of Leeds is building a projection model that combines three features that have been used separately in previous work: a multi-zone structure that recognizes flows between zones, a hierarchical structure (country-zone within country) that recognizes discontinuities in both migration flows and ethnic group definitions between countries and interactions between ethnic groups in the fertility/birth process. This paper describes the model specification and reports on progress towards full specification and implementation. The model requires inputs of detailed components of change and assumptions about their future trajectories. To carry out the projections, the team is (1) estimating ethnic group fertility for local areas using alternative data sources, (2) estimating ethnic group mortality for local areas through indirect modelling ; (3) building a databank of international migration for local areas from census, survey and administrative data sets and classifying the external migration flows by ethnicity; (4) building estimates of internal migration for ethnic groups using census and register data, adopting a model structure that captures the most important interactions. The paper presents ethnic group mortality estimates that will form one input to the projection model.

INTRODUCTION

Context

The population of the United Kingdom is continuing to grow at a moderate pace, 0.64% in 2006-7. There are several factors promoting continued growth: the remaining demographic momentum of high fertility in the 1960s and early 1970s, the recent rise (catch-up) in fertility levels, the continuing improvement of survival of people to and within the older ages and the ongoing high level of net immigration (ONS 2008). Births have risen from 663 thousand in 2001-2 to 758 in 2006-7, while deaths have decreased from 601 thousand to 571 thousand. Natural increase has risen since 2001 to contribute 48% to population change in 2006-7 from only 30% in 2001-2. Immigration has grown in the same period from 484 thousand in 2001-2 to 683 thousand in 2006-7. Emigration has also

1 increased from 334 thousand to 406 thousand in the same period. Net migration was 148 thousand in 2001-2 and 198 thousand in 2007-8 but had been 262 thousand in 2004-5 in the period of highest immigration from the new EU member states.

This population growth varies considerably from place to place (Dunnell 2007). Growth is highest in the East Midlands, East, South West and Northern Ireland regions in the year 2001-2006 but each region has a few local authorities that have experienced decline.

Against this back cloth of demographic change, the ethnic composition of the population is changing quite fast. ONS estimates for for the 2001-6 period show a 2.7% increase in the total population, a 0.4% decrease in the group and a 23.0% increase in not-White British group (ONS 2008, Large and Ghosh 2006a, 2006b). In 2001 the White British made up 87% of the England population and ethnic minorities 13%. By 2006 this had shifted to 84% White British and 16% ethnic minorities. Both immigration and natural increase of the not-White British contribute to substantial population change, which varies considerably across the local authorities of the UK. Profound change in the size and composition of the UK’s local populations is in prospect.

ESRC has funded the University of Leeds to undertake an investigation of What happens when international migrants settle? Ethnic group population trends and projections for UK local areas (ESRC Research Award RES-165-25-0032, 1/10/07 to 30/9/09). The changing population and spatial dynamics of the UK’s ethnic groups are being studied in this project through the construction of population projections for local areas1. Projections assess the contributions of fertility, mortality, international and internal migration processes to population change and work out their implications under sets of assumptions. The project is making new estimates of ethnic fertility because previous reports a range of values. To make better estimates of international migration by ethnicity, we are constructing from available census, survey and administrative sources a New Migrant Databank of local area international migration classified by ethnicity. The project will make estimates of ethnic mortality differences between local areas. We will use commissioned census tables of migration by ethnicity and register based migration to make estimates of the internal migration of ethnic groups.

We will build a new population projection model with associated software. The model will incorporate the following processes: migration between local areas and mixed offspring to mothers and fathers of different ethnicity. The model will use the different 2001 Census ethnic classifications

1 By local areas is meant the local authority (government) areas of the lowest tier, with two exceptions. Two small local authorities were merged with neighbouring authorities: City of with Westminster in London and the Isles of Scilly with Penwith in Cornwall.

2 of England, Wales, and Northern Ireland. As migrants cross country borders they will be reclassified. The projection model will be used to compute (1) a principal projection aligned to the latest official national assumptions, (2) a similar projection constrained to the official projection numbers to understand the effect of such constraints, (3) variant projections and (4) scenario projections exploring the effects of policies for international migration and of local/regional plans.

The overall aim of this paper is to set out a full design of a model that can be used to project the population by ethnicity for local authorities across the whole of the United Kingdom. This design will be used in two ways: (1) to provide the specification for a computer program that will implement the model; and (2) to provide the specifications for the input variables to the model (work is being carried out separately on each of the projection inputs).

Review of existing models and software

Why is such a projection model needed? Could not existing models and associated software be used to produce the projections? Unfortunately existing population projection models suffer from a number of deficiencies as instruments for projecting populations by ethnicity. We briefly critique the models which have been used.

Single-region models (POPGROUP, the JRF Model)

Simpson and colleagues (CCSR 2008) have developed a suite of spreadsheet macros that implement a single-region cohort-component model with net migration, which is widely used by Local Governments and has been applied to Birmingham, Oldham, Rochdale and Leicester (Simpson 2007a, 2007b, 2007c; Simpson and Gavalas 2005a, 2005b, 2005c; Danielis 2007). Rees and Parsons (2006) in work for the Joseph Rowntree Foundation (JRF) used a single-region cohort-component model for UK regions which used four migration streams: internal out-migration and emigration as intensities (probabilities) and immigration and internal in-migration as flows.

These models neglect the fundamental nexus in multistate population dynamics: the out-migrants from one region become the in-migrants to other regions (Rogers 1990). If we wish to introduce a model of migration (such as the spatial interaction model used in Champion, Fotheringham, Rees et al. 2002), then this is best accomplished through the framework of a multi-regional projection.

Multi-region models (LIPRO, UKPOP, ONS Sub-national Model, GL Model) Since the 1970s various programs have been developed to implement the multi-regional cohort- component model. In the early 1990s a general version was developed at NIDI by van Imhoff and

3 Keilman (1991) for use with household projections but in a form in which other state definitions could easily be introduced. The software is made available (NIDI 2008) though no longer supported as a licensed package. The problem with this software is some uncertainty about its capacity for handling “transition data” (e.g. census migration), having been designed for inputs of “movement data” (e.g. register migrations). It is still intensively used at NIDI and by Eurostat for various projections and by researchers in the UK such as Murphy.

In the UKPOP model (Wilson 2001, Wilson and Rees 2003) the accounts based model developed by Rees (1981) is developed for a full set of UK local authorities. The accounts based model relies on iteration to make consistent the relationship between observed deaths in a region (the variable generally available) and the deaths to the population in the region at the start of the interval (who die in that region and elsewhere). Efforts by Parsons and Rees to re-apply this model met with difficulties in achieving convergence in the iterative procedure. The model could generate for older ages negative probabilities of survival within a region, for example. The reason for this was that populations, deaths and migration come from different data sources (e.g. census and vital register) which may be inconsistent and in error. Wilson and Bell (2004a) and Wilson et al. (2004) have used simpler versions of the multi-regional model in important work in with either much smaller numbers of spatial units or using a sequence of bi-regional models. This builds on experiments by Rogers (1976). Wilson and Bell (2004b) establish that a set of bi-regional models gives results close to a full multiregional model. Wilson (2008) has also developed a model for the indigenous and non- indigenous population of the Northern Territory, Australia, which has a number of very useful features.

ONS Sub-national Model for England, GLA Model for London Boroughs Both these models have a long pedigree and are in continual use. The ONS Sub-national model for Local Authorities in England is implemented by the Office for National Statistics in collaboration with outside contractors. A broad outline of the methodology is in the public domain (ONS 2008). The results of GLA model are frequently published but again only some of the details of the underlying model are in the public domain (London Research Centre 1999, Storkey 2002a, Hollis and Bains 2002, Bains and Klodawski 2006, Bains and Klodawski 2007).

Nested multi-region models (MULTIPOLES) Kupiszewski and colleagues at CEFMR (Kupiszewska and Kupiszewski 2005, Bijak et al. 2005, Bijak et al. 2007) have developed a model from an idea by Rees et al. (1992) that uses several layers. For example, in a projection study of 27 EU states (Bijak et al. 2005) three layers are recognised:

4 inter-region migration within states, inter-state migration within the EU and extra-EU migration. This approach enables different models to be used in the different layers within a consistent accounting framework.

None of these models builds in extra ingredients which are necessary for an ethnic population projection model and which are spelt out in the model specification of Rees (2002).

These ingredients are as follows. (1) The model should be able to generate mixed origin infants from parents with different ethnicities. (2) People should be able to change their ethnic self-identification at particular points in life (e.g. on attaining adulthood at age 18). (3) The model should be able to handle different ethnic classifications and to convert between them (e.g. from country of origin to ethnic group or from one ethnic classification in England to that in Scotland and vice versa).

Other requirements are: (4) The model should use single years of age and single year projection intervals. No other age-time plan yields annual projections properly. (5) The age range should extend to 100+ as the final age group, because of increasing longevity and a need to project numbers at very old ages, whose health care needs are very high. (6) The model should be able to handle the interactions between a large number of local authorities (e.g. 352 in England), at least in principle, though in practice, a reduced model may be necessary.

These ingredients and requirements are not met fully by any previous model or software, so it is proposed to build a new model.

Organization of the paper

The rest of the paper spells out the design for a multi-layer, multi-regional, multi-ethnic group projection model. Section 2 outlines the state space, accounting assumptions and basic equations of the model. Section 3 gives an account of model inputs: base populations, fertility, mortality, immigration, emigration and internal migration, focussing on new ethnic mortality estimates. Section 4 summarises the paper and discusses the next steps. This paper serves as a progress report on model design, which will, we hope, progress rapidly over the next few months.

5 A MULTILEVEL, MULTISTATE PROJECTION MODEL FOR ETHNIC GROUPS

This section of the paper specifies the proposed model in detail, using a mix of algebraic equations and suggested calculation sequences. This is a first draft and will be revised as the software to implement the model is built.

The aim is to specify the model in as general a way as possible but also to give details of the particular features of the UK local ethnic projection system for which the model will be used. The model uses a transition framework because the vital internal migration information derives from the decennial census. The model can be adapted where similar migration data sets are available. However, there will be a demand for such a model in countries where the migration data derives from registers rather than censuses. It should be relatively easy to re-specify the model in a movement framework later.

Overall design

The modelling system is set out in Figure 1 and Tables 1 to 6. Figure 1 outlines the structure of the projection model. Table 1 lists the ethnic groups used in the different home countries of the UK. Table 2 provides detail for one population dimension age. Table 3 lists the notation used for the model. Table 4 provides a picture of the population accounting framework used in the model. Table 5 catalogues the detailed input data required for the base period. Table 6 describes the parameters that define the state space of the model.

[Figure 1 about here]

The structure of the model (Figure 1) is as follows. We input to the model sets of parameters and fixed data which describe (1) the system of interest (e.g. countries, zones, ethnicities, ages, genders), (2) the nature of the projection model we wish to run (e.g. constant intensities and flows, scenarios of new intensities and flows for one or more components) and (3) the kind of projection outputs we need (e.g. just populations for all ages, populations in age detail, all the components in age detail). The model then receives from input data files the estimates of population and intensities of the demographic change components for a selected base or benchmark time interval. The model then takes the population through the processes of survival (and mortality), emigration (migration out of the country), internal migration (within a country) at two scales - inter-region migration (within the national system) and inter-zone migration within each region and immigration from the external world (countries other than the UK) to produce new populations at the end of each interval which become the new populations used at the start of the next time interval. The fertility process (generating new births) is applied last when the female population at risk of giving birth can be computed. To the new

6 born are applied the same sequence of demographic processes (infant components) as were applied to the population present in regions and zones at the start of the interval.

To implement the demographic processes we supply the modelling system with assumptions about the direction of change of each of the components. This can be done by using a simple technique or by using a full set of input variables for each time interval in the projection. Simple techniques include assuming constancy in projection inputs or inputting summary change parameters (e.g. the rate of decline over time in mortality rates) and using distributions based on the base period variables.

The results of the demographic change processes are written to output files, in lesser or greater detail, depending on requirements. We specify these requirements in a file of projection results parameters, which are input before processing of the population begins.

We place two tasks outside of the modelling system.

The first task is the estimation of model inputs for the UK. Some of the methods used are explained in section 3 of the paper and depend on the types of data available and assumptions that need to be made in the absence of really detailed data.

The second task is the presentation of outputs in as transparent and elegant a form as possible. The output results will be written to comma separated variable files, which can be loaded into spreadsheets and further variables can be generated. It will be important to write as many codes and labels to these output files to make this task easier. We may explore the creation of a system for processing projection outputs in a general way (other research groups have done this), but for the moment we envisage that the user of the modelling system will select from a menu of fixed tables that are pre- defined. During model coding very detailed outputs will be needed for diagnosis of run-time problems. For example, we might wish to look at detailed origin to destination flows by ethnic group and age. However, for further analysis of the results we only need to look at totals for in-migration, out-migration, emigration and immigration not the zone to zone flows themselves.

There are several issues which are being addressed in building the model.

The first issue is the small number problem. That is, many of the populations (by ethnicity, single year of age, local authority) will be very small (fractions of a person). A method will be needed to report these results as simple rounding will result in undercounting. One possibility is to compute the

7 integer difference between the sum of real variables and the sum of rounded variables and allocate the difference probabilistically based on remainders. The small number issue also affects the detail in which we can represent the variables in the model. We will need to substitute a model that estimates inter-zone migration by ethnicity, age and gender for directly measured intensities.

The second issue is whether to choose top down or bottom up projections. If we adopt a two tier approach to the projections (countries as the upper tier, and zones/local authorities as the lower tier), then we could generate two sets of projected populations. The first would be a bottom up projection – the country projected populations would be the sum of the zone projections. The second would be a top-down projection – the zone populations would be adjusted so that they add up to the country projections. The arguments about which to choose are finely balanced: the top-down approach recognises that variables can be more reliably estimated for larger populations; the bottom-up approach says that we may miss important heterogeneity if we impose country constraints on local populations. This is a research issue to be investigated, so that we will need to build both options into the modelling system. However, National Statistics (ONS 2008) report that the results of sub-national projected populations for England are adjusted to match the England national projections, stating that the differences are relatively small. This suggests that the heterogeneity of component rates and flows across local authorities is relatively low.

The third issue is the closely related question of the conformity of ethnic group and all group projections. Should the ethnic group projections be adjusted so that they sum to the official national and sub-national projections (ONS, WAG, GROS, NISRA)? Because components rates and flows differ so much between ethnic groups, the sum of ethnic group projected populations is likely to be higher than all group (without ethnicity) projections as the weighting of high growth groups will increase over time. Again this is a research issue to be investigated, so we will need to build both options (constrained and unconstrained) into the modelling system.

We discuss next the state-space in terms of the population classifications we will be using.

The state space

The state space: home countries This upper tier consists of England, Wales, Scotland and Northern Ireland. The United Kingdom is made up of four countries, which have different and evolving governmental arrangements, though all are represented in the UK Westminster Parliament. The constitutional arrangements are complicated but at time of writing Scotland has its own Parliament and government (formerly the Scottish

8 Executive, now The Scottish Government) in . Wales has its National Assembly for Wales and its Welsh Assembly Government in Cardiff. Northern Ireland has its own Northern Ireland Assembly and government (just re-started), the Northern Ireland Executive. England has no specific assembly or government arrangements. All matters relevant to England are dealt with by the UK Parliament.

We also use two combined entities, for which either input statistics are available or for which output statistics may be produced. These are “England and Wales”, which combines England and Wales and “Great Britain” which combines England, Wales and Scotland.

The state space: local areas The lower tier consists of local authorities. The structure and types of local authorities in each country differs. See the file provided by ONS (2003) for a map of UK local authorities. Figure 2 (Dunnell 2007) maps population change from mid-2001 to mid-2006 for the local areas we are using. In Wales there are 22 Unitary Authorities; in Scotland there are 32 Council Areas; in Northern Ireland there are 26 Local Government Districts. In England there are 354 local authorities of four types: 46 unitary authorities, 239 county districts (with some functions being carried out by counties), 36 (former) metropolitan county districts (with some functions still at metropolitan county level) and 33 London Boroughs (local government units within Authority). We have merged two pairs of English local authorities because one of each pair has a very small population. The City of London is merged with Westminster, a neighbouring London Borough. The Isles of Scilly in Cornwall is merged with Penwith, the nearest county district on the mainland. In the discussion that follows when we talk about local areas we mean the lower tier local authorities of the UK including these two merged pairs. Other administrative geographies, such as counties, the Greater London Authority or Government Office Regions, can be built from these bottom tier local authorities.

[Figure 2 about here]

The state space: ethnic classifications Ethnic classifications are based on self-reporting through census or social survey questionnaires. Considerable consultation and debate goes into the formulation of the question. The resulting categories are a compromise between the demands of pressure groups interested in counting and promoting their own group and a need to make the question one that the whole population can understand. An ethnic group is a set of people with a common identity based on national origin and race. Other criteria such as religious affiliation or language spoken have also been used to define

9 ethnicity, although people can change their religions and acquire or lose languages, whereas national origin is usually fixed by birth or birth of parents or grandparents and race depends on certain physical characteristics such as skin colour or facial features. Ethnic classifications change over time recognising the evolution of groups as a result of migration from the outside world and as a result of marriage/partnership of people from different groups resulting in children of mixed ethnicity.

Here we use the ethnic group classifications adopted in the 2001 Census, which differ from those in the 1991 Census in recognizing several mixed groups. Table 1 sets out the classifications, which are specific to each home country within the UK. In England and Wales a 16 group classification is used; in Scotland, 5 groups are used and in Northern Ireland 12 groups are used. The classifications are based on two concepts: race and country of origin (either directly through migration or through ancestry). Many studies (e.g. Rees and Parsons 2006) used a collapsed version of the classification (e.g. White, Mixed, Asian, Black, Chinese & Other) but these amalgamated classes hide huge differences in terms of timing of migration to the UK, age-sex structures, population dynamics and socio-economic and cultural characteristics. Most studies (e.g. Coleman and Scherbov 2005, Coleman 2006, Rees and Butt 2004) drop the Mixed group: since the 2001 Census revealed this to be the fastest growing group such an omission is regrettable.

[Table 1 about here]

The state space: ages We use single years of age in the projection model. They are set out in Table 2. It is essential to use single years of age in a projection model so that projections for each year can be produced and so that aggregate age groups can be flexibly constructed. We will extend the age range to 100 and over, recognising the higher rates of survival into the older old ages that are now present in the population and recognising the important demands for care generated by the older old population. We recognise two different classifications: period-age and period-cohort. Many vital statistics are classified using the period-age scheme, but for projection models it is essential to use the period-cohort age-time-plan. In order to project the population aged 100+, we need to estimate survivorship probabilities for an additional period cohort (100+ to 101+). This involves assuming that the survivorship probabilities in the 99 to 100 period-cohort and 100+ to 101+ period-cohort are equal to the survivorship probability for the 99+ to 100+ cohort which can be estimated. This assumption is not unreasonable as in very old populations we observe a slowing down of the increase of mortality with age. The age classification used for fertility rates is also shown in Table 2. Fertility rates are usually reported by period-age. The method for handling period-age fertility rates in a period-cohort projection model is explained later.

10 [Table 2 about here]

The state space: sexes/genders Most variables in the projection model are classified by sex/gender. The sexes only interact in the fertility process, where we will adopt a female dominant fertility model. The third gender, persons, is used to hold the sum of males and females.

A notation2

It is useful to develop a general notation for the variables used in the model. We have several choices of approach. The first choice is to adopt a single letter, e.g. K, to represent all population groups. This is the approach adopted in the transition population models defined by Rees and Wilson (1977). Variables are distinguished by their attached superscripts or subscripts, e.g. Ke(i)s(j), persons who exist in zone i at the start of the time interval and who survive in zone j at the end of the interval. This notation is consistent and logical but not widely used. The second choice is to use letters based on the well known life table model, e.g. 1Lx, = the stationary population in the age group from exact age x to x+1. There are two problems with such a notation: the use of prescripts leads to some algebraic confusion: it is preferable to list subscripts in a time sequence, e.g. Lx,x+1 instead of 1Lx. Secondly, the use of upper case (e.g. M, L) and lower case (e.g. q, p, l) conflicts with a convention that uses upper case letter to represent stocks or flows of population and lower case letters to represent intensities of transition (probabilities) or events (occurrence-exposure rates). A third, popular choice is to adopt different letters for the different transitions or events that change populations (e.g. M = migrants (internal), I = immigrants (external), m = probability of migration, d = death (mortality) rate). This is what we do.

Table 3 sets out the building bricks of the notation and then builds the variables that are needed. We try to use single letter variables as far as possible, but double or triple letter variables are needed. Table 3 serves as a reference for readers to check the meaning of variables as new equations are introduced.

The accounting framework and population components equations

Every projection model has an implicit or explicit accounting framework, which must be consistent. The accounting framework consists of a matrix of population flows to which are added a column of

2 In this draft of the paper, we use a keystroke method for representing superscript and subscript indexes. Later the notation will be improved using a suitable equation editor.

11 row totals and a row of column totals to constitute an accounts table. The row totals contain births (in the case of the first, infant period-cohort) or start populations (for other period-cohorts) and totals of (surviving) immigrants. The column totals contain deaths (non-survivors) and final populations in an interval. Table 4 sets out the accounting framework for zones (local areas/authorities) within the home countries. Table 4 represents only a few of the zones in the system: the first and last in each of the four countries. Table 4 refers to a sex group and ethnic group combination and so are repeated 2 × 18 or 36 times in the model computations for each of 102 period-cohorts.

[Table 4 about here]

What are the key features of this framework?

The first feature is that the table holds transition data rather than events data. Transition data derive from censuses in which a question is asked about a person’s usual residence at a fixed point in the past (one year before the 2001 Census, in the current analysis). Events data derive from registration of the demographic events such as birth or death or migration from one place to another. The variable MSi(c)j(d) represents the number of surviving migrants resident in zone i in country c on 29 April 2000 who live in zone j in country d on 29 April 2001. Note that, in principle, migration data for the years from 2001-2 onwards are also transition data based on comparison of NHS register downloads one year apart. However, in practice, they are adjusted by National Statistics to be consistent with counts of record transfers between health authorities (much bigger zones than local authorities) to yield published estimates of migration events.

The table elements in the principal diagonal, Si(c)i(c), are persons present in country c and zone i at both the start of the year and the end of the year (surviving stayers). These counts include migrants who moved within the zone. Migrants from a zone i in one country to a zone j in another are represented as MSi(c)j(d) (inter-country migrants). Migrants from a zone i in one country to a zone j in the same country are represented as MSi(c)j(c) (intra country, inter-zone migrants).

A key point about the accounting framework is that it should put together in a consistent fashion all the population flows required to connect the start population in a time interval to the finish population. So the final population in a zone is given by:

i(c) i(c) i(c) i(c) i(c)j(c) i(c)j(d) FP eag = SP eag –D eag – ES eag – ΣjMS eag – ΣdΣjMS eag j(c)i(c) j(d)i(c) i(c) + ΣjMS eag + ΣdΣjMS eag + IS eag (1)

12 The subscripts eag refer to ethnic group, age (period-cohort) and gender combinations. From the start population are subtracted the deaths (non-survivors) to the zone i(c) start population, the surviving emigrants from the zone i(c) population, the sum of out-migrants to zones in countries other than c and the sum of out-migrants to zones within country c. Then we add the sum of in-migrants from zones within country c, the sum of in-migrants from zones in countries other than c and finally surviving immigrants from the rest of the world. The surviving stayer terms, SSi(c), do not appear in this accounting equation. However, we do need to estimate these SSi(c) variables because of the method used to estimate the migrant flows (explained later).

Given the number of zones, ages and ethnic groups represented in our projection model, we should not expect to find reliable data to count directly the flows and transition probabilities needed for the projection model. Instead we will need to estimate these flows using a variety of sub-models which use more aggregate and reliable data together with a set of assumptions, some testable, some merely reasonable in the absence of statistical evidence.

We now convert the accounting equation into a projection model by substituting for each flow (set of transitions) a product of a probability and a population at risk and show, in principle, how the probabilities are estimated.

Survivors and non-survivors using survivorship and mortality probabilities

We have specified the projection model using transition probabilities, because the most detailed migration data from the census come as transition variables. However, previous use of such data in projection models based on transition data has been difficult to implement for two reasons. The first is because migration probabilities and mortality probabilities at older ages may turn out to exceed one (leading to negative probabilities of being a surviving-stayer). The second concerns the discrepancy between observed death rates that measure deaths using occurrence-exposure rates for an average population in a zone in a time interval and the required non-survival probability for start populations in origin zones. To convert the former to the latter requires either use of iteration or matrix inversion which can lead to convergence problems at older ages for systems with large numbers of zones.

To solve these problems, we propose a simple assumption that survivorship probabilities derived from *i(c) the standard life table receiving occurrence-exposure mortality rates based on zone of death, d eag, (where a refers to period-cohort) are a reasonable estimate for non-survivorship probabilities for i(c)* origin zone populations by age at the start of the period, d eag:

13 i(c)* *i(c) d eag = d eag (2)

To estimate non-survivorship probabilities, we use the standard life table model equation for survivorship probabilities. Life tables have not, to date, been developed for ethnic groups although they are regularly produced for countries (full life tables using single year age intervals to 100+) and for local authorities (abridged life tables using five year age intervals with ages 0 and 1 to 4 to 85+). Rees and Wohland (2008) have estimated life tables for ethnic groups for local authorities using relationships between mortality and standardised illness ratios for ethnic groups based on 2001 census information. Using standard life tables to generate the survivorship probabilities has the advantage that it is relatively easy to introduce new projection assumptions based on studies of mortality rate trends or future scenarios of the mortality rates. Survivorship and non-survivorship probabilities will be used to generate total numbers of survivors from the starting populations of origin zones and total number of deaths experienced by members of those populations.

We estimate survivorship probabilities for local areas i(c), ethnic group e, period-cohort a and gender g using a methodology based on life tables and linking standardised mortality to standardised illness (Rees and Wohland 2008):

i(c) i(c) i(c) s eag =L ex+1g/L exg (3) where x is the life age that corresponds to period-cohort a. Death/non-survivorship probabilities are the complement of survivorship probabilities:

i(c) i(c) d eag = 1  s eag (4)

Deaths are projected by multiplying the non-survivorship probabilities by the starting populations by local area, ethnic group, period-cohort and gender

i(c) i(c) i(c) D eag = d eag SP eag (5).

Note that the deaths can occur anywhere and so include out-migrants who die. We do not attempt to estimate these. We project the total number of survivors of the starting population as follows:

i(c) i(c) i(c) TS eag = s eag SP eag (6).

14 Emigration and surviving emigrants using emigration rates and survivorship probabilities

The next terms in the projection equation we need to estimate/project are the emigration probabilities and emigrants. Because the accounting framework is built on transitions, we need to estimate surviving emigration probabilities. The statistics available on emigration derive almost exclusively from the UK International Passenger Survey (IPS) which estimates the number of emigrations occurring over a one year interval. The estimate is based on a question about intention to leave the country for 12 months. However, some of these emigrants may die before the year is out and we have already made an estimate of these non-surviving emigrants in the mortality/non-survivorship probabilities. The emigration counts must be converted to surviving emigrants by applying survivorship probabilities that reflect the reduced risk of exposure to dying (as emigrants exit the UK month by month during a year and can be assumed to spend half the year at risk of dying in the UK). We can use the square root (geometric mean) of the survival probability to estimate the emigration i(c) survival probability, es eag:

i(c) i(c) ½ es eag = (s eag) (7).

Then we need to subtract from the total survivors an estimate of the projected number of surviving emigrants. Emigration and immigration in the UK are measured as prospective events via a survey which asks about intentions over the next 12 months. So first we estimate the occurrence-exposure rate of emigration.

i(c) i(c) i(c) i(c) er eag[1]= E eag / ½(SP eag + FP eag) (8).

In a projection context we cannot apply these occurrence-exposure rates to the start and finish populations to project the flow total, as final populations have yet to be projected. An approximation is to modify the emigration rate by a factor based on the survivorship probability:

i(c) i(c) i(c) er eag= er eag[1]× ½(1 + s eag) (9).

The flow of people declaring an intention to emigrate is subject to mortality and must be survived to the end of the annual interval using a survivorship probability that reflects their average exposure in the interval. Here we use the geometric mean or square root of the survivorship probability to estimate

15 the probability that emigrants will survive to the end of the projection interval and hence the probability of emigration and survival:

i(c) i(c) ½ i(c) es eag = (s eag) er eag (10).

The number of such surviving emigrants is projected by applying the surviving emigrant probabilities to the starting population:

i(c) i(c) i(c) ES eag = es eag SP eag (11)

Within country survivors as a stepping stone to internal migrant projection

Then we can compute the number of the starting population who survive within the country by subtracting surviving emigrants from total survivors:

i(c) i(c) i(c) WS eag = TS eag  ES eag (12).

i(c) Substituting for TS eag from equation (6) we obtain

i(c) i(c) i(c) i(c) WS eag = SP eag  D eag  ES eag (13).

Then we can estimate internal migrants surviving within a country:

i(c)j(c) i(c)j(c) i(c) MS eag = ms eag × WS eag (14) and surviving internal migrants between local areas in different countries within a system:

i(c)j(d) i(c)j(d) i(c) MS eag = ms eag × WS eag (15)

i(c)j(c) i(c)j(d) where ms eag and ms eag are the probabilities of migration to another area within the same country and in a different country respectively given survival in the interval:

i(c)j(c) i(c)j(c) i(c) ms eag = MS eag / WS eag (16)

i(c)j(d) i(c)j(d) i(c) ms eag = MS eag / WS eag (17)

16 How can we measure the benchmark values from the latest census? The surviving migrant variables are recorded directly in the census migration tables, but the within country surviving population is not usually tabulated. We must therefore compute this variable from the census migrant data and the census population (the final populations of the year before the census for which migration is i(c) measured). To do this we need to measure the surviving stayer variables, SS eag. These are estimated by subtracting internal in-migrants from local areas within a country, internal in-migrants from zones in other countries in the study set and immigrants from abroad from the end of period final population:

i(c) i(c) j(c)i(c) j(d)i(c) i(c) SS eag = FP eag  ΣjMS eag  ΣdΣjMS eag  IS eag (18)

This enables the computation of the total survivors within a set of countries:

i(c) i(c) i(c)j(c) i(c)j(d) WS eag = SS eag + ΣjMS eag + Σ,dΣ,jMS eag (19) and thus the estimation of probability of migration within the country conditional on survival within the set of countries using equations (16) and (17) above.

Internal surviving migrants using migration probabilities conditional on survival

What does this re-formulation of the multistate projection model achieve? Essentially, the re- formulation using internal migration probabilities conditional on survival de-couples the processes of mortality and migration and enables us to develop separate models for each component. We will use two sets of properly defined probabilities: the relevant aggregations of survivorship and non- survivorship probabilities will always add to one and the appropriate conditional probabilities of internal migration given survival within the UK will always add to one.

Using the probabilities of migration between countries conditional on survival within the set of countries, we project the surviving internal migrants between zones within a country in the set of interest by multiplying the probabilities of migration given survival by the projected within country set survivors:

i(c)j(c) i(c)j(c) i(c) MS eag = ms eag × WS eag (20)

17 and surviving internal migrants between zones in different countries in the set of interest in the same way

i(c)j(d) i(c)j(d) i(c) MS eag = ms eag × WS eag (21).

These projected variables are used in two ways: as out-migration flows to be subtracted from the starting population and as in-migration flows to be added to the starting populations to yield the final populations.

Modelling internal migration between local areas in different countries

When the state space is so large, we cannot reliably estimate the conditional probabilities of internal migration given survival within the UK directly. We must adopt a reduced model that makes assumptions about the independence of components of an equation that estimates the probabilities from more aggregate data. We can divide the problem into two parts: the first concerns modelling the migration between local areas in different countries and the second concerns modelling the migration between local areas within the same country.

The UK consists of four separate areas with different governmental arrangements. Previous analysis of migration flows between England, Wales, Scotland and Northern Ireland suggests that flows are mainly quite small apart from between England and Wales. For these two reasons, we have partitioned the projection system into four countries as a top tier. We adopt a pool approach, combining migration into a country into a pool and then assigning from that pool to destinations, irrespective of origins. We model migration probabilities between a local area in one country and that in another as a two stage process: the probability of out-migrating from an origin local area to the other country multiplied by the probability of migrating to a destination local area in the other country, given in-migration to that country:

i(c)j(d) i(c)*(d) *(*)j(d) ms eag = ms eag × ms eag (22)

i(c)*(d) The first term on the RH side of equation (22), ms eag, is the probability of migration to country d *(*)j(d) from local area i in country c given survival within the UK. The second term, ms eag, is the probability of choosing destination local area j in country d, given in-migration from one of the other countries. We can estimate these probabilities from available 2001 Census migration data. We will undoubtedly need to decompose the estimation further because of the requirement of ethnic and single

18 age specificity. We discuss these issues further when examining migration between zones within one country.

Modelling internal migration between local areas in the same country

Here we need to input to the projection model inter-local area migration probabilities by ethnic group, single year of age and gender. There are a very large number of probability variables involved which we will not be able to estimate directly because (1) the number of cells in the resulting array will likely exceed the country’s population, (2) such detailed data will not be released by national statistical offices on statistical disclosure grounds and (3) no one detailed set of observations would reflect long term migration intensities.

It is useful to extend the log-linear framework for designing the migration component of sub-national projection models used by Van Imhoff et al. (1998) and show how it can be adapted for current purposes. This involves a different notation for thinking out solutions to the modelling challenge. Later we connect the notation back to the notation used in the accounting framework and projection model. In placing numerical values on any of the dimensions of the arrays considered we will use the case of England, which has by far the largest number of local areas (352).

Let E be a set of ethnicities (e.g. E16 ethnic groups proposed here, E7 ethnic groups used in the 2001 Census Commissioned Table on LA migration in England and Wales – Norman et al. 2007). The different ethnic classifications are represented by the lower case letter-number combination, e.g. e16, e7. Let O be a set of origins (e.g. O352 for England’s local areas, O22 for Wales, D32 for Scotland, D26 for Northern Ireland). Individual origins are represented by the lower case letter combination, i(c), local authority i within country c. Let D be a set of destinations (e.g. D352 for England local authorities, D22 for Wales, D32 for Scotland, D26 for Northern Ireland). Individual destinations are represented by the lower case letter combination, j(d), local authority j within country d. Let C be a set of countries (e.g. C4 for the UK’s constituent parts: England, Wales, Scotland and Northern Ireland). Individual countries are represented by the lower case letters c and d. We use c as a general index, but where two are needed use c for origins and d for destinations. Let A be set of ages. For example, A102 represents a full set of single year period-cohorts, A22 represents a full set of five year period-cohorts, A7 represents a set of broad ages used in other analyses such as Champion et al. 2002 and Norman et al. 2007 (e.g. 0-15, 16-19, 20-24, 25-29, 30-44, 45-59 and 60+). Individual ages are

19 referred to using lower case letter-number combinations, e.g. a102, a22, a7. Let S be a set of sexes

(e.g. S2 for males, females). We use the letter g to refer to individual sexes or genders and m for males and f for females. Let T be the set of time intervals (e.g. T50 for 2001-2051), over which the level or direction of migration changes. Individual time intervals are referred to by lower case t, which signals the start point of the time interval. The interval is assumed to a one year interval, unless otherwise specified. So t refers to the interval t, t+1. Time is not explicitly referred to if components within a single time interval only are being described.

A full array of variables would be an EODAS array, which contains migrant counts by origins by destinations by ethnic groups by sexes by ages. Specifically, if we used the most detailed classifications of the state space, this array would be specified as, for England, as E18O352D352A102S2 making ~455 million variables, for Wales as E18O352D352A102S2 making ~1.8 million variables, for

Scotland as E18O352D352A102S2 making ~3.8 million and for Northern Ireland as E18O352D352A102S2 making ~2.5 million variables, in total, some 463 million variables. To this we would need to add the inter-country arrays of out-migration arrays and in-migration arrays, totalling ~3.2 million variables. The total number of variables would be 466 million variables. This “full” model is clearly a non- starter.

At the other extreme we could use the model that assumed independence between each of the dimensions. In this case, we would need a set of E18 + O434 + D434 + A102 + S2 variables, some 990 variables only.

Clearly, some model intermediate in complexity is needed that captures the key interactions which descriptive analysis suggests should be captured. Previous work (van Imhoff et al. 1997) suggests that a combination of OD + OAS + DAS modelled ODAS systems for the UK, the Netherlands and Italy reasonably well. It was important to include origin-destination interactions, because migrants are strongly influenced by the distance between origins and destinations. The closer a destination is the more migrants choose it. In an ethnic population projection it will be important to capture the different destination choices of ethnic groups.

Here is one possible model:

(1) Detailed sex and age variability in migration, A102S2 =204 variables

(2) Origin and broad age interactions, O352A7 + O22A7 + O32A7 + O26A7 = 3,024 variables

(3) Destination and broad age interactions, D352A7 + D22A7 + D32A7 + D26A7 =3,024 variables

(4) Origin and destination interactions, O352D352 + O22D22 + O32D32 + O26D26 = 126,098 variables

20 (5) Origin and broad ethnicity interactions, O352E7 + O22E7 + O32E7 + O26E7 = 6,096 variables

(6) Destination, country and broad ethnicity interactions, D352E7 + D22E7 + D32E7 + D26E7 = 6,096 variables

(7) Ethnicity, E18 =18 variables This model has a total of 901,098 variables.

A second example might be:

(1) Detailed sex and age variability in migration, A102 S2 =204 variables

(2) Origin, broad ethnicity and broad age interactions, O352E7A7 + O22E7A7 + O32E7A7 + O263E7A7 = 21,168 variables

(3) Destination and ethnicity interactions, D352E7 + D22E7 + D32E7 + D263E7 = 21,168variables

(4) Origin-destination interactions, O434D434 + O434D434 + O434D434 + O434D434 =

(5) Origin, country and broad ethnicity interactions, O352C4E7 + O22C4E7 + O32C4E7 + O26C4E7 = 6,096 variables

(6) Destination, country and broad ethnicity interactions, D352C4E7 + D22C4E7 + D32C4E7 + D26C4

E7 = 6,096 variables

(7) Ethnicity, E18 =18 variables This model can be characterised as:

A102 S2 + O434E7A7 + D434E7 + O434D434 + E18 where the total number of variables = 212 882. This number would be reduced by adopting the 4 country and zone system of the accounting framework.

A number of recent papers (Hussain and Stillwell 2008, Stillwell and Hussain 2008, Stillwell et al. 2008, Raymer et al. 2008 and Raymer and Giuletti 2008) examine the spatial structure of migration flows by ethnicity in the UK and these will be reviewed for guidance on choice of a suitable model. The message from these analyses is that ODE terms will needed in our projection model to reflect the considerable differences between ethnicities in their origins, destinations and flows between them.

We have not in this discussion introduced the time dimension, T. Migration activity fluctuates with time, particularly in response to the business and housing market cycles and in response to external events (particularly with respect to international migration). There are long running trends in the attractiveness of localities (OT and DT interactions). There are also surprises when new developments are planned that when completed draw in new migrants. The origin-destination (OD) interactions tend to be rather stable, reflecting distances between places which do not change much over time, though

21 reductions in the cost and time of travel can encourage some migration flows (e.g. between the Accession 8 states and the UK).

Normally, when faced with such a model design problem, alternative models are tested out against knowledge of the full array for one or more historical time intervals. But for internal migration of ethnic groups in the UK we only have limited observations of the EODAST array. For example, our knowledge of ethnic group migration is largely confined to two years before the last two censuses (1990-91 and 2000-01). In addition, policy on statistical disclosure control limits the detail of the array that can be published. Stillwell, Dennett and Hussein (2008) obtained an array of migration flows from the 2001 Census from the Office of National Statistics for England and Wales which represent an O352D352E7A7 array. This is probably the limit of detail that can be obtained.

How should a particular dimensional design for the internal migration model be implemented? One way would be to implement a chain of probabilities that acted as odds ratio (to the mean). Let us use the first model proposed above to start with:

i(a)j(a) i(a)*(*) *(a)j(a) *(a)*(a) *(a)*(a) *(a)*(a) *(a)*(a) *(a)*(a) m eas = m × [M /M ] × [m e**/m ] × [m * s/m ] *(a)*(a) *(a)*(a) × [m *a/m ] (23) which we can interpret as follows: conditional probability of migration from zone i in country a to zone j in country a = probability of migration given survival in country a ×[share of destination j of migration in country a] ×[ratio of ethnic group e migration probability to all group migration probability in country a] ×[ratio of sex group s migration probability to all group migration probability in country a] ×[ratio of age group a migration probability to all group migration probability in country a].

Migrants changing country and ethnicity

For migrants changing country (MSi(c)j(d)), we must apply proportional look up tables that re-assign them from the ethnic classification of their origin home country to the ethnic classification of their destination country. The look up tables will be based on best matches in the classification and distribution using existing census information in the destination home country.

Immigration and surviving immigrants using immigration ratios and survivorship probabilities

22 We will need to introduce immigration into the projection in a similar fashion to surviving emigrants, but in this case we will be content with simply surviving the immigration flow over the time interval:

i(c) i(c) ½ i(c) IS eag = (s eag) × I eag (24),

i(c) where I eag is the immigration flow into zone i in country c by persons in ethnic e, period-cohort a and gender g.

Births, fertility rates, and mixed births

We will design a standard fertility model (female dominant) and introduce ethnic specific fertility rates. However, this model will allow mothers and fathers to generate mixed ethnicity offspring and so create new members of the mixed group. The fertility rates will be estimated using child-woman ratios (CWRs) for ethnic groups using data from the Standard Tables of the 2001 Census with inter- ethnic distribution factors derived from an analysis of the Samples of Anonymised Records of the 2001 Census.

The equations for the infant period-cohort

The model equations for each of the components need to be applied as well to the projected births in a projection time interval, allowing for the reduced exposure time, surviving and migrating the babies from their places of birth to their destinations, aged 0, at the end of the projection time interval.

Bringing the components together in a full projection model

The equation that projects the final population of an ethnic-age-gender group is:

i(c) i(c) i(c) (c) i(c) i(c) i(c)j(d) i(c) i(c)j(d) i(c) FP eag = SP eag – d eag SPi eag – e eag SP – Σb,jm eag SP eag – Σi≠jm eag SP eag j(c)i(d) j(c) j(b)i(a) j(b) i(a) + Σi≠jm eag SP xg + Σb,jm xg SP xg + IS xg (25)

where the following transition probabilities have been substituted for the flow terms: da death (non- survival) probabilities for period-cohort a, ea (surviving) emigrant probabilities for period-cohort a, and ma (surviving) migrant probabilities for period-cohort a.

The model described here is designed to project the population of a country, the United Kingdom, organised into four large regions (aka “home countries”) and a large set of 452 local areas (the lowest tier of local government). The model will be partitioned between these large regions and the flows of migrants between them handled in a simpler way than migrants within them. This population is

23 ethnically diverse and the model handles up to 16 ethnic groups as separate populations disaggregated by single year of age and sex. Ethnic group populations interact in the fertility process contributing members to a new set of groups of Mixed ethnicity. Because the ethnic classifications differ between the large regions people are re-classified when they cross the border between one large and another. The model is based in an accounting framework suitable for transition migration inputs derived from the census because this is the only reliable source of migration at local area scale classified by ethnicity. The model is still being designed along with associated computer code for delivery of a working tool for population forecasting in 2009.

INPUTS TO THE PROJECTION MODEL

Base populations

Concept

The model described earlier is envisaged to project the UKs future population for all the UK’s 432 Local Authorities. For each Local Authority, each ethnic group’s (defined in the Census 2001) population will be projected for single years of age and up to age of 100+, for men and women. This means, we will distinguish between a maximum of 16 ethnic groups for England and Wales, 12 ethnic groups for Northern Ireland and 5 for Scotland. Most input data for the model will follow this structure.

Population data available

At the current state input data are based on 2001 data, but will be extended in future. Table 7 shows which population data are readily available in the UK, the next paragraph, Estimation methods, describes in short how we can use the data available to derive the population data needed.

[Table 7 about here]

Estimation methods Table 7 shows that the only population data available in the desired structure are from (1) the Census 2001which supplies the population for all UK Local Authorities by single year of age and ethnic group. However, the Census was on the 29 April 2001 whereas we aspire to base our input data on midyear population estimates. To obtain those, we use a few simple assumptions. To get mid year estimates for the total population, we simple used (2) mid year estimates for all persons and divided the 90+ population for England, Wales and Scotland and the 85+ population of Northern Ireland assuming the same population age structure as observed in the Census 2001 for these age groups. To

24 extend the (3) English mid year estimates for the 16 ethnic groups to age 100+, we use the same assumptions. To finally derive mid year population estimates for the remaining home countries and their ethnic groups, we compute the factor for each age and Local Authority by which the all persons population changed from the Census 2001 to the midyear estimates of 2001 and multiply the Census 2001 ethnic population data with this factor.

Survival/mortality

Now that we have got our base population for 2001, we concentrate on one of the main ingredients for projection models, mortality data for the populations of interest. A literature review brought to light that to date, none of the UK projections or roll-forward, year by year estimates of ethnic groups use ethnic-specific mortality (e.g. UK regions: Rees and Parsons 2006; GLA, Boroughs: Bains and Klodawski 2006, 2007; England, Local Authorities: Large and Ghosh 2006a, 2006b; UK: Coleman and Scherbov 2005; Leicester: Danielis 2007). Some work uses mortality rates based on country of birth (Harding and Balarajan 2002), but this method only identifies first generation immigrants.

On the other hand, work conducted in other countries highlights the relevance of using ethnic specific mortality rates in projection models. The United States Census Bureau routinely computes projections by race and Hispanic origin (US Bureau of the Census 2004) and publishes life expectancies by race (NCHS 2007) and found for 2003, for example that White men have life expectancies in 2003 of 75.3, while for Black men life expectancies are only 68.9. The corresponding figures for women are 80.4 for Whites and 75.9 for Blacks. More examples are discussed in Rees and Wohland (2008).

If there are reasonable suggestions that ethnicity is relevant for the intensity of mortality, why have UK projections models not already considered that? The answer is simple, because mortality data in the UK are only available for the total population as vital statistics but not for ethnic groups. To our knowledge our estimates are the first in detail estimates for mortality by ethnic groups and local areas conducted for the UK.

Estimation method

So, what can be done to fill this gap in UK demographic statistics? Is there a data source for the UK that can deliver reliable information for all of the ethnic groups at local level and represent some kind of proxy to estimate mortality? Yes there is, the 2001 Census asked questions on reporting once health as either “limiting long-term illness” and “general health”. There have been a large number of studies carried out using American, Danish, Dutch, Finnish and Swedish data which indicate that self- reported health is a remarkably good predictor of subsequent mortality (Burström and Friedlund 2001,

25 McGee et al. (1999) Heistaro et al. (2001) Helwig-Larson et al. (2003) Franks et al. (2003) , Singh and Siahpush (2001)). In summary, these studies suggest that:  Self-reported health status is a strong predictor of subsequent death.  The relationship for men is different from that for women.  Socioeconomic factors are important in explaining mortality variation across groups but self- reported health status still has a significant influence after controlling for them.  There is variation between racial/ethnic groups in the self-reported health-mortality link but it is not huge.  There is an important influence of immigrant generation with the first generation having better self-reported health and mortality than subsequent generations.

Figure 3 outlines how the self reported illness data from the UK 2001 Census are used to calculate ethnic mortally on a local area level. In brief, in a first step Standardized Illness Ratios (SIRs) are computed for all local authorities in the UK for the whole population (aggregated over ethnicity) and separately for each ethnic group. At the same time Standardised Mortality Ratios for the whole population (SMRs) are calculated from data available from National Statistics for the calendar year 2001. In a next step all people local authority SIRs are regressed against all people local authority SMRs. Then this regression relationship is used on the ethnic group SIRs to generate ethnic-specific SMRs for each local area. From here, ethnic group mortality rates are estimated by multiplying the all group rates by the ratio of ethnic-specific SMRs to the all group SMR. In addition these ethnic specific mortality rates local areas are adjusted so that they produce the observed all group number of deaths in 2001. In a final step, the resulting mortality rates are then input to life table routines to generate life tables for all ethnic groups in each local authority, from which survivorship probabilities can be extracted for use in population projection.

[Figure 3 about here]

In the next paragraphs the relationship between SMR and SIR and ethnic groups SIR are described and analysed in more detail.

Figure 4 graphs SMR against SIR for two local authority data sets for both sexes. Table 8 provides the coefficients for the regression lines depicted in the graphs. From Figures 4(a) and 4(b) we can see that the regression slopes do vary between the UK home countries sets of local authorities. The England slope is close to the UK slope; Scotland has considerably steeper slopes than England, while Wales and Northern Ireland have gentler slopes, indicating stronger regression to the mean. In all cases, the

26 male slope is steeper than the female with mortality and illness ranges greater for males. The goodness of fit (r2) varies from a low of 0.16 for females in Northern Ireland to a high of 0.78 for females in Wales; on average it is around 0.5 but higher for males than females. This means about half the variation in SMRs across local authorities is associated with variation in self-reported limiting long-term illness. Slope coefficients are all below one, indicating that there is regression towards the mean: areas with higher than average SIRs also experience higher than average SMRs but these are closer to the mean; areas with lower than average SIRs also exhibit lower than average SMRs.

[Figure 4 about here]

[Table 8 about here]

Figures 4(c) and 4(d) show what happens for England when we divide LAs into those with above average ethnic minority shares in their population and those with below average shares. Might there be different relationships because of ethnic compositions of the population (equivalent to those between home nations)? The results suggest not: the two sets give almost identical coefficients. In conclusion, we chose to use different relationships between SIR and SMR for each home nation, under the assumption that the whole population relationship could be applied to each ethnic group. The next step was to estimate SIR for ethnic groups in local areas using 2001 Census data.

Figure 5 provides histograms of the distribution of SIRs for males and females for each of the 16 ethnic groups. White British SIRs cluster around the UK mean of 100 with a slightly lower average and comparable distributions for men and women. The White Irish SIRs are similar but slightly higher. The White Other group has a distribution with a majority of LAs below the UK average. The Mixed White and Black Caribbean and Mixed, White and Black African groups both exhibit worse illness distributions than White groups with higher than UK averages. The Mixed, White and Asian and Mixed, Other Mixed have slightly than average SIRs. The Asian or Asian British SIRs have the feature that female SIRs are higher than male SIRs. This suggests that Asian men are more reluctant to report limiting long term illness than Asian women. There is evidence from surveys in South East Asia (Lutz et al. 2007; Karcharnubarn 2008) that women are significantly more likely to report poor health. The Indian men have about average SIRs while Indian women’s average is 23 points higher. Pakistani and Bangladeshi men and women both report significantly high SIRs. Other Asians are marginally above average (females). Black or Black British groups have contrasting experiences: Caribbeans report more illness than average as does the Other group, while Africans report lower

27 illness. The Chinese have the lowest SIR of any ethnic group, while the SIRs of Other Ethnic group are also below average.

[Figure 5 about here]

Life expectancies for ethnic groups

The mean life expectancies in England at birth are listed in rank order for men and women in Table 9. The all group mean is placed in the table for reference. The White British group has life expectancies slightly above (women) and below (men) the all group mean. The Chinese group life expectancies are highest for both men and women. Also above the all group mean for men and women are the Other White and Other Ethnic. Black African similar to the White British groups are slightly above (men) and below (women) the all group mean. The largest discrepancies between men and women are observed in the Indian (men rank 6/women rank 11) and the White Irish (men rank 9/women rank 6) groups. We already noted that Indian women report higher rates of limiting long-term illness, relative to the all group average than men. The lowest life expectancies are experienced by Bangladeshis, Pakistanis, the Other Black group and the Mixed White and Black Caribbean group.

The spatial patterns of life expectancies

Figures 6 captures the essence of the spatial variation in life expectancy at birth across England for women of each of the 16 ethnic groups. The maps have a simple tricolour code which relates to the overall distribution of life expectancy across all local authorities in the UK. Red shade denotes that the area belongs to the 25% highest life expectancies observed in the UK (81.2 years to 85.9 years for women and 77.2 years to 84.6 years for men), blue the 25% lowest (73.8 years to 78.9 years for women and 68.7years to 74.5 years for men), and the 50% in the middle are grey.

Following features stand out:

 The gradient from higher life expectancies in South and East England to lower expectancies in Northern England.  This gradient is modified by urban/rural status of local authorities. Life expectancies in rural areas are higher than expectancies in urban areas. So, in Northern England there is a band of rural local authorities running from North Yorkshire to Cumbria which have favoured life expectancies (Brown and Rees 2006). In South and East England there are local authorities within urban areas which have lower life expectancies, particularly in and in the eastern LAs of the capital region, the Thames Gateway.

28  Four ethnic groups stand out as having most areas in the top quartile of the distribution: Chinese, Black African, Other Ethnic and White Other groups, although in Northern England and in South and East England cities, there are local authorities in the middle band.  Four ethnic groups stand out as having a large number of local areas in the bottom quartile: Mixed – White and Black Caribbean, Pakistani, Bangladeshi and Black Other groups.  The remaining groups – White British, White Irish, Mixed – White and Black African, Mixed – White and Asian, Mixed – Other Mixed, Indian, Other Asian and Black Caribbean – have a mixture of high, middle and low life expectancies. The spatial patterns of life expectancy of women presented here are very similar to those we found for men. However, the levels of female life expectancies are, of course, are higher than male life expectancies: the gaps range from 3.3 years (Indians) to 4.9 years (Irish). The lowest differences are for the Asian groups; the highest differences are for the Irish and Mixed groups. These estimates support the importance of considering mortality for ethnic groups on a local area level, as we find both significant variations in life expectancies between different ethnic groups as well as for different regions. It is important to note the low life expectation of the Pakistani and Bangladeshi communities, which together make up a large proportion of the non White ethnic group in England. If health inequalities in England are to be reduced, the health issues experienced by these two Asian groups need to be addressed.

These estimates of mortality experiences of ethnic groups in England have the status of provisional statistics and are currently being quality assessed by ONS. If they are accepted as experimental statistics, then the next steps will be to build explanations of the variations across ethnic groups and local areas. These explanations will include socioeconomic and environmental factors (Brown and Rees 2006). Preliminary analysis indicates that the level of educational qualification explains much of the differences between ethnic groups.

Other Components: Immigration, Emigration, Internal migration, Fertility

One important part of the project has been to prepare new estimates on immigration to the UK. We have gathered together into one spreadsheet database called a New Migrant Databank the statistics on migration available from census, survey and administrative sources (Boden and Rees 2008a, 2008b). The Databank is already revealing interesting differences between the account of migration derived from different sources and is generating a re-evaluation of the sub-national distribution of immigration provided by the UK’s National Statistics Office. A screen shot of the Databank is shown in Figure 7. On the left hand side, a graph compares for one of England’s regions, Yorkshire and the Humber, the time series of immigration estimates based on four sources: the official Total

29 International Migration (TIM) series produced by ONS, the General Practitioner new registrations (GP Regs) of patients recently arrived from outside the UK, the new National Insurance Number (NINo) applications by persons originating in all countries and the NINo applications by migrants from the Accession 8 countries, which joined the EU in May 2004. The flat line across the graph provides a comparison with immigrants recorded in the 2001 Census. There are clearly large discrepancies between these series, in part to do with definition (long-term, short-term migrants; worker vs non-worker; registered worker vs self-employed worker). In this case we think it is likely that the official estimate (TIM statistics) is an overestimate of inflows from abroad to the region (elsewhere there are underestimates). The next task that we have is to produce a synthetic estimate of estimated immigration using these data and local intelligence on which series is likely to be closer to the true picture.

We plan to begin work in 2009 on the estimation of the internal migration variables, drawing on the recent experience and results of colleagues (reviewed briefly above). This estimation will focus on using the data from the 2001 census and then updating that information with data from the New Patient Registration system and the Labour Force Survey.

We are currently working on the estimation of fertility rates for ethnic groups for local areas, having developed and tested a method based on the Child-Woman Ratio combined with local area fertility rates for the whole population for Government Office Regions.

The final variable needed as input to the projection will be emigration rates. Here we need to rely on the UK’s International Passenger Survey, enhanced by knowledge of the number of Britons living abroad and using some information on total intra-country out-migration from local areas. These estimates will be quite crude.

30 CONCLUDING REMARKS In this paper we have outlined the structure of a new model to project ethnic group populations within a multi-country, multi-zone and multi-group framework. This is a work in construction with many missing pieces which will be added as the model is converted into operational software. We have also reported on some progress made in estimating the inputs needed for the projection model, focussing on one of the needed inputs, ethnic-specific mortality, which in the UK needed particular attention because no direct record of ethnicity is entered on to a person’s death certificate. These estimations rest on an assumption which we so far have been unable to test. The assumption is that each ethnic group follows the general relationship that translates illness (which we can measure for ethnic groups) into mortality (which we cannot measure directly for ethnic groups). However, the results do seem plausible and correlate well with the level of socioeconomic achievement of the different groups. We are working on the other component inputs to the projection.

We have taken on a very challenging task and may have to reduce our ambition in order to achieve the goal of being able to forecast the changing ethnic mix of the UK population. This is an important challenge which should be addressed because it tells us important things about the future society that our children, grandchildren and yet to be born great grandchildren will live in.

31 REFERENCES Bains, B. and Klodawski, E. (2006) GLA 2005 Round: Interim Ethnic Group Population Projections. DMAG Briefing 2006/22, November 2006. Data Management and Analysis Group, Greater London Authority, London. Available at: http://www.london.gov.uk/gla/publications/factsandfigures/dmag-briefing-2006-22.pdf. Bains, B. and Klodawski, E. (2007) GLA 2006 Round: Ethnic Group Population Projections. DMAG Briefing 2007/14, July 2007. Data Management and Analysis Group, Greater London Authority, London. Available at: http://www.london.gov.uk/gla/publications/factsandfigures/DMAG- Briefing-2007-14-2006.pdf. Bijak J., Kupiszewska D., Kupiszewski M., Saczuk K. (2005) Impact of international migration on population dynamics and labour force resources in Europe. CEFMR Working Paper 1/2005. Central European Forum for Migration Research, Warsaw. Online at: http://www.cefmr.pan.pl/docs/cefmr_wp_2005-01.pdf. Bijak J., Kupiszewska D., Kupiszewski M., Saczuk K. and Kicinger A. (2007) Population and labour force projections for 27 European countries, 2002–2052: impact of international migration on population ageing. European Journal of Population 23 (1), 1-31. Boden P. and Rees P. (2008a) New Migrant Databank: concept and development. Chapter 5 in Stillwell J., Duke-Williams O. and DennettA. (eds.) Technologies for Migration and Commuting Analysis. IGI Global, Hersey, PA. Boden P. and Rees P. (2008b) New Migrant Databank: Concept, development and preliminary analysis. Paper presented at the QMSS2seminar on Estimation and Projection of International Migration, University of Southampton, 17-19 September 2008. Online at: http://www.ccsr.ac.uk/qmss/seminars/2008-09-17/documents/QMSSPaper-PBPHR-Sept2008.pdf Brown, D. and Rees, P. (2006) Trends in local and small area mortality in Yorkshire and the Humber: monitoring health inequalities. Regional Studies, 40(5): 437-458. Burström, B. and Fredlund, P. (2001) Self-rated health: Is it as good a predictor of subsequent mortality among adults in lower as well as in higher social classes. Journal of Epidemiology and Community Health, 55, 836-840. Champion, T., Fotheringham, S., Rees, P., Bramley, G. and others (2002) Development of a Migration Model. The University of Newcastle upon Tyne, The University of Leeds and The Greater London Authority/London Research Centre. Office of the Deputy Prime Minister, London. ISBN 1 85112 583 3. Online at: http://www.communities.gov.uk/documents/housing/pdf/138796.pdf Danielis, J. (2007) Ethnic Population Forecasts for Leicester using POPGROUP. CCSR Research Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/research/documents/EthnicPopulationForecastsforLeicester.pdf. Dunnell, K. (2007) The changing demographic picture of the UK: National Statistician’s annual article on the population. Population Trends 130, 9-21. Available online at: http://www.statistics.gov.uk/downloads/theme_population/Population_Trends_130_web.pdf. CCSR (2008) POPGROUP: Population Forecasts. Online at: http://www.ccsr.ac.uk/popgroup/about/popgroup.html. Coleman, D. (2006) Immigration and ethnic change in low-fertility countries: A third demographic transition. Population and Development Review 32(3):401–446. Coleman, D and Scherbov, S. (2005) Immigration and ethnic change in low-fertility countries – towards a new demographic transition? Presented at the Population Association of America Annual Meeting, Philadelphia. Available at http://www.apsoc.ox.ac.uk/Oxpop/publications%20files/WP29.pdf Franks, P., Gold, M.R. and Fiscella, K. (2003) Sociodemographics, self-rated health, and mortality in the US. Social Science and Medicine, 56(12): 2505-2514. Harding, S., Balarajan, R. (2002) Mortality data on migrant groups living in England and Wales: issues of adequacy and of interpretation of death rates. Pp.115-127 in: Haskey, J., ed. Population Projections by Ethnic Group: A Feasibility Study. London: The Stationery Office

32 Heistaro, S., Jousilahti, P., Lahelma, E., Vartiainen, E. and Puska, P. (2001) Self-rated health and mortality: a long term prospective study in eastern Finland. Journal of Epidemiology and Community Health, 55(4) 227-232. Helweg, M., Kjøller, M. and Thoning, H. (2003) Do age and social relations moderate the relationship between self-rated health and mortality among adult Danes? Social Science and Medicine, 57(7): 1237-1247. Hollis, J. and Bains, B. (2002) GLA 2001 Round Ethnic Group Population Projections. DMAG Briefing 2002/4. Data Management and Analysis Group, Greater London Authority, London. Hussain, S. and Stillwell, J. (2008) Internal migration of ethnic groups in England and Wales by age and district type. Working Paper 08/3, School of Geography, University of Leeds, Leeds, UK. Online at: http://www.geog.leeds.ac.uk/wpapers/08-03.pdf. Imhoff, E. van & N. Keilman (1991) LIPRO 2.0: an application of a dynamic demographic projection model to household structure in the Netherlands. NIDI/CBGS Publications nr. 23, Amsterdam/Lisse: Swets & Zeitlinger. 245 p. Online at: http://www.nidi.knaw.nl/en/output/nidicbgs/nidicbgs-publ-23.pdf/nidicbgs-publ-23.pdf Karcharnubarn, R. (2008) Healthy life expectancies in Thailand. Chapter 6, Draft PhD thesis, School of Geography, University of Leeds. Kupiszewska, D. and Kupiszewski, M. (2005), A revision of the traditional multiregional model to better capture international migration: The MULTIPOLES model and its applications, CEFMR Working Paper 10/2005. Large, P. and Ghosh, K. (2006a) A methodology for estimating the population by ethnic group for areas within England. Population Trends 123:21-31. Large, P. and Ghosh, K. (2006b) Estimates of the population by ethnic group for areas within England. Population Trends 124:8-17. Lutz, W., Samir, K.C., Khan, H.T.A., Scherbov, S. and Leeson, G.W. (2007) Future ageing in Southeast Asia: demographic trends, human capital and health status. Interim Report IR-07-026, International Institute for Applied Systems Analysis, Laxenburg, Austria. McGee, D.L., Liao, Y., Cao, G. and Copper, R.S. (1999) Self-reported health status and mortality in a multiethnic US cohort. American Journal of Epidemiology, 149(1), 41-46. NCHS (2007) United States Life Tables, 2003. National Vital Statistics Reports, 54, 14: 1-40. National Center for Health Statistics, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services, MD 20782, USA. Retrieved 10 September 2008 from: http://www.cdc.gov/nchs/data/nvsr54/nvsr54_14.pdf London Research Centre (1999) 1999 Round of ethnic group projections. LRC, London. NIDI (2008) LIPRO 4 for Windows. Online at: http://www.nidi.knaw.nl/en/projects/270101/. Norman, P., Stillwell, J. and Hussein, S. (2007) Propensity to migrate by ethnic group: 1991 and 2001. Presentation at the Sample of Anonymised Records, User Meeting, ONS (2003) A Beginner’s Guide to UK Geography: UK Map Collection. Office for National Statistics, London. Online at: http://www.statistics.gov.uk/geography/maps.asp. ONS (2007) Ethnic population estimates. National Statistics Statbase data repository ONS (2008) Population change: UK population creases by 388,000. Online at: http://www.statistics.gov.uk/cci/nugget.asp?ID=950 ONS (2008) 2006-based Subnational Population Projections for England Methodology Guide. Online at: http://www.statistics.gov.uk/downloads/theme_population/SNPP- 2006/2006_Methodology_Guide.pdf Raymer J., Smith, P. and Giulietti, C. (2008) Combining census and registration data to analyse ethnic migration patterns in England from 1991 to 2007. Paper presented at the 2008 European Population Conference, Barcelona, Spain. Raymer, J. and Giuletti, C. (2008) Analysing structures of interregional migration in England. Paper presented at the ESRC Research Methods Festival, Oxford, 2 July 2008, Session 40: Handling Migration and Commuting Flow Data.

33 Rees, P. (1981) Accounts based models for multiregional population analysis: methods, program and users' manual. Working Paper 295, School of Geography, University of Leeds, Leeds, UK. Rees P., Stillwell J. and Convey A. (1992) Intra-Community migration and its impact on the development of the demographic structure at regional level. Working Paper 92/1, School of Geography, University of Leeds, Leeds, UK. Rees, P. (2002) New models for projecting UK ethnic group populations at national and subnational scales. Chapter 3 in Haskey, J. (ed.) Population Projections by Ethnic Group: A Feasibility Study. ONS Studies in Medical and Population Topics, SMPS No.67. The Stationery Office, London. Pp.27-51. Rees P. and Butt F. (2004) Ethnic change and diversity in England, 1981-2001. Area, 36(2): 174-186. Rees, P. and Parsons, J. (2006) Socio-demographic scenarios for children to 2020. York: Joseph Rowntree Foundation. Available at http://www.jrf.org.uk/bookshop/details.asp?pubID=809. Rees, P. And Wilson, A. (1977) Spatial Population Analysis. Edward Arnold, London. Rees, P. & Wohland, P. (2008). Estimates of ethnic mortality in the UK. Working Paper 08/04, School of Geography, University of Leeds, Leeds, UK. Online at: http://www.geog.leeds.ac.uk/wpapers/08-4.pdf. Rogers A. (1976) Shrinking large-scale population projection models by aggregation and decomposition. Environment and Planning A 8: 515–541. Rogers, A. (1990) Requiem for the net migrant. Geographical Analysis 22, 283–300. Singh, G.K. and Siahpush, M. (2001) All-cause and cause-specific mortality of immigrants and native born in the United States. American Journal of Public Health, 91(3): 392-399. Simpson, L. (2007a) Population forecasts for Birmingham. CCSR Working Paper 2007-12, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/publications/working/2007-12.pdf. Simpson, L. (2007b) Population forecasts for Birmingham, with an ethnic group dimension. CCSR Research Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/research/documents/PopulationForecastsforBirminghamCCSRReport.pdf. Simpson, L. (2007c) Population forecasts for Birmingham, with an ethnic group dimension. CCSR Technical Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/research/documents/PopulationforecastsforBirminghamCCSRTech.pdf. Simpson, L. and Gavalas, V. (2005a) Population forecasts for Oldham Borough, with an ethnic group dimension. CCSR Research Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Retrieved 8 January 2008 from: http://www.ccsr.ac.uk/research/documents/PopulationForecastsforOldhamCCSRReportMay05.pdf Simpson, L. and Gavalas, V. (2005b) Population forecasts for Rochdale Borough, with an ethnic group dimension. CCSR Research Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/research/documents/PopulationForecastsforRochdaleCCSRReportMay05.pd f. Simpson, L. and Gavalas, V. (2005c) Population forecasts for Oldham and Rochdale Boroughs, with an ethnic group dimension. CCSR Technical Report, Cathie Marsh Centre for Census and Survey Research, The University of Manchester, Manchester. Online at: http://www.ccsr.ac.uk/research/documents/PopulationforecastsforOldhamandRochdaleCCSRTech May05.pdf. Stillwell, J. and Hussain, S. (2008) Ethnic group migration within Britain during 2000-01: a district level analysis. Working Paper 08/2, School of Geography, University of Leeds, Leeds UK. Online at: http://www.geog.leeds.ac.uk/wpapers/08-02.pdf Stillwell, J., Hussain, S. and Norman, P. (2008) The internal migration propensities and net migration patterns of ethnic groups in Britain. Migration Letters, 5(2) 135-150.

34 Storkey, M. (2002) Population Projections of Different Ethnic Groups in London, 1991 to 2011. PhD Thesis, University of Southampton. 261p. Van Imhoff, E., van der Gaag, N., van Wissen, L. and Rees, P.H. (1997) The selection of internal migration models for European regions. International Journal of Population Geography, 3, 137- 159. Wilson, T. (2001) A new subnational population projection model for the United Kingdom. PhD Thesis, University of Leeds. Wilson T (2008) A multistate model for projecting regional populations by Indigenous status: an application to the Northern Territory, Australia. Forthcoming in Environment and Planning A. Wilson, T. and Bell, M. (2004a) Australia's Uncertain Demographic Future. Demographic Research, 11 8: 195-234. Wilson, T. and Bell, M. (2004b) Comparative empirical evaluations of internal migration models in subnational population projections. Journal of Population Research, 21 2: 127-160. Wilson, T., Bell, M., Heyen, G. and Taylor, A. (2004) New Population Projections for Queensland and Statistical Divisions. People and Place, 12 1: 1-14. Wilson, T. and Rees, P. (2003) Why Scotland needs more than just a new migration policy. Scottish Geographical Journal 119.3: 191-208.

35 Table 1: Ethnic groups: short and full names

SHORT FULL NAME NAME ENGLAND AND WALES ALL All Ethnic Groups WBR White: British WIR White: Irish OWH White: Other White WBC Mixed: White and Black Caribbean WBA Mixed: White and Black African WAS Mixed: White and Asian OMI Mixed: Other Mixed IND Asian or Asian British: Indian PAK Asian or Asian British: Pakistani BAN Asian or Asian British: Bangladeshi OAS Asian or Asian British: Other Asian BCA Black or Black British: Black Caribbean BAF Black or Black British: Black African OBL Black or Black British: Other Black CHI Chinese or other ethnic group: Chinese OET Chinese or other ethnic group: Other Ethnic Group SCOTLAND ALL All Ethnic Groups WHI White IND Indian PAS Pakistani and other South Asians CHI Chinese OTH Others NORTHERN IRELAND ALL All Ethnic Groups WHI White ITR Irish Travellers MIX Mixed IND Indian PAK Pakistani BAN Bangladeshi OAS Other Asians BCA Black Caribbean BAF Black African OBL Other Black CHI Chinese OTH Others

36 Table 2: Ages: period-ages and period-cohorts

Period Period- Period- Period- Period- Period- Period- Period- ages ages ages ages cohorts cohorts cohorts cohorts 0 25 50 75 -1 to 0 24 to 25 49 to 50 74 to 75 1 26 51 76 0 to 1 25 to 26 50 to 51 75 to 76 2 27 52 77 1 to 2 26 to 27 51 to 52 76 to 77 3 28 53 78 2 to 3 27 to 28 52 to 53 77 to 78 4 29 54 79 3 to 4 28 to 29 53 to 54 78 to 79 5 30 55 80 4 to 5 29 to 30 54 to 55 79 to 80 6 31 56 81 5 to 6 30 to 31 55 to 56 80 to 81 7 32 57 82 6 to 7 31 to 32 56 to 57 81 to 82 8 33 58 83 7 to 8 32 to 33 57 to 58 82 to 83 9 34 59 84 8 to 9 33 to 34 58 to 59 83 to 84 10 35 60 85 9 to 10 34 to 35 59 to 60 84 to 85 11 36 61 86 10 to 11 35 to 36 60 to 61 85 to 86 12 37 62 87 11 to 12 36 to 37 61 to 62 86 to 87 13 38 63 88 12 to 13 37 to 38 62 to 63 87 to 88 14 39 64 89 13 to 14 38 to 39 63 to 64 88 to 89 15 40 65 90 14 to 15 39 to 40 64 to 65 89 to 90 16 41 66 91 15 to 16 40 to 41 65 to 66 90 to 91 17 42 67 92 16 to 17 41 to 42 66 to 67 91 to 92 18 43 68 93 17 to 18 42 to 43 67 to 68 92 to 93 19 44 69 94 18 to 19 43 to 44 68 to 69 93 to 94 20 45 70 95 19 to 20 44 to 45 69 to 70 94 to 95 21 46 71 96 20 to 21 45 to 46 70 to 71 95 to 99 22 47 72 97 21 to 22 46 to 47 71 to 72 96 to 97 23 48 73 98 22 to 23 47 to 48 72 to 73 97 to 98 24 49 74 99 23 to 24 48 to 49 73 to 74 98 to 99 100+ 99 to 100 100+ to 101+ Notes: Period-ages are used in estimating ethnic mortality rates. The period-ages in bold are the ones used in the fertility sub-model, summed in 5 year ages. Period-cohorts are used for the projection model itself.

37 Table 3: A notation for an ethnic population projection model

Variable Description Stocks Counts of people FP Final Population in a time interval (count) SP Start Population in a time interval (count) L Stationary population (equivalent to the Life years variable in a Life table model ) Flows Transitions from one state to another B Births D Non-Survivors (deaths to persons in a region at the start of an interval) TS Total Survivors (transitions, survivors from persons in a region at the start of the interval) NS Non-Survivors (deaths to persons in a region at the start of an interval) WS Total Survivors within a set of countries SS Surviving Stayers (transitions) MS Surviving Migrants (inter-country or inter-zone, internal migrants) TOM Total Out-Migrants (surviving inter-country or inter-zone, internal migrants) TIM Total In-Migrants (surviving inter-country or inter-zone, internal migrants) ES Surviving Emigrants (migrants to rest of world, external migrants) TE Total Emigrations (count of migrations to rest of world, external migrants) IS Surviving Immigrants (migrants from rest of world, external migrants) TI Total Immigrations (count of migration from rest of world, external migrants) Flows Derived variables NI natural increase = births minus deaths NTM net total migration = total in-migrants minus total out-migrants NIM net internal migration = total internal in-migrants minus total internal out-migrants NEM net external migration = immigration minus emigration PC population change = final population minus start population Intensities Either probabilities or occurrence-exposure rates F fertility rates (occurrence exposure rates) D death rates or mortality rates (occurrence-exposure rates) S survival probabilities Ns Non-survival probabilities = 1  survival probabilities Ms migration probabilities conditional on survival Es emigration probabilities conditional on survival Z sex proportion at birth Indexes Integers X period-age index (values = 0 to 100) Nx number of period-ages (value = 101) (nx+1 = index for all period-ages) A period-cohort index (values = 0 to 101) Na number of period-cohorts (102) (nw+1 = index for all period-cohorts) Y year (values 2001 to 2101) Ny number of years (set as projection horizon) G gender index (values = 0, 1) Ng number of genders (2) (ng+1 = index for persons) E ethnic group (index values = 1 to 16, 1 to 18 Ne number of ethnic groups (initially 16, then 18) (ne+1 = index for all ethnic groups) C country index (values = 1 to 4) Nc number of countries (4) (nc+1 = index for all countries) D country index (used for destinations) i(c) zone index within country ni(c) number of zones within country c j(c) zone index within country c T for stocks: a point in time; for flows: an interval in time from t to t+1

38 Table 4: The nested population accounting framework for countries and local areas (transition data)

Northern DESTINATIONS England Wales Scotland Ireland Rest of world Deaths Totals

City of London Isle of and Anglesey/Ynys Cardiff/Caerdy Start Westminster … Isle of Wight Môn … dd Aberdeen City : West Lothian Derry City : Belfast Populations ORIGINS Zone names Zones 1(1) … N(1) 1(2) … N(2) 1(3) … N(3) 1(4) … N(4) R D

City of London and 1(1)1(1) 1(1)N(1) 1(1)1(2) 1(1)N(2) 1(1)1(3) 1(1)N(3) 1(1)1(4) 1(1)N(4) 1(1) 1(1) 1(1)*(*) England Westminster 1(1) SS … MS MS … MS MS … MS MS … MS ES D SP

: : :…::;:;::::

N(1)1(1) N(1)N(1) N(1)1(2) N(1)N(2) N(1)1(3) N(1)N(3) N(1)1(4) N(1)N(4) N(1) N(1) N(1)*(*) Isle of Wight N(1) MS … SS MS … MS MS … MS MS … MS ES D SP

Isle of Anglesey/Ynys 1(2)1(1) 1(2)N(1) 1(2)1(2) 1(2)N(2) 1(2)1(3) 1(2)N(3) 1(2)1(4) 1(2)N(4) 1(2) 1(2) 1(2)*(*) Wales Môn 1(2) MS … MS SS … MS MS … MS MS … MS ES D SP

: : :…::…::…::…::::

N(2)1(1) N(2)N(1) N(2)1(2) N(2)N(2) N(2)1(3) N(2)N(3) N(2)1(4) N(2)N(4) N(2) N(2) N(2)*(*) Cardiff/Caerdydd N(2) MS … MS MS … SS MS … MS MS … MS ES D SP

1(3)1(1) 1(3)N(1) 1(3)1(2) 1(3)N(2) 1(3)1(3) 1(3)N(3) 1(3)1(4) 1(3)N(4) 1(3) 1(3) 1(3)*(*) Scotland Aberdeen City 1(3) MS … MS MS … MS SS … MS MS … MS ES D SP

: : :…::…::…::…::::

N(3)1(1) N(3)N(1) N(3)1(2) N(3)N(2) N(3)1(3) N(3)N(3) N(3)1(4) N(3)N(4) N(3) N(3) N(3)*(*) West Lothian N(3) MS … MS MS … MS MS … SS MS … MS ES D SP

1(4)1(1) 1(4)N(1) 1(4)1(2) 1(4)N(2) 1(4)1(3) 1(4)N(3) 1(4)1(4) 1(4)N(4) 1(4) 1(4) 1(4)*(*) Northern Ireland Derry City 1(4) MS … MS MS … MS MS … MS SS … MS ES D SP

: : :…::…::…::…::::

N(4)1(1) N(4)N(1) N(4)1(2) N(4)N(2) N(4)1(3) N(4)N(3) N(4)1(4) N(4)N(4) N(4) N(4) N(4)*(*) Belfast N(4) MS … MS MS … MS MS … MS MS … SS ES D SP

1(1) N(1) 1(2) N(2) 1(3) N(3) 1(4) N(4) *(*) Rest of world Immigrants R IS … IS IS … IS IS … IS IS … IS 0 0 IS

*(*)1(1) *(*)N (1) *(*)1(2) *(*)N(2) *(*)1(3) *(*)N(3) *(*)1(4) *(*)N(4) * * *(*)*(*) Totals Populations * FP FP FP FP FP … FP FP … FP ES D T Notes: Green = Starting population (SP), Dark blue = Final population (FP), Light Grey = Deaths (D)/Non-survivors (NS), Purple = Surviving Emigrants (ES), Dark Grey = Surviving Immigrants (IS), Light Blue = Inter-country Surviving Migrants (MS), Pink = Intra-country Surviving Migrants (MS), Red = Surviving Stayers (SS).

39 Table 5: The structure of the projection model: base period data

Component Notes Particular features for UK model Base populations Populations for countries by age, sex and ethnicity for Capacity for 4 countries, single years of age 0 4 countries = England, Wales, Northern Ireland, Scotland a mid-year to 100+ (101), 2, sexes, 18 ethnic groups Populations for zones by age, sex and ethnicity for a Capacity for 352 zones in one country (nested), Zones = local authorities. Data for mid-years. 352 LAs in England, 22 in Wales, 32 mid-year single years of age 0 to 100+ (101), 2, sexes, 18 in Scotland and 26 in Northern Ireland ethnic groups Survival Survivorship probabilities for countries by age, sex and Period-cohort probabilities for (-1,0), (0,1), Derived from life tables for countries for mid-year intervals 2001-2, 2002-3, 2003- ethnicity for mid-year to mid-year interval (1,2) to (99,100) and (100+,101+) (102) 4, 2004-5, 2005-6, 2006-7 (see Rees and Wohland 2008) Survivorship probabilities for zones by age, sex and Period-cohort probabilities for (-1,0), (0,1), Derived from life tables for local authorities for mid-year intervals 2001-2, 2002-3, ethnicity for mid-year to mid-year interval (1,2) to (99,100) and (100+,101+) (102) 2003-4, 2004-5, 2005-6, 2006-7 (see Rees and Wohland 2008) Fertility Fertility rates for countries by age, sex and ethnicity Period-age rates for five year age groups 15-19 Derived from a combination of vital statistics, census and survey data for mid-year to mid-year interval to 45-49 (with <15 subsumed in 15-19, >49 in 45-49) Fertility rates for zones by age, sex and ethnicity for Period-age rates for five year age groups 15-19 Derived from a combination of vital statistics, census and survey data mid-year to mid-year interval to 45-49 (with <15 subsumed in 15-19, >49 in 45-49) Sex proportions by ethnicity Male proportion of live births, female Derived from vital statistics and recent literature indicating that Indian origin proportion of live births mothers have fewer girl babies than the population as a whole Internal migration Internal migration conditional on survival between Period-cohort probabilities for (-1,0), (0,1), Derived from 2001 Census migration data. They are transition data. There will countries by age, sex and ethnicity (1,2) to (99,100) and (100+,101+) (102) need to a process for handling ethnic conversion when migrants cross country boundaries. Internal migration probabilities conditional on survival Period-cohort probabilities for (-1,0), (0,1), Derived from 2001 Census migration data. They are transition data. They will need between zones within countries by age, sex and (1,2) to (99,100) and (100+,101+) (102) to modelled choosing a reduced model from the possible ODASE permutations for ethnicity the trend model and using some exogeneous variables for the explanatory model. Emigration Emigration conditional on survival from countries Period-cohort probabilities for (-1,0), (0,1), This will be generated from our international migration analysis using a variety of within countries by age, sex and ethnicity (1,2) to (99,100) and (100+,101+) (102) sources including NHS, NINo, TIM and IPS. Emigration conditional on survival from zones within Period-cohort probabilities for (-1,0), (0,1), This will be generated from our international migration analysis using a variety of countries by age, sex and ethnicity (1,2) to (99,100) and (100+,101+) (102) sources including NHS, NINo, TIM and IPS. Immigration Immigration flows to countries by age, sex and Period-cohort flows converted from country of We will use survivorship probabilities by age, sex and ethnicity to survive ethnicity origin or country of birth to ethnicity immigrants to survived status at the end of intervals Immigration flows to zones by age , sex and ethnicity Period-cohort flows converted from country of We will use survivorship probabilities by age, sex and ethnicity to survive origin or country of birth to ethnicity immigrants to survived status at the end of intervals Note: For each of the components we are building estimates for 2001-2007. This will give us alternative jump off years for testing and data for trend analysis.

40 Table 6: The general parameters and names/labels

VARIABLES PARAMETER OR NAME EXAMPLE Countries Number of home countries1 4 Country names England, Wales, Scotland, Northern Ireland Zones Number of zones for country (1) 3522 Zone names for country (1) City of London … Isle of Wight Number of zones for country (2) 22 Zone names for country (2) Isle of Anglesey/Ynys Môn … Cardiff/Caerdydd Number of zones for country (3) 32 Zone names for country (3) Aberdeen … West Lothian Number of zones for country (4) 26 Zone names for country (4) Derry City … Belfast Ethnic groups Number of ethnic groups 16 in England, 16 in Wales, 6 in Scotland and 12 in Northern Ireland Ethnic group names White: British … Chinese or other ethnic group: other ethnic group (see Table 6 for details) Ages Number of period ages for 101 mortality rates and life table Period-age names 0, 1, … 99, 100 Number of period cohorts for 102 projection Period-cohort names -1 to 0, 0 to 1, 1 to 2, 99 to 100, 100+ to 101+ Number of period ages for fertility 35 Period-age names 15, 16, …, 48, 49 Notes: 1. The United Kingdom is divided into four “home countries”, which have different government arrangements, reflecting their histories of union with England (still on-going). 2. This number is reduced from the full 354 LAs, because two local authorities (City of London in Greater London and Isles of Scilly in Cornwall) have very small populations. We merge the City of London with the and the Isles of Scilly with Penwith.

41 Table 7: Available population data for 2001, UK home countries and ethnic groups. Population Home Data structure data available country (1) All person England and ethnic Wales LAs, Single year of age for each local authority up to age 100+, men groups Census Scotland and women. 2001 Northern Ireland (2) Midyear England LAs,Single year of age for each local authority up to age 90+, men estimates for Wales and women. all person Scotland 2001 Northern Ireland LAs Single year of age for local authority up to age 85+, men and women. (3) Midyear estimates for Single year of age for each local authority up to age 90+, men and England all ethnic women. groups 2001

42 Table 8: The parameters for the linear regressions of SMR as a function of SIR

Females Males Local area group n r2 A b r2 a b

Scatter plots in Figures 3(a) and 3(b)

England 352 0.51 52.1 0.48 0.63 47.3 0.52 Wales 22 0.78 60.5 0.37 0.56 54.9 0.39 Scotland 32 0.69 43.9 0.64 0.75 28.3 0.82 Northern Ireland 26 0.16 71.2 0.26 0.40 59.9 0.36 Scatter plots in Figures 3(c) and 3(d) UK high ethnic minority 108 0.49 56.9 0.44 0.69 48.4 0.54 UK low ethnic minority 324 0.48 56.9 0.43 0.58 48.9 0.50 Notes: 1. n = number of local areas, r2 = squared correlation, a = intercept, b = slope. 2. The equation was fitted to two different partitions of local authorities: (a) the regression coefficients were calculated for local authorities (LAs) for each home nation England, Wales, Scotland and Northernܵܯ ܴ ൌ Irelandܽ ൅ ܾ andൈ ܵܫ byܴ gender, females and males; (b) the regression coefficients were calculated for LAs and by gender with high ethnic minority/low ethnic minority LAs UK, where high ethnic minority means non white population is more than 8.2 % of the population, 107 of the 108 LAs are in England. (c)

43 Table 9: The ranking of mean life expectancy for ethnic groups, men and women, England, 2001

Mean e Mean e Rank Ethnic group 0 Rank Ethnic group 0 Women Men

1 Chinese 82.1 1 Chinese 78.1 2 Other Ethnic 81.5 2 Other White 76.9 3 Other White 81.3 3 Other Ethnic 76.2 4 White British 80.5 4 Black African 76.1 All groups 80.5 All group 76.0 5 Black African 80.4 5 White British 75.9 6 White Irish 80.3 6 Indian 75.5 7 White-Asian 80.0 7 Other Asian 75.2 8 Other Mixed 79.9 8 White-Asian 75.1 9 Other Asian 79.5 9 White-Irish 74.9 10 White-Black African 79.5 10 Other Mixed 74.6 11 Indian 79.3 11 Black Caribbean 74.4 12 Black Caribbean 79.1 12 White-Black African 74.2 13 White Black Caribbean 78.7 13 Other Black 73.4 14 Other Black 78.5 14 White-Black Caribbean 73.4 15 Bangladeshi 77.7 15 Pakistani 73.1 16 Pakistani 77.3 16 Bangladeshi 72.7

44 START  PARAMETERS Base year/interval parameters Projection model parameters Projection output parameters  INPUT DATA Base Populations Survivorship/non-survivorship probabilities Emigration probabilities Internal migration probabilities Immigration flows Fertility rates Ethnic conversion probabilities  PROCESSES Start populations Survival/mortality Emigration Internal migration Immigration End populations Fertility Infant components Derived components  PROJECTION ASSUMPTIONS Survival/mortality Emigration Internal migration Immigration Fertility  PROJECTION RESULTS Population Survival/mortality Emigration Internal migration Immigration Fertility Derived components  END

Figure 1: The structure of the ETHNICPOP model

45 Figure 2: Population change mid-2001 to mid-2006 (illustrating the state space of the model) Source: Crown Copyright, reproduced from Dunnell, K. (2007) The changing demographic picture of the UK: National Statistician’s annual article on the population. Population Trends 130, 9-21. Available online at: http://www.statistics.gov.uk/downloads/theme_population/Population_Trends_130_web.pdf Accessed 4 February 2008.

46 LIMITING LONG TERM ILLNESS RESIDENTS DATA DEATHS DATA POPULATION DATA DATA 2001 Census Tables 2001 Census Tables 2001 Vital statistics 2001 Mid year S16,S65 S16,S65 Estimates

Countries & Local Countries & Local Countries & Local Countries & Local Authorities Authorities Authorities Authorities

STANDARDISED ILLNESSRATIOS MORTALITY RATES 2001 , UK Standard 2001 , UK Standard

Countries & Local Countries & Local Authorities Authorities

REGRESSION ANALYSIS STANDARDISED MORTALITY RATIOS SMR = f(SIR) 2001, UK Standard •All LAs in UK Countries & Local •LAs in E,W,S,N Authorities •‘Ethnic’ vs ‘Non-Ethnic’

RESIDENTSDATA BY ETHNICITY 2001 Census Tables ST STANDARDISED MORTALITY 101, 107, 207, 318 RATIOS BY ETHNICITY STANDARDISED ILLNESS Countries & Local 2001, UK Standard RATIOS BY ETHNICITY Authorities Countries & Local 2001, UK Standard Authorities Countries & Local Authorities LIMITING LONG TERM ILLNESS BY ETHNICITY 2001 Census Tables ST 101, 107, 207, 318 LIFE TABLES & SURVIVORSHIP PROBABILITIES BY ETHNICITY Countries & Local 2001 (Calendar Year) Authorities Countries & Local Authorities

Figure 3: The SIR method for estimating ethnic mortality

47 180 180

(a) E (b) England W Wales S Scotland 160 N 160 Northern Irland E England W Wales S Scotland N Northern Irland 140 140 Fit line for Total Fit line for Total s s R R M M S 120 120

S s

e s l e a l a m e M

F 100 100

80 80

60 60

1801601401201008060 1801601401201008060 Females SIRs Males SIRs

(c) (d)

ETH_Min ETH_Min Ethnic minorty > 8.2% Ethinc minorty > 8.2% Ethnic minorty <= 8.2% Ethnic minority <= 8.2% 150.00 Fit line for Total 150.00 Ethinc minorty > 8.2% Ethnic minorty > 8.2% Ethnic minority <= 8.2% Ethnic minorty <= 8.2% Fit line for Total Fit line for Total Fit line for Total

125.00 125.00 R R M M S

S e

l e a 100.00 l 100.00 a m e M F

75.00 75.00

R Sq Linear = 0.484 R Sq Linear = 0.583

50.00 50.00

175.00150.00125.00100.0075.0050.00 175.00150.00125.00100.0075.0050.00 Female SIR Male SIR

Figure 4: The relationships between SIR and SMR in UK local authorities by gender: (a) for all local authorities in the UK and by countries, females, (b) for all local authorities in the UK and by countries, males, (c) for local authorities in the UK with above and below average shares of ethnic minority groups, females, (d) for local authorities in the UK with above and below average shares of ethnic minority groups, males.

0)

48 0 White British 0 White Irish 0 Other White 0 White & Black Caribbean 5 5 5 5

1 97 (m) 1 109 (m) 1 79 (m) 1 135 (m) 96 (f) 100 (f) 83 (f) 133 (f) 0 0 0 0 s 0 0 0 0 A 1 1 1 1 L

f o

0 0 0 0 r 5 5 5 5 e b m u 0 0 0 0 N

0 50 150 250 0 50 150 250 0 50 150 250 0 50 150 250

0 White & Black African 0 White & Asian 0 Other Mixed 0 Indian 5 5 5 5

1 121 (m) 1 108 (m) 1 115 (m) 1 99 (m) 117 (f) 107 (f) 110 (f) 122 (f) 0 0 0 0 s 0 0 0 0 A 1 1 1 1 L

f o

0 0 0 0 r 5 5 5 5 e b m u 0 0 0 0 N

0 50 150 250 0 50 150 250 0 50 150 250 0 50 150 250

0 Pakistani 0 Bangladeshi 0 Other Asian 0 Black Caribbean 5 5 5 5

1 133 (m) 1 138 (m) 1 105 (m) 1 110 (m) 159 (f) 152 (f) 119 (f) 122 (f) 0 0 0 0 s 0 0 0 0 A 1 1 1 1 L

f o

0 0 0 0 r 5 5 5 5 e b m u 0 0 0 0 N

0 50 150 250 0 50 150 250 0 50 150 250 0 50 150 250

0 Black African 0 Other Black 0 Chinese 0 Other Ethnic Group 5 5 5 5

1 83 (m) 1 129 (m) 1 60 (m) 1 87 (m) 98 (f) 135 (f) 67 (f) 80 (f) 0 0 0 0 s 0 0 0 0 A 1 1 1 1 L

f o

0 0 0 0 r 5 5 5 5 e b m u 0 0 0 0 N

0 50 150 250 0 50 150 250 0 50 150 250 0 50 150 250

SIR SIR SIR SIR

Figure 5: The distribution of SIRs for local areas for ethnic groups, England, 2001 Notes: Grey bars = males, solid bars= females; horizontal axis = SIR (100=UK mean), vertical axis = number of local authorities.

49 White British White Irish White Other Mixed, White and Black Caribbean

Mixed, White and Black African Mixed, White and Asian Mixed, Other Mixed Asian or Asian British: Indian

Asian or Asian British: Asian or Asian British: Asian or Asian British: Other Black or Black British: Pakistani Bangladeshi Asian Caribbean

Black or Black British: African Black or Black British: Other Chinese Other Ethnic Group

Figure 6: Maps of life expectancy at birth, for 16 ethnic groups, England, females,2001 >= 81.17 to <85.86 >= 78.91 to <81.17 >= 73.77 to <78.91

50 Figure 7: Illustration of the New Migrant Databank for estimating immigration to UK local areas

51