<<

International Journal of Geo-Information

Article Geo-Spatial Analysis of Population Density and Annual Income to Identify Large-Scale Socio-Demographic Disparities

Nicolai Moos * , Carsten Juergens and Andreas P. Redecker

Geomatics Group, Institute of Geography, Faculty of Geosciences, University , D-44870 Bochum, ; [email protected] (C.J.); [email protected] (A.P.R.) * Correspondence: [email protected]

Abstract: This paper describes a methodological approach that is able to analyse socio-demographic and -economic data in large-scale spatial detail. Based on the two variables, population density and annual income, one investigates the spatial relationship of these variables to identify locations of imbalance or disparities assisted by bivariate choropleth maps. The aim is to gain a deeper insight into spatial components of socioeconomic nexuses, such as the relationships between the two variables, especially for high-resolution spatial units. The used methodology is able to assist political decision-making, target group advertising in the field of geo-marketing and for the site searches of new shop locations, as well as further socioeconomic research and urban planning. The developed methodology was tested in a national case study in Germany and is easily transferrable to other countries with comparable datasets. The analysis was carried out utilising data about population density and average annual income linked to spatially referenced polygons of postal   codes. These were disaggregated initially via a readapted three-class dasymetric mapping approach and allocated to large-scale city block polygons. Univariate and bivariate choropleth maps generated Citation: Moos, N.; Juergens, C.; from the resulting datasets were then used to identify and compare spatial economic disparities for a Redecker, A.P. Geo-Spatial Analysis study area in North -Westphalia (NRW), Germany. Subsequently, based on these variables, a of Population Density and Annual multivariate clustering approach was conducted for a demonstration area in . In the result, Income to Identify Large-Scale it was obvious that the spatially disaggregated data allow more detailed insight into spatial patterns Socio-Demographic Disparities. of socioeconomic attributes than the coarser data related to postal code polygons. ISPRS Int. J. Geo-Inf. 2021, 10, 432. https://doi.org/10.3390/ijgi10070432 Keywords: population density; annual income; disaggregation; dasymetric mapping; economic

Academic Editors: Giuseppe Borruso disparities; economy; multivariate clustering; bivariate choropleth map; geo marketing; socioeco- and Wolfgang Kainz nomic research

Received: 30 April 2021 Accepted: 22 June 2021 Published: 24 June 2021 1. Introduction Socio-demographic datasets provide information about the population in a certain Publisher’s Note: MDPI stays neutral area. Besides others, they provide measures for the evaluation of age and family structures, with regard to jurisdictional claims in gender distribution, and household size as well as educational level, employment, income, published maps and institutional affil- purchasing power, religious beliefs, and cultural heritage on different scales [1]. Especially iations. for political decision-making and urban planning, this information is of great value. Spatial economic information is also of particular interest to companies. With this, advertising can not only be developed and placed in a more targeted way, butalso, for example, a new branch of a business can be located, much more precisely adapted to the income of the Copyright: © 2021 by the authors. population living in a respective area. Licensee MDPI, , Switzerland. -Westphalia (NRW) is the most populated state of Germany and exhibits This article is an open access article a population with distinct economic statuses and opportunities. Hence, it is particularly distributed under the terms and suitable to establish a reproducible methodological approach that can be applied to other conditions of the Creative Commons urban areas in other countries. Detailed socio-demographic datasets are very often collected Attribution (CC BY) license (https:// by private enterprises (e.g., microm GmbH, Michael Bauer Micromarketing GmbH) and are creativecommons.org/licenses/by/ commercially published in many different formats, covering a lot of different variables [2]. 4.0/).

ISPRS Int. J. Geo-Inf. 2021, 10, 432. https://doi.org/10.3390/ijgi10070432 https://www.mdpi.com/journal/ijgi ISPRS Int. J. Geo-Inf. 2021, 10, 432 2 of 17

Numerous spatial approaches focus on a more global scale for which the resolution and the size of the spatial units do not fall below urban statistical districts (e.g., [3–6]). The scope of available initial spatial datasets varies from very coarse (e.g., whole cities) to moderate (e.g., urban statistical districts), and results are often simply visualised in table form [7] or diagrams that only establish borders between statistically generated classes [8]. Since the early 1990s [9], there have been numerous international studies and other publications that address the combination of spatial and statistical datasets and suggest how to ideally deal with this inter-methodological approach [10–14]. However, socioeconomic properties are usually assigned to area-covering, gap-less administrative polygons, neglecting the fact, that people do not live equally spread throughout the area covered by such polygons. This leads to wrong spatially related numbers such as density for those polygons, wherein people gather only on a small part (e.g., block of buildings) of their respective area. This methodological limitation has been overcome by the submitted approach. The application of the proposed workflow using broadly available data to gain disaggregated relocated large-scale socioeconomic datasets has not yet been fully utilised. In 2016, Ref. [15] conducted a study to detect and classify hotspots of socioeconomic disadvantages for urban statistical districts in the city of Dortmund. Until then, this was the highest level of spatial detail one could find in social science studies that deal with socioeconomic values stored rather in individually shaped vector features instead of uniform raster cells. This study proposes to fill the gap between the expertise of using spatial data on the one hand and statistical socio-demographic data analysis on the other hand. It leads to a sophisticated disaggregation and relocation concept that can be broadly applied with a certain set of source data. The aim is to come to new conclusions by enhancing the spatial precision and to find new ways to incorporate social data into spatial analyses, such as the clustering and recognition of regional and local patterns, developed based on a case study for Dortmund, NRW. Finally, it is about the visualisation of such data in adequate and revealing maps to provide an auxiliary for visual validation and interpretation. One of the common spatial units for geo-spatial representations of socioeconomic datasets in Germany are postcode polygons. These can be compared to any other kind of gapless spatial unit in terms of global transferability. Those postcode polygons are split up to even more detailed postal units with eight-digit pseudo-postcode polygons (PLZ8) that are provided by the company microm GmbH. Each of the polygons covers about 500 households. Hence, this dataset delivers a uniform basis for comparing different regions while still neglecting the fact that the distribution of people in a given polygon is never homogeneous and, consequently, partly contains areas where no people live. Yet, those homogeneous numbers of inhabitants per unit reliably allow to compare certain attributes in between any selection of polygons ([16], see Figure1). In this paper, the general concept of the three-class dasymetric mapping disaggregation will be introduced, illustrated, and applied to income and population data from the city of Dortmund. The resulting disaggregated datasets will be used for spatial comparison of the relationship between the population density and annual income [17] on a city block level. Subsequently, the results will be compared to the initial postal code units through a correlation analysis followed by an individual clustering of both kinds of units for the final identification of respective hot spots of the highest correlation. Univariate and bivariate choropleth maps are well-suited to give an extensive overview of how people of different economic statuses are distributed in Germany’s most populated state NRW and where certain characteristics and values peak locally. In this case, this comparative visualisation focuses on the two agglomerations of the and the Ruhr area with their respective biggest cities of and Dortmund. They are known to be the densest populated areas in NRW. However, they differ in various social aspects, as, e.g., [18–20] have already pointed out. Here, they provide an ideal use case for large-scale disaggregated socioeconomic datasets. A spatial comparison of these two areas reveals ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 3 of 17

ISPRS Int. J. Geo-Inf. 2021, 10, 432 3 of 17

large-scale disaggregated socioeconomic datasets. A spatial comparison of these two areas reveals patterns that confirm and underline differences, allowing a better understanding patternsof the causes that for confirm regional and disparities. underline differences, allowing a better understanding of the causesThe for comprehensive regional disparities. disaggregation approach tested for the city of Dortmund refines the PLZ8-wideThe comprehensive information disaggregation to much smaller approach spatial tested units for representing the city of Dortmund only residential refines housingthe PLZ8-wide blocks informationor even sing tole muchhouses smaller and leaving spatial out units all representing unpopulated only parts residential between them.housing Consequently, blocks or even disaggregated single houses values and leaving can pr outovide all the unpopulated possibility parts to not between only rate them. the gaplessConsequently, and coarse disaggregated polygons coming values can from, provide e.g., postal the possibility codes or toadministrative not only rate units. the gapless With them,and coarse one can polygons also obtain coming a visual from, impression e.g., postal of codes how the or administrativebivariate choropleth units. map With results them, ofone the can precedent also obtain analysis a visual refer impression to the real of living how the location bivariate of the choropleth population map [21]. results of the precedent analysis refer to the real living location of the population [21].

Figure 1. Location of North Rhine-Westphalia in Germany and its PLZ8 polygons based on postal codes and an average of 500Figure households 1. Location per of polygon North Rhine-Westphalia (Data source: [22,23 in]). Germany and its PLZ8 polygons based on postal codes and an average of 500 households per polygon (Data source: [22,23]). 2. Materials and Methods 2.2.1. Materials PLZ8 Polygons and Methods 2.1. PLZ8Countries Polygons are subdivided into smaller administrative units (states, districts, municipal- ities,Countries etc.) as well are as postalsubdivided tracts on into different smaller levels administrative of spatial detail. units The postal(states, code districts, 8 level (PLZ8)municipalities, is an artificial etc.) as spatial well as unit postal developed tracts on for different German levels conditions of spatial bymicrom detail. The GmbH. postal It codesubdivides 8 level the(PLZ8) regular is an Germanartificial five-digitspatial unit postal developed code polygons for German into conditions smaller areas by microm repre- GmbH.senting anIt averagesubdivides of 500 the inhabitants regular German per polygon. five-digit Advantageously, postal code polygons the PLZ8 datainto productsmaller areasmatches representing with existing an average administrative of 500 inhabitants spatial units. per polygon. Hence, its Advantageously, comparability is the directly PLZ8 datadependent product on matches the spatial with scale existing of the administra scientific approachtive spatial (see units. Figure Hence,1). In urbanits comparability areas, one isidentifies directly smallerdependent PLZ8 on polygons,the spatial whereas scale of the in rural scientific areas, approach the extent (see of Figure PLZ8 polygons1). In urban is areas,much larger.one identifies This means, smaller in effect,PLZ8 thatpolygons, values whereas between in two rural different areas, polygons—e.g., the extent of PLZ8 one polygonsin an urban is andmuch one larger. in a rather This rural means, area—can in effect, be reasonably that values compared between with two one different another whilepolygons—e.g., the extent ofone the in respective an urban polygons and one may in a differ rather widely. rural area—can be reasonably compared with one another while the extent of the respective polygons may differ widely.

ISPRSISPRS Int.Int. J.J. Geo-Inf.Geo-Inf. 2021,, 1010,, 432x FOR PEER REVIEW 44 of of 17

2.2. Population Density To visualisevisualise the population density for each PLZ8 polygon in NRW, thethe absoluteabsolute numbers of inhabitants (inh.) must be linked to the extent of thethe respectiverespective polygonspolygons inin order to be able to calculatecalculate the numbernumber ofof inhabitantsinhabitants per squaresquare kilometrekilometre (sqkm)(sqkm) forfor each area.area. AfterAfter that,that, the the values values can can be be categorised categorised into into classes classes for betterfor better comparability comparability [24] and[24] and then then be visualised be visualised in a mapin a map (see Figure(see Figure2). 2).

Figure 2. Population density 2017 of North Rhine-Westphalia basedbased onon PLZ8PLZ8 polygons,polygons, K: Cologne,Cologne, DO: Dortmund (Data source:source: [[23,25,26]).23,25,26]).

The map reveals the more densely populated areas in NRWNRW inin darkerdarker colours,colours, asas there is the Rhine–Ruhr area in its centre with Dortmund (DO) north of the river Ruhr, continuing southsouth acrossacross the the river river Rhine Rhine with with the th citye city of of Cologne Cologne (K). (K). All All other other cities cities are notare namednot named on the on map.the map.

2.3. Average Annual Income per Inhabitant The average annualannual incomeincome per per inhabitant inhabitant can can be be represented represented by by the the monetary monetary sum sum of moneyof money per per inhabitant inhabitant or—better or—better suited suited for fo comparisons—byr comparisons—by an indexan index that that transforms transforms the meanthe mean annual annual income income of allof all inhabitants inhabitants within within a certaina certain year year to to the the value value of of 100. 100. IncomeIncome values higher or lower than the average areare represented by values above or below 100, respectively. TheThe micrommicrom datasetsdatasets offeroffer aa variablevariable called purchasing powerpower indexindex (PPI).(PPI). It reflectsreflects thethe meanmean net net inhabitant inhabitant income income [27 [27]] during during the the period period of 1 of year 1 year within within a spatial a spatial unit andunit canand be can used be asused a proxy as a proxy to assess to incomeassess inco factorsme factors such as such salary, as capitalsalary, assets, capital lettings, assets, etc., including tax deductions. Periodic expenses such as rent, electricity, and lettings, etc., including tax deductions. Periodic expenses such as rent, electricity, and are not taken into account [28]. For Germany, the average income in 2017 is 21,220 EUR per insurances are not taken into account [28]. For Germany, the average income in 2017 is inhabitant and is represented by a PPI value of 100 for that year [17]. 21,220 EUR per inhabitant and is represented by a PPI value of 100 for that year [17].

ISPRS Int. J. J. Geo-Inf. Geo-Inf. 2021, 10, x 432 FOR PEER REVIEW 5 5of of 17

TheThe choropleth map of the PPI—more suitable called average annual income per inhabitantinhabitant hereinafter—for hereinafter—for NRW in Figure 3 3 was was createdcreated inin thethe samesame wayway basedbased uponupon thethe samesame spatial spatial units units (PLZ8) (PLZ8) as as the population density map in Figure 22.. ItIt variesvaries thethe colourcolour schemescheme for for the the different classes to point out regions characterised by average, lower, or higherhigher income values (see Figure 33).).

Figure 3.3. AverageAverage annual annual income income per per inhabitant inhabitant (PPI (PPI 2017) 2017) for the for area the of area North of Rhine-WestphaliaNorth Rhine-Westphalia based on based PLZ8 polygons,on PLZ8 polygons,K: Cologne, K: DO: Cologne, Dortmund DO: Dortmund (Data source: (Data [23 source:,25,26]). [23,25,26]).

AsAs stated before, each PLZ8 unit does not ne necessarilycessarily reflect reflect the real situation of the distributiondistribution of inhabitants,inhabitants, asas therethere are are areas areas within within each each polygon polygon that that are are not not occupied occupied by byresidents. residents. In order In order to move to thesemove values these into valu thosees into particular those sub-polygons, particular sub-polygons, representing representingthe actual location the actual of residential location blocks,of residen in thetial following blocks, chapters,in the following a disaggregation chapters, ap- a disaggregationproach will be applied. approach The will aim be is applied. to confirm The the aim hypothesis is to confirm that the large-scale hypothesis units, that selected large- scaleas residential units, selected areas by as their residen typetial of usage, areas provideby their a type significantly of usage, better provide basis a for significantly a valuation betterconcerning basis scientificfor a valuation problems concerning with a spatial scientific socioeconomic problems with background. a spatial socioeconomic background. 2.4. Disaggregation Approach 2.4. DisaggregationThe disaggregation Approach of spatial data is a problem that widely occurs in science and in planningThe disaggregation practice. The primary of spatial goal data is to is distribute a problem values that widely from higher-level, occurs in science larger spatialand in planningunits to the practice. smaller The spatial primary units goal within. is to distri Thus,bute using values a suitable from higher-level, procedure thatlarger achieves spatial unitsa realistic to the reassignment smaller spatial of units attribute within. values Thus, describing using a suitable the actual procedure state of that the achieves respective a realisticsmaller spatialreassignment units. The of attribute method usedvalues to assigndescribing the aggregated the actual characteristicsstate of the respective from the smallerlarge spatial spatial units—source units. The method polygons—to used to theassign smaller the aggregated spatial units—target characteristics polygons from (seethe largeFigure spatial4)—is crucialunits—source for the reliabilitypolygons—to of a spatialthe smaller disaggregation spatial units—target on the one polygons hand. On (see the Figure 4)—is crucial for the reliability of a spatial disaggregation on the one hand. On the

ISPRS Int. J. Geo-Inf. 2021, 10, 432 6 of 17 ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 6 of 17

other hand, the consistency of the input data required for thethe respectiverespective methodsmethods hashas anan even higher influenceinfluence on the quality of thethe resultsresults [[29].29].

Figure 4. PLZ8 source polygons (black lines) and target polygons (red areas)areas) (Data(Data source:source: [[22,30]).22,30]).

Numerous studies conduct a disaggregation using the areal interpolation approach, which isis oneone of of the the simplest simplest disaggregation disaggregation techniques techniques (e.g., (e.g., [31,32 [31,32]).]). In contrast, In contrast, this study this appliesstudy applies the three-class the three-class dasymetric dasymetric mapping mapping method method as a methodology as a methodology for further for studiesfurther instudies other in countries other countries that might that bemight based be onbased it. Thison it. approachThis approach was evaluatedwas evaluated as the as best the andbest mostand most accurate accurate one in one various in various studies studies [29,33 –[29,33–36].36]. It is applied It is applied by incorporating by incorporating certain additionalcertain additional datasets datasets that shape that shape and qualify and qualify the target the target areas areas according according to their to their usage: usage: the outlinethe outline of city of blocks,city blocks, their their individual individual type oftype residential of residential use (such use as(such residential, as residential, mixed use,mixed etc.), use, and etc.), the and coverage the coverage of the area of bythe buildings area by buildings within the within respective the respective spatial unit. spatial With thisunit. information With this information added to the added target to polygons,the target forpolygons, each of for them, each a weightingof them, a factorweighting can befactor determined can be determined that defines that the defines proportional the prop assignmentortional assignment of inhabitants of inhabitants from the superiorfrom the PLZ8superior polygon. PLZ8 polygon. It is also dependentIt is also dependent on the extent on the of eachextent new of each and smallernew and spatial smaller unit. spatial The lackunit. ofThe ubiquitous, lack of available,ubiquitous, large-scale available, datasets large-scale holding datasets the necessary holding information the necessary for the determinationinformation for of the weightingdetermination factors of mightthe weighting be one of thefactors main might reasons be why one soof many the main past studiesreasons ratherwhy so use many the muchpast studies simpler rather areal use interpolation. the much simpler areal interpolation. Since the availability of datasets required for this approach has strongly improved in a lot of countries over the past years [37,38], [37,38], with this study, a methodology isis developed utilising certain types of broadly availableavailable datasetsdatasets necessarynecessary for aa dasymetricdasymetric mappingmapping in three classes. It also aims to advertise this approach for a broader application in socio- geomatic research. geomatic research. However, still, the general availability of additional datasets depends strongly on the However, still, the general availability of additional datasets depends strongly on the location of the study area and the respective national or regional players that collect and location of the study area and the respective national or regional players that collect and provide the corresponding data. Not only the acquisition of the data itself is important, but provide the corresponding data. Not only the acquisition of the data itself is important, also the underlying structure of it. In some three-class approaches, no distinction is made but also the underlying structure of it. In some three-class approaches, no distinction is between populated and non-populated target areas. Instead, regardless of whether it is made between populated and non-populated target areas. Instead, regardless of whether used for, e.g., agricultural, industrial, or residential purposes—a corresponding value of it is used for, e.g., agricultural, industrial, or residential purposes—a corresponding value socio-demographic and -economic properties is assigned to each area. While the decision of socio-demographic and -economic properties is assigned to each area. While the about the weighting factors based on use classes is always subject-driven, it gives degrees of decision about the weighting factors based on use classes is always subject-driven, it gives freedom for applying the principle and deriving a suitable allocation factor. Thus creating adegrees need for of testingfreedom and for refinement applying [the35]. principle and deriving a suitable allocation factor. Thus creating a need for testing and refinement [35].

ISPRS Int. J. Geo-Inf. 2021, 10, x 432 FOR PEER REVIEW 7 7of of 17

As it it is is known known that that people people use use to to live live in in buildings, buildings, one one could could come come to to more more prominent prominent resultsresults by focusing the socioeconomic data data to build up populated structures. To achieve this,this, the the disaggregation approach approach is based on remotes-ensing imagery and additional datasets from regional authorities.

2.5. Disaggregation Disaggregation of of Population Density and Annual Individual Income The disaggregation of the socio-demographic values from the source polygons into thethe target target polygons is conducted in four majormajor steps (see FigureFigure5 5):): first,first, thethe additionaladditional datasets thatthat constitute constitute the the target target polygons polygons in an extendedin an extended dataset are dataset selected, are restructured, selected, restructured,and adjusted and to obtainadjusted a structureto obtain a that structure matches that the matches one of th thee one source of the dataset. source dataset. In this Incase, this the case, source the polygonssource polygons are derived are fromderived the from microm the PLZ8microm units. PLZ8 The units. target The polygons target polygonsrepresenting representing city blocks city are blocks taken are from take then from Urban the Atlas Urban 2012 Atlas dataset 2012 dataset provided provided by the byCopernicus the Copernicus program program [39]. The [39]. official The digitalofficial landscapedigital land modelscape that model supplies that supplies the building the buildinggeometries geometries for this case for studythis case can study be retrieved can be retrieved from Geobasis.NRW from Geobasis.NRW [40]. [40].

Figure 5. Schematic flow flow chart of the disaggr disaggregationegation process (source: authors).

The temporal temporal gap between the datasets is di disregardedsregarded in this context, since the Urban Atlas and the digital landscape model,mode,l besides the target-geometries,target-geometries, only contribute informationinformation for for the the classification classification of residential use types. A divergent acquisition time of only a few years for these datasets is not expected to cause significantsignificant inconsistencies. InIn thethe secondsecond step, step, representing representing the secondthe seco classnd ofclass the three-classof the three-class dasymetric dasymetric mapping mappingprocess, the process, extended the datasetsextended are datasets filtered are according filtered toaccording the requirements to the requirements for the disaggrega- for the disaggregation,tion, leaving only leaving the target only polygons the target that poly aregons appropriate that are for appropriate the current analysis.for the current In this analysis.case study, In thethis residential case study, urban the residential areas providing urban space areas for providing living are space of major for interest.living are This of majorleads tointerest. preselection This leads of only to preselection houses and of city only blocks houses whose and purpose city blocks is habitation whose purpose as rural, is habitationindustrial, andas rural, commercial industrial, sites and usually commercial do not provide sites usually a place do for not living. provide There a are place several for living.buildings There that are can several be identified buildings as mixedthat can usage, be identified e.g., shops as onmixed ground usage, level e.g., and shops several on groundflats on thelevel floors and above.several This flats leads on the to stepfloors three, above. and This the thirdleads and to step last classthree, of and the the disaggre- third andgation last is class where of the the polygons disaggregation of the preselected is where the residential polygons areas of the are preselected classified according residential to areastheir respectiveare classified kind according of apartment to their structure respec andtive their kind number of apartment of floors structure [41]. The and structure their numbervaries from of floors one- or[41]. two-floor The structure single varies homes from for single one- or families two-floor over single mixed homes buildings for single with familiesfew shops over and mixed more buildings apartments with to large few apartmentshops and blocksmore apartments such as multi-family to large apartment houses or blocksdormitories such foras multi-family students, pupils, houses or seniors. or dormit Eachories usage for typestudents, represents pupils, a differentor seniors. amount Each usageof people type per represents area unit a defining different the amount density of of people inhabitants per area in the unit respective defining building the density blocks. of inhabitantsThis makes itin inevitable the respective to distinguish building between blocks. allThis these makes classes, it inevitable ensuring theto valuesdistinguish from betweenthe source all polygons these classes, are transferred ensuring the to thevalues target from polygons the source most polygons appropriately. are transferred For this, anto theallocation target polygons factor is determined most appropriately. based on theFor developedthis, an allocation classification factor ofis usagedetermined types, based again onbased the developed on the additional classification datasets. of usage This istypes, a crucial again step, based since on the the additional potentialrelative datasets. living This isarea a crucial of all subdivided step, since polygonsthe potential is the relative only value living where area allof polygonsall subdivided differ frompolygons each is other. the only value where all polygons differ from each other.

ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 8 of 17

ISPRS Int. J. Geo-Inf. 2021, 10, 432 8 of 17

After the preselection and classification of the target polygons, all source polygons are intersected with these prepared target polygons. This step is followed by assigning the weighted sourceAfter values the preselectionto the target andpolygons classification according of theto the target twopolygons, steps before. all sourceAs polygons the PLZ8 and arethe intersectedbuilding block with polygons these prepared do not targetmatch polygons.exactly, a lot This of steptarget is polygons followed by assigning are subdividedthe into weighted smaller source pieces valuesthat were to the assigned target polygonswith values according from different to the twosource steps before. As polygons. Thesethe values PLZ8 andare assigned the building to the block target polygons polygons do depending not match on exactly, their respective a lot of target polygons potential livingare area subdivided compared into to the smaller overall pieces living that area were in the assigned respective with source values zones. from different source Using thepolygons. specific TheseID of valueseach single are assigned target topolygon, the target these polygons broken depending parts can on be their respective dissolved intopotential a final, coherent living area city compared block while to theaggregating overall living up all area property in the respectivevalues of the source zones. respective subparts.Using This theresults specific in a IDnew of eachpatchy single dataset target covering polygon, only these those broken areas parts where can be dissolved actually peopleinto are a living. final, coherent city block while aggregating up all property values of the respective subparts. This results in a new patchy dataset covering only those areas where actually 2.6. Concept ofpeople Bivariate are Choropleth living. Maps To illustrate2.6. the Concept relationship of Bivariate between Choropleth two Mapsquantitative parameters of polygons in a map, a combined matrix legend can be applied. This supports the recognition of patterns To illustrate the relationship between two quantitative parameters of polygons in a created by the combination of the characteristics of the two variables [42,43]. It is built by map, a combined matrix legend can be applied. This supports the recognition of patterns combining two different colour scales, each representing one variable by graded created by the combination of the characteristics of the two variables [42,43]. It is built by brightness. Thecombining saturation two of differentthe colour colour assign scales,ed to each each representing variable increases one variable with byhigher graded brightness. values and issThe divided saturation into of thethree colour classes. assigned For toeasy each identification variable increases of the with displayed higher values and iss correlation, bydivided assigning into the three resulting classes. colours For easy in identificationthe polygons ofto thethe displayed combined correlation, classes of by assigning both variables,the a resultingmatrix legend colours with in the all polygons nine possible to the combinedcolour combinations classes of both is used. variables, a matrix Therefore, thelegend classes with of both all nine characteristic possible colours are combinationsgrouped by a is specific used. Therefore, selection. theThis classes of both results in an individualcharacteristics associative are grouped colour by for a each specific possible selection. combination This results of the in antwo individual values associative (see Figure 6).colour Further for applications each possible for combination bivariate choropleth of the two valuesmaps are (see provided Figure6). by, Further e.g., applications [21,44]. for bivariate choropleth maps are provided by, e.g., [21,44].

Figure 6. Schematic generation of a sequential matrix legend (Source: authors). Figure 6. Schematic generation of a sequential matrix legend (Source: authors).

2.7. Population2.7. Density Population vs. Annual Density Individual vs. Annual Income Individual in Bivariate Income Choropleth in Bivariate Maps Choropleth Maps The univariateThe choropleth univariate maps choropleth shown in maps Figures shown 2 and in Figures3 can now2 and be3 cantransformed now be transformed into one bivariateinto choropleth one bivariate map. choropleth For this reas map.onable For this class, reasonable breaks for class, each breaks attribute for eachneed attribute need to be set, andto a be colour set, and scheme a colour must scheme be se mustlected be selectedthat emphasises that emphasises the essence the essence of this of this analysis. analysis. The resulting Figure7 gives a decent insight into the spatial pattern of the two variables The resultingused. Figure It shows 7 lessgives saturation a decent in insight the combination into the spatial of the two pattern different of the colours two where values variables used.tend It shows to be lower, less saturation and vice versa.in the combination of the two different colours where values tend to be lower, and vice versa.

ISPRS Int. J. Geo-Inf. 2021, 10, 432 9 of 17 ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 9 of 17

ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 9 of 17

FigureFigure 7. 7. BivariateBivariate choropleth choropleth map map showing showing the the combination combination of of population population density density and and average average annual annual income income per per inhabitantFigure 7. (DataBivariate source: choropleth [22,23,25]). map showing the combination of population density and average annual income per inhabitant (Data source: [22,23,25]). inhabitant (Data source: [22,23,25]). 2.8. Correlation and Multivariate Cluster Analysis 2.8. Correlation The correlation and Multivariate analysis of the Cluster two Analysisattributes, population density and average annual incomeThe per correlation inhabitant, analysis on the of PLZ8 the two level attributes,attribut reveals,es, populationrespectively, density a low and R2 averagevalue with annual an excessiveincome per scattering inhabitant, (see on Figure the 8). PLZ8 level reveals, respectively, aa lowlow R2R2 valuevalue withwith anan excessive scattering (see Figure8 8).).

Figure 8. Regression analysis for average annual income per inhabitant and population density in PLZ8 polygons, dot size inFigureFigure scatterplot 8.8. Regression proportional analysis to relati for averageve polygon annual area income (Source: per authors). inhabitant and population densitydensity inin PLZ8 polygons, dotdot sizesize inin scatterplotscatterplot proportionalproportional toto relativerelative polygonpolygon areaarea (Source:(Source: authors).authors).

ISPRS Int. J. Geo-Inf. 2021, 10, 432x FOR PEER REVIEW 1010 of of 17 ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 10 of 17

Calculating the same correlation for the disaggregateddisaggregated city block polygons, the R2R2 value becomes significantly significantly higher while thethe scatteringscattering isis reducedreduced (see(see FigureFigure9 9).).

Figure 9.9. RegressionRegression analysisanalysis for for average average annual annual income income per per inhabitant inhabitant and and population population density density in cityin city block-polygons, block-polygons, dot sizedot size in scatterplot in scatterplot proportional proportional to relative to relative polygon polygon area area (Source: (Source: authors). authors).

The two regressions already show that the disaggregation has led to an improvement inin the the correlationcorrelation ofofof the thethe two twotwo values. values.values. In InIn the thethe following followingfollowing step, step,step, a a multivariatea multivariatemultivariate cluster clustercluster analysis analysisanalysis is isisconducted conductedconducted with withwith the thethe two twotwo attributes attributesattributes for forfor both bothboth spatial spspatial units units inin orderorder toto generategenerate new classes, compare the results, and provide aa better mean to see if the results results can reveal new patterns and insights into large-scale city areas, usin usingg the disaggregated city blocks as the smaller spatial unit. The The multivariate multivariate cluster cluster analysis analysis used used in in this study is based on the K meansmeans algorithm, whichwhich aimsaims toto partition partition features features based basedbased on onon seeds seedsseeds that thatthat grow growgrow into clusters,intointo clusters,clusters, min- minimisingimising the the differences differences among among the the features features within within each each cluster cluster [45 [45,46].,46]. This This is is thethe basisbasis forfor the map shown in FigureFigure 1010 displayingdisplaying thethe clusteredclustered valuesvalues spatiallyspatially that are shownshown statistically in the scatterplots of Figures 8 8 and and9 .9.

Figure 10. Multivariate clusters for the city of Dortmund with PLZ8 and city blocks, the two lowest clusters in blue (Data Sources: [[22,23,25]).22,23,25]).

ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 11 of 17

ISPRS Int. J. Geo-Inf. 2021, 10, 432 11 of 17

3. Results 3.1. Disaggregation 3. Results 3.1.The Disaggregation result of the disaggregation case study for the city of Dortmund leads to two comparable maps, as shown in Figure 10. On side, the city of Dortmund is The result of the disaggregation case study for the city of Dortmund leads to two composed of PLZ8 polygons, while the map on the right side shows only residential comparable maps, as shown in Figure 10. On the left side, the city of Dortmund is composed patches. Looking at the PLZ8 map, it becomes apparent that the level of detail and the of PLZ8 polygons, while the map on the right side shows only residential patches. Looking variation of colours for the same expanse in the city centre is much higher compared to at the PLZ8 map, it becomes apparent that the level of detail and the variation of colours the outskirts. This can be explained by the requirement of about 500 households per for the same expanse in the city centre is much higher compared to the outskirts. This polygon for the PLZ8 data. In less populated areas, these turn out to be larger than those can be explained by the requirement of about 500 households per polygon for the PLZ8 in the city centre, where higher densities of people are present. The polygons in the right data. In less populated areas, these turn out to be larger than those in the city centre, where map represent only those city blocks where people actually live. All other city blocks (such higher densities of people are present. The polygons in the right map represent only those as open space, industry, business, sports, public services, etc.) were excluded from the city blocks where people actually live. All other city blocks (such as open space, industry, disaggregationbusiness, sports, (white public areas services, in the etc.) map). were Cons excludedequently, from the the size disaggregation of polygons representing (white areas residentialin the map). areas Consequently, is much smaller the sizethan ofthe polygons extent of representingarea-wide tracts. residential areas is much smallerAnother than improvement the extent of area-wide is the accuracy tracts. of values in certain areas that were transferred from Anothernon-fitting improvement geometries isto thereasonable accuracy city of values blocks in (see certain Figure areas 11). that The were Clarenberg transferred in Dortmund’sfrom non-fitting suburb geometries Hörde is to a reasonablemulti-building city blocksresidential (see Figuretower 11block). The complex Clarenberg that is in homeDortmund’s to approximately suburb Hörde 3000 ispeople a multi-building in around 1000 residential flats. A lot tower of them block are complex unemployed, that is therehome is to a approximatelyrelatively high 3000rate of people childin poverty, around and 1000 a flats.lot of A the lot inhabitants of them are are unemployed, receiving unemploymentthere is a relatively benefits high [47]. rate Thus of child for poverty,the Clarenberg, and a lot the of theaverage inhabitants annual areincome receiving per inhabitantunemployment is comparatively benefits [47 low,]. Thus and forthe thepopu Clarenberg,lation density—due the average to the annual structure income of the per buildinginhabitant complex—is is comparatively comparatively low, and high. the population This results density—due in the logical to theconclusion structure that of the it shouldbuilding be complex—is represented comparativelyby one of the high. lowest This clusters results inthat the have logical been conclusion calculated that by it multivariateshould be represented clustering. by Figure one of 11 the shows lowest that clusters this thatis not have the been case calculated when looking by multivari- at the resultsate clustering. of the PLZ8 Figure clustering. 11 shows This that is this probably is not thecaused case by when the fact looking that atthe the outlines results of of the the mainPLZ8 polygon, clustering. which This ishas probably the biggest caused overlaying by the fact thatarea thewith outlines the Clarenberg, of the main polygon,are not accuratelywhich has shaped the biggest similarly overlaying to the complex area with itse thelf. Hence, Clarenberg, the PLZ8 are polygon not accurately overlays shaped with othersimilarly surrounding to the complex polygons itself. that Hence, are thenot PLZ8linked polygon with the overlays Clarenberg with other and surroundingrepresent a respectivepolygons thathigher are income not linked and with a lower the Clarenbergpopulation and density represent and are a respective consequently higher assigned income toand cluster a lower 4. population density and are consequently assigned to cluster 4.

FigureFigure 11. 11. ClarenbergClarenberg building building complex complex (blue (blue outline) outline) in in Dortmund—comparison Dortmund—comparison of of extent, extent, shape, shape, and and values values before before and and afterafter disaggregation disaggregation (Date (Date source: source: [22,23,25,40]). [22,23,25,40]).

TheThe disaggregation disaggregation outcome outcome shines shines in ina diff a differenterent light—the light—the boundary boundary of the of complex the com- isplex accurately is accurately represented represented by the by theoutline outline of ofthe the polygon, polygon, assigned assigned with with values values that that incorporateincorporate the the structure structure of of the the people people and and the the building building complex complex itself. itself. This This results results in in an an assignmentassignment to thethe second-lowestsecond-lowest cluster cluster by by the the clustering clustering algorithm algorithm applying applying the disag- the disaggregatedgregated values values stored stored with with city block-polygons.city block-polygons. Additionally, Additionally, the whole the whole level oflevel detail of detailhas consequently has consequently increased, increas whiched, which leads leads to a wholeto a whole new new pattern pattern and and the possibilitythe possibility of a much finer distinction between the different clusters (see Figure 10). It allows comparing

ISPRS Int. J. Geo-Inf. 2021, 10, x FOR PEER REVIEW 12 of 17

ISPRS Int. J. Geo-Inf. 2021, 10, 432 12 of 17 of a much finer distinction between the different clusters (see Figure 10). It allows comparing socioeconomic parameters related to single building blocks of homogenous use classes instead of area-wide units with obscurely varying population densities (see socioeconomic parameters related to single building blocks of homogenous use classes Figure 12). instead of area-wide units with obscurely varying population densities (see Figure 12).

Figure 12. 12. DisaggregatedDisaggregated values values from from PLZ8 PLZ8 (left) (left) to residential to residential blocks blocks (right) (right) for the for city the of cityDortmund of Dortmund (Data source: (Data [22,23,25,40]).source: [22,23 ,25,40]).

SummedSummed up, up, the the result result of of the the disaggregati disaggregationon is is a new sophisticated dataset that containscontains only only building building blocks blocks reasonably reasonably a attributedttributed with with values values derived derived from from the PLZ8 datadata and and consequently consequently allows allows evaluating evaluating the the area area in in a a much much more more detailed way than the area-widearea-wide PLZ8 PLZ8 polygons polygons would would allow allow [5,48]. [5,48]. Now Now it it is possible to distinguish between smallsmall housing housing blocks blocks of of fewer fewer or or more more inhabitants and and to directly utilise the focused socioeconomicsocioeconomic information aboutabout peoplepeople living living there. there. The The value value of of disaggregated disaggregated data data is tois tobe be illustrated illustrated in in an an example example by by placing placing focus focus on on aa magnified,magnified, moremore detaileddetailed sub-area of Dortmund.Dortmund. At At this this scale, scale, the the original original PLZ8 PLZ8 da datata can can be be ideally ideally compared compared to to the the respective respective disaggregateddisaggregated values. values. In In addition, addition, it it can can be be ob observedserved to to what what extent extent this this higher level of detaildetail can can lead lead to to better better perceptions. perceptions. 3.2. Univariate Choropleth Maps 3.2. Univariate Choropleth Maps The univariate choropleth maps in Figures2 and3 reveal different regional patterns. The univariate choropleth maps in Figures 2 and 3 reveal different regional patterns. In the population density map in Figure2, the densely populated areas appear in darker In the population density map in Figure 2, the densely populated areas appear in darker colours, representing a high number of inhabitants per square kilometre. Namely, there is colours, representing a high number of inhabitants per square kilometre. Namely, there is the dominant Rhine–Ruhr area along the two rivers in the centre of NRW, including the the dominant Rhine–Ruhr area along the two rivers in the centre of NRW, including the area called ‘’ south of the Ruhr and several surrounding highly populated area called ‘Bergisches Land’ south of the Ruhr and several surrounding highly populated areas such as the city of close to the western border as well as and Münster areas such as the city of Aachen close to the western border as well as Bielefeld and in the northern part of the state. The space in between these highly populated areas is Münster in the northern part of the state. The space in between these highly populated almost uniformly characterised by rural regions and is not as diverse in terms of population areas is almost uniformly characterised by rural regions and is not as diverse in terms of density as it appears in the denser populated urban areas. This results from the fact that populationpopulation density density as is highlyit appears correlated in the denser with specific populated land urban use [49 areas.] as fields, This results forests, from and

ISPRS Int. J. Geo-Inf. 2021, 10, 432 13 of 17

huge industrial sites (e.g., open-pit mines) are clearly not as suitable for living as residential areas close to urban agglomerates. The map in Figure3 also shows a notable pattern, except that this cannot simply be explained by land use and thus by the presence or absence of housing. Indeed, the reddish-coloured polygons indicate a lower average annual income and appear rather accumulated in more urban areas. There are also urban areas as well as rural areas where this assumption does not fit.

3.3. Bivariate Choropleth Maps The bivariate choropleth map in Figure7 shows the combined distribution of the population density and the individual annual income for the year 2017 based on PLZ8 polygons. It is clearly recognisable that the most populated urban areas are dominated by the largest diversity while the rural areas are coloured quite homogenously with less population density and a mediocre annual income. The Ruhr area is dominated by blue polygons, representing the lowest class of average annual income, no matter how densely populated they are. Consequently, this indicates the presence of people who have a low annual income but still live in the rather highly populated inner parts of the cities. This impression is interrupted by only a few areas that are represented in brownish colours. These yellow-brown colours represent the intermediate and utmost class of the individual annual income and are located mostly in the southern parts of the Ruhr area at the northern riverside of the Ruhr. This impression matches with the already attested assumption that the motorway A40 running from west to east across the Ruhr area is some sort of social equator, dividing the Ruhr area into two different parts—well-heeled inhabitants in the southern part and inhabitants with less money in the northern part [50,51]. Comparing the Ruhr area with the Rhineland in Figure7, one can ascertain that almost no black dyed polygons can be found in the Ruhr area. It appears mostly in lighter colours, indicating the lowest class of the average annual income through all three classes of the population density. Cities such as Dusseldorf, Cologne, or shine in a different light, since there are a lot of areas that have the highest population density and are indwelled by people who have a relatively high average annual income. The rural areas appear in mostly brownish colours resulting from the already mentioned low rate of population density. They are only interrupted by few smaller areas. These appear either in darker yellow, indicating higher annual income if they are closer to urban agglomerations. Otherwise, they appear in light green in regions far off the respective centres of cities. The detailed regional perceptions coincide with the statistical results from other studies that analysed the relationship between the Rhineland and the Ruhr area [18–20,52]. On the one hand, this corroborates the hypothesis that the social composition of the two agglomerations varies significantly. On the other hand, it provides a possibility to locate these areas in a new level of detail. However, the bivariate choropleth map in Figure7 can reveal a whole new level of detail and give extensive insights into the smaller urban areas. Additionally, and maybe rather especially, the preceding disaggregation and the resulting new bivariate choropleth map for the city of Dortmund in Figure 12 show that the level of spatial information has increased significantly. This offers the possibility to not only evaluate the bigger picture but also to analyse socioeconomic conditions on a city block level.

4. Discussion The primary methodological goal of this study was to develop a transferable method- ology to be able to analyse socio-demographic variables in much more spatial detail to gain more spatially precise information on the living circumstances of residential areas, embedded in the context of political decision making, geo-marketing, and for further socioeconomic research and urban planning. The results facilitate a distinctly new perspec- tive on certain urban areas, where large-scale city block polygons have replaced the area ISPRS Int. J. Geo-Inf. 2021, 10, 432 14 of 17

covering PLZ8 polygons, assigned with respective matching and adjusted values. These allow a much more detailed assessment of fine-grained regional questions. This becomes obvious by combining the two different socio-demographic attributes of population density and average annual income per inhabitant. This is useful to not only analyse how these two variables may affect each other in terms of numbers but also to dissipate them into appropriate spatial units. Their spatial visualisation can reveal certain segmentations that are perfect for preliminary analyses of the underlying data. By completing this, the results in Figure7 reveal a distinct pattern that allows for the evaluation and comparison of certain regions about their characteristics in both variables. The K means algorithm and the resulting multivariate statistical and spatial clusters for both spatial levels of detail evidently illustrate the added value of more precise new spatial units that can lead to a better understanding of where to find socially disadvantaged people on the one hand and how to improve economic approaches that incorporate the social status of people on the other hand. They provide an enhanced reference for further studies that deal with a smaller area of investigation and thus require smaller units to evaluate not broadly but in great detail. Looking at the two regions of the Rhineland and the Ruhr area, it is noticeable that even though both regions stand out in NRW in terms of population density, the Rhineland is obviously inhabited by more people that have an average income or above while the inhabitants of the Ruhr area seem to be below the German average of the respective year (2017). A reason for this could be the diverging demographic structures of the two regions. The Ruhr area has always been a melting pot for miners and factory employees due to the history of the region with a high density of coal and steel factories. Once the industrial sector was obliterated by the structural transformation of the Ruhr area due to the end of the mining era, a lot of people—especially foreign guest workers—were forced to look for new professions while the area was still looking for its future purpose [53]. Apart from that, the Ruhr area had and still has a reputation that suggests a dirty image caused by industrial smoke, coal mining, air pollution, and missing recreational sites such as green areas and forests. This led to the fact that the Rhineland seemed to be more attractive not only for citizens but also for industry and commerce [20] while the Ruhr area was still in the process of handling the downfall of the industrial era and hence is still looking for new perspectives to provide jobs for the many inhabitants. Due to the successful disaggregation, now inhabitants can be evaluated according to their density and their average annual income—not only for large PLZ8 polygons but for detailed polygons that represent residential city blocks. Many disaggregation techniques have already been tested, evaluated, and established in various studies (see Section 2.3). However, the one that [21] was adopted and improved for his case study of NRW can easily be transferred to almost every spatial unit in other countries with a comparable data basis without further adaptation. This indicates that socio-geomatic approaches are of great value to combine fine spatial structures with socio- demographic and -economic data to enhance their spatial resolution in order to more accurately analyse their spatial patterns.

5. Conclusions This paper exemplarily investigates the spatial relationship between population den- sity and annual income to identify hotspots of imbalance by visualising them in bivariate choropleth maps. This approach is carried forward and followed by applying a spatial disaggregation technique to then perform a statistical multivariate cluster analysis on both spatial resolutions for the two attributes into a certain hierarchy for the study area of Dortmund (see Figures8–10). The comparison of both the clusters and bivariate choropleth maps of the starting and the resulting scales after the disaggregation revealed several improvements. It was demonstrated that the scattering of the cluster combinations has been substantially de- creased while the coefficient of determination of the two attributes has been increased, and ISPRS Int. J. Geo-Inf. 2021, 10, 432 15 of 17

hence, the usability of socio-demographic values stored in postal code features could be significantly improved. The empirical acquisition and commercial or scientific distribution of statistical values for the huge variety of different socio-demographic variables is often carried out in coarse and gapless spatial datasets (e.g., regular rasters or postcode polygons). These datasets can help to characterise regions based on their structure and distribution, but their mediocre level of detail and the fact that they do not distinguish between inhabited and uninhabited areas reveal a great potential for improvement concerning the spatial resolution and suitability of the variable containing features. This especially takes effect when dealing with sociological research questions that are analysed spatially, as the results of a sophisticated disaggregation can raise possible interpretations to the next level.

Author Contributions: Conceptualisation, Carsten Juergens, Nicolai Moos, and Andreas P. Redecker; Methodology, Nicolai Moos and Andreas P. Redecker; Spatial Analysis, Nicolai Moos and Andreas P. Redecker; Writing—Original Draft Preparation, Nicolai Moos, Carsten Juergens, and Andreas P. Redecker; Writing—Review and Editing, Nicolai Moos, Carsten Juergens, and Andreas P. Redecker; Visualisation, Nicolai Moos, Carsten Juergens, and Andreas P. Redecker; Supervision, Carsten Juergens. All authors have read and agreed to the published version of the manuscript. Funding: The research was supported by project no. 2019-1-CZ01-KA203-061374 Spatial and eco- nomic science in higher education—addressing the playful potential of simulation games (Spation- omy 2.0) funded by the within the + program. Conflicts of Interest: The authors declare no conflict of interest.

References 1. Hoffmeyer-Zlotnik, J.H.; Warner, U. Soziodemographische Standards//Nationale soziodemographische Standards und interna- tional harmonisierte soziodemographische Hintergrundvariablen. In Handbuch Methoden der empirischen Sozialforschung; Baur, N., Blasius, J., Hoffmeyer-Zlotnik, J.H., Warner, U., Eds.; Springer VS: , Germany, 2014; pp. 733–743. 2. Küppers, R. Verfahren der Generierung mikrogeographischer Datenangebote zu Bevölkerung, Haushalten, Wohnungen, Gebäu- den, Quartieren und Arbeitsplätzen. In Flächenutzungsmonitoring IV: Genauere Daten–Informierte Akteure–Praktisches Handeln; Meinel, G., Schuhmacher, U., Behnisch, M., Eds.; Rhombos: , Germany, 2012; pp. 175–182. 3. Monteiro, J.; Martins, B.; Pires, J.M. A hybrid approach for the spatial disaggregation of socio-economic indicators. Int. J. Data Sci. Anal. 2017, 5, 189–211. [CrossRef] 4. Yao, J.; Mitran, T.; Kong, X.; Lal, R.; Chu, Q.; Shaukat, M. Landuse and land cover identification and disaggregating socio-economic data with convolutional neural network. Geocarto Int. 2019, 35, 1109–1123. [CrossRef] 5. Stevens, F.R.; Gaughan, A.E.; Linard, C.; Tatem, A.J. Disaggregating census data for population mapping using random forests with remotely-sensed and ancillary data. PLoS ONE 2015, 10, e0107042. [CrossRef] 6. Flacke, J.; Köckler, H. Spatial urban health equity indicators—A framework-based approach supporting spatial decision making. In Proceedings of the Sustainable Development and Planning VII, , , 19–21 May 2015; pp. 365–376. 7. Maschewsky, W. Umweltgerechtigkeit: Gesundheitsrelevanz und Empirische Erfassung; WZB Berlin Social Science Center: Berlin, Germany, 2004; SP I 2004-301. 8. Maier, W.; Mielck, A. “Environmental justice” (Umweltgerechtigkeit). Prävention Gesundh. 2010, 5, 115–128. [CrossRef] 9. Luc Anselin. Spatial Data Analysis with GIS: An Introduction to Application in the Social Sciences; University of California: Santa Barbara, CA, USA, 1992. 10. Ballas, D.; Clarke, G.; Franklin, R.S.; Newing, A. GIS and the Social Sciences: Theory and Applications; Routledge: Abingdon, UK; New , NY, USA, 2017; ISBN 9781317638834. 11. Goodchild, M.F.; Anselin, L.; Appelbaum, R.P.; Harthorn, B.H. Toward Spatially Integrated Social Science. Int. Reg. Sci. Rev. 2000, 23, 139–159. [CrossRef] 12. Baur, N.; Hering, L.; Raschke, A.L.; Thierbach, C. Theory and Methods in Spatial Analysis. Towards Integrating Qualitative, Quantitative and Cartographic Approaches in the Social Sciences and Humanities. Hist. Soc. Res. 2014, 39, 7–50. [CrossRef] 13. McLafferty, S. The Socialization of GIS. Cartogr. Int. J. Geogr. Inf. Geovis. 2004, 39, 51–53. [CrossRef] 14. Spielman, S.E.; Thill, J.-C. Social area analysis, data mining, and GIS. Comput. Environ. Urban Syst. 2008, 32, 110–122. [CrossRef] 15. Flacke, J.; Schüle, S.A.; Köckler, H.; Bolte, G. Mapping Environmental Inequalities Relevant for Health for Informing Urban Planning Interventions—A Case Study in the City of Dortmund, Germany. Int. J. Environ. Res. Public Health 2016, 13, 711. [CrossRef] 16. Microm GmbH. Das Datenhandbuch 2019. Available online: https://www.microm.de/fileadmin/microm_Datenhandbuch_2019 .pdf#c915 (accessed on 15 October 2020). ISPRS Int. J. Geo-Inf. 2021, 10, 432 16 of 17

17. Microm GmbH. Kaufkraft. Available online: http://fdz.rwi-essen.de/files/PDF/microm_Kaufkraft%20-%20Kopie.pdf (accessed on 15 October 2020). 18. Tenfelde, K.; Ditt, K. (Eds.) Das Ruhrgebiet in Rheinland und Westfalen; Verlag Ferdinand Schöningh: , Germany, 2007; ISBN 9783657757480. 19. LZG. NRW. Bevölkerung Mit Migrationsgeschichte 2017. Available online: https://www.lzg.nrw.de/00indi/0data/02/grafik/02 00601052017/atlas.html?comparisonSelect=5000&date=2017 (accessed on 20 October 2019). 20. Ditt, K. Die Entwicklung des Raumbewusstseins in Rheinland und Westfalen, im Ruhrgebiet und in Nordrhein-Westfalen während des 19. und 20. Jahrhunderts: Charakteristika und Konkurrenzen. In Das Ruhrgebiet in Rheinland und Westfalen; Tenfelde, K., Ditt, K., Eds.; Verlag Ferdinand Schöningh: Paderborn, Germany, 2007; pp. 405–473. ISBN 9783657757480. 21. Moos, N. Soziogeomatik: Möglichkeiten und Grenzen der Verwendung von Erdbeobachtungsdaten und Geodaten Zusammen mit Soziodemographischen und Sozioökonomischen Daten. Ph.D. Thesis, Ruhr-University Bochum, Bochum, Germany, 2020. 22. Microm GmbH. PLZ8: NRW Region. Available online: https://www.microm.de/loesungen/geodaten/plz8/ (accessed on 22 November 2020). 23. BKG. Verwaltungsgebiete 1:250 000 (VG250). Available online: https://gdz.bkg.bund.de/ (accessed on 21 November 2020). 24. Juergens, C. Digital Data Literacy in an Economic World: Geo-Spatial Data Literacy Aspects. ISPRS Int. J. Geo Inf. 2020, 9, 373. [CrossRef] 25. Microm GmbH. Soziodemografische Variablen NRW 2017. Available online: https://microm.de/loesungen/marktdaten/ soziodemografie-und-oekonomie/ (accessed on 16 October 2020). 26. Openstreetmap Contributors: OpenStreetMap Data Extracts. Available online: https://openstreetmap.org/copyright; opendatacommons.org (accessed on 22 November 2020). 27. RWI; Microm. Sozioökonomische Daten auf Rasterebene (Welle 6). Kaufkraft; RWI—Leibniz Institute for Economic Research: , Germany, 2018. 28. Goebel, J.; Krause, P. Gestiegene Einkommensungleichheit in Deutschland. Wirtschaftsdienst 2007, 87, 824–832. [CrossRef] 29. Li, T.; Pullar, D.; Corcoran, J.; Stimson, R. A comparison of spatial disaggregation techniques as applied to population estimation for South East Queensland (SEQ), Australia. Appl. GIS 2007, 3, 1–16. [CrossRef] 30. European Commission. Copernicus. Urban Atlas 2012. Available online: https://land.copernicus.eu/local/urban-atlas/urban- atlas-2012 (accessed on 22 November 2020). 31. Cartone, A.; Panzera, D. Deprivation at local level: Practical problems and policy implications for the province of . Reg. Sci. Policy Pr. 2021, 13, 43–61. [CrossRef] 32. Kounadi, O.; Ristea, A.; Leitner, M.; Langford, C. Population at risk: Using areal interpolation and Twitter messages to create population models for burglaries and robberies. Cartogr. Geogr. Inf. Sci. 2018, 45, 205–220. [CrossRef] 33. Li, T.; Corcoran, J. Testing dasymetric techniques to spatially disaggregate the regional population forecasts for South East Queensland. J. Spat. Sci. 2011, 56, 203–221. [CrossRef] 34. Mennis, J.; Hultgren, T. Intelligent Dasymetric Mapping and Its Application to Areal Interpolation. Cartogr. Geogr. Inf. Sci. 2006, 33, 179–194. [CrossRef] 35. Eicher, C.L.; Brewer, C.A. Dasymetric Mapping and Areal Interpolation: Implementation and Evaluation. Cartogr. Geogr. Inf. Sci. 2001, 28, 125–138. [CrossRef] 36. Sridharan, H.; Qiu, F. A Spatially Disaggregated Areal Interpolation Model Using Light Detection and Ranging-Derived Building Volumes. Geogr. Anal. 2013, 45, 238–258. [CrossRef] 37. Coetzee, S.; Ivánová, I.; Mitasova, H.; Brovelli, M. Open Geospatial Software and Data: A Review of the Current State and A Perspective into the Future. ISPRS Int. J. Geo Inf. 2020, 9, 90. [CrossRef] 38. Malhotra, A.; Bischof, J.; Allan, J.; O’Donnell, J.; Schwengler, T.; Benner, J.; Schweiger, G. A Review on Country Specific Data Availability and Acquisition Techniques for City Quarter Information Modelling for Building Energy Analysis. In BauSIM 2020: 8th Conference of IBPSA Germany and Austria, 23–25 September 2020, Graz University of Technology, Austria: Proceedings; Verlag der Technischen Universität Graz: Graz, Austria, 2020. 39. Batista e Silva, F.; Poelman, H. Mapping Population Density in Functional Urban Areas: A Method to Downscale Population Statistics to Urban Atlas Polygons; EUR, Scientific and Technical Research Series: Luxembourg, 2016. 40. Land NRW. Open Data-Digitale Geobasisdaten NRW. Available online: https://www.bezreg-koeln.nrw.de/brk_internet/ geobasis/opendata/index.html (accessed on 30 April 2020). 41. Töpsch, S. Räumliche Disaggregation von Bevölkerungsdaten: GIS-Gestützte Methode zur Erstellung eines Deutschland-Rasters der Kleinräumigen Bevölkerungsdichte; Lodron-Universität: Salzburg, Austria, 2009. 42. Götze, W.; van den Berg, N. Techniken des Business Mapping; Oldenbourg Wissenschaftsverlag: Berlin, Germany; Boston, MA, USA, 2003; (Reprinted in 2017). 43. Olbrich, G.; Quick, M.; Schweikart, J. Desktop Mapping: Grundlagen und Praxis in Kartographie und GIS, 3rd ed.; Springer: Berlin/, Germany, 2002. 44. Juergens, C.; Meyer-Heß, M.F. Application of NDVI in Environmental Justice, Health and Inequality Studies–Potential and Limitations in Urban Environments; Preprints: Basel, Switzerland, 2020. 45. Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett. 2010, 31, 651–666. [CrossRef] ISPRS Int. J. Geo-Inf. 2021, 10, 432 17 of 17

46. ESRI. Mapping Cluster Toolset Concepts: How Multivariate Clustering Works. Available online: https://pro.arcgis.com/en/pro- app/latest/tool-reference/spatial-statistics/how-multivariate-clustering-works.htm (accessed on 29 January 2021). 47. Stadt Dortmund. Bericht zur sozialen Lage in Dortmund 2018; City of Dortmund, Department for Labor, Health and Social Affair: Dortmund, Germany, 2018. 48. Mennis, J.; Hultgren, T. Dasymetric Mapping for Disaggregating Coarse Resolution Population Data. In Proceedings of the 22nd Annual International Cartographic Conference, A Coruña, , 9–16 July 2005; pp. 9–16. 49. Cooke, S.; Behrens, R. Correlation or cause? The limitations of population density as an indicator for public transport viability in the context of a rapidly growing developing city. Transp. Res. Procedia 2017, 25, 3003–3016. [CrossRef] 50. Kersting, V.; Meyer, C.; Strohmeier, P.; Terpoorten, T. Die A 40–der Sozialäquator des Ruhrgebiets. In Atlas der Metropole Ruhr: Vielfalt und Wandel des Ruhrgebiets im Kartenbild; Prossek, A., Schumacher, J., Eds.; Emons: Köln, Germany, 2009. 51. Mühlan-Meyer, T.; Lützenkirchen, F. Visuelle Mehrsprachigkeit in der Metropole Ruhr–eine Projektpräsentation: Aufbau und Funktionen der Bilddatenbank “Metropolenzeichen”. Z. Angew. Linguist. 2017, 66.[CrossRef] 52. Danielzyk, R. Demographischer Wandel in Nordrhein-Westfalen, 2nd ed.; Inst. für Landes-und Stadtentwicklungsforschung: Dortmund, Germany, 2010; ISBN 9783869340418. 53. Weber, W. Strukturwandel im Ruhrgebiet 1820-2000. Westfälische Z. Z. Vaterländische Gesch. Altert. 2003, 153, 71–83.