How to represent geospatial data in SDMX
SDMX Experts Workshop 2018 Juan Muñoz López SDMX-TWG, INEGI Agenda
• Basic Concepts on Geolocation • Representation of Geographic Coordinates and Areas • Proposal to Represent Geospatial Data in SDMX • Some Examples of Information Systems With Intensive Use Geo- Referenced Data • Discussion • Conclusions Basic Concepts on Geolocation Some Basic Concepts
• Geospatial data refers to geographical aspects like geodesic coordinates, regions, countries, cities, addresses, places, etc. • This kind of data is georeferenced, because it represents location, size and shape of an object on planet Earth. • Geo-Reference: To locate something into the physical space. We establish a reference between an object and its position on Earth. • ISO 19115-1:2014 defines the schema required for describing geographic information and services by means of metadata. Reference frameworks • To geo-reference we can use: • Coordinates system: establish imaginary points over the Earth. A coordinate system is a reference system used to represent the locations of geographic features, imagery, and observations, such as Global Positioning System (GPS) locations, within a common geographic framework. • Map projection: transform latitudes and longitudes on the surfaces of an sphere or an ellipsoid into locations on a plane. Is a systematic transformation of the latitudes and longitudes of locations from the surface of a sphere or an ellipsoid into locations on a plane. Map Projections
• Map projections: Conical, Cylindrical, Pseudocylindrical (Combination of conical and cylindrical), Azimuthal (planar).
• Orientation: normal, oblique or transversal
• Cuts: tangent or secant Projection Examples
Mercator: Is a cylindrical map projection presented by the Flemish geographer and cartographer Gerardus Mercator in 1569 Van der Griten: Is a projection of the Earth into a circle. Polar regions are subject to extreme distortion Google Maps, WMS and OpenStreetMap used a Mercator projection (EPSG:3857). From August 2018 Google is using a 3D Globe Model The Universal Transverse Mercator (UTM) • Uses a 2-dimensional Cartesian coordinate system to give locations on the surface of the Earth. Like the traditional method of latitude and longitude, it is a horizontal position representation, i.e. it is used to identify locations on the Earth independently of vertical position. • The UTM system is not a single map projection. The system instead divides the Earth into sixty zones, each being a six-degree band of longitude, and uses a secant transverse Mercator projection in each zone. • Instead of using latitude and longitude coordinates, each 6° wide UTM zone has a central meridian of 500,000 meters. This central meridian is an arbitrary value convenient for avoiding any negative easting coordinates. All easting values east and west of the central meridian will be positive. • If you’re in the northern hemisphere, the equator has a northing value of 0 meters. In the southern hemisphere, the equator starts at 10,000,000 meters. Geoid, Ellipsoid, Datum
• Geoid: Is the shape that the surface of the oceans would take under the influence of Earth's gravity and rotation alone, in the absence of other influences such as winds and tides. • Ellipsoid: Is a three-dimensional geometric figure that resembles a sphere, but whose equatorial axis (a in Figure 2.15.1, above) is slightly longer than its polar axis (b). Ellipsoids are commonly used as surrogates for geoids so as to simplify the mathematics involved in relating a coordinate system grid with a model of the Earth's shape. Many ellipsoids are in use around the world. • Geodetic datum: Is a coordinate system, and a set of reference points, used to locate places on the Earth (or similar objects) which is related to a specific ellipsoid. A datum in the modern sense is defined by choosing an ellipsoid and then a primary reference point. Therefore giving the ellipsoid used is not enough Th diff i di b d i l Cartesian Coordinate System
• Is a coordinate system that specifies each point uniquely in a plane by a pair of numerical coordinates, which are the signed distances to the point from two fixed perpendicular directed lines, measured in the same unit of length. • A Cartesian coordinate system for a three-dimensional space consists of an ordered triplet of lines (the axes) that go through a common point (the origin), and are pair-wise perpendicular; an orientation for each axis; and a single unit of length for all three axes. • As in the two-dimensional case, each axis becomes a number line. For any point P of space, one considers a plane through P perpendicular to each coordinate axis, and interprets the point where that plane cuts the axis as a number. The Cartesian coordinates of P are those three numbers, in the chosen order. The reverse construction determines the point P given its three coordinates. Geographical Coordinates Systems
Geographical coordinates ( , , h): • Latitude, phi, , Y-axis • Measured as Degrees φ λ • Longitude: lambda,φ , X-axis • Measured as Degrees • References: λ • X: Meridian of Greenwich • Y: Equator • Z: Rotation axis • Altitude (Height or Elevation), h, Z-axis • Measured as Meters • Altitude can be measured: • Ellipsoidal or Orthometric: A point above an ellipsoid. • Geocentric or Cartesian: A point from the mass of the Earth. Latitude, Longitude, Altitude
• Latitude: Coordinate that specifies the north–south position of a point on the Earth's surface. Is an angle which ranges from 0° at the Equator to 90° (North or South) at the poles. Lines of constant latitude, or parallels, run east–west as circles parallel to the equator. • Geodetic Latitude: The angle between the normal and the equatorial plane. The standard notation in English publications is . This is the definition assumed when the word latitude is used without qualification. • Longitude: Coordinate that specifies the eastφ -west position of a Point on the Earth’s Surface. Longitude the angle between a plane containing the Prime Meridian and a plane containing the North Pole, South Pole and the location in question. This forms a right-handed coordinate system with the z axis pointing from the Earth's center toward the North Pole and the x axis extending from Earth's center through the equator at the IERS Prime Meridian (derived from the Greenwich Meridian). • Geodetic Longitude: Angle from the prime meridian plane to the meridian plane passing through the given point, eastwards usually treated as positive. The standard notation in English publications is . • Altitude: Synonym of elevation or height. Is the mean distance above sea level. λ • Geodetic Altitude: The distance from the selected point to the reference geoid, measured along the geodetic local vertical, and is positive for points outside the www.britannica.com id d f h f h E h Some Reference Ellipsoids
Parameter Value Name Semi-Major Semi-Minor 1/Flattening Axis (a) Km. Axis (b) Km. Semi-major axis a Airy 1830 6377.563 6356.257 299.32 Reciprocal of 1/F Modified Airy 6377.340 6356.034 299.32 flattening Australian National 6377.160 6356.775 298.25 Semi-minor axis b Bessel 1841 6377.397 6356.079 299.15 b=a(1-F) Clarke 1886 6377.206 6356.584 294.98 Clarke 1880 6377.249 6356.516 293.46 First eccentricity = 1 = F(2-F) Everest 1830 6377.276 6356.075 300.80 squared 2 2 𝑏𝑏 Fischer 1960 6378.155 6356.773 298.30 2 ( ) Second eccentricity 𝑒𝑒 = − 𝑎𝑎 1 = Helmert 1906 6378.200 6356.818 298.30 squared 2 ( ) 6378.160 6356.774 298.25 2 𝑏𝑏 𝐹𝐹 2−𝐹𝐹 Indonesian 1974 2 2 𝑒𝑒′ 𝑎𝑎 − 1−𝐹𝐹 International 1924 6378.388 6356.912 298.00 Krassovsky 1940 6378.245 6356.863 298.30 South American 1969 6378.160 6356.774 298.25 WGS 72 6378.135 6356.751 298.26 GRS 80 6378.137 6356.752 298.257 WGS 84 6378.137 6356.752 298.257 Some Reference Datum
Datum Ellipsoid Adindan Clarke 1880 European 1950 (ED 50) International 1924 European 1979 (ED 79) International 1924 Pulkovo 1942 Krassovsky Word Geodetic System (WGS 84) WGS84 Korean Geodetic System WGS84 North American 1927 (NAD 27) Clarke 1866 North American 1983 (NAD 83) Clarke 1866 South American 1969 South American 1969 Most GPS receivers and map services South American Geocentric GRS80 like Apple Maps, Google Maps, OpenStreetMap, among others, use a Tokyo Bessel 1841 Cartesian coordinate System based on International Terrestrial Reference System 1992 (ITRF92) GRS80 WGS 84 (EPSG:4326); GRS80 ellipsoid is close enough to align without International Terrestrial Reference System 2008 (ITRF08) GRS80 correction. International Terrestrial Reference System 2014 (ITRF14) GRS80 Representation of Geographic Coordinates and Areas Representing Coordinates
There are several standards to represent geo-spatial coordinates: • ISO 6709:2008; Standard representation of geographic point location by coordinates. • KML 2.3, Open Geospatial Consortium, ISO 19111 • GML, Open Geospatial Consortium, ISO 19136 ISO 6709:2008; Standard representation of geographic point location by coordinates. resenting Coordinates • ISO 6709:2008; Standard representation of geographic point location by coordinates. • The horizontal position shall be described through a pair of coordinates. • Latitude: is a number preceded by a sign character. A plus sign (+) denotes northern hemisphere or the equator, and a minus sign (-) denotes southern hemisphere. • Longitude: is a number preceded by a sign character. A plus sign (+) denotes east longitude or the prime meridian, and a minus sign (-) denotes west longitude or 180° meridian (opposite of the prime meridian). • For digital data interchange, decimal degrees shall be the preferred representation. However, for backward compatibility with the first edition of this International Standard, sexagesimal degrees may be used. • Height (or depth): Its representation is optional. Height or depth from the reference surface in the positive direction shall be designated using a plus sign (+). Height or depth from the reference surface in the negative direction shall be designated using a minus sign (−) Height or depth on the reference surface
ISO 6709:2008 Single Text Representation Examples (Annex H) • Latitude and Longitude: • Degrees: +40-075CRSWGS_84/ • Degrees and decimal degrees: +40.20361-75.00417CRSWGS_84/ • Degrees and minutes: +4012-07500CRSWGS_84/ • Degrees, minutes and decimal minutes: +4012.22-07500.25CRSWGS_84/ • Degrees, minutes and seconds: +401213-0750015CRSWGS_84/ • Latitude, Longitude and Height or Depth: • Degrees and decimal degrees: +40.20361-75.00417+350.517CRSWGS_84/ • Degrees, minutes, seconds and decimal seconds: +401213.1-0750015.1+2.79CRSWGS_84/ • Rules: • The number of digits for latitude, longitude and height indicates the precision • There shall be no separator between the elements for latitude, longitude, height (depth) and CRS. • The use of designators “+”, “−” and “CRS” preceding the value part of each element permits the recognition of • The start of each element and the termination of the previous one. • The point location string shall be terminated. The terminator character shall be a solidus (/). • Representation in XML (From Annex G):
• Basis for the geographical reference of census and statistical surveys • Divides the Mexican territory in parts called “geo-statistical areas” • Contains three disaggregation levels: • Estate (AGEE) • Municipality (AGEM) • Basic (AGEB) • Urban or Rural • Contains localities and blocks Elements of the MGN
• For each disaggregation level, the MGN contains: • Code lists • Polygons Proposal to Represent Geospatial Data in SDMX Including Geographical Coordinates and Areas at any Level (like Observation) • To represent any geographical place we can add an attribute (Coordinates: COORD) or a set of attributes (Latitude: LAT Longitude: LON [Height: H] [Coordinates Reference System: CRS]) • We need to decide which would be the representation that we are going to use: • ISO 6709 • GML • KML • A SDMX defined • To represent any geographical area we can add an attribute (RefPolygon) composed by a set of coordinates. • We need to decide which would be the way we want to represent the polygon: • GML • KML • A SDMX defined Including Geographical Coordinates and Areas at any Level • Original:
Including Geographical Coordinates and Areas at any Level • Original:
• In GML/KML Style: GEO_POLYGON=“-112.265654928602,36.09447672602546,2357 -112.2660384528238,36.09342608838671,2357 - 112.2668139013453,36.09251058776881,2357 -112.2677826834445,36.09189827357996,2357 - 112.2688557510952,36.0913137941187,2357”
• In a SDMX defined Style: GEO_POLYGON=“45.256,-110.45;46.46,-109.48;43.84,- 109 86 45 8 109 2” Including Geographical References to CL_AREA • Geographical areas (CL_AREA) • This code list provides code values for geographical areas, defined as areas included within the borders of a country, region, group of countries, etc.
• A Polygon of coordinates may be associated to each code as an Annotation
• A reference to the Capital City (or to a centroid of the polygon) may be associated with a Coordinates attribute or a set of LAT LON [H] [CRS] attributes Defining a New Type of Dimension
• Geographical Dimension might be incorporated to the SDMX Information Model • This dimension may be useful for precise geographical localization of statistical data and their representation into maps • This dimension should be derived from DimensionComponent • It would include attributes like: • Latitude • Longitude • Height • Datum • GeoPolygon • LayerID SDMX Concepts Related to Geographical Issues
• Counterpart reference area (VIS_AREA) • The secondary area, as opposed to the reference area, to which the measured data is in relation. • Reference area • Country or geographic area to which the measured statistical phenomenon relates. • Geographical areas (CL_AREA) • This code list provides code values for geographical areas, defined as areas included within the borders of a country, region, group of countries, etc. SDMX Concepts Related to Geographical Issues
• Comparability – geographical (COMPAR_GEO) • Extent to which statistics are comparable between geographical areas. • Coverage (COVERAGE) • The definition of the population that statistics aim to cover. The term "coverage" encompasses the descriptions of key dimensions delimiting the statistics produced, e.g. geographical, institutional, product, economic sector, industry, occupation, transaction, etc., as well as relevant exceptions and exclusions. • The term Coverage describes the scope of the data compiled, rather than the characteristics of the survey. Some Examples of Information Systems With Intensive Use Geo-Referenced Data Mexico in Numbers
• Information from national to municipal level • Several domains: • Economy • Environment • Demography • Society • Government • Visualization in charts of maps • Thematic, cartographic or satellite maps
http://www.beta.inegi.org.mx/app/areasgeograficas/?ag=01#tabMCcollapse-Indicadores National Statistical Directory of Business Units (DENUE) • Identification and localization of 5’078,714 active business units in Mexico • Data from 2012, permanently updated • Selection of establishments by size, economical activity, and geographical area
• Geographical space http://www.beta.inegi.org.mx/app/mapa/denue/ chose by the user National Statistical Directory of Business Units (DENUE)
http://www.beta.inegi.org.mx/app/mapa/denue/ National Inventory of Dwellings (INV) • Data of dwellings, population and urban environment • Information from the 2010 National Census of Population and Dwelling and the grown between 2010 and 2015 • Shares the DENUE platform • Indicators contained: • 12 of dwelling • 7 of population • 14 of urban environment (vial infrastructure, equipment and services, access and commerce in the public roads) National Inventory of Dwellings (INV)
http://www.beta.inegi.org.mx/app/mapa/INV/Defaul t.aspx?ll=22.993165420735874,-108.9915234375&z=5 National Inventory of Dwellings (INV) Discussion Questions to Solve About Representing Geospatial Data in SDMX • Is it relevant to represent geographical information in SDMX? • Which representation style may be used to represent geographical coordinates in SDMX? • Which representation style may be used to represent geographical areas in SDMX? • CL_AREA must be modified to associate geographical polygons or is it better to create a new code list (CL_GEOAREA)? • Who would be in charge of maintaining this code-list? • Geographical information must be set as any regular dimension? Or, do we need to include a new type of dimension (Geographical Dimension) in SDMX? • What else will we need to support this characteristics (modifications to the web services APIs software tools etc ) Conclusions Space to fill with the conclusions… Additional Questions or Comments?
Contact: TWG@sdmx.org [email protected] References
• Encyclopedia Britannica (www.britannica.com) • International Earth Rotation and Reference Systems Service (www.iers.org) • International Hydographic Organization (www.iho.int) • International Union of Geodesy and Geophysics (www.iugg.org) • ISO (www.iso.org) • John A. Dutton E-Education Institute, PennState (www.e- education.psu.edu) • National Geographic (www.nationalgeographic.org) • Open Geospatial Foundation (www.ogc.org) • Open Source Geospatial Foundation (www.osgeo.org) • Wiki di ( iki di ) Some Known Names for Common CRS
NGVD 29 Sea Level Datum 1929 OSGB36 Ordnance Survey Great Britain 1936 SK-42 Systema Koordinat 1942 goda ED50 European Datum 1950 SAD69 South American Datum 1969 GRS 80 Geodetic Reference System 1980 ISO 6709 Geographic point coord. 1983 NAD 83 North American Datum 1983 WGS 84 World Geodetic System 1984 NAVD 88 N. American Vertical Datum 1988 ETRS89 European Terrestrial Ref. Sys. 1989 GCJ-02 Chinese obfuscated datum 2002 Geo URI Internet link to a point 2010 ITRF92-ITRF2014 International Terrestrial Reference System SRID Spatial Reference System Identifier (SRID) UTM Universal Transverse Mercator (UTM)