Geographic Information Systems (G I S)

Computerisation has opened a vast new potential in the way we can communicate, analyze our surroundings, and make decisions. Data representing the real world can be stored and processed so that they can be presented later in simplified forms to suit specific needs. Many of our decisions depend on the details of our immediate surroundings and require information about specific places on the Earth’s surface. Such information is called geographical because it helps us to distinguish one place from another and to make decisions for one place that are appropriate for that location. Geographical information allows us to apply general principles to the specific conditions of each location, allows us to track what is happening at any place, and helps us to understand how one place differs from another. Geographical information, then, is essential for effective planning and decision making in the modern society.

G I S--What is it?  Geographic/Geospatial Information – information about places on the earth’s surface – knowledge about “what is where when” – Geographic/geospatial: synonymous  GIS--what’s in the S? – Systems: the technology – Science: the concepts and theory – Studies: the societal context

Geographic Information Technologies

 Global Positioning Systems (GPS) – a system of earth-orbiting satellites which can provide precise (100 meter to sub-cm.) location on the earth’s surface (in lat/long coordinates or equiv.)  Remote Sensing (RS) – use of satellites or aircraft to capture information about the earth’s surface – Digital ortho images a key product (map accurate digital photos)  Geographic Information Systems (GISy) – Software systems with capability for input, storage, manipulation/analysis and output/display of geographic (spatial) information. GPS, RS & maps are sources of input data for a GISy. GIS is a way to model our world!

Events Zoning Streets Utilities Ownership Natural Resources

Real World

Defining Geographic Information Systems

• The common ground between information processing and the many fields using spatial analysis techniques. (Tomlinson, 1972) • A powerful set of tools for collecting, storing, retrieving, transforming, and displaying spatial data from the real world. (Burroughs, 1986) • A computerised database management system for the capture, storage, retrieval, analysis and display of spatial (locationally defined) data. (NCGIA, 1987) • A decision support system involving the integration of spatially referenced data in a problem solving environment. (Cowen, 1988)

Basic Data Model

 A prerequisite for describing the real world by use of GIS is that the different types of geographical information can be stored in a computer.  All the operations in a computer are based on the storage and handling of numbers.  This is why the data stored in a computer are known as digital data.  In GIS there is a need to store graphical figures, images, numerical values, and plain text.  All these forms of data must thus be able to be converted to digital representation. Representing Data with Vector Models

 Vector Model

The fundamental concept of vector GIS is that all geographic features in the real world can be represented either as:

 Points or Dots (nodes): Points are the fundamental and simplest form of geographical objects and are zero-dimensional because they have no extension. Each point is represented by a coordinate pair. Example : trees, poles, tube wells, earthquakes.

 Lines (arcs) : Lines linked together with line segments. A line has two points as a boundary; a start point and an end point. Lines are one-dimensional, as they stretch in only one direction. Mathematically, a vector is straight line having both magnitude and direction. streams, streets, sewers, pipe line, electrical line.

 Areas (polygons): An area is represented by a single line that encloses a space, thus forming a closed polygon. The surrounding line, called a ring, has to start and end at the same point in order for the area to be closed and defined. Areas are two-dimensional because they stretch in two dimensions. Example: land parcels, cities, counties, forest, rock type, water body.

Spaghetti Model

 Spaghetti data are a collection of points and lines with no real connection.  What appears as a long, continuous line on the map or in the terrain may consist of several line segments that are to be found in odd places in the data file.  There are no specific points that designate where lines might cross, nor are there any details of logical relationships between objects.  Polygons are represented by their circumscribing boundaries, so that common boundaries between adjacent polygons are registered twice.

Topology Model

 The topology model is one in which the connections and relationships between objects are described independent of their coordinates.  Their topology remains fixed as geometry is stretched and bent.  The topology model overcomes the major weakness of the spaghetti model, which lacks the relationships requisite to many GIS manipulations and presentation.  The topology model employs nodes and links.  A node can be a point where two lines intersect, an endpoint on a line, or a given point on a line. For example intersection of two roads.  A link is a segment of a line between two nodes. Links have a start node and an end node and therefore have a direction in a topology model. Several links can share a node, and a collection of such links and nodes is known as a network. Representing Data with Raster Models

Raster Model

 Area is covered by grid with (usually) equal-sized, square cells.  The raster model represents reality through selected surfaces arranged in a regular pattern.  Reality is thus generalized in terms of uniform, regular cells, which are usually rectangular or square but may be triangular or hexagonal.  The raster model is in many ways a mathematical model, as represented by the regular cell pattern.  Raster models are created by assigning real-world values to pixels.  Attributes are recorded by assigning each cell a single value based on the majority feature (attribute) in the cell, such as land use type.  Image data is a special case of raster data in which the “attribute” is a reflectance value from the geomagnetic spectrum  cells in image data often called pixels (picture elements)  Raster models are created by assigning real-world values to pixels.  The assigned values comprise the attributes of the objects. Values are assigned to all the pixels in a raster.

Concept of Vector and Raster

Real World

Raster Representation Vector Representation 0 1 2 3 4 5 6 7 8 9 0 R T 1 R T 2 H R 3 R 4 R R 5 R point 6 R T T H 7 R T T line 8 R 9 R

polygon Vector and Raster

Vector Raster Point = Position, no area Point = 1 cell Line = Length, no width Line = Multiple cell joined at edges or corners usually with only 1 or 2 neighbors. Polygon = Area and Perimeter Polygon = Group of contiguous cells joined at edges or corners

Vector Structure – Advantages

 Good representation of the landscape being mapped.  Topology (geometrical relationships between spatial objects viz. points, lines, areas) can be completely described, including network linkages.  Great looking graphics.  Generalization of graphics is possible while still maintaining the great look.  Raster Structure - Advantages

 Overlaying maps is easy and perfect.  Integration of remotely sensed images is straight forward.  A huge variety of complex spatial analyses are supported.  Software is generally cheaper and easier to learn compared to vector GIS.

Terrain Surface Representation

 The digital representation of a terrain surface is called either a Digital Terrain Model (DTM) or a Digital Elevation Model (DEM).  The terrain surface can be described as comprising two basically different elements – random and systematic elements.  The random elements are the continuous surfaces with continuously varying relief.  It would take an endless number of points to describe exactly the random terrain shapes, which can be described with a network of points.

 GRID model

 A systematic grid, or raster, of spot heights at fixed mutual spaces is often used to describe terrain.  Elevation is assumed constant within each cell of grid i.e. the area represented by each cell is shown as a flat area in the model  Thus small cells detail terrain more accurately than large cells.  The grid model is most suitable for describing random variations in terrain, whereas the systematic linear structures can easily disappear or be deformed.

 Elevations values are stored in a matrix, and the contiguity between points is thus expressed through the column and line numbers.  Different interpolation techniques are used to generate an elevation grid from source data such as points, contour lines, and break lines.  When elevation data are organized in a grid structure, the matrix will give direct access to neighboring cells.  Thus, the interpolation of the z value to the new x and y points is quite simple based on linear interpolation.

GRID Method: Digital terrain models describe the terrain numerically in the form of x, y and z coordinates. Graphic presentation can be either in the form of a grid or as profile.

 TIN model

 An area model is an array of triangular areas with their corners stationed at selected points of most importance, for which the elevations are known.  The inclination of the terrain is assumed to be constant within each triangle.  The areas of the triangles may vary, with the smallest representing those areas in which the terrain varies most.  The resulting model is called the Triangulated Irregular Network (TIN).

 To construct a TIN, all measured points are built and the model thus represents lines of fracture, single points, and random variations in the terrain.  In the TIN model, the x-y-z coordinates of all points, as well as the triangle attributes of inclination and direction, are stored.  The triangles are stored in a topological vector data storage structure comprising polygons and nodes, thereby preserving the triangle’s contiguity, which eases the calculation of z values for new points. TIN method: The surface of the terrain can be described as inclined triangular areas, starting from points selected for their importance.

Geo-referencing Data

 Geo-referencing is defined as positioning objects in either two or three dimensional space.  Capturing data – Scanning: all of map converted into raster data – Digitising: individual features selected from map as points, lines or polygons  Geo-referencing – Initial scanning digitising gives co-ordinates in inches from bottom left corner of digitiser/scanner – Real-world co-ordinates are found for four registration points on the captured data – These are used to convert the entire map onto a real-world co-ordinate system  Continuous geo-referencing systems • Implies continuous measurement of the position of phenomena in relation to a reference point with no abrupt changes or break. • It involves resolution (refer to smallest increment that a digitizer can represent) and precision (measure of the dispersion e.g. standard deviation). Accuracy is the extent to which an estimated value approaches a true value. • Many geographical phenomena, including land boundaries, manhole locations, building details, and many map details are measured on a continuous basis. • Continuous systems include direct geo-referencing, which involve Datum , Coordinate systems, Map projection.

 Discrete geo-referencing systems  In discrete georeference systems the positions of phenomena are measured relative to fixed, limited units of the surface of the Earth.  This method is also known as spatial georeferencing by geographical identifiers.  Typical area reference units include : Address and street codes; Postal codes and area name; Property units; Administrative zones and statistical units; Regular grids and map sheets.

Example of Geo-referencing

Geographic Coordinate Systems

 A geographic coordinate system is a reference system that uses a three- dimensional spherical surface to determine locations on the earth.  Any location on earth can be referenced by a point with longitude and latitude coordinates.

For example, figure shows a geographic coordinate system where a location is represented by the coordinates longitude 80 degree East and latitude 55 degree North. The lines that run east and west each have a constant latitude value and are called parallels. They are equidistant and parallel to one another, and form concentric circles around the earth. The equator is the largest circle and divides the earth in half. It is equal in distance from each of the poles, and the value of this latitude line is zero. Locations north of the equator have positive latitudes that range from 0 to +90 degrees, while locations Latitude Lines south of the equator have negative latitudes that range from 0 to 7 -90 degrees.

The lines that run north and south each have a constant longitude value and are called meridians. They form circles of the same size around the earth, and intersect at the poles. The prime meridian is the line of longitude that defines the origin (zero degrees) for longitude coordinates. One of the most commonly used prime meridian locations is the line that passes through Greenwich, England. Locations east of the prime meridian up to its antipodal meridian (the continuation of the prime meridian on the other side of the globe) have positive longitudes ranging from 0 to +180 degrees. Locations west of the prime meridian have negative longitudes ranging from 0 to -180 degrees.

Geographic Coordinate Systems

 The latitude and longitude lines can cover the globe to form a grid, called a graticule.  The point of origin of the graticule is (0,0), where the equator and the prime meridian intersect.  The equator is the only place on the graticule where the linear distance corresponding to one degree latitude is approximately equal the distance corresponding to one degree longitude.  Because the longitude lines converge at the poles, the distance between two meridians is different at every parallel.  Therefore, as you move closer to the poles, the distance corresponding to one degree latitude will be much greater than that corresponding to one degree longitude.  It is also difficult to determine the lengths of the latitude lines using the graticule.  The latitude lines are concentric circles that become smaller near the poles.  Form a single point at the poles where the meridians begin.  At the equator, one degree of longitude is approximately 111.321 kilometers, while at 60 degrees of latitude, one degree of longitude is only 55.802 km (this approximation is based on the Clarke 1866 spheroid).  Therefore, because there is no uniform length of degrees of latitude and longitude, the distance between points cannot be measured accurately by using angular units of measure.  UTM : Universal Transverse Mercator Coordinate System

 UTM is a grid-based method of specifying locations on the surface of the Earth.  It is used to identify locations on the earth, but differs from the traditional method of latitude and longitude in several respects.  The UTM coordinate system is commonly used in GIS for larger scale areas within a certain UTM zone.  The UTM projection is formed by using a transverse cylindrical projection, i.e., the standard line runs along a meridian of longitude.  The effect is to minimize distortion in a narrow strip running pole to pole.  UTM divides the earth into pole-to-pole zones 6 degrees of longitude wide.  The first zone starts at the International Date Line (180 degrees east) and the last zone, 60, starts at 174 degrees east.  Northings are determined separately for the areas north and south of the equator.  Because distortion becomes extreme at northern latitudes, UTM is not normally used above 80 degrees North or South.

Easting and Northing in UTM Grid

 Easting is a measure of how far east or west the location is from a reference longitude.  Northing is a measure of how far north or south the location is from the equator.  Any point can then be described by its distance east of the origin (its ‘easting’ value).  By definition the Central Meridian is assigned a false easting of 500,000 meters.  Any easting value greater than 500,000 meters indicates a point east of the central meridian.  Any easting value less than 500,000 meters indicates a point west of the central meridian.  Distances (and locations) in the UTM system are measured in meters, and each UTM zone has its own origin for east-west measurements.

 To eliminate the necessity for using negative numbers to describe a location, the east-west origin is placed 500,000 meters west of the central meridian.  This is referred to as the zone’s ‘false origin’. The zone doesn't extend all the way to the false origin.

 The origin for north-south values depends on whether you are in the northern or southern hemisphere.  In the northern hemisphere, the origin is the equator and all distances north (or ‘northings’) are measured from the equator.  In the southern hemisphere the origin is the south pole and all northings are measured from there.  Once again, having separate origins for the northern and southern hemispheres eliminates the need for any negative values.  The average circumference of the earth is 40,030,173 meters, meaning that there are 10,007,543 meters of northing in each hemisphere.  UTM coordinates are typically given with the zone first, then the easting, then the northing.  So, in UTM coordinates, Red Hill is located in zone twelve at 328204 E (easting), 4746040 N (northing).  Based on this, you know that you are west of the central meridian in zone twelve and just under halfway between the equator and the north pole.  The UTM system may seem a bit confusing at first, mostly because many people have never heard of it, let alone used it.  Once you’ve used it for a little while, however, it becomes an extremely fast and efficient means of finding exact locations and approximating locations on a map.

Layers

 Data on different themes are stored in separate “layers”  As each layer is geo-referenced layers from different sources can easily be integrated using location  This can be used to build up complex models of the real world from widely disparate sources

MAP - Key Terms

 Map: a flat representation of a globe  Cartography - the art and science of mapmaking

This cannot be done without distortion, characteristic that is to be shown accurately at the expense of others, or a compromise of several characteristics. There is literally an infinite number of ways in which this can be done, and several hundred projections have been published, most of which are rarely used.

Maps : Key Properties

Shape, Area (size), Distance, Direction  Conformal Map - shows correct shape  Equal Area Map - shows correct area  Rhumb Line - line shows correct direction  Great Circle - the largest circle that can be drawn on a globe. This circle divides the globe into two halves and is the shortest distance between any two points.

Map Projections

 Basic Principle  Although for many mapping applications the earth can be assumed to be a perfect sphere, there is a difference between the distance around the earth between the poles versus the equator.  The circumference of the earth is about 1/300th smaller around the poles.  This type of figure is termed an oblate ellipsoid or spheroid, and is the three- dimensional shape obtained by rotating an ellipse about its shorter axis.  An estimate of the earth’s surface based on an ellipsoid provides a determination of the elevation of every point on the earth’s surface, including sea level, and is often called a datum.

Map projections are used to transfer or “project” geographical coordinates onto a flat surface. The easiest way to try to transfer the information onto a flat surface is to convert the geographic coordinates into an X and Y coordinate system, where x is longitude and y is latitude. This is an example of “projecting” onto a plane.

 Projection: The system used to transfer locations from Earth’s surface to a flat map. – A projection of an image onto another surface . either a cylinder, a flat plane or a cone – 3 basic types of projections – cylindrical projection – conical projection – azimuthal projection A cylindrical projection usually places the earth inside a cylinder with the equator tangent or secant to the inside of the cylinder.

CYLINDRICAL PROJECTION used by navigators to show direction and meteorological chart Meridians run north & south Parallels run east & west

Cylindrical Maps are conformal maps

They conform to the correct shape

Examples are the Mercator projection and the Lambert Conformal Conic projection. The U.S. Geological Survey uses a conformal projection for many of its topographic maps. Conical Projection In a conic projection, a cone is placed over the earth, normally tangent to one or more lines of latitude.

Conical Projection used for Mid-Latitude Maps. Also referred to as equal area maps

Azimuthal Projection An azimuthal or planar projection is usually tangent to a specific point on earth’s surface, but may also be secant. This point, or focus, may be a pole, the equator, or other oblique point.

The azimuthal projection is used for polar charts due to distortion at other latitudes.

 Advantages of GIS – Exploring both geographical and thematic components of data in a holistic way – Stresses geographical aspects of a research question – Allows handling and exploration of large volumes of data – Allows integration of data from widely disparate sources – Allows analysis of data to explicitly incorporate location – Allows a wide variety of forms of visualisation  Limitations of GIS – Data are expensive – Learning curve on GIS software can be long – Shows spatial relationships but does not provide absolute solutions – Origins in the Earth sciences and computer science. Solutions may not be appropriate for humanities research