The Design and Implementation of

a Web-based GIS for Political Redistricting

Thesis

Presented in Partial Fulfillment of the Requirements for the Degree Master of Arts in the Graduate School of The Ohio State University

By

Wei Chen

Graduate Program in Geography

The Ohio State University

2009

Thesis Committee:

Prof. Ningchuan Xiao, Co-Advisor

Prof. Mei-Po Kwan, Co-Advisor

Prof. Daniel Sui

Copyright by

Wei Chen

2009

Abstract

The World Wide Web (www) has dramatically changed our way of producing, utilizing and consuming information, especially geospatial information in recent years. Web-based

GIS (Geographic Information Systems) are designed to provide Web users analytical tools to assist their spatial decisions making process. With the advantages such as platform independence, customizability and cost effectiveness, Open Source Geospatial

(OSGEO) software has been more adopted to develop Web-based GIS applications. Also, the increased availability of spatial functionalities in OSGEO software has opened many possibilities towards the implementation of a more powerful, interactive and collaborative Web-based GIS platform which is favorably referred to as the GeoWeb.

However, compared with proprietary systems current open source based online GIS systems have several limitations. For example, most of them do not provide customizable web mapping service and spatial data processing service. However, these two types of services are essential to effectively filter spatial information and explore area of interest.

This research introduces a framework of implementing a Web-based GIS using Open

Source Software, including Postgresql/PostGIS, MapServer, and OpenLayers. On the server side, Postgresql/PostGIS is used to store and process spatial data. MapServer is adopted to provide Web Mapping Service (WMS). Server side scripting language PHP is

ii employed to dynamically generate map file from PostGIS for MapServer to render. On the client side, OpenLayers provides the programming interface to incorporate layers from different data sources into a same DOM container. Web-based GIS for political redistricting, as an example, has been developed to demonstrate both merits and demerits of adopting this framework.

Initial results of the demonstration show that the integration of PostGIS, MapServer and

PHP could facilitate query based map generation and make mapping of massive spatial data efficient. Query based Web Mapping Service is capable to dynamically generate map and legend images. Spatial data handling functions in PostGIS are suitable for developing user interactive functions for querying, measuring and processing spatial data.

Users could use the implemented Web-based political redistricting GIS to explore census, devise and evaluate new plan, and compare different plans. This framework based on open environment can be adapted to applications with similar requirements. The application implemented in this research can be access through gis.osu.edu/redistricting.

Key Words: Web-based GIS, Political Redistricting, Open Source Geospatial (OSGEO),

Public Participation

iii

Dedication

I dedicate this thesis to my parents. Without their unconditional love and consistent

support, the completion of this thesis would not be possible.

iv

Acknowledgments

I would like to acknowledge and express my sincerest gratitude to the following persons who have given me help during the completion of this Master’s thesis:

Dr. Mei-Po Kwan, my co-advisor, for her many cares on my living and professional planning at the beginning of my study, her mentor in my professional development, her gratuitous help in revising a competition paper and the previous version of the thesis, and her sincere suggestions on developing research questions and the final version of the thesis.

Dr. Ningchuan Xiao, my co-advisor, for his help in shaping the thesis topic and the structure of the writing, his constructive suggestions and guide on the design and implementation of the system, his gratuitous help on the revision and formatting of the document, and his kindness for reminding me of graduation issues.

Mr. Louis So, my supervisor at Asset Strategies Group, for his guide of doing user need analysis and suggestions about how to make technology useful to the industry.

v My friends, Quji Ma, Yuan Gao and Dingmou Li, for their suggestions in many technical issues.

My American friend Dennis Shimer and department colleagues Shanshan Cai, Shiguo

Jiang, Lili Wang, for their support and wishes.

All other Department of Geography faculty members, especially Daniel Sui, for their comments and suggestions.

Department of Geography staffs, especially Diane, Stephanie and Maggie, for their help in setting up the dependence room.

Most especially to my Mom, Dad and my wife, for their forever support and encouragement to me.

vi

Vita

Jul. 2007...... B.S. Geography, Nanjing Normal University

Sept. 2007 to Dec. 2008...... Graduate Teaching Associate, Department of

Geography, The Ohio State University

Mar. 2009 to Aug. 2009...... Graduate Research Associate, Department of

Geography, The Ohio State University

Fields of Study

Major Field: Geography

vii

Table of Contents

Abstract ...... ii

Dedication ...... iv

Acknowledgments...... v

Vita ...... vii

Table of Contents ...... viii

List of Tables ...... x

List of Figures ...... xi

CHAPTER 1: Introduction ...... 1

CHAPTER 2: Literature Review of Web-based GIS ...... 6

CHAPTER 3: The Framework of Implementing Web-based GIS for Political

Redistricting ...... 15

CHAPTER 4: Database Design ...... 25

CHAPTER 5: User Interface ...... 42

CHAPTER 6: Implementation ...... 47

CHAPTER 7: Results ...... 66

CHAPTER 8: Conclusions ...... 76

REFERENCE ...... 81

Appendix A: SQL Scripts for Creating Postgresql Database Tables ...... 87

viii Appendix B: SQL Scripts for Creating Database Views ...... 96

Appendix C: Sample Map File ...... 98

Appendix D: User’s Manual ...... 101

Appendix E: 1990s Districting Principles Used by Each State ...... 103

ix

List of Tables

Table 1. TIGER/Line®Shapefiles attribute table structure ...... 26

Table 2. Number of vertex of different levels of census geography objects ...... 33

Table 3. Summary level and part flag for census units ...... 37

Table 4. Traditional districting principles ...... 62

Table 5. Redistricting Principles by States ...... 103

x

List of Figures

Figure 1. Comparison between classic web application model and Ajax based application model...... 21

Figure 2. Framework of implementing Web-based GIS using OSGEO software ...... 24

Figure 3. Referential relationships diagram between database tables ...... 39

Figure 4. Main interface of Web-based redistricting GIS ...... 42

Figure 5. Exploring census interface ...... 44

Figure 6. Redistricting interface ...... 45

Figure 7. Evaluation interface ...... 46

Figure 8. Point based choropleth map of population by county ...... 67

Figure 9. Polygon based choropleth map of population by county ...... 68

Figure 10. User drawn feature of defining a query ...... 70

Figure 11. Blocks returned by query using user drawn features...... 70

Figure 12. Draw feature to select census units ...... 72

Figure 13. Selected units shown as markers ...... 72

Figure 14. Assign selected units to a district ...... 73

Figure 15. Merge the boundaries of all units in the same district ...... 73

Figure 16. Evaluation of districts ...... 75

Figure 17. Evaluation of 110th congressional districts ...... 75

xi

CHAPTER 1: Introduction

Political redistricting can be defined as a process of devising political district boundaries.

The result of this boundary redrawing work must satisfy both constitutional and geographic constraints, which make the political redistricting issue complicated.

Williams (1995) described this process as “one of dividing an area into relatively equally populated districts which are compact and contiguous, while preserving existing political and community ties.” In this argument, population equality is the mandatory constitutional constraint. Compactness and contiguity are two geographic constraints that are usually adopted by states to check the existence of gerrymandering. Preservations of communities and political ties are two political principles for redistricting. Before GIS is introduced for redistricting work, this multi-criteria decision-making process is very labor intensive. The use of GIS changed the redistricting process by providing effective tools to manipulate census and draw district boundaries in a computer aided system (Eagles 2000).

The introduction of GIS does not reduce the complexity of the redistricting process. On the contrary, to a certain degree it makes the issue more complex. On one hand, it becomes more difficult to choose the “best” one from multiple competitive plans since different plans with distinctively composed districts can have very close statistics. On the other hand, with relative technical advantages over ordinary people redistricting

1 authorities could devise visually attractive district maps to disguise, for example, certain partisan protection which can hardly be detected by viewing a map.

“Communities of interest” has long been considered as an important consideration for redistricting. The Supreme Court considers them as appropriate political units to use in redistricting, but it seldom defined these entities or described their characteristics explicitly (Forest 2004). Community can be described as "a social network of interacting individuals, usually concentrated in a defined territory" (Johnston, Gregory and Smith

1994). Stein (1960), on the other hand, has defined community as “an organized systems standing in a determinate relation to its environment which has a local basis but not necessarily a rigid boundary”. From arguments like above we can recognize the fuzzy geographic implication of community and therefore it is challenging to use community as blocking units to define political districts. However, this does not mean that geographic concepts cannot be employed to define communities of interest. In fact, citizens vote, in part, according to their identification of themselves with certain interest groups such as religious values, occupation, class, rural or urban orientation (Morrill 1981). These community concepts do have their counterparts in regional concepts such as religious regions (Zelinsky 1961; Kong 1990), economic regions/trade area (Krugman 1991), urban/suburban (Carter 1995; Pacione 2005). Regional experts and geographers with related interests have devoted a significant amount of attention to the research work in defining and analyzing above regions. One way is through incorporating the findings from regional experts into redistricting process. It can be expected that the introduction of

2 expert opinion can rectify the dehumanization trend of political redistricting gradually being considered as a pure computerized process.

To effectively integrate expert opinions as essential qualitative or quantitative inputs during the political redistricting process, a participatory platform need to be set up to facilitate this purpose. Each legitimate voter, expert or non expert need to be provided with effective tools to generate his or her own plan in an independent online environment.

The bottom line is that they should at least be able to evaluate a plan. A plan from a redistricting agency, either an individual or organization, should be open for evaluation and comparison with other plans. Only through this participatory work could we be able to challenge each plan and select out the “fairest” one. The results from 2009 Ohio redistricting competition provides some concrete proof that states like Ohio can rely on an open process based on objective criteria to produce fair legislative districts (Ohio

Redistricting Competition 2009). An interesting finding from this competition is that even the worst scoring plan submitted to the competition was quantitatively fairer than the actual 2000 redistricting plan, which is featured by lean partisan, low community preservation and high incompactness. This competition demonstrated that the public can contribute their talents to devising new districts which can be much fairer for both voters and candidates. As Illinois State Representative Mike Fortner, one of the competition winners suggested, “it is possible to have an open process to bring in a number of outside groups to participate in redistricting. A public process like this can help improve confidence in the political system.” (Fortner 2009)

3

Web-based GIS can promote citizen’s political engagement in several dimensions. A first effect will take place at the representation level. Political information such as congressional districts and voting districts can be represented and highly accessible. A second effect will be at the evaluation level. As detailed information about political policies can be visualized and situated in a spatial context, it is possible for citizens to evaluate the influence of a redistricting plan on a spatial basis. This knowledge gained through this process can be used as the first hand evidence of whether it is a good plan or not. A last effect is at the participation level. Citizens could either support or challenge existing plans with the evidence they find. Citizens could share their findings and synthesize their opinions before giving feedback to their representatives to improve the plan.

This thesis discusses the design and implementation of a Web-based GIS for political redistricting using Open Source Geospatial Foundation (OSGEO) technologies. This research takes advantage of open source geospatial software, including

Postgresql/PostGIS, Mapserver and OpenLayers to implement a redistricting online platform. A framework of how to build similar applications is also included. Initial results show that the integration of PostGIS, MapServer and PHP could help user retrieve useful information and facilitate query based mapping of massive spatial data. Spatial data processing capabilities of PostGIS make it possible to implement functions such as spatial query, geometry measurement and geometry union functions. Users could use this

4 Web-based political redistricting GIS to explore census data, devise new redistricting plans, evaluate a new plan and compare different plans. This framework is scalable to include other web service components and can be applied to applications with similar requirements.

5

CHAPTER 2: Literature Review of Web-based GIS

2.1 Web-based GIS application

According to different objectives for implementing a Web-based GIS system, most applications can be classified into two categories. One category is Web-based Public

Participatory GIS (PPGIS), which employs “bottom-up” approach (Talen 1999). The

“Bottom-up” approach enables residents with GIS tools to communicate their perception and evaluation of their neighborhood. Their intellectual inputs contribute to the final top- level decision making. It aims to empower the grassroots through providing more accessible public information and opening channel for response. Another category is

Web-based GIS systems designed specifically for government agencies to assist their routine work associated with spatial decision-making. These applications are using “top- down” approach through which data is manipulated and analysis is conducted by technical experts. Citizens did not participate in the process until the final decision is made.

PPGIS, in some literature, is also referred to as Community GIS (Harris and Weiner 1998,

2002; Weiner and Harris 2002, 2003), bottom up GIS (Talen 1999, 2000) or GIS/2

(Sieber 2004; Miller 2006) with the similar objectives of extending GIS capabilities to benefit much broader social groups especially those who have been historically under-

6 represented in public policy making (Obermeyer 1998). The benefits of using GIS as an online tool are first revealed by many PPGIS researches. Barndt (2002), for example, argues that access to comprehensive information is limited for communities therefore support in the use of online GIS tools is important. Carver and Evans et al. (2001) discussed many benefits of online participatory system such as no location restriction for planning meetings, high accessibility of information, and anonymous participation.

Kingston (2000) revealed another advantage of a Web-based system, which is the dynamic update of database. Data can be updated the same time when the public use the system and as soon as new information becomes available. However, there is often a lack of decision-making tools available in these systems and there are not many fully implemented examples in the literature.

Web-based GIS applications using top down approach, on the other hand, are more comprehensive in terms of functionality. Many of the implementations in this category have advantages over PPGIS in ways that (1) consistent financial support is available from beneficial agencies during the whole life cycle of the system, (2) users of the system are relatively well trained with a better GIS knowledge and (3) more sophisticated GIS functionalities can be added to facilitate decision-making process. Many of these applications emerged in recent years when new web technologies are being widely adopted. Boulos, et al published a series of research papers “Web-based GIS in practice” in the area of public health (Boulos, Russell, and Smith 2005; Boulos 2005; Boulos and

Honda 2006). Those online projects aim to increase people’s accessibility to the

7 information about the distribution of recorded transmitted diseases and ratings of the performance of Primary Care in London. It aims to increase the trust in government’s public health system by the masses. Choi (2005) revealed the benefits of using open source software MapServer in the implementation of Spatial Decision Support System

(SDSS) which can be helpful for watershed management decision-makers. Lu (2005) explored the potential of implementing a browser independent health emergency Web- based GIS system to visualize health surveillance data. The Chicago Police Department uses ArcIMS to implement a Web-based crime mapping system to give citizens a sense of crime distribution in the city (CDP 2005). Web-based applications like these indicate that more sophisticated functions, if implemented in an effective and efficient way, could largely improve the depth and extent of user participation and therefore benefit online democratic process at large.

2.2 Limitations of current Web-based GIS applications

Despite the accolade on the role of Web-based GIS in sharing geographic information and technically enabling the disadvantageous, limitations of current implementations are also significant.

A first aspect of limitations is related to the relative lack of effective supporting tools.

Kwan (2002) implied that it is necessary to conserve strategies of using GIS in order to assist activist groups, especially women’s activist groups, in scaling their participation into a higher level of politics. The overall effectiveness of this democratic

8 enfranchisement through online GIS system largely depends on what type of support the system can provide, and whether it can be provided effectively (Muller et al. 1997). A second aspect of insufficiency lies in representation capability of available tools. The system should be able to help user understand the knowledge of spatial sources and provide effective user interface to facilitate this (Haklay and Tobón 2003). For example, query and classification tools can be useful to user if they want to filter specific information. Effective filtering functions are essential to enhance the usability of available data. A third question is how to make GIS more user-friendly. Xiao, Ahlqvist and Kwan (2007) discussed the reasons for the lack of “public” in PPGIS lied in the esoteric features of available GIS packages, requirements for intensive technical training, and nontrivial financial resources for developing the system. A last issue is how to make the best use of increasingly available data. More data is publicized by government agencies and Volunteered Geographic Information (VGI) is also accumulating over the website. The popularity of VGI indicates that collective intelligence in the cyberspace is valued. This collective intelligence can greatly enrich the resources of a Web application

(Goodchild 2007). By revisiting the implications of GIS in the Web 2.0 context, Sui

(2008) argues that traditional four components of GIS—hardware, software, data and people, are all influenced by the emerging amount of volunteering work. This volunteering work, in his originally words, opens an area for the “Wikification of GIS.”

Moreover, the scope of volunteering activity can go beyond the forms of inputting an entry in Wikipedia or uploading a data file to OpenStreetMap, Volunteered Geographic

Information can turn into Volunteered Geographic Knowledge (VGK) through users’

9 understanding, analyzing and processing of information. However, this transformation cannot be easy without appropriate design and implementation of Web-based GIS.

2.3 Technical advances for Web-based GIS implementations

Recent advances in geospatial web technologies can open up opportunities to overcome the above limitations of Web-based GIS and therefore improve the overall usability of the system. The discussion on the benefits of using these techniques has been heated around the time when Tim O’Reilly gave his milestone speech Web 2.0 (O’Reilly 2005). Web

2.0 emphasized network effects, Web’s role as a platform, data as “the Next Intel Inside” and light weight Web development. Web-based GIS development is also influenced by this Web 2.0 trend in that Web-based GIS application can be implemented more interactive, efficient and in a light weighted way. Recent Web-based GIS research benefits from certain Web technologies in two main aspects: Ajax and Open source techniques.

Ajax stands for Asynchronous JavaScript and XML. It is a framework of using several related Web technologies. The asynchronous feature of Ajax standardizes and simplifies the communication between client and server. The use of it makes a Web application more responsive, efficient, and easy to implement as well. The overall performance of a

Web application can also be improved by using Ajax since the update of web content only takes place on a DOM object. Cha (2007) said that Ajax could take Web-based GIS

10 visualization applications to new levels of power and usability with significantly enhanced performance.

Open source software is characterized by their low cost, platform independence and wide user groups. Open source techniques have been more discussed in recent literature, Yi,

Hoskins and Hillringhouse (2008) uses Open source packages such as Postgresql, Google

Maps and to implement Web-based GIS interface. It empowers public health officials by offering them spatial and temporal visualization techniques to disseminate public health data. Sui (2008) contended that elaborate protocols and standards established by

Open Geospatial Consortium (OGC) and Open Source Geospatial Foundation (OSGEO) would facilitate the wikification of GISystems which will benefit more user groups.

Caldeweyher, Zhang, and Pham (2006) discussed the merits of using

Mapserver/Mapscript to integrate classification methods into Web-based GIS systems.

Most Open source software provides programmable interface to access their functions via

API. For example, Google Maps API provides developer interfaces to access Google’s street maps and satellite images as well as their geospatial services such as geocoding and routing. Compared with Google Maps API, OpenLayers is more powerful for implementing client side GIS functionalities. Its well-developed API library allows developer to add feature-editing tools to create geometries on the map. Its compatibility of integrating different data sources greatly maximizes the possibility of creating maps with enriched contents.

11 2.4 Web-based GIS for political redistricting

Computers were first used for redistricting in the 1960 however not until 1990s did all 50 states adopt computer to perform redistricting (Altman, MacDonald, and McDonald

2005). According to their survey regarding the publics’ engagement in redistricting process, in 2002, 18 out of all 50 states provided the public certain terminals to create and summit their own plan. Among these 18 states, however, only four of them provided user free desktop packages and only Wisconsin has provided an interface through web browser although the functionalities for creating district plans no longer exists now.

Web has its incomparable advantage in disseminating information. Many states established redistricting web sites (McDonald 2000) to provide information such as redistricting process, legal information, data, and existing district maps. However, the use of websites by the public still stays at relative low level in that websites like these are mainly designed to represent information rather than empowering user with tools to evaluate a policy or even create their own alternative. Therefore, a more capable website such as a Web-based redistricting GIS is needed to scale up the participation of the general public into policy making to a much higher level.

In order to support plan evaluation and creation process, a web-based redistricting GIS is suppose to include essential components as provided by other desktop redistricting GIS packages. The capacities of current redistricting GIS packages can be summarized into three main categories (Handley 2000; Altman, MacDonald, and McDonald 2005):

12 1. Tabulation: a statistical table for checking various redistricting criteria. For constitutional criteria, the table should include statistics of district population and population deviation. For geographic criteria, the table should include contiguity check and compactness check. For political criteria, the table includes such as partisan registration or minority population.

2. Thematic mapping: Classified Choropleth map is needed to identify concentrated areas of total population, minority populations, partisanship, communities, and so on.

3. Automated redistricting: A number of packages offer the ability to automatically draw district lines based on selected criteria. However, this automation functionality can be very limited in practical use. On one side, most of them can only optimize on one criterion rather than balance between competing criteria. On the other hand, even for single criteria the computing performance of the system is still far from satisfactory for practical use. In our research, we only implement functionalities in first two categories.

Also worth noting is the potential to adopt Web-based GIS in a much wider range of politic related applications such as mapping and assessing voting outcomes (Ward and

O’Loughlin 2002) and developing strategy for political campaigns (O’Looney 2000;

Terra 2008). Web mapping capabilities are vital to the implementation of these applications.

This research aims to propose the implementation framework of a Web-based GIS for political redistricting using Open Source Geospatial (OSGEO) technologies. Open source

13 software is cost effective, platform independent and customizable in functionality. Recent advances in geospatial web technologies make spatial data handing functionalities more available than ever before such as PostGIS. This research takes advantage of open source geospatial software Postgresql/PostGIS, Mapserver and OpenLayers to implement an online redistricting platform. The online system aims to provide users query based web mapping service to explore census data. It also provides an interface to devise new plans, evaluate plans and compare different plans. This framework aims to be scalable to include other web service components and able to be applied to applications with similar requirements.

14

CHAPTER 3: The Framework of Implementing Web-based GIS

for Political Redistricting

3.1 Client/Server architecture

The client/server model defines the communication between service consumers (clients) and service providers (servers) (Umar 1997). A common client is a web browser such as

Microsoft Internet Explorer or Mozilla Firefox. Servers, on the other hand, have more diversified types. A basic server is a Web Server, which is also called the HTTP Server.

The main task of a Web server is to handle HTTP (Hypertext Transfer Protocol) request.

When the Web server receives an HTTP request from a client, it responds with an HTTP response, such as sending back an HTML page. However, this response can be sent back in several different ways. First, it can be sent directly from the HTTP server, which receives the request. Second, the request can be redirected and sent from other HTTP servers to the client. Third, the HTTP server could delegate the generation of a dynamic response to some other programs such as CGI (Common Gate Interface) program, PHP

(Hypertext Processor) program, or some other server-side technology. The last method of processing request is frequently used in today’s web application. Besides HTTP server, there are other types of servers deployed for the above mentioned delegation role such as application servers and database servers.

15

To design a certain logic chain for processing a specific type of data of an application, we need to build an application server. As for Web-based GIS application, examples of application server include MapServer and GeoServer, which provide Web Mapping

Service (WMS) to client browsers. As indispensable components, a database server, a

HTTP server and a server side scripting language are usually incorporated into an application server to facilitate the entire Web service process.

For the client side, JavaScript is a commonly used client side scripting language, which is responsible for the communication between the server and the client. Based on different strategies to balance workload between the client and the server, clients can be classified as thick (fat) client and thin (lien) client. A thick client typically provides rich functionality independently of the server. By contrast, a thin client aims to do as little processing as possible and therefore heavily depends on a server's applications. An example application based on thin client is Google Maps, while applications such as

Google Earth are based on thick clients.

3.2 Enabling open source GIS technologies

3.2.1 Server side techniques

On the server side, the Apache server is used to serve the Web pages. All other components are integrated into Apache server to enrich the content of the Web pages.

Postgresql functions as a database server to support the storage and retrieval of spatial

16 data. MapServer is an application server that can provide Web Mapping Service (WMS).

PHP is the server side scripting language responsible for the communication between different servers.

Postgresql/PostGIS Database

In order to effectively store and manipulate spatial data such as the shape and location of census units in the redistricting case, we need a . Traditionally, spatial data is stored in an interoperable format such as ESRI shapefiles. Based on that, spatial objects and their attributes are stored in separate files, one for the geometry, one for the attributes and one for the projection. This file-based organization makes the dynamic update of spatial information difficult and significantly decreases the possibility of interoperability. An alternative solution is to store spatial data in an object-relational database in which spatial objects can be stored as a geometry column of a table and all other non-spatial attributes as more columns in the same table. Postgresql and MySQL implement their spatial extension in this way. The spatial extension in Postgresql is called

PostGIS, which supports a set of fully-fledged spatial functions in today’s available databases and has already established its leading role in the open source world. Its 706 spatial functions (according to version 1.4) have opened up many opportunities of implementing sophisticated geospatial applications. PostGIS follows the OpenGIS

Simple Features Interface Standard (SFS) to implement spatial functions. This provides a standardized way for applications to store and access feature data in relational or object- relational databases, so that the stored data can be used to support other applications

17 through a common feature model, data store and information access interface (OGC

2006).

MapServer

Spatial representation functionality is an integral part of GIS. However, it has long been a challenge to represent spatial object effectively in the Web environment. One reason is due to the relatively large size of spatial data and large number of spatial objects. Take congressional redistricting in the state of Ohio as an example. Each block consists of an average number of 100 vertexes and there are totally 277807 blocks according to Ohio’s

2000 census. A Web Feature Service (WFS), which renders spatial object on the map, may not be suitable in this case. Take the rendering of geometries on Google Maps as an example, with increased number of vertexes of a feature and number of features drawn on

Google Maps the performance of feature rendering decreases. For this reason, Google also limits the maximum size of a KML file to 10MB in order to ensure its server performance.

This explains why we resort to a Web Map Service (WMS), which helps to generate georeferenced map images instead of each geographic object in the map. The most popular map servers used today are MapServer and GeoServer. Because WMS provides data as a single image, it avoids many intensive processes in WFS such as the symbology of each object and creating event listener for each object. By virtue of this, WMS is usually faster than WFS. Compared with GeoServer, MapServer provides more capable

18 symbolization strategies and it provides its WMS through a single map file, which makes a dynamic web mapping service possible since a text-based map in MapServer can be dynamically generated, based on user query through a server side scripting language.

PHP

The final component on the server side used in this research is a server side scripting language, which will be responsible for the communication between all the above components and with the client side as well. In the open source world, PHP (Hypertext

Preprocessor) is a highly popular choice. Its object oriented programming style, platform independent feature and relative ease to implement have won many web developers.

3.2.2 Client side techniques

Ajax

The term Ajax is first introduced by Garrett (2005) who envisaged a new era of Web application in which "richness," "responsiveness," and "simplicity" were the key words involved (Ullman 2007). The acronym “Ajax” stands for Asynchronous JavaScript and

XML. In fact, the techniques enabling Ajax are more than what is motioned in the full name. According to Garrett’s definition, a typical Ajax implementation usually incorporates the use of:

1. a front end presentation using XHTML and CSS;

2. dynamic display and interaction using the Document Object Model (DOM);

3. data interchange and manipulation using XML and XSLT;

19 4. data retrieval using asynchronous XMLHttpRequest and

5. the use of JavaScript to bind everything together.

XHTML stands for Extensible Hypertext Markup Language, which is based on the XML syntax. CSS stands for Cascading Style Sheets (CSS), which is a style sheet language, used to describe the look of a Web document. DOM specifies the way to refer to XML or

HTML elements as objects, which makes it easy to update the content of Web pages.

XMLHttpRequest is a DOM API that can be called by a client side script to send an

HTTP request to a web server, to get the response and to handle returned contents within the scripting language.

From the Ajax characteristics listed above, we may observe that the advantages of Ajax can be paraphrased into following aspects:

1. Web interaction become more efficient by using asynchronous XMLHttpRequest

and DOM. Web applications can retrieve data in the form of XML from the server

asynchronously in the backend without interfering with the display and behavior

of the existing page since information is only updated to associated DOM objects.

2. End users have richer web experience through the functions implemented by

JavaScript libraries based on Ajax framework.

3. XML makes communication between client side and server side easy. GML,

KML and SVG are all based on XML standards.

20 Figure 1 is a comparison of the classic web application model with the Ajax model. In the

Ajax model, user activity and server processing can move forward asynchronously.

Therefore, within a given time period there can be more interactions than synchronous case. Results from the server side can be flexibly handled either to be stored on the server or to be displayed on the client side.

Figure 1. a Comparison between classic web application model and Ajax based application model

a Adapted from (Garrett 2005). The main difference is we use timely overlapped server operations to illustrate asynchronous character of Ajax.

21 OpenLayers

As JavaScript plays an indispensable role between client side and server side communication, more JavaScript libraries have been implemented and made available in the form of web APIs (Application Programming Interface). Some JavaScript APIs ease the process of implementing interactive user interface such as jQuery, ExtJS and YUI. jQuery is a lightweight JavaScript library that emphasizes interaction between JavaScript and HTML. ExtJS is also a JavaScript library for building interactive web applications, which is originally built as an add-on library extension of YUI (The Yahoo! User

Interface Library). ExtJS also includes interoperability with jQuery. Other JavaScript libraries are featured with web mapping capabilities catering to the need of geographic applications such as Google Maps API. Although Google Maps API does provide interfaces to customize the functions of Google Maps for different applications, it is difficult to overlay other map sources such as WMS layers with Google Maps.

However, essential cartographic elements such as legend and scale bar are yet available in Google Maps as well as classification, symbolization and layer control. Considering the exponential increase of geocoded information over the Web it is more necessary than ever before to incorporate essential map elements and tools in the Web mapping service, especially in open source based applications. OpenLayers, as its name indicates, is potentially capable to fill the gap Google Maps leaves. The OpenLayers JavaScript API organizes data sources into layers such as Google Maps Layer, MapServer WMS Layer, feature layer and many others. Moreover, analytical tools can also be implemented

22 through the class of controls and each control can be registered for its corresponding layer. For example, we can register a feature editing control to a feature layer so that users can create and edit features on top of the feature layer. This implementation of APIs can be used to integrate various data sources into a same implementation framework. For example, it becomes easier to overlay a population layer from MapServer on top of

Google’s base maps so that we can examine the population distribution in a user-defined neighborhood. Since November 2007, OpenLayers has become part of the Open Source

Geospatial Foundation project. The Open Source Geospatial Foundation (OSGEO), is a not-for-profit organization whose mission is to support and promote the collaborative development of open geospatial technologies and data (OSGEO 2009). All The foundation's projects are freely available.

3.3 The framework

A framework to design and implement a Web-based GIS using open source geospatial software has been proposed. In figure 2, the two dashed arrows indicate the starting and ending points of an information processing procedure. The interaction between the web and a user starts with user inputs, which specify parameters of a query such as how data is classified and what field is used as the data label. These parameters are sent via the

XMLHttpRequest to the server. The server side PHP program gets these parameters and parses them as a SQL query to the spatial database (Postgresql/PostGIS). The spatial database then returns its result to the PHP program, which processes the result and output them as map file, which can be rendered by MapServer. The map file includes the

23 information of spatial objects, classification method, symbology and labeling method.

Besides the map file, the PHP program also generates an XML file, which includes metadata or supplemental information of the map file. After the map file is generated,

MapServer generates a map image and legend based on the configurations in this map file through its CGI program. The client OpenLayers JavaScript program gets parameters of the map images from the metadata included in the XML file and renders the map image and legend image. By now, the whole interaction finishes. According to the asynchronous feature of Ajax the user can submit a second query before the result of the first query returned. In this sense, the Web application becomes more responsive, which allows the user to have effective interactions with the server.

O utput

Figure 2. Framework of implementing Web-based GIS using OSGEO software

24

CHAPTER 4: Database Design

Data is an important issue to an online participatory GIS application. The availability and representation of data largely determine how effectively users could use the data for their participatory activities. The way of organizing the data, such as how the data is processed, input, stored, and output, also influences the overall performance of the system as well as the overall effectiveness of the system.

4.1 Data sets

The data sets used for the redistricting research include both spatial and non-spatial data.

Both can be obtained from the www.census.gov website.

Spatial data can be obtained at http://www2.census.gov/cgi-bin/shapefiles/state- files?state=39 . Non-spatial data can be obtained at http://ftp2.census.gov/census_2000/datasets/redistricting_file--pl_94-171/Ohio/ .

4.1.1 Spatial data

The spatial data mainly includes different levels of census units (such as county, tract, block group and block) which are the basic units to compose a congressional district.

Therefore, they are also referred to as the “building blocks”. Besides, spatial data for political redistricting also includes previous congressional district (110th, 109th,108th )

25 boundaries which can be used to compare different plans. All these units are stored in shapefiles as polygons. The particular hierarchy of census geographies is one of the bases for redistricting work. States are composed of counties. Counties are composed of tracts and so on. The lowest level of census geography is the census block from which the congressional district is directly built. The association of different levels of census units is coded through FIPS (Federal Information Processing Standard) number. FIPS numbers are important to the implementation of redistricting program since it makes redistricting on different census levels—possible. For example, a user could assign a whole county to one district and a tract in that county to another district. In a shapefile, FIPS numbers and the shape field are the two important fields we need in this research.

The coordinate systems of these shapefiles are all in GCS_NAD_1983. The basic information of each shapefile is listed in table 1. The fields listed are the ones used for the congressional redistricting application.

Shapefile Name Number of Records Fields County.shp 88 FID, Shape, CountyFIPS, Name Tract.shp 2941 FID, Shape, TractFIPS Bg.shp 9354 FID, Shape, BgFIPS Block.shp 277807 FID, Shape, BlockFIPS CD110th 18 FID, Shape, DistrictNum CD109th 18 FID, Shape, DistrictNum CD108th 18 FID, Shape, DistrictNum Table 1. TIGER/Line®Shapefiles attribute table structure

26 Besides the shapefiles, a supplemental geographic file is also used in this research. This data provides additional information of census units such as how these units are associated with each other. The FIPS number of a census unit in the shapefile is used to link its corresponding record in the supplemental file. This supplemental file also contains a logic record number, which can be used to join the population data from population files that will be discussed later.

4.1.2 Non-spatial data

The content of non-spatial data determines what social and economic themes can be visualized on a spatial basis. For the redistricting problem, these themes mainly include population information. The population data from the census website includes detailed population of different races and ages although for our research we will only deal with total population, white population and minority population as an example to address several key issues in political redistricting. The relationship between population (non- spatial data) and census units (spatial data) are built through a logic record number

(logrecno) that is specifically implemented as an internal record index. Using this logic record number, we could access the population information of any census geographic units. Together with the use of FIPS numbers for geographic units, the logrecno makes a seamless integration of spatial and non-spatial information possible which is acentral issue of database design.

27 4.2 Database Design

The database management system (DBMS) used for this redistricting application is

Postgresql 8.4 with PostGIS 1.4 extension.. Postgresql is an object relational database management system (ORDBMS) that is released under the BSD-style license as an open source software package for use free of charge. PostGIS is originally designed as a stand alone spatial database by Refractions Research and now is part of Postgresql as its spatial extension to manipulate spatial data. PostGIS’s compliance with OGC’s

Specification for SQL makes it easy to serve as a backend data supplier for other software products such as MapServer and ArcGIS.

In PostGIS version 1.4, there are 706 spatial functions covering the manipulation operations such as spatial object storage, geoprocessing, measuring, projection and many others. The spatial processing functions such as intersects, union and buffer are supported through the GEOS (Geometry Engine - Open Source) program under the Lesser General

Public License (LGPL). GEOS provides a c++ based API to access a Java program called

Java Topology Suite (JTS). The JTS Topology Suite is an API of 2D spatial predicates and operators written in pure Java. The GEOS in PostGIS makes it possible to build complex Web-based GIS applications since it can be used to handle and process user input geographic information such as user drawn features.

The PROJ4 library in PostGIS handles all issues with data projection. It is a comprehensive library that includes almost all projection types and coordinate systems

28 standardized by EPSG (European Petroleum Survey Group). However, it does not yet include the Spherical Mercator projection, which serves a commercial projected coordinate system adopted by Google (OpenLayers 2009). The EPSG code for this projection is EPSG: 900913. The method of importing this projection to PROJ4 library is discussed later.

4.2.1 Importing spatial data to Postgresql

Spatial data and non-spatial data should be stored in the same database so that they can interact with each other. Spatial data is stored in Postgresql database as objects in the spatial column, which is a database column in Postgresql to store spatial objects. Before creating such a geometry column, we need to create a new database in Postgresql using a

PostGIS template to make the database spatially enabled. We create a database called

Ohio in pgAdmin first. PgAdmin is a graphical management, development and administration tool for PostgreSQL.

After the Ohio database is created in Postgresql database, there are 706 spatial functions and two meta-data tables automatically created in the database’s public schema. There are two OpenGIS meta-data tables: The SPATIAL_REF_SYS table and the

GEOMETRY_COLUMNS table. The SPATIAL_REF_SYS table holds EPSG IDs and descriptions of the coordinate systems used in the spatial database. As mentioned earlier,

Google Maps Spherical Mercator projection (EPSG: 900913) is not included in this database. In order to integrate Google Maps with our own mapping service, we need to

29 add this projection into our SPATIAL_REF_SYS table. This EPSG: 900913 projection can be expressed in PROJ4 as follows. This definition gives the parameters to define this projection in Postgresql database as well as in MapServer.

+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs

In PostGIS, the following SQL script can be used to add the EPSG: 900913 projection to

SPATIAL_REF_SYS table by running this script in Ohio database.

INSERT into spatial_ref_sys (srid, auth_name, auth_srid, srtext, proj4text) values (900913 ,'EPSG',900913,'GEOGCS["WGS 84", DATUM["World Geodetic System 1984", SPHEROID["WGS 84", 6378137.0, 298.257223563,AUTHORITY["EPSG","7030"]], AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich", 0.0, AUTHORITY["EPSG","8901"]], NIT["degree",0.017453292519943295], AXIS["Longitude", EAST], AXIS["Latitude", NORTH],AUTHORITY["EPSG","4326"]], PROJECTION["Mercator_1SP"],PARAMETER["semi_minor", 6378137.0], PARAMETER["latitude_of_origin",0.0], PARAMETER["central_meridian", 0.0], PARAMETER["scale_factor",1.0], PARAMETER["false_easting", 0.0], PARAMETER["false_northing", 0.0],UNIT["m", 1.0], AXIS["x", EAST], AXIS["y", NORTH],AUTHORITY["EPSG","900913"]] |','+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +no_defs');

In addition, this projection can be added to MapServer’s proj file, which is in the

MapServer’s bin folder, by adding the following line to the end of the file:

<900913> +proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@ null +no_defs

30 By adding the EPSG: 900913 projection to both PostGIS and MapServer, it is possible to render Google Maps and MapServer WMS layers using the same coordinate system.

After the spatial database is created, we are ready to import spatial data into the database.

Since our spatial data is in shapefile format, it is convenient to use the PostGIS command line to add a table populated with spatial data directly from the shapefile. There are two steps to do this. First, create a SQL script from the shapefile using the following command:

c:\pgutils\shp2pgsql -s 900913 block.shp block > block.sql

The above command uses the shp2pgsql program to convert the shapefile block.shp to a

SQL script called block.sql, which can later be used to create the table in Postgresql.

Once block.sql is executed in Ohio database, it will create a table called block with all its spatial objects populated into an automatically added geometry column. The projection of the geometry column will be in EPSG: 900913 as specified in the command parameters.

Second, use the psql program to execute block.sql script and create a table called block in

Ohio database. This block table includes all fields from the shapefile with the fid field renamed to GID, which is the unique identifier of each table and shape field renamed to the_geom.

c:\pgutils\psql -d ohio -h localhost -U postgres -f block.sql 31

The last thing we need to do on the table is to transform the projection of the_geom to

EPSG: 900913. Although we specify the SRID as 900913 when we create the block.sql, this only defines the projection of the geometry column, which is called the_geom. The unit of values in this the_geom column is still in geographic degrees as the original shapefile is in GCS_NAD_1983 coordinate system. Therefore, we need to convert the internal projection of each value in the_geom column to EPSG: 900913. The following

SQL command will do this for us.

UPDATE block set the_geom =ST_Transform(SetSRID(the_geom, 4326), 900913)

Function ST_SetSRID(geometry geom, integer srid) tells the database which projection the original coordinates is in. The ST_Transform(geometry g1, integer srid) function returns a new geometry with its coordinates transformed to spatial reference system referenced by the SRID integer parameter. The destination SRID must exist in the SPATIAL_REF_SYS table. This is why we add EPSG: 900913 to

SPATIAL_REF_SYS table in the first place. The following is an example of the Well

Known Text (WKT) representation of geometry in the block table before and after the transform. This geometry is a polygon with five vertexes.

Before: POLYGON((-84.205226 40.838316,-84.205803 40.83846,-84.205228 40.838512,-84.205226 40.838316)) After: POLYGON((-9373682.88045252 4988522.48212802,-9373747.11179871 4988543.67022705,-9373683.1030915 4988551.32149635,-9373682.88045252 4988522.48212802)) 32

We follow the same procedure as above to create other three census tables, named respectively bg, tract and county. The SQL scripts for creating these tables are in

Appendix A.

The storage of spatial data usually requires a large volume of disk space. The block table with 3 columns and 277807 tuples has already taken up 1GB space. The big storage size is also due to the relatively large number of points composing geometry. The following SQL script examines the average number of vertexes of the geometries in the block table.

select avg(ST_NPoints(the_geom)) as Average, max(ST_NPoints(the_geom)) as Maximum, min(ST_NPoints(the_geom)) as Minimum, sum(ST_NPoints(the_geom)) as Total from block

Table 2 summarizes the average number of vertexes of the polygons in each spatial table.

Table Name Minimum Averagea Maximum Total Block 4 47 3191 12,930,794 Bg 9 204 4662 1,907,721 Tract 9 311 4662 913,687 County 426 1496 4013 131,649 Cd 173 597 1380 32,229 Table 2. Number of vertex of different levels of census geography objects

a The average number rounds to the closest integer. For example, 46.5 rounds to 47.

33 The Open Geospatial Consortium (OGC) has defined two basic web services to accessing spatial data. One is the Web Feature Services (WFS) and the other is the Web Mapping

Services (WMS). The WFS method directly retrieves coordinates of the geometry and plot the geometry in the web interface. The complexity of the geometry and the number of geometries largely determines the overall performance of the WFS. The performance of delivering spatial data to a user largely determines the efficiency of the online system.

A work-around method to this limitation is to use WMS instead of WFS whenever is possible. However, the performance of WMS server will also be affected by complex geometries. Another option is to calculate a surface point of each geometry as an alternative representation when the extent of the spatial object is critical in representation.

A surface point is a point guaranteed on the surface of geometry. The reason we use surface point instead of centroid is that for an irregular geometry, the centroid may not always be in the geometry.

To add the surface point, we use the following SQL command to create a point type geometry column in the block table and populate the column by the surface point of the geometry. The AddGeometryColumn() function adds a two dimensional point type geometry column called surfacepoint in the block table. The projection for the column is EPSG: 900913.

SELECT AddGeometryColumn('my_schema','block','surfacepoint’,900913,'POINT',2); UPDATE BLOCK SET SURFACEPOINT=ST_POINTONSURFACE(the_geom);

34 We create a point representation for all other three census tables which are respectively bg table, tract table and county table (See appendix A). The FIPS numbers of different levels of census units indicate spatial relationship of these units. For example, the first five digits in a block’s FIPS number "390930239001004" indicate this block is within the county with the FIPS number “39093”. In this way, we can get the substring of the first five characters of a block’s FIPS number, compare to the FIPS numbers of all counties so that we can find the county this block belongs to, and therefore access the information of this county. However, string operation is costly for large datasets.

Therefore, for the performance consideration, we specifically add three more columns for the block table to save the FIPS number of each of its higher level census units. These three columns are named respectively bgfips, tractfips and countyfips.

Similarly, we add higher-level FIPS to the bg table and tract table as well. The relational schema of each table is shown in Appendix A. Finally, we create the btree index on gid and fips columns and GiST (Generalized Search Trees) index on the_geom and surfacepoint columns for each table to increase the speed of search on these fields. Indexes are essential for large data sets. Without indexing, any search for a feature would require a "sequential scan" of every record in the database. Indexing speeds up searching by organizing the data into a search tree which can be quickly traversed to find a particular record. B-Trees are used for data which can be sorted along one axis; for example, numbers, letters, dates. GiST indexes break up data into "things to one side," "things which overlap," and "things which are inside," and can be used on a wide range of data-types, including GIS data.

35

Then, we use the copy command in Postgres to import the geographic supplemental file called ohgeo.upl into the Ohio database. Before doing that we create a table called geo in the Ohio database using the script in Appendix A and then use the following SQL command to populate this table. Key words “with CSV” specify the file is a CSV equivalent file.

COPY geo FROM 'path_to_file/geo.upl' with CSV;

There are four important fields in this table: logrecno, fips, sumlev and partflag. Logrecno, which is the primary key of the geo table, is used to link the geo table to population tables, which have the same field in them. The FIPS number is used to link this geo table to spatial tables implemented earlier. The FIPS number itself does not determine the census unit record in the geo table. However, the FIPS, sumlev and partflag together are the key to determine each census unit of the geo table.

Sumlev is a three-character string indicating the summary level of the geographic units.

For example, fips number 39049 can correspond to multiple records. On sumlev 050, it uniquely refers to Franklin County while on sumlev 155 it corresponds to 30 places within Franklin County. In order to make the summary of population data correct, we need to use the combination of sumlev and fips as the group by fields. Particularly, for the block group records some of which are composed by parts we also need to add partflag formation to the group by fields. The sumlev and partflag field for

36 each census level are listed in table 3. This lays the foundation of calculating population for each district during the redistricting process.

Census Units Sumlev Partflag Block 750 p/w Block group 740 w Tract 140 w County 050 w Table 3. Summary level and part flag for census units

4.2.2 Importing non-spatial data to Postgresql

Non-spatial data are obtained from oh00001.upl and oh00002.upl files. The creation and population of these two files is similar to that of the geo table.

4.2.3 Creating congressional district table

To create the congressional district table from shapefile, we can use the same method as we did for census shapefiles. The trick here is that after we created three SQL files we edit and combine the contents of these files together so that three congressional districting plan can be in a same table. We name this table cd. Please see Appendix A for the structure of this table.

This cd table inherits three fields from the source shapefile, which are plan name, district number and the shape of each district. In order to evaluate plan, we also need information about total population, white population and minority population. Here, we take all minority population as a single unit. Please see the creation of cd table in Appendix A. 37 When we create a new plan, the plan name and district number will be populated with the data given by a user. After the user finishes the plan, the geometry and population field will be updated with information retrieved and aggregated from other tables. We need plan information to update this cd table. Therefore, we also need to create a table to save plan information.

4.2.4 Creating redistricting plan tables

In order to save redistricting result in the database, we create two tables, respectively named as plan1 and plan2. Please see Appendix A for the structure of these two tables. These two plan tables only save the non-spatial information of each plan—FIPS number, plan name and district number. When a new plan is created, a new column is added to the plan 1 table. We call this plan1 table a preliminary plan table. Each plan corresponds to a sequence of strings indexed by the FIPS number in this table. In the plan1 table, the fips column indexes all 290190 census units while in plan2 table the fips column only indexes 277807 census block units. Through this design, we can devise a plan based on different levels of census units and generate district’s boundary based on hierarchy and relationship of these units. By comparison, we can call the plan2 table the final plan table since each district is corresponded to the block level. We update the preliminary plan to final plan table when user finalizes their plan. Again, the geometry information of each plan is stored in the cd (congressional district) table.

38 4.2.5 Relationships between tables

After all above process, 10 tables are created in Ohio database. All 10 tables are normalized to 3NF, which means every non-prime attribute is non-transitively dependent on every key of the table. Most databases in 3 NF are free of anomalies during insertion, update, and deletion process. Figure 3 shows the referential relationships of the fields between different tables. The key for each table is underlined. The cd table, as a special case, is excluded from this graph.

Figure 3. Referential relationships diagram between database tables

39 Their referential relationship between tables is indicated through arrows in the diagram.

For example, the column bgfips in block table references the fips field in bg table.

Other relationships can be interpreted as such. This referential relationship is the basis for creating database views.

4.2.6 Creating database views

Based on this referential relationship, we can integrate spatial information, population information and plan information together through the database view, which is a virtual table in the database. In this way, we logically link the geographic object and its attributes under the same frame. The use of view makes it easier to retrieve information based SQL query.

Before we devise a new plan, we might want to explore the population on certain census level. Therefore, it is necessary to join the fields we are interested from the population tables to the county tables together. So that we can visualize population information of a county, which is in population table, based on the location of the county, which is in the county table. Meanwhile, we might also want to look at which district a specific census unit is in according to a new plan. For this end, we also need to join certain fields from the plan table to the county table. Therefore, we need to create a view to logically include all necessary fields from related tables. The following SQL command gives an example of how to do that in Postgresql. First, we create a unified population table p, which

40 includes FIPS, sumlev and logrecno fields from geo table and six population fields selected from two population tables p1 and p2.

CREATE OR REPLACE VIEW p AS SELECT geo.fips, geo.sumlev, geo.logrecno, p1.p0010001 AS total, p1.p0010003 AS white, p1.p0010001 - p1.p0010003 AS minor, p2.p0030001 AS total18, p2.p0030003 AS white18, p2.p0030001 - p2.p0030003 AS minor18 FROM geo JOIN p1 USING (logrecno) JOIN p2 USING (logrecno);

Then, we join county table with plan1 table (preliminary plan table) using fips, and then join view p using fips as well. We use p.sumlev = '050' as the condition in where clause to filter the records for counties. In this way, we create a view county_plan1_p to integrate spatial information, population information and district information of county together.

CREATE OR REPLACE VIEW county_plan1_p AS SELECT cp.*, p.total, p.white, p.minor, p.total18, p.white18, p.minor18 FROM (county b JOIN plan1 USING (fips)) cp JOIN p USING (fips) WHERE p.sumlev = '050'::bpchar;

Appendix B includes SQL scripts for creating other views. In order to update the district boundaries based on different levels of census units, we also create a view to include the geometries of all census units and their fips numbers. This is shown as view fips_the_geom in Appendix B.

41

CHAPTER 5: User Interface

5.1 Main interface

The main Web interface for the congressional redistricting application is shown in figure

4. Figure 4 primarily shows how the client area of a browser is organized and divided.

Basically, the main interface is divided into two functional sections: the tools section (left) and the map section (right). Access this application online via gis.osu.edu/redistricting.

Figure 4. Main interface of Web-based redistricting GIS

42 5.2 Exploring Census interface

Under the first tab of the tools section, users are offered tools to explore the census area

(Figure 5). Query and symbolization tools are offered to the users to retrieve a WMS for the area they are interested in. For a demonstration purpose, we only implement spatial query tools as an example of the use of filters. For symbolization method, we choose different methods to symbolize features based on whether the data is qualitative or quantitative. For qualitative data such as district number, we assign each class a unique color. For quantitative data such as population, we apply a graduated color scheme.

The procedure of a query can be described as follows. First, users choose a census layer based on which a map will be generated. The map can be regarded as a knowledge representation of the neighborhood combining different layers’ information such as the area, the boundary or the surface point. These three layers are not the same layers as defined in GIS but refer to the layer defined in MapServer. MapServer layers are defined based on the geometry type of the feature layer. For example, a county feature layer can be represented either as a polygon layer or a line layer or both. We take advantage of this

MapServer feature to create a combined map based on user’s needs. The label and classification information is also defined according to MapServer syntax in the map file.

We will discuss MapServer issue in the next chapter. After the map is generated, it can be added to the map area on top of the Google base map. The transparency and order of the layer can also be adjusted after the layer is generated.

43

Figure 5. Exploring census interface

5.3 Redistricting interface

The second tool set is for redistricting (Figure 6). Based on exploration result, users can select units and assign them to certain district. First, they create a new plan and this new plan will be added to the new plan list. Then they could choose one of these new plans to work on. A second step is to choose census units before assigning them to new districts.

By doing that, the user could use the tool box on the upper right corner of the map to draw geometries. All census units the boundaries of which intersect with the geometry drawn by the user will be selected. The selected units will be shown as a marker representing the unit on the map. To click the apply button, preliminary plan table will be updated based on current selection and indicated district number. To click finish plan button, the final plan table in the database will be updated to the district user indicates

44 and the geometry of each district will be generated through the spatial union operation of all census units in the same district.

Figure 6. Redistricting interface

5.4 Evaluation interface

The last tool set is for evaluation (Figure 7). There are both mandatory constitutional criteria and proposed criteria adopted by many states in practice. The constitutional criteria require the check for population equality by using the deviation measurement from the population. Proposed criteria usually include the check for contiguity and compactness but also include many others which are yet required by law. Figure 7 shows those options to evaluate plans.

45

Figure 7. Evaluation interface

Based on the above implemented user interface, we are ready to implement all the client side and server side functions to make the interfaces work. Appendix D includes a user’s manual of how to use the tools in the interface.

46

CHAPTER 6: Implementation

6.1 Dynamic Web Mapping Service

People make their judgments based on what information is available to them and how effectively available information can be presented. Visualization of population distribution is necessary to help user determine whether to assign a unit to a certain district or not since the assigned district should satisfy both equal population and contiguity requirements. An effective visualization of partisan data or voter inclination data will help prevent partisan gerrymandering of redistricting. In all above scenarios,

Web mapping services with effective classification and labeling capabilities will be needed. However, there haven’t been many examples in the literature of how to generate classification maps dynamically based on Open Source Geospatial technology. For instance, it is useful to generate a map file dynamically for MapServer so that for every different user query the classification method, symbology scheme and labeling method can be customized. The term dynamic web mapping service we use here refers to query based customizable web mapping service.

Not only map content can be dynamic, the source information composing the map can also be dynamic. In terms of a new plan, the district number for each unit is being updated. Users need to know which part of the state has been redistricted before they

47 move on to redistrict other parts. However, most current Web applications with WMS capabilities are based on predefined map files and data source such as shapefiles. The limitation of this file based data management method becomes significant when the attribute information of the spatial objects in the file is being updated. A more flexible way of managing the data is needed so that we can generate maps based on updated attribute information more easily. To overcome this limitation, we propose to use

PostGIS to provide dynamic data source. We also use PHP to dynamically generate

MapServer’s configuration file (map file) based on user’s request. Therefore, map image and legend image can change upon different user request.

Now we describe the mechanism to provide dynamic WMS through the integration of

PostGIS, PHP, MapServer and OpenLayers. First, the user defines a query through the

“explore census” user interface. Each query consists several mapping parameters such as map extent, query layer, classification method and a corresponding number of parameter values. This query is generated as a string in JavaScript conforming to PHP parameter passing syntax. After the client JavaScript program concatenates the parameters’ names and values into a query string, this query string will be added after a PHP program as the

PHP program’s pass-in variables. An example query string in JavaScript for retrieving a census block map looks like this:

Var getVars= features[]=POLYGON((-9239662.5933404 4866096.4726637, -9239765.9027614 4865995.5518998,-9239595.1137764 4865980.0256284,-9239662.5933404 4866096.4726637))

48 &features[]= LINESTRING(-9239720.5182759 4865841.4835147,-9239529.4257052 4865859.9956074 ) &features[]=POINT(-9239348.4849273 4865926.2808429) &extent=-9239765.902761 4865841.483515 -9239348.484927 4866096.472664 &view=block_plan2_p1 &layerFlgs[0]=true&classItems[0]=&labelItems[0]=fips &layerFlgs[1]=true&classItems[1]=&labelItems[1]= &layerFlgs[2]=true&classItems[2]=cd110th &labelItems[2]=

In the above example, the features array defines all the spatial query geometries that are drawn by the user to query the database. If the window extent is used as the query geometry to retrieve data, then the geometry will be a polygon representing the window extent. This way of query is convenient when people want to view map content at current view of the map. The extent parameter is used to define the extent key word in the map file. The view is the corresponding view created in the database. The layerFlgs array stores whether the geographic layer will be represented as a point, line or polygon and its value indicates whether the corresponding representation will be shown. Along with each layer is the definition of label item and class item specified by the user. If either of them is undefined (empty) then a default setting will apply. In the example, user defines to retrieve a map of block consisting point feature with fips as the label and polygon feature with cd110th as the classification field and also a line feature with no classification method which will be used as the boundary.

Then, the above getVars string in JavaScript is sent to a sever side script through

XMLHttpRequest. How the XMLHttpRequest is implemented has to with industry

49 standards which is not the main concern of this thesis. What we are interested in is how to take advantage of this Web interaction technique to handle the transaction between client and serve. XMLHttpRequest is asynchronous which means the client activity will not be interrupted until the service finishes processing and the result is sent back.

Many web service providers offer their own implementation of the XMLHttpRequest to handle cross browser compatibility for the user. Google also has its own

XMLHttpRequest implementation called GDownloadUrl() in its Google Maps API makes the use of XMLHttpRequest easy. The following is an example of using it.

GDownloadUrl("php/queryCensus.php?"+getVars, function(data, responseCode) { if (responseCode == 200) { var xml = GXml.parse(data);

//Do sth with xml here } else if (responseCode == -1) { alert("Data request timed out. Please try later."); } else { alert(responseCode); alert("Request resulted in error. Check XML file is retrievable."); }

The above example retrieves the resource from the given URL and calls the onload function with the text of the returned document (data) as first argument, and the HTTP response status code (responseCode) as the second. Once the result is returned, the client

JavaScript could parse the data into an XML DOM object.

50 On the server side, a PHP program called queryCensus.php gets the query parameters, builds the query, queries the database and writes the result into a

MapServer’s map file. Then, MapServer will generate a WMS image and a legend image based on the configuration in the map file. The PHP program itself will also generate an xml document including the meta information of the map file. Appendix C has a sample map file generated by the query discussed above.

After the xml file is generated, it will be obtained by the client side JavaScript via the data parameter in the GDownoadURL() function. This xml includes the meta information of the map file. The sample xml file will look like the following:

Finally, OpenLayers API will call MapServer CGI to generate both a new map and a new legend. The key code segment for generating the map image is as below. Basically, we create a WMS layer object ollayer in OpenLayers and pass parameters in the construction function to define some properties of the ollayer such as the source layers in MapServer’s map file, the projection of the layer and whether the layer is transparent or not. At last, we add ollayer to the map.

var xml = GXml.parse(data); var mapFile= xml.documentElement.getElementsByTagName("map")[0]; var url ="http://localhost/cgi-bin/mapserv.exe? map=\\ms4w\\Apache\\htdocs\\redistricting\\map\\" +mapFile.getAttribute('name')+".map&"; 51 ollayer= new OpenLayers.Layer.WMS("somename",url, {layers: mapFile.getAttribute('layers'), transparent:"true", format: "image/png",projection:"EPSG:900913"}, {tileSize: new OpenLayers.Size(400,400), buffer:1, visibility:true, reproject:true,isBaseLayer:false}); map.addLayer(ollayer);

The code to insert the map legend in JavaScript is as follows. The most important part is to define the image’s src attribute in the HTML, which points to where the legend image is generated on the server. Each OpenLayers layer will have a legend and a new legend will replace the old one if the layer is updated.

function createLegend(layerName,mapFile){ var layerTypes=mapFile.getAttribute('layers').split(','); var div=gebid('legend'); var imgs=div.getElementsByTagName('img'); for(var i=0;i0){ for(var j=0;j

createNewLegend(div,mapFile,layerTypes[i],layerName); } }else{ createNewLegend(div,mapFile,layerTypes[i],layerName); } } }

function createNewLegend(div,mapFile,layerType,layerName){ var img=ce('img'); img.onload = function() { checkResizeMap(); }; img.id=layerName+' '+layerType; img.src="http://localhost/cgi- bin/mapserv.exe?map=\\ms4w\\Apache\\htdocs\\redistricting\\map\\" 52 +mapFile.getAttribute('name')+".map&mode=legend&layer="+layerType; div.appendChild(img); div.innerHTML+='
'; }

6.2 Redistricting

Based on the census map generated using explore census tools, user could begin redistricting process. There are mainly three steps for doing this. This procedure is based on the interface discussed in 4.3.

First step is to select census units. User selects census units by drawing geometry on the map such as points, line or polygon. The census units in the database, which intersect with the drawn geometry, will be selected. Selected items will be highlighted by plotting a marker on the map. The marker is positioned at the coordinates of the surfacepoint field in the corresponding census table. As discussed in chapter 3, the surfacepoint field is a point representation of the census unit. Once a unit is selected, the statistics of the population of current selection will be calculated. This statistic information includes total population, population deviation and objective population. The user could then decide whether to include more census units in order to meet the population requirement for a district. If user wants to remove certain units from the selection, they could simply click on the marker to remove it. The statistic information will be updated accordingly.

The fips number of selected units will be added to a html select object. Redistricting could take place on different census levels. For example, user could assign a whole county to a district and a block in the county to another district. A server side PHP

53 program will process all assignments and will handle duplicate or invalid assignments, which will be discussed later.

The second step is to assign selected units to a district by updating the plan1 table in the database. The number of the district is specified by the user input in the input box in the third option of redistricting interface. Once user clicks the apply button, the assignment operation will be sent to the server and processed. As discussed earlier, the plan1 table is a preliminary plan table in which all levels of census units can be assigned a district number. This number is stored in the column the name of which is the same as that of the plan. The following code snippet from updatePlan.php shows how to do this via

PHP/Postgresql connection.

$planName=$_GET['planName']; $districtNum=$_GET['districtNum']; $connection = pg_connect($connpara); $query="UPDATE plan1 SET $planName=$districtNum WHERE"; $fips=$_GET['fips']; if($fips){ $fips_comma_seperated=implode("','",$fips); $query.=" fips in ('$fips_comma_seperated')"; set_time_limit(120); $result = pg_query($query); }else(die('No fips passed in.'))

After all assignments have been updated to plan1 table in the database, the final step is to generate the boundaries of new districts. The boundary generation process is done by merging the boundaries of all census units in the same district. The PostGIS function

ST_UNION(geometry []) can be used to handle this merging operation. The

54 following code snippet from updatePlan.php shows how to get the union of census geometries through PHP Postgresql connection.

$fips_comma_seperated=implode("','",$fips); $query="select $planName, st_astext(st_multi(st_union(the_geom))) as wkt from ( (select * from plan1 where $planName is not null) a join (select * from fips_the_geom where fips in ('$fips_comma_seperated')) b using (fips) ) group by $planName order by $planName"; $result=pg_query($query);

In the above code, first we use implode() to convert the $fips array, which contains fips numbers of all census units to be updated, to a string $fips_comma_seperated with each array element delimited by comma. Then, we build a sql query string $query to select district number and union of the geometries from a join of two sub queries. One sub query selects all triples with not-null district number from plan1 table. The other sub query selects all tuples with fips specified in $fips_comma_seperated from fips_the_geom table. In the above PHP script, the wkt column stores the geometries of each district. These wkts will be used to update the_geom column of the cd table.

Meanwhile the population columns of the cd table will also be updated. The challenge of aggregating population for each district is that population of different census units have to be aggregated in different ways. For example, a district may contain m counties, n tracts and k block groups. We need to find aggregate the population of all census units in

55 the same district and use the sum to update corresponding column in the cd table. The whole procedure of updating the cd table can be done through the following PHP script.

foreach ($districts as $key=>$district){ $result=pg_query("select fips from plan1 where $planName=$district"); $fips=array('county'=>array(),'tract'=>array(), 'bg'=>array(),'block'=>array()); while ($row=pg_fetch_assoc($result)) { switch (strlen(trim($row['fips']))){ case 15: array_push($fips['block'],trim($row['fips'])); break; case 12: array_push($fips['bg'],trim($row['fips'])); break; case 11: array_push($fips['tract'],trim($row['fips'])); break; case 5: array_push($fips['county'],trim($row['fips'])); break; } } if($fips['block']){ $blockfips_comma_seperated=implode("','",$fips['block']); $blockCondition="(fips in ('$blockfips_comma_seperated') AND sumlev='750')"; }else { $blockCondition='false'; } if($fips['bg']){ $bgfips_comma_seperated=implode("','",$fips['bg']); $bgCondition="(fips in ('$bgfips_comma_seperated') AND sumlev='740')"; }else { $bgCondition='false'; } if($fips['tract']){ $tractfips_comma_seperated=implode("','",$fips['tract']); $tractCondition="(fips in ('$tractfips_comma_seperated') AND sumlev='140')"; }else { $tractCondition='false'; } if($fips['county']){ $countyfips_comma_seperated=implode("','",$fips['county']); $countyCondition="(fips in ('$countyfips_comma_seperated') AND sumlev='050')"; }else { $countyCondition='false'; 56 } $query="select sum(total) as total, sum(white) as white, sum(minor) as minor, sum(total18) as total18, sum(white18) as white18,sum(minor18) as minor18 from p where $blockCondition OR $bgCondition OR $tractCondition OR $countyCondition"; $result=pg_query($query); $pops=array(); while ($row=pg_fetch_assoc($result)) { array_push($pops,$row['total']); array_push($pops,$row['white']); array_push($pops,$row['minor']); array_push($pops,$row['total18']); array_push($pops,$row['white18']); array_push($pops,$row['minor18']); }

$pops_comma_seperated=implode(",",$pops);

//Update population $query="update cd set (total,white,minor,total18,white18,minor18) =($pops_comma_seperated) where planname='$planName' and districtnum=$district;"; $result=pg_query($query); $geom="st_MPolyFromText('$wkts[$key]',900913)";

//update the geometry column $query="update cd set the_geom=$geom where planname='$planName' and districtnum=$district;"; $result=pg_query($query); }

//calculate the surface point of each geometry $query="update cd set surfacepoint=st_pointonsurface(the_geom) where planname='$planName'"; $result=pg_query($query);

So far, we have discussed the entire procedure of selecting units, assigning district and create geometry for new districts on the server side processing. In the following chapter, we will discuss how to evaluate the plan we created.

57 6.3 Evaluation

To evaluate a plan, there are both mandatory requirement required by the constitution and proposed requirements that are usually referred to as districting principles. The constitutional requirement includes only one criterion, which is equal population of each district. Districting principles includes several geographic and political requirements that are broadly adopted among states.

6.3.1 Constitutional criterion

A plan has to satisfy the constitutional criterion —equal population before it being sent to the court. Courts usually use terms with definite statistical meaning to quantitatively evaluate how good a plan is according to population equality criterion. These terms

(Redistricting law Chapter 3 1999) are discussed as follows.

6.3.1.1 Ideal population

A logical starting point is the “ideal” district population. In a single-member district plan, the “ideal” district population is equal to the total state population divided by the total number of districts.

P p = n where P is the total population of a state, n is the total number of districts in the state and

p is the average population of all districts.

58 Worth noting is the calculation of ideal population is based on the assumption of single- member districting method. For devising multimember districts, the “ideal” population can be more properly expressed as the “ideal” population per representative and is obtained by dividing the total state population by the total number of representatives.

Besides the ideal population, we also need to evaluate how population of each district differs from the “ideal” and how all districts collectively vary in population from the

“ideal”. There are two types of measures for these two purposes, which are individual deviation and collective deviation respectively.

6.3.1.2 Individual deviation measures

There are mainly two measures to quantify deviation: absolute deviation and relative deviation.

The “absolute deviation” is measured by the difference between the population of a district and the “ideal” population. It can be either positive or negative. As shown in the equation below, a positive number means that the district's population exceeds the “ideal” population while a negative number means that it falls below that number of people.

dppaj=− j

Where daj is the absolute deviation in population of jth district, and p j is the population of jth district and p is the “ideal” population.

59 “Relative deviation” is calculated as absolute population divided by the “ideal” population. It is more commonly used than absolute deviation. It is calculated as dividing the district's absolute deviation by the “ideal” population.

p d =×aj 100% rj p

where drj is the relative deviation in population of the jth district, paj is the absoluate deviation in population of jth district and p is the average population for all districts.

Therefore, we also use individual relative deviation to derive collective deviation in the following.

6.3.1.3 Collective deviation measures

Above three deviation index measure the population deviation of a single district from the “ideal” number. However, it does not measure how all districts in a plan vary collectively in population from the “ideal” number.

Mean deviation is proposed to solve the above problem. Based on the calculation of deviation for a single district, mean deviation also includes “absolute mean deviation” and “relative mean deviation”. Since relative deviation is more adopted, we now give the calculation of “relative mean deviation” here. The mean deviation is attained by dividing the sum of individual absolute relative deviation by the number of districts (Xiao 2008).

The equation is shown below.

60 n ∑||drj d = j=1 r n

Where dr is the absolute mean deviation, drj is the relative deviation of jth district and n is the total number of districts.

Overall range is more commonly used than mean deviation to measure the overpopulation equality of a plan. Based on individual relative deviation, the range is expressed as the difference between the max individual relative deviation and min individual relative deviation.

Rdrr=−max d r min

Where Ra is the relative deviation range in population of the plan, dr max is the maximum

relative deviation and dr min is the minimum relative deviation.

Worth noting is that relative mean deviation and relative deviation range should be used together to evaluate the overall deviation of a plan. Deviation range can be used as a check for the existence of extremely deviated cases while mean deviation gives an overall measurement. In this research, we use the measures of relative individual deviation, relative overall range and relative mean deviation to evaluate the population equality of a plan.

61 6.3.2 Traditional Districting Principles

Beside constitutional requirement, there are also redistricting principles that are widely adopted by states in juridical practice. Appendix E includes a copy of Table 5 in

Redistricting Law 2000. These principles can be divided into two broad categories: geographical principles and political principles (Redistricting law chapter 4). The districting principles in each category are listed in table 4.

Geographical Political 1. Compactnessa 1.preservation of communities of 2. Contiguityb interestd 3. Preservation political 2.preservation of cores of prior districtse subdivisionsc 3.protection of incumbentsf 4.compliance with Section 2 of the Voting Rights Actg Table 4. Traditional districting principles

Since this research mainly discusses political redistricting issue from a geographic perspective we here will also only discuss the geographic principles. Due to the lack of political subdivisions data, we also only implement the first two geographic principles as an example.

a Shaw v. Reno (Shaw I), 509 U.S. 630, 647 (1993); Bush v. Vera, 517 U.S. 952, 959(1996); DeWitt v. Wilson, 856 F. Supp 1409, 1414 (E.D. Cal 1994), summarily aff’d, 515 U.S.1170 (1995). b Shaw v. Reno (Shaw I), 509 U.S. 630, 647 (1993). c Shaw v. Reno (Shaw I), 509 U.S. 630, 647 (1993); Abrams v. Johnson, 521 U.S. 74(1997). d Miller v. Johnson, 515 U.S. 900, 919-20 (1995); Abrams v. Johnson, 521 U.S. 74(1997). e Abrams v. Johnson, 521 U.S. 74 (1997). f Abrams v. Johnson, 521 U.S. 74 (1997). g Shaw v. Hunt (Shaw II), 517 U.S. 899, 915 (1996). 62 6.3.2.1 Compactness

Compactness requires that the shape of the district should be ‘regular’. Compactness has long been considered as a defense against gerrymandering, Gerrymandering is a term that describes the deliberate drawing of congressional districts boundaries to influence the outcome of elections either for racial or partisan interests.

From mathematic geometry perspective, a shape’s compactness is a measure of how spread out the geometry is. Worth noting also is the proof that compactness is not a sufficient guarantee against gerrymandering since a district can gerrymander compactly.

However, this also does not mean compactness index is useless since district of highly irregular shape does deserve our suspect of gerrymandering. The following equation is employed to check compactness of districts. It is represented by the ratio of the perimeter

(p) of a district to the circumference of a circle of equal area (a) (Schwartzberg 1966).

The index is expressed like this

Cp= 2 /4 aπ

C will have a minimum value 1 if the shape is a circle, which is truly compact. The value increases as the district becomes more irregular. Usually value over 2.8 is an indicator of

“irregular” (Schwartzberg 1966). Compactness can be calculated in PostGIS using the following SQL script:

Select name,num,st_perimeter(the_geom)^2/(4*st_area(the_geom)*3.1415) as compactness From cd Order by compactness

63

6.3.2.2 Contiguity

Contiguity requires all “building blocks” in the same district to be contiguous over space.

It can be checked using ST_DUMP(geometry) function in PostGIS. This function specifies the decomposition rules for input geometry. If the input geometry is multi geometry it will return a record for each of the components of the geometry (PostGIS

1.4.0 Manual). Based on this, we could use the number of records returned by this function to determine whether a district is contiguous or not. The following SQL script lists out all multipolygon districts for a specific plan and the number of their parts:

select planname, districtnum, count(districtnum) as count from (SELECT planname, districtnum, (ST_Dump(the_geom)).geom FROM cd where cd.planname=planname order by planname, districtnum) group by planname, districtnum having count(districtnum) >1

For plan cd110th, the return will be:

name | num | count ------+-----+------cd110th | 9 | 7

Worth mentioning is that compared with the importance of constitutional constraint, geographic compactness should not be over stressed. Geographers should be aware that the physical shape of a meaningful territory is of far less importance than its behavioral shape or sense of integrity. Besides the above two geographic criteria, there are also

64 political criteria for redistricting process. However, the discussion of them is beyond the scope of this research.

65

CHAPTER 7: Results

7.1 Exploring the census

7.1.1 Use map extent to query

Figure 8 and figure 9 are two results of showing choropleth map of county data. Both use the map extent to as the spatial filter to show the results. Current map extent as shown in figure 8 contains all the counties in Ohio. The legend for this map is shown on the left side of the map container. Blue color in the legend indicates low population and red color indicates high population. The classification method we use for quantitative data such as population simply assigns each value to a class so that the difference among populations can be maximized. More classification methods such as equal interval, quantile, natural breaks and standard deviation can also be implemented based on the needs of different applications.

The performance of query on the county level is satisfactory since the total time of showing a county map takes less than 10 seconds. This time includes both query the database and rendering 88 counties on map.

We can see from the map that population in Ohio has an unbalanced distribution. Large cities such as Cleveland, Cincinnati and Columbus and their neighboring area lives more

66 population than other areas. This choropleth map lays the analytical basis for subsequent redistricting process. Based the color of each census unit and labeled population, user could decide whether to include a census unit (such as county) in a new district or not.

Figure 8. Point based choropleth map of population by county

67

Figure 9. Polygon based choropleth map of population by county

7.1.2 Use user drawn feature to query

Spatial query has advantages than non-spatial query when user wants to show certain census units based on an area they draw on the map. Spatial query operators include intersect, contain, touch, cross, etc. For example, intersect operator calculates whether a query feature (drawn feature) intersects with feature being queried. If a query feature intersects with feature being queried, the queried features will be selected. Worth noting is that this query feature can be other predefined feature rather than user drawn feature such as a communities of interest defined by a regionalization expert. 68

Figure 10 and figure 11 give examples of how to query census blocks based on user drawn features. The area in the map extent is around Ohio State University’s Main

Campus in Columbus, OH. In Figure 10, we create two polygons on each side of a street called “North High Street”. Figure 11 shows the query result based on these two polygons. The classification field and label field both use the number of the districts

Ohio’s 110th congressional redistricting plan.

District number is a type of qualitative data which only indicates the category the value belongs to. The magnitude of the qualitative value is not meaningful. In this way, we give each qualitative value a unique class. As shown in Figure 11, blue color indicates census blocks assigned to district 15 in 110th congressional redistricting while red color indicates census blocks assigned to district 12.

69

Figure 10. User drawn feature of defining a query

Figure 11. Blocks returned by query using user drawn features

70 7.2 Redistricting

7.2.1 Selecting census units

Here, we give an example of how to create districts based on counties. Figure 12-15 shows each step of doing this. First, as shown in Figure 12, we have a county population map and we use the selection tool to draw a polygon to select all counties that intersect with this polygon. After we finish drawing the feature, a call back function called selectUnits() in JavaScript will send a query string through XMLHttpRequest to a

PHP file. The PHP file will process the query and return coordinates of point representation of selected units. At the same time, statistical data will be updated based on current selection. A marker will be shown on the map based on the coordinates of each unit as shown in Figure 13.

7.2.2 Assigning units to district

Then, we input a district number 1 in the input box in “3. District” option and click

“Apply” bottom. All selected counties will be assigned to district 1. The alert dialog will pop up when the assignment is processed. The alert dialog will indicate how many units are updated and the time for this update

7.2.3 Finishing redistricting

After user finishes the assignment to a district, they could choose to finish current redistricting. User clicks the finish button and all census units in the same district will be

71 merged together to create a new district. Since we only finish redistricting one district, only this district will be shown on the map as indicated in the blue area in Figure 15.

Figure 12. Draw feature to select census units

Figure 13. Selected units shown as markers

72

Figure 14. Assign selected units to a district

Figure 15. Merge the boundaries of all units in the same district

73 7.3 Evaluation

The evaluation process will check both the constitutional criteria and geographic criteria of current plan. The table in Figure 16 shows the evaluation results of assigned district.

We currently finished four districts. From the evaluation results we can see that, only district 2 and 3 could pass the check for both constitutional and geographic constraints.

District 1 violates the population constraint since its population is 15.12% higher than the average of all districts. District 4 violates the geographic constraints since it has multiparts and its compactness index is higher than the threshold which in this reseach we set as 2.8.

Figure 17 is an example of evaluating existing plan cd110th, which is the official 110th congressional redistricting plan of the state of Ohio. From figure 17 we can see that although the population deviation of this plan is small it only has 6 districts with approximately “zero deviation”. Most of districts are highly suspect of gerrymandering and only district 14 can be considered as regular according to the threshold value of compactness index as 2.8.

74

Figure 16. Evaluation of districts

Figure 17. Evaluation of 110th congressional districts

75

CHAPTER 8: Conclusions

8.1 Conclusions

Political redistricting is an important and controversial issue. It is important because the spatial rearrangement of census units into new districts will end up with the reallocation of political powers when new representatives are elected. Also, intentionally redrawing district boundaries according to specific partisan or racial interests will diminish a citizen’s voting power. As a result, although the voting right of the citizen has not been denied, it is less possible for the citizen to equally influence the outcome of the election if gerrymandering exists.

An effective way for the citizens to protect their voting right is through their engagement into the redistricting process. One of the ways to reach this goal is through an online participatory system such as a Web-based GIS. The initial evidence provided in this research indicates that open source geospatial software such as Postgresql/PostGIS,

MapServer and OpenLayers are potentially suitable for implementing effective spatial analytical tools for a Web-based GIS application. The implemented system in this research not only provides Web Mapping functions based on user-defined query for retrieving census data but also provides spatial processing functionalities to process spatial data such as spatial union operation.

76

A wide range of user groups can benefit from the implementation of the Web-based political redistricting GIS in this research. First, government agencies can use the platform to publish their political plan as well as the data and methodology they employ to devise the policy. By doing this, the plan making process can become more transparent and therefore increase people trust in their government. Second, ordinary users can assess the environmental impact of a plan using implemented mapping tools, for example, how the plan will reshape the spatial structure of their own neighborhood. A last potential user group is political activists and related experts who have both expertise in regional research and the enthusiasm to participate in politics. Their research results can be used to evaluate cultural, ecological and environmental impact of the policy and therefore offers either support or challenge. Particularly, their research results such as the definitions of the communities of interests can be adopted to evaluate whether a devised district splits or preserver an existing community. The preservation of communities has raised many concerns in recent year as a counter to the increased dehumanization of redistricting process. In sum, the implemented Web-based political redistricting GIS in this research can benefit user group from the government, activists, to ordinary people. It provides a user interface to devise redistricting plans from the inception to final evaluation and a platform for comparing different plans. As a result, government policy making procedure can become more transparent and democracy improved.

77 Finally, the framework proposed in this research can be potentially adapted to other development environments with similar budget and functionality requirements of the application. In essence, defining a state’s congressional voting districts is not different from defining legislative districts. The main difference lies in size of the population associated with each representative but this does not change the political and geographic nature of redistricting issue.

8.2 Future research

8.2.1 Preservation of county and city boundaries

Concerns have been raised in recent years about what geographic unit can be used as the most suitable building blocks for redistricting. One proposal is to use the Super District method adopted by North Carolina Supreme Court as a means to limit the opportunity for political mischief (Peterson 2008). It aims to devise a so-called super district entirely based on county boundaries. One or more districts can then be devised from these super districts by dividing population into each district. Other proposal is to use cities as the building blocks for populated areas where county population exceeds that of a district. In this case, although counties in general are more compact than cities it is inevitable to split them as a result the integrity of each county can not be preserved. In this case, cities are more readily suitable as the building blocks. The objective of adopting county and city boundaries whenever possible is to minimize the chance that local jurisdiction boundaries can be split by district boundaries (Cain, Hui and MacDonald 2008).

78 8.2.2 Community of interest

Communities of interest is the most vaguely defined criterion which, however, usually considered by court when several competing maps are to be compared. For example, counting the number of city or county splits is one way of evaluating preservation of community of interest. Controversies focus on whether community of interest has a geographic essence.

One opinion is that the community of interest is a geographic concept but this type of geographic unit does not have a crispy boundary. In an appreciation of paper in time geography (Kwan 1999), Forrest (2005) argues the importance of considering dynamic process in defining the spatial extent of communities. He pointed out processes such as day-to-day interactions and organizations of civil societies are driving forces to form communities. This kind of representation such as using time geography method can reconcile doctrine of practice that takes boundaries as the sole representation of community.

The other opinion is that communities of interest are better thought as an overall ideological congruence among citizens rather than a geographic consideration (Brunell

2006). He argues that a community of interest should be composed entirely of either

Democrats or Republicans (liberals or conservatives) and it is the most important measure for deciding where to split a city or a county. This type of opinion emphasizes

79 more on the politic criteria such as partisan competitiveness and inevitably sacrifices some geographic criteria such as compactness and contiguity.

80

REFERENCE

Altman, M, MacDonald, K and McDonald, M. 2005. From crayons to computers: the evolution of computer use in redistricting. Social Science Computer Review. 23(3): 334- 346.

Barndt, M. 2002. A Model for Evaluating Public Participation GIS. In Craig, W., T. Harris, and D. Weiner (Eds.), Community Participation and Geographic Information Systems. Taylor and Francis. London. 346-356.

Boulos, M.N.K., Russell, C., and Smith, M. 2005. Web GIS in practice II: interactive SVG maps of diagnoses of sexually transmitted diseases by Primary Care Trust in London, 1997-2003. International Journal of Health Geographics. 4(1): 4.

Boulos, M.N.K. 2005. Web GIS in practice III: creating a simple interactive map of England's Strategic Health Authorities using Google Maps API, KML, and MSN Virtual Earth Map Control. International Journal of Health Geographics. 4(1): 22.

Boulos, M.N.K. and Honda, K. 2006. Web GIS in practice IV: publishing your health maps and connecting to remote WMS sources using the Open Source UMN International. Journal of Health Geographics. 5(1):6.

Brunell, T.L. 2006. Rethinking redistricting: How drawing uncompetitive districts eliminates gerrymanders, enhances representation, and improves attitudes toward Congress. Political Science. 39(1): 77.

Cain, B.E., Hui, I., and MacDonald, K. 2008. Sorting or Self-Sorting: Competition and Redistricting in California. The New Political Geography of California. Berkeley Public Policy Press, Berkeley, California.

Caldeweyher, D, Zhang, J and Pham, B. 2006. OpenCIS—Open Source GIS-based web community information system. International Journal of Geographical Information Science. 20(8): 885-898.

Carver, S., Evans, A., Kingston, R., and Turton, I. 2001. Public participation, GIS, and Cyberdemocracy: evaluating on-line spatial decision support systems. Environment and Planning B. 28(6): 907-922.

81 Carter, H. 1995. The study of urban geography. Arnold, London.

Cha, S.J., Hwang, Y.Y., Chang, Y.S., Kim, K.O. 2007. Integrating Ajax into GIS Web Services for Performance Enhancement. Lecture Notes on Computer Science. 4488/2007: 562-568.

Chicago Police Department. 2005. CLEARMAP http://events.esri.com/uc/2005/sag/list/index.cfm?fa=pr&SID=247. Honors for Exceptional Work Using GIS Technology ESRI International User Conference.

Choi, J.Y., Engel, B.A., and Farnsworth, R.L. 2005. Web-based GIS and spatial decision support system for watershed management. Journal of Hydroinformatics. 7(3): 165-174.

Gourley, D. HTTP: The Definitive Guide, 1st Edition. O'Reilly Media, Inc.

District for ArcGIS. http://www.esri.com/industries/elections/business/redistrict.html.

Ext JS. http://extjs.com/.

Forest, B. 2004. Information sovereignty and GIS: the evolution of “communities of interest” in political redistricting. Political Geography. 23(4):425-451.

Forest, B. 2004. The legal (de) construction of geography: race and political community in Supreme Court redistricting decisions. Social & Cultural Geography. 5(1):55-73.

Fortner, M. 2009. Ohio Redistricting Competition. http://www.sos.state.oh.us/SOS/Text.aspx?page=12303&AspxAutoDetectCookieSupport =1.

Garrett, J.J. 2005. Ajax: A New Approach to Web Applications. Adaptive Path. http://www.adaptivepath.com/publications/essays/archives/000385.php.

Ghose, R. and Huxhold, W.E. 2001. Role of Local Contextual Factors in Building Public Participation GIS: The Milwaukee Experience. Cartography and Geographic Information Science. 28(3): 195-208.

Goodchild, M.F. 2007. Citizens as voluntary sensors: spatial data infrastructure in the world of Web 2.0. International Journal of Spatial Data Infrastructures Research. 2: 24- 32.

Handley, L. 2000. A guide to 2000 “redistricting tools and technology”. In N. Persily (Ed.), The real Y2K problem: Census 2000 data and redistricting technology. Brennan Center for Justice, New York. 28-42.

82 Haklay, M. and Tobón, C. 2003. Usability evaluation and PPGIS: towards a user-centred design approach. International Journal of Geographical Information Science. 17(6): 577- 592.

Harris, T. and Weiner, D. 2002. Implementing a community-integrated GIS: perspectives from South African fieldwork. Community Participation and Geographic Information Systems. 246-258.

Harris, T. and Weiner, D. 1998. Empowerment, Marginalization, and "Community- integrated" GIS. Cartography and Geographic Information Science. 25(2): 67-76

Johnston, R.J., Gregory, D. and Smith, D.M. 1994. The Dictionary of Human Geography 3rd edition. Blackwell, Oxford. 110. jQuery. A new kind of JavaScript library. http://jquery.com/.

Kingston, R. 2000. Web-based public participation geographical information systems: an aid to local environmental decision-making. Computers, Environment and Urban Systems. 24(2): 109-125.

Kong, L. Geography and religion: trends and prospects. Progress in Human Geography. 1990. 14(3): 355-371.

Krugman, P. 1991. Geography and Trade. MIT Press, Cambridge, MA.

Kwan, M.P. 1999. Gender and individual access to urban opportunities: a study using space–time measures. The Professional Geographer. 51(2): 211-227.

Kwan, M.P. 2002. Feminist Visualization: Re-envisioning GIS as a Method in Feminist Geographic Research. Annals of the Association of American Geographers. 92(4): 645- 661.

Lu, X. 2005.An Investigation on Service-Oriented Architecture for Constructing Distributed Web GIS Application. IEEE Computer Society Washington, DC, USA. 1:191- 197.

Morrill, R.L. 1981. Political redistricting and geographic theory. Washington DC: Association of American Geographers. 23.

McDonald, M.P. 2000. Redistricting Websites: 2000 Redistricting. http://elections.gmu.edu/Redistricting_websites_2000.html.

Miller, C.C. 2006. A Beast in the Field: The Google Maps Mashup as GIS/2. Cartographica: The International Journal for Geographic Information and Geovisualization. 41(3): 187-199. 83

Muller, M.J., Haslwanter, J.H. and Dayton, T. 1997. Participatory practices in the software lifecycle. Handbook of Human- Computer Interaction, 2 nd edition. Elsevier. 255-297.

Nash, E., Korduan, P., Abele, S. and Hobona, G. 2008. Design Requirements for an AJAX and Web-Service Based Generic Internet GIS Client. Proceedings of the 11 the AGILE International. 1-6.

Obermeyer, N.J. 1998. PPGIS: the evolution of public participation GIS. Unpublished UCGIS white paper.

O'Looney, J. 2000. Beyond Maps—GIS and Decision Making in Local Government. Redlands, Calif. ESRI Press. 206.

OpenLayers 2.5. http://trac.openlayers.org/wiki/SphericalMercator.

O’Reilly, T. 2007. What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software. Communications and Strategies. 1: 17.

OSGEO. 2009. About the Open Source Geospatial Foundation http://www.OSGEO.org/content/foundation/about.html.

Pacione, M. 2005. Urban geography: a global perspective, Edition 3. London: Routledge.

Paulson, L.D. 2005. Building rich web applications with Ajax. Computer. 38(10): 14-17.

Peterson, D. 2008. Putting chance to work: Reducing the politics in political redistricting. CHANCE. 21(1): 22.

PostGIS 1.4.0 Manual. http://postgis.refractions.net/documentation/manual-1.4/.

Redistricting Law 2000. 1999. National Conference of State Legislatures, Denver, Colorado.

Schwartzberg, J. 1966. Reappointment, Gerrymandering and the Notion of Compactness. Minnesota Law Review. 50:443-457.

Sieber, R.E. 2000. GIS Implementation in the Grassroots. URISA Journal. 12(1): 15-29.

Sieber, R.E. 2004. Rewiring for a GIS/2. Cartographica: The International Journal for Geographic Information and Geovisualization, 39(1): 25-39.

84 Sieber, R.E. 2006. Public Participation Geographic Information Systems: A Literature Review and Framework. Annals of the Association of American Geographers. 96(3): 491-507.

Stacey, M. 1969. The Myth of Community Studies. The British Journal of Sociology. 20(2): 134-147 .

Stein, M. 1960. The eclipse of community. Princeton, NJ: Princeton University Press.

Sui, D. 2008. The wikification of GIS and its consequences: Or Angelina Jolie’s new tattoo and the future of GIS. Computers Environment and Urban Systems. 32(1):1-5.

Talen, E. 1999. Constructing neighborhoods from the bottom up: the case for resident- generated GIS. Environment and Planning B. 26: 533-554.

Talen, E. 2000. Bottom-Up GIS. Journal of the American Planning Association. 66(3): 279-294.

Terra GIS. 2008. Obama Campaign-Mapping voters with Mapserver, PostGIS and Openlayers. http://www.terragis.net/2008/11/24/obama-campaign-mapping-voters-with- --and-openlayers/.

Umar, A. 1997. Object-oriented client/server Internet environments. Prentice Hall Press Upper Saddle River, NJ, USA.

Vatsavai, RR, Shekhar, S, Burk T.E., Lime, S. 2006. UMN-MapServer: A high- performance, interoperable, and open source web mapping and geo-spatial analysis system. Lecture Notes in Computer Science. 4197/2006: 400-417.

Ward, M. and O'Looney, J. 2002. Spatial Methods in Political Science. Political Analysis.

Weiner, D., Harris, T., and Craig, W. 2002. Community participation and geographic information systems. Community Participation and Geographic Information Systems. 3-6.

Weiner, D., and Harris, T. 2003. Community-integrated GIS for Land Reform in South Africa. URISA Journal. 15(2): 61-73.

Williams, J.C. 1995. Political Redistricting: A Review. Papers in Regional Science. 74(1): 13-40.

Xiao, N., Ahlqvist, O., and Kwan, M.P. 2007. Public Participation GIS for the General Public? the 9th International Conference on Geocomputation, Maynooth, Ireland.

85 Xiao, N. 2008. A Unified Conceptual Framework for Geographical Optimization Using Evolutionary Algorithms. Annals of the Association of American Geographers. 98(4): 795-817.

Yi, Q, Hoskins, RE, Hillringhouse, EA. 2008. Integrating open-source technologies to build low-cost information systems for improved access to public health data. International Journal of Health Geography. 7: 29.

YUI. The Yahoo! User Interface Library (YUI) http://developer.yahoo.com/yui/.

Zelinsky, W. 1961. An approach to the religious geography of the United States: Patterns of church membership in 1952. Annals of the Association of American Geographers. 51 (2): 139-193.

86

Appendix A: SQL Scripts for Creating Postgresql Database Tables

A1 Create census tables

A1.1 Create block table

CREATE TABLE block ( gid serial NOT NULL, the_geom geometry, fips character(15), surfacepoint geometry, bgfips character(12), tractfips character(11), countyfips character(5), CONSTRAINT block_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_surfacepoint CHECK (ndims(surfacepoint) = 2), CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2), CONSTRAINT enforce_geotype_geom CHECK (geometrytype(the_geom) = 'POLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_geotype_surfacepoint CHECK (geometrytype(surfacepoint) = 'POINT'::text OR surfacepoint IS NULL), CONSTRAINT enforce_srid_surfacepoint CHECK (srid(surfacepoint) = 900913), CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 900913) ) WITH (OIDS=FALSE); ALTER TABLE block OWNER TO postgres;

CREATE INDEX idx_block_blkidfp ON block USING btree (fips);

CREATE INDEX idx_block_geom ON block USING gist (the_geom);

CREATE INDEX idx_block_gid ON block USING btree 87 (gid);

CREATE INDEX idx_block_surfacepoint ON block USING gist (surfacepoint);

A1.2 Create bg table

CREATE TABLE bg ( gid serial NOT NULL, fips character(12), the_geom geometry, surfacepoint geometry, tractfips character(11), countyfips character(5), CONSTRAINT bg_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_surfacepoint CHECK (ndims(surfacepoint) = 2), CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2), CONSTRAINT enforce_geotype_surfacepoint CHECK (geometrytype(surfacepoint) = 'POINT'::text OR surfacepoint IS NULL), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'MULTIPOLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_surfacepoint CHECK (srid(surfacepoint) = 900913), CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 900913) ) WITH (OIDS=FALSE); ALTER TABLE bg OWNER TO postgres; COMMENT ON TABLE bg IS 'Table of block group';

CREATE INDEX idx_bg_bkgpidfp ON bg USING btree (fips);

CREATE INDEX idx_bg_gid ON bg USING btree (gid);

CREATE INDEX idx_bg_surfacepoint ON bg USING gist (surfacepoint);

CREATE INDEX idx_bg_the_geom 88 ON bg USING gist (the_geom);

A1.3 Create tract table

CREATE TABLE tract ( gid serial NOT NULL, fips character(11), the_geom geometry, surfacepoint geometry, countyfips character(5), CONSTRAINT tract_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_surfacepoint CHECK (ndims(surfacepoint) = 2), CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2), CONSTRAINT enforce_geotype_surfacepoint CHECK (geometrytype(surfacepoint) = 'POINT'::text OR surfacepoint IS NULL), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'MULTIPOLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_surfacepoint CHECK (srid(surfacepoint) = 900913), CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 900913) ) WITH (OIDS=FALSE); ALTER TABLE tract OWNER TO postgres;

CREATE INDEX idx_ctidfp_tract ON tract USING btree (fips);

CREATE INDEX idx_gid_tract ON tract USING btree (gid);

CREATE INDEX idx_tract_countyfips ON tract USING btree (countyfips);

CREATE INDEX idx_tract_surfacepoint ON tract USING gist (surfacepoint);

CREATE INDEX idx_tract_the_geom 89 ON tract USING gist (the_geom);

A1.4 Create county table

CREATE TABLE county ( gid serial NOT NULL, fips character(5), "name" character varying(100), the_geom geometry, surfacepoint geometry, CONSTRAINT county_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_surfacepoint CHECK (ndims(surfacepoint) = 2), CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2), CONSTRAINT enforce_geotype_surfacepoint CHECK (geometrytype(surfacepoint) = 'POINT'::text OR surfacepoint IS NULL), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'MULTIPOLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_surfacepoint CHECK (srid(surfacepoint) = 900913), CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 900913) ) WITH (OIDS=FALSE); ALTER TABLE county OWNER TO postgres;

CREATE INDEX idx_county_surfacepoint ON county USING gist (surfacepoint);

CREATE INDEX idx_county_the_geom ON county USING gist (the_geom);

CREATE INDEX idx_countyfp_county ON county USING btree (fips);

CREATE INDEX idx_gid_county ON county USING btree (gid);

90 A2 Create cd table (congressional district table)

CREATE TABLE cd ( gid integer NOT NULL DEFAULT nextval('cd110_gid_seq'::regclass), districtnum smallint, planname character varying(50), the_geom geometry, surfacepoint geometry, total integer, white integer, minor integer, total18 integer, white18 integer, minor18 integer, CONSTRAINT cd110_pkey PRIMARY KEY (gid), CONSTRAINT enforce_dims_geom CHECK (ndims(the_geom) = 2), CONSTRAINT enforce_dims_surfacepoint CHECK (ndims(surfacepoint) = 2), CONSTRAINT enforce_geotype_surfacepoint CHECK (geometrytype(surfacepoint) = 'POINT'::text OR surfacepoint IS NULL), CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = 'MULTIPOLYGON'::text OR the_geom IS NULL), CONSTRAINT enforce_srid_geom CHECK (srid(the_geom) = 900913), CONSTRAINT enforce_srid_surfacepoint CHECK (srid(surfacepoint) = 900913) ) WITH (OIDS=FALSE); ALTER TABLE cd OWNER TO postgres;

CREATE INDEX idx_cdgeo_gid ON cd USING btree (gid);

CREATE INDEX idx_cdgeo_surfacepoint ON cd USING gist (surfacepoint);

CREATE INDEX idx_cdgeo_the_geom ON cd USING gist (the_geom);

91 A3 Create geo table (geographic correspondence table)

CREATE TABLE geo (fileid character(6), stusab character(2), sumlev character(3), geocomp character(2), chariter character(3), cifsn character(2), logrecno character(7) NOT NULL, region character(1), division character(1), statece character(2), state character(2), county character(3), countysc character(2), cousub character(5), cousubcc character(2), cousubsc character(2), place character(5), placecc character(2), placedc character(1), placesc character(2), tract character(6), blkgrp character(1), block character(4), iuc character(2), concit character(5), concitcc character(2), concitsc character(2), aianhh character(4), aianhhfp character(5), aianhhcc character(2), aihhtli character(1), aitsce character(3), aits character(5), aitscc character(2), anrc character(5), anrccc character(2), msacmsa character(4), masc character(2), cmsa character(2), macci character(1), pmsa character(4), necma character(4), necmacci character(1), necmasc character(2), exi character(1), ua character(5), uasc character(2), uatype character(1), ur character(1), sldu character(3), sldl character(3), vtd character(6), vtdi character(1), zcta3 character(3), zcta5 character(5), submcd character(5), submcdcc character(2), arealand character(14), areawatr character(14), "name" character(90), funcstat character(1), gcuni character(1), pop100 character(9), res character(9), intptlat character(9), intptlon character(10), lsadc character(2), partflag character(1), sdelm character(5), sdsec character(5), sduni character(5), taz character(6), uga character(5), puma5 character(5), puma1 character(5), reserved character(32), fips character(15), PRIMARY KEY (logrecno));

A4 Create population tables

A4.1 Create p1 table (population by race table)

CREATE TABLE p1 ( fileid character(3), stusab character(2), chariter character(3), cifsn character(2), logrecno character(7) NOT NULL, p0010001 integer, p0010002 integer, p0010003 integer, p0010004 integer, p0010005 integer, p0010006 integer, p0010007 integer, p0010008 integer, p0010009 integer, p0010010 integer, p0010011 integer, p0010012 integer, p0010013 integer, p0010014 integer, p0010015 integer, p0010016 integer, p0010017 integer, p0010018 integer, p0010019 integer, p0010020 integer, p0010021 integer, p0010022 integer, p0010023 integer, p0010024 integer, p0010025 integer, p0010026 integer, p0010027 integer, p0010028 integer, p0010029 integer, p0010030 integer, p0010031 integer, 92 p0010032 integer, p0010033 integer, p0010034 integer, p0010035 integer, p0010036 integer, p0010037 integer, p0010038 integer, p0010039 integer, p0010040 integer, p0010041 integer, p0010042 integer, p0010043 integer, p0010044 integer, p0010045 integer, p0010046 integer, p0010047 integer, p0010048 integer, p0010049 integer, p0010050 integer, p0010051 integer, p0010052 integer, p0010053 integer, p0010054 integer, p0010055 integer, p0010056 integer, p0010057 integer, p0010058 integer, p0010059 integer, p0010060 integer, p0010061 integer, p0010062 integer, p0010063 integer, p0010064 integer, p0010065 integer, p0010066 integer, p0010067 integer, p0010068 integer, p0010069 integer, p0010070 integer, p0010071 integer, p0020001 integer, p0020002 integer, p0020003 integer, p0020004 integer, p0020005 integer, p0020006 integer, p0020007 integer, p0020008 integer, p0020009 integer, p0020010 integer, p0020011 integer, p0020012 integer, p0020013 integer, p0020014 integer, p0020015 integer, p0020016 integer, p0020017 integer, p0020018 integer, p0020019 integer, p0020020 integer, p0020021 integer, p0020022 integer, p0020023 integer, p0020024 integer, p0020025 integer, p0020026 integer, p0020027 integer, p0020028 integer, p0020029 integer, p0020030 integer, p0020031 integer, p0020032 integer, p0020033 integer, p0020034 integer, p0020035 integer, p0020036 integer, p0020037 integer, p0020038 integer, p0020039 integer, p0020040 integer, p0020041 integer, p0020042 integer, p0020043 integer, p0020044 integer, p0020045 integer, p0020046 integer, p0020047 integer, p0020048 integer, p0020049 integer, p0020050 integer, p0020051 integer, p0020052 integer, p0020053 integer, p0020054 integer, p0020055 integer, p0020056 integer, p0020057 integer, p0020058 integer, p0020059 integer, p0020060 integer, p0020061 integer, p0020062 integer, p0020063 integer, p0020064 integer, p0020065 integer, p0020066 integer, p0020067 integer, p0020068 integer, p0020069 integer, p0020070 integer, p0020071 integer, p0020072 integer, p0020073 integer, CONSTRAINT race_pkey PRIMARY KEY (logrecno) ) WITH (OIDS=FALSE); ALTER TABLE p1 OWNER TO postgres;

CREATE INDEX idx_p1_logrecno ON p1 USING btree (logrecno);

A4.2 p2 table (voting population by race table)

CREATE TABLE p2 ( fileid character(3), stusab character(2), chariter character(3), cifsn character(2), logrecno character(7) NOT NULL, p0030001 integer, p0030002 integer, p0030003 integer, p0030004 integer, p0030005 integer, p0030006 integer, p0030007 integer, p0030008 integer, p0030009 integer, p0030010 integer, 93 p0030011 integer, p0030012 integer, p0030013 integer, p0030014 integer, p0030015 integer, p0030016 integer, p0030017 integer, p0030018 integer, p0030019 integer, p0030020 integer, p0030021 integer, p0030022 integer, p0030023 integer, p0030024 integer, p0030025 integer, p0030026 integer, p0030027 integer, p0030028 integer, p0030029 integer, p0030030 integer, p0030031 integer, p0030032 integer, p0030033 integer, p0030034 integer, p0030035 integer, p0030036 integer, p0030037 integer, p0030038 integer, p0030039 integer, p0030040 integer, p0030041 integer, p0030042 integer, p0030043 integer, p0030044 integer, p0030045 integer, p0030046 integer, p0030047 integer, p0030048 integer, p0030049 integer, p0030050 integer, p0030051 integer, p0030052 integer, p0030053 integer, p0030054 integer, p0030055 integer, p0030056 integer, p0030057 integer, p0030058 integer, p0030059 integer, p0030060 integer, p0030061 integer, p0030062 integer, p0030063 integer, p0030064 integer, p0030065 integer, p0030066 integer, p0030067 integer, p0030068 integer, p0030069 integer, p0030070 integer, p0030071 integer, p0040001 integer, p0040002 integer, p0040003 integer, p0040004 integer, p0040005 integer, p0040006 integer, p0040007 integer, p0040008 integer, p0040009 integer, p0040010 integer, p0040011 integer, p0040012 integer, p0040013 integer, p0040014 integer, p0040015 integer, p0040016 integer, p0040017 integer, p0040018 integer, p0040019 integer, p0040020 integer, p0040021 integer, p0040022 integer, p0040023 integer, p0040024 integer, p0040025 integer, p0040026 integer, p0040027 integer, p0040028 integer, p0040029 integer, p0040030 integer, p0040031 integer, p0040032 integer, p0040033 integer, p0040034 integer, p0040035 integer, p0040036 integer, p0040037 integer, p0040038 integer, p0040039 integer, p0040040 integer, p0040041 integer, p0040042 integer, p0040043 integer, p0040044 integer, p0040045 integer, p0040046 integer, p0040047 integer, p0040048 integer, p0040049 integer, p0040050 integer, p0040051 integer, p0040052 integer, p0040053 integer, p0040054 integer, p0040055 integer, p0040056 integer, p0040057 integer, p0040058 integer, p0040059 integer, p0040060 integer, p0040061 integer, p0040062 integer, p0040063 integer, p0040064 integer, p0040065 integer, p0040066 integer, p0040067 integer, p0040068 integer, p0040069 integer, p0040070 integer, p0040071 integer, p0040072 integer, p0040073 integer, CONSTRAINT race18_pkey PRIMARY KEY (logrecno) ) WITH (OIDS=FALSE); ALTER TABLE p2 OWNER TO postgres;

CREATE INDEX idx_race18 ON p2 USING btree (logrecno);

94 A5 Create plan1 table (preliminary plan table)

CREATE TABLE plan1 ( fips character(15) NOT NULL, newplan1 smallint, CONSTRAINT plan_pkey PRIMARY KEY (fips) ) WITH (OIDS=FALSE); ALTER TABLE plan1 OWNER TO postgres;

CREATE INDEX idx_plan1_fips ON plan1 USING btree (fips);

A6 Create plan2 (final plan table)

CREATE TABLE plan2 ( fips character(15) NOT NULL, cd110th smallint, cd109th smallint, cd108th smallint, CONSTRAINT plan2_pkey PRIMARY KEY (fips) ) WITH (OIDS=FALSE); ALTER TABLE plan2 OWNER TO postgres;

CREATE INDEX idx_plan2_blockfips ON plan2 USING btree (fips);

95

Appendix B: SQL Scripts for Creating Database Views

B1 Create view p

CREATE OR REPLACE VIEW p AS SELECT geo.fips, geo.sumlev, geo.logrecno, p1.p0010001 AS total, p1.p0010003 AS white, p1.p0010001 - p1.p0010003 AS minor, p2.p0030001 AS total18, p2.p0030003 AS white18, p2.p0030001 - p2.p0030003 AS minor18 FROM geo JOIN p1 USING (logrecno) JOIN p2 USING (logrecno);

B2 Create view block_plan_p

CREATE OR REPLACE VIEW block_plan_p AS SELECT bp.*, p.total, p.white, p.minor, p.total18, p.white18, p.minor18 FROM (block b JOIN plan1 USING (fips) JOIN plan2 USING (fips)) bp JOIN p USING (fips) WHERE p.sumlev = '750'::bpchar;

B3 Create view bg_plan1_p

CREATE OR REPLACE VIEW bg_plan1_p AS SELECT a.*, p.partflag, p.total, p.white, p.minor, p.total18, p.white18, p.minor18 FROM (bg b JOIN plan1 USING (fips)) a JOIN p USING (fips) WHERE p.sumlev = '740'::bpchar;

96 B4 Create view tract_plan1_p

CREATE OR REPLACE VIEW tract_plan1_p AS SELECT tp.*, p.total, p.white, p.minor, p.total18, p.white18, p.minor18 FROM (tract t JOIN plan1 USING (fips)) tp JOIN p USING (fips) WHERE p.sumlev = '140'::bpchar;

B5 Create view county_plan1_p

CREATE OR REPLACE VIEW county_plan1_p AS SELECT cp.*, p.total, p.white, p.minor, p.total18, p.white18, p.minor18 FROM (county b JOIN plan1 USING (fips)) cp JOIN p USING (fips) WHERE p.sumlev = '050'::bpchar;

B6 Create view fips_the_geom

CREATE OR REPLACE VIEW fips_the_geom AS (( SELECT block.fips, block.the_geom FROM block UNION SELECT bg.fips, bg.the_geom FROM bg) UNION SELECT tract.fips, tract.the_geom FROM tract) UNION SELECT county.fips, county.the_geom FROM county;

97

Appendix C: Sample Map File

MAP EXTENT -9239765.902761 4865841.483515 -9239348.484927 4866096.472664 STATUS ON SIZE 800 600 IMAGETYPE png NAME "block_plan2_p1" PROJECTION "init=epsg:900913" END SYMBOL NAME 'Circle' TYPE ELLIPSE FILLED TRUE POINTS 1 1 END END OUTPUTFORMAT NAME png DRIVER "GD/PNG" MIMETYPE "image/png" IMAGEMODE RGBA EXTENSION "png" END LEGEND STATUS ON KEYSIZE 12 12 LABEL TYPE BITMAP SIZE SMALL COLOR 0 0 89 END END

LAYER NAME "pointLayer" CONNECTIONTYPE postgis CONNECTION "user=postgres dbname=ohio host=localhost" DATA "surfacepoint FROM block_plan2_p1 USING UNIQUE gid using srid=900913" FILTER "gid in (78831,83163,81740,80948,76701,87587,80947)" STATUS ON TYPE point LABELCACHE on

98 LABELITEM "fips" CLASS NAME "Single Point Class" STYLE SYMBOL "Circle" SIZE 4 COLOR 0 0 0 OUTLINECOLOR 255 255 255 END LABEL TYPE bitmap SIZE medium POSITION auto COLOR 0 0 0 OUTLINECOLOR 255 255 255 PARTIALS false END END PROCESSING "CLOSE_CONNECTION=DEFER" END #point layer

LAYER NAME "lineLayer" CONNECTIONTYPE postgis CONNECTION "user=postgres dbname=ohio host=localhost" DATA "the_geom FROM block_plan2_p1 USING UNIQUE gid using srid=900913" FILTER "gid in (78831,83163,81740,80948,76701,87587,80947)" STATUS ON TYPE line CLASS COLOR 0 0 0 END PROCESSING "CLOSE_CONNECTION=DEFER" END #line layer

LAYER NAME "polygonLayer" CONNECTIONTYPE postgis CONNECTION "user=postgres dbname=ohio host=localhost" DATA "the_geom FROM block_plan2_p1 USING UNIQUE gid using srid=900913" FILTER "gid in (78831,83163,81740,80948,76701,87587,80947)" STATUS ON TYPE polygon CLASSITEM "cd110th" CLASS NAME "15" EXPRESSION "15" COLOR 255 0 0 END CLASS NAME "12" EXPRESSION "12" 99 COLOR 0 255 0 END PROCESSING "CLOSE_CONNECTION=DEFER" END #polygon layer END #map

100

Appendix D: User’s Manual

D1 Exploring the census

1. Choose a census level. This is the census layer to be created. If a map of chosen

census layer has already been created. The old one will be replaced.

2. Choose a spatial query method either using “map extent” or “draw geometry”. If

“draw geometry” option is chosen, use editing tools at the right upper corner of

the map to draw one or more features. These features will be used as query

geometries. Use clear bottom to clear all drawing.

3. Define classification and labeling options for each type of reorientation of the

layer. Point is a surface point representation guaranteed to be within the polygon.

4. Click “Query Census” button to query and generate the map. A legend of the map

will be shown on the left side of the map.

5. Adjust transparency of the map or map order if necessary.

6. Check on “click to identify” to identify each census unit on the map for detailed

information.

D2 Redistricting

1. Create a new plan or choose a created new plan.

101 2. Check on “select units from”. A editing tool will be added at the right upper

corner of the map. Also, specify which layer to choose units from. Use editing

tool to draw features on map. Upon feature drawn on the map, all units from

specified layer that intersects with the drawn feature will be selected.

3. Input a number in the district number input box. Click “Apply” to update plan

table. Click “Finish Plan” to generate new district boundaries and calculate

statistics.

D3 Evaluation

1. Choose a plan to load. A map and statistical table will be created. Set

transparency and order of the plan map if necessary.

2. Check constitutional constraint and geographic constraint if corresponding results

are not shown.

102 Appendix E: 1990s Districting Principles Used by Each Statea

C = Required in congressional plans L = Required in legislative plans NC = Prohibited in congressional plans

NL = Prohibited in legislative plans YC = Allowed in congressional plans YL = Allowed in legislative plans

State Compact Contiguous Preserve Preserve Preserve Protect Voting Political Communities Cores of Incumbents Rights Subdivisions of Interest Prior Districts Act Alabama C, L C, L C, L C, L C, L C, L Alaska L L L L Arkansas C, L C, L YC, YL C, L

103 Arizona C, L C, L C, L California L L Colorado L L L L Connecticut L L Delaware L NL Florida L Table 5. Redistricting Principles by States

a This is table 5 from redistricting 1990s Chapter 3 Part 2.

http://www.senate.leg.state.mn.us/departments/scr/redist/red2000/Ch3part2.htm#Table%205

103 Table 5 continued

State Compact Contiguous Preserve Preserve Preserve Protect Voting Political Communities Cores of Incumbents Rights Subdivisions of Interest Prior Districts Act Georgia C, L C, L C, L YC, YL C, L Hawaii L L L L NL Idaho C, L C, L C, L C, L NC, NL C, L Illinois L L Indiana L Iowa C, L C, L C, L NC, NL C, L Kansas C, L C, L C, L C, L C NL L

104 104 Kentucky C C C C C Louisiana L L L L Maine L L L Maryland C, L C, L C, L C, L C, L YC, YL C, L Massachusetts L L Michigan L L L Minnesota C, L C, L C, L C, L C, L Mississippi C, L C, L C, L C Missouri C, L C, L C C C C Montana L L L L NL L Nebraska C, L C, L C, L C, L NC, NL C, L Nevada C, L L C, L L C, L

Continued

104 Table 5 continued

State Compact Contiguous Preserve Preserve Preserve Protect Voting Political Communities Cores of Incumbents Rights Subdivisions of Interest Prior Districts Act New L L Hampshire New Jersey L C, L L C C New Mexico L L L New York L L L North C, L C, L C YC C, L Carolina 105 North Dakota L L L Ohio L L L Oklahoma L L L L Oregon C, L C, L C, L NC, NL C, L Pennsylvania L L L Rhode Island L South C, L C, L C, L C, L C, L YC, YL C, L Carolina South Dakota L L L L Tennessee L L L Texas L L C, L Utah C, L C, L C, L C, L NC, NL Continued

105 Table 5 continued

State Compact Contiguous Preserve Preserve Preserve Protect Voting Political Communities Cores of Incumbents Rights Subdivisions of Interest Prior Districts Act Vermont L L L L YL Virginia C, L C, L L L YL L Washington C, L C, L C, L C, L NL West Virginia C, L C, L C, L Wisconsin L L L Wyoming C, L C, L C, L L NL L

106

106