Free and Open Source Geospatial Tools for Environ- mental Modeling and Management

A. Jolmaa , D.P. Amesb, N. Horningc, M. Netelerd, A. Racicote and T. Suttonf a Helsinki University of Technology, Espoo, Finland, [email protected] b Geographic Information Sciences Lab, Idaho State University, Idaho Falls, ID, USA Center for Biodiversity and Conservation, American Museum of Natural History, New York, NY, USA d Center for Scientific and Technologic Research, The Trentino Cultural Institute, Trento, Italy e Ecotrust, Portland, OR, USA f BiodiversityWorld Project

Abstract: Geospatial software tools (GIS) are used for creating, viewing, managing, analyzing, and utiliz- ing geospatial data. Geospatial data can include socio-economic, environmental, geophysical, and technical data about the Earth and societal infrastructure and it is pivotal in environmental modeling and manage- ment (EMM). Desktop, web-based, and embedded geospatial tools and systems have become an essential part of EMM. Environmental simulation models often require pre- or post-processing of geospatial data, or they can be tightly linked to a GIS, using it as a graphical user interface (GUI). Many local, regional, na- tional, and international efforts are underway to create geospatial data infrastructures and tools for viewing and using geospatial data. When environmental attribute data is linked to these infrastructures, powerful tools for environmental management are instantly created. The growing culture of free and open source software (FOSS) provides an alternative approach to software development also in the field of GIS (FOSS4G). For a systematic look at FOSS4G for EMM platforms, software stacks, and EMM workflows need to be analyzed. Platform is a service abstraction on which software stacks are built. A software stack for FOSS4G comprises system software, data processing tools, data serving tools, user interface tools, and end-user applications. Digital map creation, support for numerical modeling, and geospatial information systems are main areas of use for FOSS4G in EMM. The dividing line between FOSS and proprietary soft- ware is fuzzy, partly because it is in the interest of developers of proprietary software to make it fuzzy and partly because the end-users are getting reluctant to buy software. In the FOSS world the barriers to interop- erability are low and thus the software stack tends to be thicker than in the proprietary platform. The FOSS4G world thrives on the evolution of software stacks and platforms. Our examples show that it is pos- sible to build software stacks from current FOSS4G to support EMM workflows. In the examples we men- tion for example how a particular funding agency has chosen FOSS4G solutions because of the opportuni- ties to redistribute resulting modeling tools freely to end-users and to support general goals of openness and transparency with respect to modeling tools. Keywords: ; Open Source Software; Geospatial software; Geographic Information Systems; Environmental modeling and management

1. INTRODUCTION System (GIS)1. And, the development of such Geospatial software tools are used for creating, viewing, managing, analyzing, and utilizing geo- 1 We assume in this sentence a broad definition spatial data. An interoperable collection of such for a GIS. It is also possible to assume a more re- tools may be called a Geographic Information stricted definition and consider GIS separate from

tools and the study of their application to real- university curricula are taught around these soft- world problem solving may be called Geographic ware packages and many professional modelers, Information Science (GISci). Geospatial data can managers, and consultants depend on them in include socio-economic, environmental, geophysi- their work. We hypothesize here that this business cal, and technical data about the Earth and socie- model has resulted in a closed and monolithic tal infrastructure. The common characteristic of structure in GIS software and its development, all geospatial data is the presence of a spatial and that this has resulted in reduced interoperabil- component: objects are tied to actual places and ity, software transparency, and data transferabil- locations, referenced by geospatial coordinates. ity. Geospatial objects can encapsulate environmental The growing culture of free2 and open source soft- attributes and link to each other in models of envi- ware (FOSS) provides an alternative approach to ronmental networks. Geospatial objects can be software development also in the field of GIS. temporally static or dynamic; they can represent FOSS does not necessarily sacrifice business as- data at a variety of spatial scales and resolutions; pects of software development and application; and they can be very simple (e.g. points) or com- there are several successful companies built plex (e.g. 3-D triangulated irregular networks). around FOSS and companies are using FOSS Geospatial data and tools can be used to address components to a substantial degree [Bonnici, and answer many disparate types of questions in 2006]. Indeed, the FOSS model of software devel- most every field of study. For these and other rea- opment has produced a new breed of software and sons, a large number of tools have been developed associated business models. Certainly there are for working with geospatial data. One result of the many shinning examples of FOSS successes such proliferation of such tools is a coincident prolif- as the Linux/GNU operating system and its many eration of a large number of storage formats brands. The community that develops FOSS for which have been specifically designed for geospa- geoinformatics (FOSS4G) has recently gained tial data. While many of these formats are open, momentum3, making it a more viable option for i.e., their specifications have been published, oth- environmental modelers and managers. A survey ers are proprietary and do not support data inter- of FOSS4G has been presented by Ramsey [2005]. operability between tools. Published formal comparisons between FOSS4G and proprietary solutions are not common. Wik- Desktop, web-based, and embedded geospatial strøm and Tveite [2005] compared PostGIS and tools and systems have become an essential part of MapServer favorably against ArcSDE and Ar- environmental modeling and management cIMS. (EMM). With the incorporation of geospatial data sets these systems allow new and unique views The relationship between environmental models into the environment. This is particularly true in (as software), which re-create a simplification of the case of Internet-based tools that provide wide events in nature, and geospatial data (as stored in and relatively easy access to large geospatial data computers) has been studied for several decades. sets and useful analyses (e.g. driving directions, The initial phase of this research culminated in a environmental data summaries, etc.) Many local, series of conferences such as initial HydroGIS regional, national, and international efforts are [Kovar and Nachtnebel, 1993] and initial GIS/EM underway to create geospatial data infrastructures [Goodchild et al, 1993], which presented the then and tools for viewing and using geospatial data state-of-the-art in development and application of and have put geospatial technology in the interna- geospatial tools in EMM. One result of these ef- tional spotlight. In many cases, these efforts build forts has been the birth of new disciplines of hy- on free and open source tools and open, published data formats creating new challenges and oppor- tunities for the EMM software community. 2 We mainly use the term “free” to refer to the The GIS industry is growing rapidly, constituting freedoms [Anonymous, 2005] the developer and a $2.02 billion international market in 2004 [Da- user has with using the software or data, not to the ratech, 2004]. Many very successful commercial price tag. Many tools are free to use or download, companies have developed over the past 25 years, and they are sometimes useful, but the distinction built on the model of delivering proprietary GIS is important to note. 3 solutions and selling software licensing. Many We refer to the founding of OSGeo (www.osgeo.org) and to the successes of the Open Source Geospatial conferences, e.g., OSG’05: e.g., remote sensing image analysis tools and http://mapserver.gis.umn.edu/community/conferen CAD tools for geospatial data. ces/MUM3/.

droinformatics and geoinformatics. At the same are not necessarily used for remote sensing or time geocomputation has also emerged as a disci- CAD applications. pline, focusing on the study of geospatial phenom- (iii) Workflows – Each platform and software ena using computers. Despite these efforts, the stack is particularly suited for specific workflows. problem of efficiently connecting environmental We will describe some common workflows and models and geospatial tools still exists and is an show FOSS4G solutions for them. Analyzing typi- active area of research. cal “use cases” (software usage scenarios) by di- In this paper we examine the current state and verse user types is intended to show the strengths future prospects for free and open source geospa- and weaknesses of the current set of FOSS4G tial tools in EMM. We will analyze the current tools available and shed light on future opportuni- enabling technologies and tools along with future ties for improvement. directions. Analysis of the role of the community and of the community process is left for another 2. PLATFORMS paper in the same workshop [Gross et al. in prep.]. We also do not consider problems related An important trend in computing is the increased to intellectual property, such as licensing issues. availability of new platforms: e.g. different ver- We restrict our description and analysis to tools, sions of operating systems, different combinations which give modelers and managers the freedom of of hardware, and different client-server protocols adopting, using, learning, and extending the tools, for delivering tools to users. As software becomes i.e., they can be considered FOSS4. The descrip- increasingly important in everyday life, more peo- tion reflects our current understanding of what are ple use software, and more tasks are taken care of or will be the key building blocks of geospatially by software, the number of existing platforms aware information systems for environmental grows. The concept has been popularized most problem solving. Links to web pages of the vari- visibly by Ray Ozzie, CTO of Microsoft and key ous software tools mentioned in this paper are not developer of Lotus Notes, who has defined “plat- always given since they are usually very easy to form” in his weblog as “a relevant and ubiquitous find using Internet search tools. The discussed common service abstraction”5. FOSS4G tools are only a representative set of what is available. Portals, such as freegis.org, at- The benefit of a platform to a user is a common tempt to maintain comprehensive lists. look and feel and an interoperable set of function- ality. The benefit of a platform to a developer is For a systematic look at FOSS4G we divide our the ready availability of interoperable tools and analysis in three main parts: services, e.g., application programmer interfaces (i) Platforms – Platforms are defined as the media (APIs). A FOSS, as a platform (e.g. a Linux/GNU used to deliver FOSS4G solutions. We mainly distribution) provides initially a large set of ser- examine and compare the desktop and the web as vices and functionality, which is mainly controlled platforms. Despite this rather simplistic approach, by the distribution designers. Adding more ser- a platform is a very versatile, interesting, and use- vices and functionality is often technically easy at ful concept when software solutions are analyzed. least for technologically savvy users, but may in- Each platform offers a unique set of functionality troduce maintenance problems and thus in prac- and opportunity when applied to EMM. tice may be available only to a limited user base. In contrast, developing proprietary software for a (ii) Software Stack – Each platform has its own proprietary operating system (OS) is usually more specific needs and available tools for building a limited, often only to what the OS provides. working geospatial software stack. We will exam- ine various software stacks for these two platforms The set of available programming languages is an and discuss functionality as well as interoperabil- important characteristic of a platform. Java and ity. This should lead to a better understanding scripting languages are very good examples since what services these tools can provide to environ- applications written in them are usually limited to mental managers. Note that in this paper, we fo- using only services, which are written in those cus on tools, which are within the more restricted languages (or have a specific interface in the re- or traditional definition of a GIS, i.e., tools that spective language) and services written in another

5 4 Note that many proprietary tools can also be ex- http://www.ozzie.net/blog/stories/2002/09/24/soft tended programmatically. warePlatformDynamics.html

scripting language are typically unavailable6. The rapidly deliverable to end-users and more easily common division to Java and C/C++ camps is updated. An interesting compromise occurs when distinct also in FOSS4G but interesting practical a desktop application is developed to be “web- solutions exist: Geometry Engine Open Source aware” or “web-enabled”. In this case, the user (GEOS) is a C++ port of the JTS Java Topology gains the benefit of local data processing and stor- Suite (JTS). age while using the Internet to download software updates and share data with a wider user commu- From the user’s point of view a platform is, nity. Such web-enabled desktop tools are growing roughly, an OS or its user interface (UI), an appli- in popularity (e.g. and ESRI’s Ge- cation, or some specific hardware. A desktop, for ography Network). example, is a platform where computing mainly happens in one computer, within one OS but usu- The Internet has been the focus of much work in ally with more than one application. Application the geospatial community. Especially the specifi- suites, especially so called “office suites” become cations developed cooperatively in Open Geospa- very popular in the 1980’s and 1990’s and they tial Consortium (OGC) and the efforts to develop are essentially a platform within a platform. The national spatial data infrastructures illustrate this. office suites have been more or less proprietary The FOSS4G community has been active in this solutions, with only few exceptions, most notably work and it has adopted the specifications quickly. the OpenOffice.org. Spreadsheet tools, as part of an office suite, are a very popular platform for small-scale modeling and management support. 3. FOSS4G SOFTWARE STACK Office suites do not have any noteworthy support 3.1. Software stack for geospatial computation for working with geospatial data or doing geospa- on the desktop and web platforms tial analysis. A desktop connected to local net- works and to the internet is effectively an ex- The architecture of FOSS and FOSS4G is usually tended desktop platform. However, if a geospatial layered and thus makes up “software stacks”. system relies on services, e.g., on a spatial data- These software stacks can be very deep, the layers base server, which resides in another computer, at the bottom being, for example, the Linux kernel then it operates on a network or internet platform. and the GNU C library, libc. Alternatively, a FOSS4G stack can also have at its base a proprie- The web is often used as a platform for delivering tary product such as Microsoft Windows XP and content such as text, images, and simple applica- its associated run-time libraries. Indeed, FOSS4G tions, intended for wide distribution and use. co-exists with and adjusts to proprietary software Some web-based systems are targeting smaller easily. Thus it is possible to attract individuals groups and focus more on collaboration. Mobile who are forced to use or more comfortable using computing, which is characterized by small and the Microsoft operating system. light-weight devices such as cellular phones and personal digital assistants (PDAs), is another plat- Beside the actual run-time stack, there is also the form that is growing in importance and is espe- stack of development tools, like make, which con- cially interesting to geospatial computing because trol the compilation of software into binary form, of its mobile and location-aware nature. and Glade2, one of many graphical user interface (GUI) development tools. Open source integrated Each of these software platforms presents an in- development environments such as SharpDevelop teresting set of benefits and challenges to the user or Eclipse provide support for developers of Mi- or developer of geospatial software tools and sys- crosoft .NET or Java based FOSS tools. The run- tems. For example, the desktop platform typically time stack consists also of tools like the Apache has the advantage of allowing for more intensive HTTP server and various libraries. For mixing use of local disk space, memory, and processing FOSS and proprietary software on platform level power than does a web-based platform. So for several solutions exist. For example MinGW im- large, computationally intensive applications us- plements the GNU platform partially in Microsoft ing large data sets, it is generally preferable to Windows. Several FOSS applications are also deploy geospatial software solutions as desktop available for proprietary platforms and/or multiple based applications. However, web-based applica- platforms. tions generally have the advantage of being more Some software is intended exclusively for the web platform, some is usable on both the web and 6 The Microsoft .NET Framework and its FOSS desktop platforms, and other software is intended counterpart Mono, are notable efforts to overcome primarily for the desktop. Clearly, it is not possi- this problem.

ble to distinguish absolutely one platform from the tem, it is the most ubiquitous desktop OS and other. Rather, we simply attempt to define a typi- supports a wide array of FOSS and FOSS4G tools. cal desktop solution and a typical web solution.

The specific and unique software stack that is set up to support geospatial computations makes up 3.2 System Software the platform together with the hardware and the System software is the foundation on which the rest of the computational environment. complete stack is built, providing software inter- operability and a common look and feel for tools.

Generic Stack FOSS4G Stack Grouping In the web-based FOSS4G software stack, the Application Environmental modeling and data End User Linux operating system is the most common OS Extensions/Plug-ins analysis tools (e.g. US EPA’s Application BASINS 4, etc.) and has proven to be an exceptionally robust for Application Quantum GIS, GRASS, OSSIM, User Interface mapping applications. Linux based geospatial JUMP, uDig, MapWindow GIS Application Dev. Eclipse, QT, OpenGL, tools for the desktop are also readily available. Environment SharpDevelop High Level Utilities GeoTools, PostGIS, Data Serving The Microsoft Windows/.NET framework system MapWinGIS ActiveX High Level Scripting PHP, Perl, Python, , R spatial Data Processing software creates an interesting opportunity for Languages FOSS software developers and environmental Low Level Utilities Shapelib, JTS/GEOS, GDAL/OGR, GMT modelers to build free and open source tools on a Low Level C, C++, Java, Fortran, C#, System Software Languages VB.NET common proprietary platform. Operating System Linux, Darwin, Cygwin, Microsoft Windows System software is typically generic and not di- rectly relevant for geospatial computation. The Generic Stack FOSS4G Stack Gr ouping case of Java vs. other languages is one noteworthy Application Web Decision support syst em, End User Site environmental data viewer, etc. Application exception. Java is a high-level programming lan- Client Side B rowser Firefox, Safari, Netscape User Interface Client Side Scripting JavaScript, Java, Perl, Pytho n guage but it is also often used instead of C or C++ ~~~~~~~~~~~Internet~~~~~~ ~~~ ~ ~ for developing low-level libraries and tools. It is Server Side PHP, Python, Perl Data Serving Scripting possible to build Java interfaces to C libraries High Level Uti l ities MapServer, GeoServer, GRASS Low Level Utilities Shapelib, JTS/GEOS, GDAL/ Data Processing (Java is one of languages supported by Swig, a OGR, PostGIS, R spatial, GMT FOSS generic interface generator) but it is consid- High Level Scripting PHP, Perl Pyt hon Languages erably more difficult and not sensible to use Java Low Level C, C++, Java, Fortran System Software Languages from other languages. It is also often preferred to Operating Linux, Darwin, Cygwin System/Drivers have “pure” Java solutions for various reasons. The result has been that software stacks are often

divided to Java-based solutions and to other solu- Figure 1. The tool layers of the desktop platform tions. However, the internet constitutes such a (above) and the web platform (below) and group- strong dividing line that it is easy to mix Java so- ing of the layers. The software list is an example lutions on the server or client side with other solu- and in some cases is very incomplete. Somewhat tions on the other side. similar diagrams have been presented e.g. by Ticheler [2005]. 3.3 Data processing

In the data processing layer, data or essential in- In Figure 1. we present typical software stacks for formation about it is retrieved into system memory desktop and web platforms. The system software and is processed in some way. The way the data layers in the two stacks contain many common are accessed and what kind of data structures are items. This synergy between the software stacks used is often specific to different FOSS4G tools. shows great promise in intertwining capabilities. One benefit of the FOSS platform is the unob- For example, it is envisioned that future desktop structed access to these details. applications will rely more on web-services, while web-based applications will contain more func- Data processing is divided here into data man- tionality traditionally relegated to the desktop agement, format processing, and geo-analytical platform. The commonality of the lower layers of processing. Geoprocessing is examined more the software stacks allows for much of this inte- carefully since it is most important from the point gration. of view of environmental modeling. Data processing tools are quite common between We have included Microsoft Windows into the desktop and web platforms. stacks. Although it is not a FOSS operating sys-

The foundation of the domain specific interopera- guages like Python, Perl, and Ruby to the bility of the geospatial tools is in this layer. Solv- GDAL/OGR library. ing of complex problems in EMM requires com- plex workflows, which usually requires interop- eration of several tools in order to be effective. Geo-analytical tools An example piece of data processing software The geo-analytical functionality on the FOSS plat- specific to geospatial computations is a tool to form is very strong in theory since FOSS is often transform datasets from one geospatial coordinate the platform of choice for academic research pro- system to another. The most common FOSS4G jects. In practice the utilization of all that is possi- tool for this is PROJ.4. PROJ.4 contains a system ble is sometimes difficult because of interoperabil- for describing common projections and for con- ity problems and a sometimes steep learning curve verting data between projections. to use the tools. Examples of analytical and geo- analytical tools in the FOSS platform include

GSL, R, and CGAL. GSL is a general numerical Data management library. The R project develops a language and platform for statistical computing and associated Data management is a critical function of GIS for graphics. The R spatial project uses and extends R EMM. Both the number of required geospatial for spatial statistics. CGAL is a high quality li- datasets and their size are often voluminous. Geo- brary for computational geometry. spatial datasets are often stored within specialized file formats or data servers. Generally, geospatial JTS (Java Topology Suite) and its C++ port GEOS data are described and stored in either a vector or is a library for basic computational geometry op- raster based model and file format. Raster data are erations as binary predicates and overlays. Binary typically stored as a matrix of data, either in tiles predicates return a Boolean value indicating or in large singular files in a file system. Vector whether two spatial objects have a named spatial data are either stored as files or in tables in data- relationship. Examples of binary predicates are base management systems. Simple vector data “intersects” and “within”. Spatial overlays take does not consider the topology or connectivity of one or two spatial objects as arguments and return the spatial primitives and many formats support a new spatial object. Examples are “intersection” only this kind of data. Geospatial data sets and and “buffer”. GEOS is used by GDAL/OGR and information about them, i.e. meta data, are pro- PostGIS (see below). The JTS/GEOS library pro- vided for use from servers either through net- vides the OGC standard spatial data types. [The worked file systems or by the http protocol, typi- JUMP-Project.Org, 2006] cally using protocols standardized by the Open The main analytical method family for raster data- Geospatial Consortium (OGC). sets is map algebra (a term coined by Tomlin [1990]), which extends the standard algebra of scalar values to raster data. For example in map Format processing algebra the sum of two similarly projected and The GDAL/OGR library and associated utility sized rasters is a raster, whose cell values contain programs provide widely used basic functionality the sums of respective cell values of the argument on the FOSS platform. GDAL provides a general- rasters. This basic concept may be extended in ized API for raster data and a way to implement map algebra into spatial neighborhoods, zones, drivers for raster data formats and OGR does the the 3rd dimension (elevation/depth, voxels), and same for vector data. GDAL can be used to read, into raster time series analysis [Karssenberg, access, manipulate, and to write in over 50 raster 2005]. The GRASS function, “r.mapcalc”, sup- file formats; OGR enables programs to read, ac- ports the ordinary arithmetic and logical opera- cess, manipulate, and to write in over 20 vector tors, trigonometric etc. functions and it has the file formats and database systems. The concept of cell neighborhood, the r.series com- GDAL/OGR library is written in C++ with C mand supports time series statistics. libral is a C wrapper and it has been ported to several common library, which supports similarly basic map alge- operating systems. The OGR library can option- bra. The Geo::Raster module extends the Perl ally be compiled to directly include the GEOS programming language with map algebra using library. GEOS gives to OGR some topological libral. capabilities. A Swig interface has been developed More complex geospatial analytical methods in- and used to create bindings for scripting lan- clude spatial interpolation, terrain analysis (in- cluding hydrological analysis), applications of

graph theory (e.g. shortest path computation), processing layer and serve it to the user interface spatial data mining (e.g. discovery of spatial phe- layer. The data serving layer is thin but important nomena). GRASS is traditionally strong in these since it requires standardization and enables a areas (see e.g., Neteler and Mitasova [2004]) for wholly new type of software, i.e., collaborative example libral and TauDEM provide tools for applications. Collaborative applications are im- hydrological analysis. portant and will probably become even more im- portant for EMM. Tools such as MapServer and The most complex geo-analytical tools are those MapGuide, which mainly work in this layer have that support spatial modeling. These tools may be also become hugely popular in mainstream web- implemented with the help of basic geo-analytical mapping applications. methods such as map algebra but they may also work directly on geospatial data. These tools are The two main FOSS tools for serving geospatial typically implemented as plug-ins for desktop GIS data via the web are the MapServer and or as stand-alone applications. Examples include GeoServer. MapServer can be used to serve maps openModeller and SME. openModeller is a spatial (as images) and thus create interactive websites, distribution modeling library, providing a uniform but it can also be used to serve data according to method for modeling distribution patterns using a the WMS (map layer) and WFS (vector data) stan- variety of modeling algorithms. openModeller can dards. GeoServer supports WMS and WFS-T (“T” be used via programmatic interfaces, including denoting transactional, WFS-T supports add, de- SOAP and SWIG-python, as well as via a user lete, and update of features) specifications. friendly desktop graphical user interface and as a

Quantum GIS plug-in. [openModeller develop- ment team, 2006] SME is an integrated environ- Spatial databases ment for developing spatial simulation models In the desktop, this layer is often hidden within an [Maxwell et al., 1999]. application or does not exist. The only noteworthy Visualization is an essential element of geospatial exception is the attribute database connection, analysis. All GIS provide at least some kind of which often is based on standards (especially two- and possibly three-dimensional visualization SQL). capabilities, some with animation features. The A relational database management system boundary between mapping and analytical geo- (RDBMS) is in itself a platform for developing visualization is fuzzy. Visualization (mapping) is applications, functionality, and services. On the an important part of data serving and user inter- FOSS platform the two main RDBMSs are face (also analytical geovisualization). Tools for MySQL and PostgreSQL. A general perception is analytical geovisualization are often large stand- that MySQL is less featured (in SQL sense) than alone applications, like for example the Vis5D or PostgreSQL but faster. PostgreSQL offers a richer Paraview. array of DBMS features, such as triggers and it OSSIM is a high performance image processing supports a variety of procedural languages. Geo- library and ImageLinker is the GUI application spatial data are not easy to store in a standard providing access to the OSSIM libraries. Image- RDBMS, thus spatial extensions have been devel- Linker is capable of image visualization, of mo- oped and standardized by the OGC. PostGIS pro- saicing a large number of images, and of simple vides the spatial data types and spatial queries for image manipulation tasks like contrast enhance- PostgreSQL, and more recent versions of MySQL ments. ImageLinker lacks some important capa- have been adding spatial data handling capabili- bilities, for example image classification and vec- ties, too. tor capabilities.

Terralib and Terraview are a cross platform Geo- Scripting languages spatial analysis library and an accompanying front end GUI. Terralib stores all data (including raster) Scripting languages can also be considered as a within the RDBMS environment. part of the data serving layer. Scripting languages provide very powerful scripting capabilities be-

sides being general purpose programming lan- 3.4 Data Serving guages. Scripting languages can usually be ex- tended by modules, which can be interfaces to Web servers data access, graphics or other libraries or they can The data serving layer exists mainly in the web be extensions written in the scripting language platform as tools that receive data from the data itself. The module system and general purpose

programming capability make scripting languages and visualization components that can be used in very useful for “gluing” as an application devel- both desktop and web-based (using ASP.NET) opment methodology. These languages are typi- applications. cally interpreted, and thus programs are easy to Quantum GIS has the notable advantage of being write and the language can be used in an interac- fully cross-platform such that the same tools run tive fashion. Interactive use of a programming on Microsoft, Macintosh, and Linux/GNU operat- language makes it attractive for “use-developers” ing systems. An effort to make Quantum GIS [Rosson, 2005]. A use-developer is anybody, who usable as a GUI for GRASS is making it possible creates applications with spreadsheets, interactive to exploit GRASS’ analytical capabilities directly web-pages, or, as in this case, with a few lines from Quantum GIS. using a scripting language. and Gtk2::Ex::Geo are FOSS toolkits for Many scripting, interpreted programming lan- developing geospatially aware applications. Map- guages, such as Perl, PHP, Python, Ruby, Tcl, and nik focuses on cartographic quality and R have been developed as FOSS and thus the Gtk2::Ex::Geo focuses on providing a geovisuali- FOSS platform is well suited for their use. The R zation widget and combining GUI and CLI. Map- language is developed for statistical computing nik builds on a graphics library called AGG, but it is also a general purpose programming lan- which is FOSS. Gtk2::Ex::Geo is a collection of guage. These languages also support modular and Perl modules and it builds on Gtk2-Perl software object-oriented programming for creating larger and Geo::Raster and Geo::Vector modules, which applications. The openness of these languages has in turn build on libral and GDAL/OGR. attracted developers to create and publish exten- sions to them, thus making, e.g., database integra- OpenEV is a general viewer for geospatial data, tion, data import, and development of graphical and provides for example basic vector overlay applications relatively easy. The main problem is capabilities. OpenEV’s analytical capabilities are that the tools developed for Python for example, limited and for the most part they have to be ac- are not directly usable by people who use for ex- cessed via the command line. ample Perl. SAGA GIS is a desktop GUI environment and library for geoscientific analysis. 3.5 User Interface uDig represents a new approach to building a desktop GIS as it treats web-based (OGC compli- The user interface is often, but not necessarily, ant) data sets as similarly as possible to local data significantly different in the desktop and in the sets. uDig is also designed so that new, derived web platform, the web platform being dominated applications can be created by developers. by the web browser with its own set of menus, toolbar buttons and intrinsic functionalities. Some tools (e.g. uDig), because of their platform inde- 3.6 End-user applications pendent nature or their significant client server interaction have begun to blur the distinction be- At the top of the FOSS software stack are end-user tween a desktop and web end-user application. applications – the tools that are used by managers, modelers, stakeholders and others to better under- Lately two modern desktop GUI GIS that are stand the environment. These can be as simple as FOSS have appeared: Quantum GIS and Map- a customized map viewer on a web site showing Window GIS. Both have a standard menu system, the location of a particular environmental prob- graphical dialogs, and other desktop software GUI lem, or management area, or as complex as a fully elements common to proprietary GIS software. integrated dynamic simulation watershed model. Both have a plug-in architecture for extension In each case, it is at this top layer where end users developers, making them suitable for supporting interface with the software. Hence, this layer environmental models and modeling toolkits. needs to be a critical focal point for the FOSS4G MapWindow GIS is the first fully Microsoft .NET community, so that the tools developed at the un- compatible open source GIS and has the advan- derlying layers meet the actual (not only per- tage of providing developers with ActiveX and ceived) needs of end users. .NET components that can be used in many com- Interestingly many existing tools in this layer are mon programming languages (i.e. Visual Basic, already (and in many cases have been for years) C#, Delphi, VBA, etc.) The current MapWindow FOSS tools. For example, SWAT, HSPF, WASP, GIS development effort is centered on producing QUAL2E and other notable hydrologic simulation an enhanced suite of geoprocessing, data access

models have always been FOSS. However, due to mean simple assimilation of data or it may mean the lack of suitable FOSS4G tools to support these development of assessment skills. Descriptive and models (and because many were developed prior simulation models are often used for learning. to the advent of GIS) current implementations of Decision making is the task of selecting between these models on FOSS4G platforms is limited. In alternatives, but the more time consuming tasks of other words, it is possible to see the source code to inventing the alternatives and the assessment of the model, but not the to the proprietary GIS plat- their impacts are usually included in decision form upon which the model can run. This prob- making. Simulation and optimization models can lem is being addressed in many cases as efforts are be used to invent or refine alternatives. Simulation underway to generate FOSS4G interfaces for these models are routinely used for impact assessment. models. Optimization models can be used to suggest deci- sions once evaluation criteria and methods for

computing them are selected. 4. WORKFLOWS FOR ENVIRONMENTAL The requirements that environmental modeling MODELING AND MANAGEMENT and management set for geospatial tools and 4.1 Introduction methods may be organized according to the task at hand: Environmental modeling and management (EMM) are both activities, which increasingly • technical tasks: storage of data, format con- happen computationally with desktop and web- version, etc, based tools. In this chapter we examine the work- • supporting simple assimilation of data: view, flow concept and user classes, we define a few visual overlay, etc, representative EMM workflows that involve map- ping or geospatial analysis, and we analyze how • a formal language: writing of specifications, the FOSS tools are suited to the task. In the cases programming a model, etc, we also discuss some ongoing efforts to link • planning support: sketching of alternative FOSS4G and EMM. spatial plans, The goal in environmental modeling is to develop • and use a model (for our purposes, we use the analytical tasks: preparation of input data, term “model” to mean a computer based generali- execution of model, evaluation of model out- zation of a real system) of an environmental sys- put, tem. Important domains in environmental model- • support for assessment: expert advice, deci- ing include geospatial, temporal, geophysical, sion support, probabilistic reasoning, evalua- chemical, and ecological. How geospatial domains tion of plans. can and should be included in modeling has to be assessed by the modeler. The historical limited Other requirements are very diverse and stem availability of spatial data has caused severe limi- from the type of the project, its goals, and work tations to modeling; the situation is getting better habits: thanks to new remote sensing instruments and the • requirements on the user interface: what are development of standard datasets but problems the computer skills of the user of the tool remain due to reduced budgets and copyright re- • strictions. computation time: will the tools be used in interactive sessions for example, The motivation for modeling environmental sys- • tems may be for example scientific interest, im- support for cooperation: will one user do eve- pact assessment, planning, or environmental rything or will there be several users with dif- management. A model is in practice either a de- ferent skills. scriptive model, a simulation model, or an optimi- zation model. A descriptive model is for example a database, which has a schema and which can be 4.2 User Classes queried. A simulation model explains or predicts End users of software typically have the expecta- the behavior of a system over time (or space). An tion that there is a GUI which resembles in behav- optimization model can be used to find the best ior and appearance (as closely as possible) GUIs values for decision variables. of the other software that they use. The need to Environmental management typically has two study help texts and manuals is typically seen as a phases: learning and deciding. Learning may hindrance. One classification for users is the Ros- son’s [2005] developer, user-developer, and user.

Another classification is by the time the user can documented) sequence of individual commands devote to using a tool: 10 minutes, one hour, one with optional use of variables for task generaliza- day, etc. There is no universal classification of tion. Coupling these methods enables the informa- users, the issue has to be considered in the begin- tion system to support sophisticated data process- ning of each tool development, modeling, and ing workflows. management project. Developers are typically looking for a clean set of Nielsen [1993] classifies existing user interface API’s to which they can integrate custom pro- types and associated interaction styles. The inter- gramming interfaces. action styles he lists are no interaction, question- The current set of FOSS4G tools represents a answer, command language, function keys, form broad spectrum of support for various users. For- fill-in, menus, and direct manipulation (with a tunately, there is a clear trend in the community to mouse for example). The ease of use of a GUI develop tools that are easier to use and more simi- comes from the fact that there is only a limited set lar to other common software tools with respect to of possible actions and they are presented to the user interaction (e.g. they support standard mouse user. Command line interface (CLI) provides behavior and key sequences, and have common added functionality mainly because of the much types of “GIS oriented” tool bar buttons for map larger set of possible actions and the possibility to navigation, etc.) combine tasks. CLI is possible to combine with GUI using free-form text input. CLI also leads to scripting which is essentially the stored (and

Interpretation of Needs

Specifications Interviews Laws Data Gathering Artistic interpretation Models Web services Database interaction File system Data Formatting Custom data generation Web searches Scripting Field work Re-Projection Geo-Referencing Tiling Mosaic Re-Sampling

QGis Example using QGis and ArcMap to produce similar cartographic output ArcMap using same workflow Processing

Map Algebra Scripting Cartographic Formatting Application modeling

Artistic formatting Stylization Map Production Symbolization Integration of disparate data Format translation Meta data Resolution translation Packaging Integration to other tools

Figure 2. Example of the cartographic workflow producing map output in both FOSS and proprietary soft- ware. Data is depicting change in fishing effort by the Flatfish fishery off the California coast between 2001 and 2003, during which time a fishing closure was instituted along the continental shelf.

As the environmental modeling community begins Reasons for using web-based mapping to solve to adapt these tools to their specific modeling and problems in environmental modeling and man- end user needs, it is expected that the interfaces agement include: will continue to improve and be refined such that 1. Instant access to the latest version of analyti- they are more acceptable to a wider array of users cal analysis and models by geographically from all user classes. separated teams.

2. Collaboration in management can be 4.3 Case 1 – Cartographic map production achieved over the web. The workflow to produce a map is initiated by an 3. Interactiveness that can be built into the web- expressed need. In our case the need is related to mapping site. EMM. In this case we define a map as a static 4. Cross platform nature and ease-of-use of visualization, essentially an image file, which browser-based solutions. typically combines geospatial information from various sources. Map production has traditionally The FOSS4G solutions for web-based mapping been one of the main functions of a GIS. The are very good mainly due to the popularity of workflow consists of the following steps: MapServer and MapGuide Open Source. Also notable as free though not open source, is the 1. Interpretation of the expressed need for a Google Map API which currently allows for free map. hosting of simple maps (appropriate for basic 2. Gathering of required data. navigational needs) on any publicly accessible website. 3. Formatting of gathered datasets. The analytical capabilities of web-based mapping 4. Processing of datasets. solutions on the FOSS platform are limited by 5. Cartographic formatting of the requested technical problems. Solutions which use GRASS map. as a backend for MapServer exist but are not op- timal in a technical sense. The development of 6. Production of the requested map. scripting language interfaces to geo-analytical Maps are arguably more useful in management libraries and the development of the actual ana- than modeling, although graphical visualization lytical libraries both in C and in Java have a po- of all geo-spatial processing is useful for data tential to solve this problem. The former will en- validation and understanding. Maps are useful able server-side solutions and the latter (Java) will both in learning about the case or system at hand, also enable client-side solutions. and at supporting decision making.

In general the FOSS4G tools are good at steps 2 to 4.4 Case 3 – Numerical Simulation 4 but support for steps 5 and 6 are more limited. (Figure 2.) With tools like Mapnik and GIMP Environmental simulation models often require there is a good chance that these steps will be bet- pre or post-processing of geospatial data, or they ter supported in the future. Also, we expect that can be tightly linked to a GIS, using it as a GUI. many traditional mapping applications resulting Harvey and Han [2002] have presented an excel- in paper maps will be replaced by interactive web lent analysis of the relevance of FOSS to hydraulic based mapping in the future. and hydrological models. Several environmental models are available as FOSS, for example

MODFLOW, which is a groundwater simulation 4.4 Case 2 – Web-based mapping model, and SWAT, which is a river basin scale land management impact assessment model. It is Web-based mapping is effectively a simplified, or interesting to note that although SWAT is FOSS, canned, form of cartographic map production. The it uses a proprietary GIS for a GUI (although this creation of maps can be done within a framework is currently changing – see below). A general im- offering a limited number of datasets and layout pression (supported by discussions with modelers) options. Because of these limitations the process is that FOSS models are popular and get used. of creating a map is more straightforward than traditional cartographic map production (Figure The general workflow of modeling is 3.). 1. Setting of objectives and scope and scale of the modeled system.

2. Data collection. 6. Using the model. 3. Development or selection of a model. 7. Analysis of the model output. 4. Preparation of input data. Geospatial tools are typically needed at steps 2, 4, 7. (Figure 4.) 5. Calibration and validation of the model.

Capture User Needs

Zoom Simplified Web-Based Query Map Generation Workflow Interactive spatial capture Data Gathering

Web services Database interaction File system

Cartographic Formatting

Stylization Symbolization

Map Production

Example web-based Format production application utilizing Packaging Mapserver and KaMap AJAX driven UI DM Solutions Group

Figure 3. Example of web-based geo-spatial interface and workflow.

Three interesting and complementary efforts to migrating BASINS from a proprietary GIS system merge environmental modeling with FOSS4G to a FOSS4G GIS system (namely MapWindow tools are currently under way at the United States GIS). Similarly ORD is currently investing in the Environmental Protection Agency (US EPA) Of- adaptation of MapWindow, GDAL, JTS, and fice of Science and Technology (OST), US EPA other FOSS4G tools to support a wide variety of Office of Research and Development (ORD), and EPA environmental modeling tools, beginning the United States Department of Agriculture with the FRAMES/3MRA system. USDA through (USDA). it’s Texas A&M university collaborators are fol- lowing suit with the development of a new GIS In each case an existing environmental model or interface to its SWAT watershed model, again modeling system is being adapted to explicitly use based on MapWindow and other FOSS4G tools. geospatial data in a FOSS4G based GIS environ- ment. Specifically, OST has recently awarded a 4.5 Case 4 – Environmental management five year contract to a consortia of consulting Environmental management workflows consist of: firms and universities to improve and provide support for the BASINS watershed modeling sys- 1. Monitoring the state of the environment. tem. The awarded team specifically proposed 2. Planning of actions for improving the state.

3. Responding to actions, which affect the envi- flow is perhaps best supported with developing a ronment. comprehensive geospatial database of the system, which is managed. PostGIS is one example of a 4. Increasing awareness of people of the state of tool that is well suited for this purpose. It is possi- the environment ble to build various stacks on the top of PostGIS to The workflow of environmental management can support real-time or ongoing monitoring, analyti- be split into the above parts, all of which consist cal needs, decision making, and mapping for de- of independent workflows described in the first livering information. three cases of this section. This long-term work-

3-D Circulation modeling of the Objectives and Columbia River and Plume Scope Data Collection Community needs Scientific objectives Custom data generation Technology capabilities Field work (GPS) Model Development Web services Database interaction Capabilities File system Compatibility Web searches Technology Grid resolution

Prepare Inputs

Example FOSS Custom formatting Splice Environmental Modeling Interpolate Sensors Geo-Reference OHSU - OGI School of Science and Engineering Calibration and ccalmr.ogi.edu Validation

Statistical analysis Using Model Ground truth Interviews Application spatially Numerical analysis Analysis of Outputs Integration with DSTs Grids Deployment of technology Decision support Scientific advancement

Figure 4. Example FOSS based modeling system integrating spatial data and GIS tools such as PostGIS, OGR/GDAL, and Mapserver. 3-D circulation models are used to simulate and forecast physical parameters that are then used for environmental and ecosystem management.

In the domain of environmental management initiatives and data collection efforts. Many pro- most projects and project proposals currently in- jects similar to IGRAC’s GGIS exist. The poten- clude a component, which aims to develop soft- tial amount of data in such databases and accessi- ware and databases. These components may be ble by such information systems may be huge. quite large in large international cooperative pro- Also the complexity of the data may be over- jects. For example there is an International whelming calling for dedicated meta data projects. Groundwater Resources Assessment Centre Many international organizations and initiatives (IGRAC), which works under auspices of such as FAO [Ticheler, 2005] are currently inves- UNESCO and WMO. Perhaps the most important tigating FOSS4G for their needs in developing part of the work of IGRAC is the development of Internet-based information systems. Another ex- Global Groundwater Information System (GGIS). ample of a similar effort is the Managing Informa- This information system draws on several other tion for Local Environment in Sri Lanka (MILES)

project7. MILES has run a virtual seminar on is- on community development and on technical and sues concerning the linking of FOSS4G and envi- usability merit. ronmental management. Our examples show that it is possible to build software stacks from current FOSS4G to support EMM workflows. IT professionals have long em- 5. DISCUSSION braced the concepts of platforms and software As computing becomes more ubiquitous, the sig- stacks, now the same is happening in the domain nificance of one single tool becomes less impor- of EMM. It is important to include these concepts tant and focus shifts more towards software stacks into the communication between EMM and and platforms. It is difficult to assess how much of FOSS4G communities. This communication for this is conscious “empire building” by some soft- example conveys the needs and requirements of ware companies and how much is natural evolu- the EMM practitioners to software developers, tion. Organizations are clearly making strategic who can analyze them and understand what needs decisions on software platforms and stacks that to be done. Common platforms and interoperabil- they use and support. Although it is a believable ity are important for the users. When design is trend that more users and in fact user-developers, based on selection among proprietary products, the platform on which they develop is often very the software stack concept is usually not usable. restricted and dictated by institutional guidelines. The result is that decisions may be based on somewhat questionable analysis, for example The dividing line between FOSS and proprietary based on just supported platforms and brand software is fuzzy, partly because it is in the inter- names etc. est of developers of proprietary software to make it fuzzy and partly because the end-users are getting FOSS4G still caters in many cases to advanced more and more reluctant to buy software. People users. The FOSS4G solutions for application areas are expecting web browsers and other common are usually not as “packaged” as those offered for tools to be free of charge. Also, depending on li- proprietary products, although companies exist, cense of the particular FOSS tool, proprietary which specialize on delivering complete FOSS tools may include FOSS. Advances in standards solutions. This is partly a fundamental difference aiming for interoperability make it possible to co- but it may also change if and when people work- use free and proprietary tools. ing in application areas discover FOSS4G. Addi- tionally it is incumbent upon developers of It is easy for a tool developer to fall into the trap FOSS4G tools to improve the ease of use, installa- of slowly expanding the scope of his tool, promot- tion, and integration of such tools so that they can ing the solving of ever bigger and more complex be more readily adapted by the environmental problem with his tool. From the user’s point of modeling community. Potential improvements view the benefit in this evolution is that he can might include: “grow with the software” not having to go through the learning curve of other tools. The developer of • Providing compiled binaries for multiple plat- a proprietary product is often happy with this, forms and operating systems, since he or she does not have to share income with • Developing demonstration applications that others. After a long experience in this kind of fa- show how one might integrate FOSS4G tools miliar worlds, the FOSS world must look chaotic with legacy FORTRAN or other existing en- and complex. vironmental modeling code, In the FOSS world the barriers to interoperability • Generating simplified installation packages are much lower and thus the software stack tends that can be readily adapted and integrated to be thicker in FOSS platform than in the pro- with the installation package of a particular prietary platform. There is competition in the model, FOSS4G world, but it is not preventing evolution of individual tools, stacks, or platforms. Code • Enhancing existing user communities and sharing is encouraged, as exemplified by activities developing new discussion forums targeted within so called “foundation projects” in the OS- specifically at FOSS4G users in the environ- Geo Foundation. The competition in the FOSS4G mental modeling community, world seems to happen on two distinct domains: • Clarifying the meaning and interpretation of various FOSS license agreements, and

7 http://www.miles.geo.ar.tum.de

• Seeking out opportunities to adapt FOSS4G spatial Consortium should also be acknowledged tools to the specific needs of the EMM com- for its work in developing and publishing open munity. specifications, which has been a great motivation for many FOSS4G developers. In the examples we mention above in Case 3, en- vironmental numerical modeling, the particular funding agency has chosen FOSS4G solutions 8. REFERENCES because of the opportunities to redistribute result- ing modeling tools freely to end-users and to sup- Anonymous, The free software definition. port general goals of openness and transparency http://www.fsf.org/licensing/essays/free- with respect to modeling tools. Indeed, the main sw.html (URL accessed 26.4.2006) marketing advantages of FOSS4G are its low cost, Bonnici, K., To international markets with com- open licensing, and modular nature allowing di- petitive projects. Avopaikka, Open source verse tools. business news 1/2006. Finnish Centre for Open Source Software. On page 6. Certainly, FOSS4G has been around and evolved Daratech Inc., Press Release: Worldwide GIS for a very long time, GRASS is the prime example Revenue Forecast to Top $2.02 Billion in of that. The current challenge for FOSS4G is to 2004, up 9.7% over 2003, develop working software stacks, which provide http://www.daratech.com/press/releases/2004 solutions that are attractive to end-users and peo- /041019.html (URL accessed 25.4.2006) ple working in the application areas. The open Gross, T., Hood, R., Voinov, A., Building a Com- data formats and data exchange protocols are cur- munity Modeling Culture. (in prep.) rently shaping the industry and FOSS4G are prov- Harvey, H. and Han, D., The relevance of Open ing to be very good with them. Source to hydroinformatics. Journal of Hy- droinformatics. (4) 2002. pp 219-234. also available at 6. CONCLUSIONS http://public.hamishharvey.fastmail.fm/publi FOSS4G has many advantages, which should be cations/2002-10-jhydroinf-open-source/open- attractive to modelers and managers like low cost, source-hydroinformatics.pdf (URL accessed easy dissemination, code transparency, natural 10.3.2006) support for extensions and experimentation. The Karssenberg, D. and De Jong, K., Dynamic envi- greatest barriers for their increased use in the en- ronmental modelling in GIS: 1. Modelling in vironmental modeling and management commu- three spatial dimensions. Int. J. Geog. Inf. nity seem to be the (sometimes) perceived (low) Sci. 559-579. 5(19) 2005. importance of geospatial aspect, some technical Maxwell, T., Villa, F., and Costanza, R., Spatial obstacles, and low visibility. Modeling Environment. http://www.uvm.edu/giee/SME3/ (URL ac- In this paper we have identified and described cessed 26.4.2006). The SME software is elements in environmental modeling and man- available at SourceForge: agement, which require or benefit from geospatial http://sourceforge.net/projects/smodenv computation. We have also described geospatial (URL accessed 26.4.2006). software, focusing on FOSS tools and solutions, Neteler, M. and Mitasova, H., Open Source GIS: and we have discussed how these tools can be used A GRASS GIS Approach. Second Edition. to accomplish tasks of environmental modeling The Kluwer international series in Engineer- and management. ing and Computer Science (SECS): Volume 773. Kluwer Academic Publishers / Springer, 420 pp. Boston, 2004. 7. ACKNOWLEDGEMENTS Nielsen, J., Usability engineering. Morgan Kauf- The authors would especially like to thank the mann. 362 pp. 1993. people at the various email lists: e.g., Geowank- Rosson, M.B., The end of users. Keynote presen- ing, GRASS users, OSGeo discuss, OpenNR, tation at OOPSLA 2005 conference, October freegis list, etc. for their ideas, visions, and enthu- 16-20, 2005, San Diego, California. siasm. Many of these people have created we- Ramsey, P., The State of Open Source GIS, blogs, wikis, and websites which have provided http://www.refractions.net/white_papers/inde invaluable background information for this paper. x.php?file=2005-2-2_oss_briefing.data (ac- Some of the ideas presented in this paper no doubt cessed 24.4.2006) originate in some of these sources. The Open Geo-

The JUMP-Project.Org, Project overview. http://www.jump-project.org/project.php? PID=JTS&SID=OVER (URL accessed 10.3.2006) Ticheler, J., SDI Architecture diagram. http://193.43.36.138/relatedstuff/index_html/ document_view (URL accessed 10.3.2006) Tomlin, C.D., Geographic Information Systems and Cartographic Modelling. Prentice-Hall. Englewood Cliffs, NJ, 1990. Wikstrøm, M. and Tveite, H. 2005. Post- greSQL/PostGIS and MapServer compared to ArcSDE and ArcIMS in performance on large geographical data sets. Kart og plan (3) 2005. pp 185-192. (in norvegian)