Final Report

CDR Management Plan: Data Survey and System Design

Submitted to the

Texas General Land Office

by

The University of at Austin Center for Space Research

Work leading to the report received funding from the Texas General Land Office through GLO Contract No. 18-229-000-B854.

October 2020

Table of Contents

Executive Summary ...... 4 Prologue ...... 8 Chapter 1. Survey of Data for Response, Recovery and Mitigation ...... 10 Geodetic Control ...... 13 Bathymetry ...... 14 Topography ...... 15 High-Resolution Baseline Imagery ...... 16 ...... 18 Soils ...... 19 Climate and Weather ...... 21 Administrative Boundaries ...... 22 Land Cover / Land Use / Zoning ...... 24 Agriculture ...... 25 Transportation ...... 26 Critical Infrastructure ...... 27 Street Addressing ...... 28 Land Parcels ...... 29 Demography ...... 31 Economic Activity ...... 32 Disaster Insurance Claims ...... 34 Public Health ...... 35 Chapter 2. Data Sharing ...... 38 FEMA HPA Team and Hurricane Harvey High Water Marks ...... 39 UTA-UTSA 9-County Southeast Texas Project ...... 40 Sharing data from the ADCIRC Storm Guidance System with the SOC and TTF1 ...... 41 Sharing satellite imagery with the NWS West Gulf River Forecast Center ...... 43 USACE Trinity River Investigation ...... 45 Chapter 3. Data System Security ...... 49 Major Security Concerns related to Enterprise-Scale Disaster Information System ...... 50 Federated Identity Management is the best means by which to handle a distributed TDIS ...... 53 Data Sharing agreements among Organizations ...... 55 Chapter 4. Highest Priority Data ...... 58 LiDAR-Based Topography ...... 59 NOAA Atlas 14...... 60 DHS HIFLD Products and Data Services ...... 62 Radar Satellite Imagery ...... 63 FEMA Individual Assistance / NFIP ...... 65 Enhanced Demography / SVI ...... 67

2 Chapter 5. Data Shortfalls ...... 69 Stream and Tide Gages ...... 70 Height Modernization ...... 70 Street Addressing and Geocoding ...... 71 Building Footprints...... 72 Base Flood ...... 73 Economic Activity ...... 74 Land Parcels / Zoning ...... 75 Land Cover ...... 77 Chapter 6. Best Practices in ...... 80 Chapter 7. Cost-Benefit of Information System Design ...... 85 Chapter 8. Critical Data and Application Development ...... 91 NOAA QPE ...... 92 Texas Mesonet ...... 92 USGS Hurricane Harvey High Water Marks Viewer ...... 93 GLO Post-Ike Projects ...... 94 Chapter 9. Uses of Probabilistic and Deterministic Approaches ...... 96 Chapter 10. Recommendations ...... 100 Appendix 1. Reference Links ...... 106 Appendix 2. Acronyms ...... 121

3

Executive Summary

The University of Texas at Austin Center for Space Research (UT-CSR) has completed a final report to address the topics and answer the questions raised by the General Land Office (GLO) concerning concepts underlying the Texas Disaster Information System (TDIS) through work to develop a “CDR Plan: Data Survey and System Design.”

The investigation can be divided into three closely interrelated parts: The Data—The System— The Applications. For the data, the study provides a thorough review of 18 major data themes that cover the primary information sources used for disaster response, recovery and mitigation in Texas. Sources of the greatest value (highest priority data) are recognized and supported for expanded use and additional investment. The study also identifies significant data shortfalls, where deficiencies have hindered progress and measures need to be taken to fill the data gaps. Data content and data sharing standards are also reviewed and promoted as guidelines for adoption by TDIS and its participating organizations.

The report envisions that the system solution needed for TDIS will be based on an architecture of federated data systems in which the partnering organizations negotiate data sharing arrangements that permit the exchange of their data products, data services and web-based data applications. This degree of interaction by the TDIS core participants will require Federated Identity Management to maintain data system security. In the realm of data applications, the report points to examples of collaborations for data sharing during disaster response operations and for the development of recovery projects. Prototypes are presented for the kinds of data services and web-based data applications that TDIS could catalyze and host.

The report concludes with a series of recommendations centered on the initial development of the TDIS design, future data requirements and creation of data services and web application prototypes. Discussed in further detail in Chapter 10, abbreviated versions of the report’s recommendations are presented below.

Recommendations:

1. Negotiate data sharing partnerships with the major agencies (USACE, FEMA, NWS, USGS, TWDB, TDEM, GLO). The building of TDIS first requires that an understanding be reached with each of the partners of the core group of federal and state agencies, including USACE, FEMA, NWS, USGS, TWDB, TDEM and GLO. The understanding needs to cover more than a simple agreement to work together toward mutual objectives, but a willingness to address in detail the methods and step- by-step procedures for sharing data, the maintenance of data system security and the adherence to data content and quality standards. Above all, TDIS partners need to be willing to experiment and test the boundaries of new ways to unite their interests and collaborate through data sharing.

4 2. Build the TDIS Architecture using primarily a data services design based on the concept of federated data systems acting in partnership. The TDIS architecture will need to be a very flexible environment. The optimal solution should be built using the concept of federated data systems in which separate organizations work as partners invoking and sharing public services and processes. Fundamentally, federated systems are about services built to agreed upon standards. Among the partners, agreement is reached that certain data services will behave in certain ways and be updated with certain frequencies. The model empowers the users to build their own solutions on top of a predictable set of data and processes accessed through a services model. By leveraging proven models, TDIS can build novel software and partnerships through the richness of data sharing and collaboration.

3. Implement Data System Security using a Federated Identity Management solution. TDIS will be a complex hybrid collection of partners and internal processes that will have both public and restricted areas of concern. In such as a system, one of the best methods for handling security involves Federated Identity Management. The proposed solution removes the management of the individual users in particular external organizations by shifting the management of an organization’s users back to that organization. As part of the early work, a security team should be established that could build a reference implementation of the Federated Identity System.

4. Adopt Data Content and Quality Standards for the Data Accessed through TDIS. To develop the full capabilities of information accessed through TDIS, standards must be implemented. Quality standards reduce data discrepancies. For GIS-ready data spatial data quality standards include measures of completeness, currentness, accuracy, precision and consistency. The data managed by TDIS must adhere to the FAIR guiding principles to be Findable, Accessible, Interoperable and Reusable through Digital Object Identifiers or other sources of comprehensive metadata.

5. Explore and implement the licensing of commercial data products. Any plan to establish a disaster information system should include a thorough exploration of commercial data products and data services. Although freely available, many public data sets suffer from data quality issues, currency concerns, differing data formats and the lack of a unified data schema. Whereas the licensing arrangements for access to commercial data introduce a layer of complexity (and expense), the proposed design for TDIS data management is well suited to handle commercial data sources.

6. Develop Prototype Web Services for TDIS. TDIS data services will feed the collaborations and applications built by the partners and users. It follows that prototype web services will play an outsized role as development begins. Consideration and priority should be placed on building examples of both reference services and targeted services. Reference services are those that help ground or locate the other resources being used by TDIS. Inspiration for the targeted data services could come from any one of the high value data sets considered in the report. Ultimately, the prototype data services will serve as an early validation of the conceptual architecture and infrastructure of TDIS. The services can also begin to lay the groundwork for and access level assignment. The

5 documentation for the services and the specifications used will feed proof of concept applications.

7. Collect new, higher quality and more comprehensive high priority data. TDIS should be an advocate for the continuous improvement of data resources needed for disaster response, recovery and mitigation. As technology advances, new information sources emerge and older sources can be refined and updated. The high priority data sets cited in the report, including LiDAR topography, radar satellite imagery and enhanced demography, represent resources that need to be maintained and extended for reasons of their recognized utility and importance. On the other hand, the critical data shortfalls discussed are readymade subjects for new investment to create suitable data updates and replacements where deficiencies currently exist. The TDIS partners should confer and offer reasons concerning their own priorities for investment in new data. An agreement among the partners will advance the cause of renewable digital infrastructure as a necessary complement to the new physical infrastructure now in development for recovery projects in Texas.

8. Develop and expand the use of social impact data including Public Health data. Texas-focused social impact demographic data should be developed and its use promoted in disaster response, recovery and preparedness because a disaster frequently impacts vulnerable populations differently than more resilient communities. Anonymized public health information should be an integral facet of data compilation. The CDC has already made the case for the development of its Social Vulnerability Index. TDIS should expand upon the SVI’s 15 social factors, attach suitable environmental and health data and update the data to be as current as possible. With the vulnerabilities exposed, TDIS better supports the development of community resilience and adaptability. Programs become more effective when targeted to meet identified gaps of this kind.

9. Adopt procedures to handle the arrival and use of new and novel data sources during disasters. The data generated during disasters in Texas is expanding exponentially. TDIS must anticipate an increasingly complex information environment during future disasters with ever-growing volumes of relevant data and the arrival of novel data sources never dealt with before. To capture the full range of data sources that come into play during the activation of state resources for emergency response, TDIS needs to participate in the creation of the information products required to assist the emergency management community. The participation will expose TDIS to the enormous variety of potentially actionable information that could be filtered from the chaotic stream of loosely-structured and unstructured data, along with the well-structured data arriving from unfamiliar sources. The sudden introduction of new, unfamiliar information sources frequently occurs in the midst of disaster response, and these data sets can be extremely useful if recognized and acted on. During a disaster, TDIS should expect the arrival of novel data sources and be prepared to react.

10. Perform an online data survey of the core TDIS partners to develop a better understanding of their needs for data, data services and applications and their perceptions of data shortfalls. An online survey could be employed to gather responses to questions concerning each of the 18 data themes discussed in the report from the core group of TDIS partners (GLO, TWDB, TDEM,

6 USACE, FEMA, NWS, USGS) to record each organization’s expert opinion. The survey questions would be designed to identify significant data gaps that now inhibit progress and what steps could be taken to remove the data shortfalls. The metrics developed from the responses will help to prioritize future data needs across the different agencies and solidify a group consensus for high priority data. In addition, the survey would promote the building of a coalition needed to advocate for the future digital infrastructure.

7

Prologue

The discussion in this final report addresses the topics and questions targeted in a contract to produce a “CDR Data Management Plan: Data Survey and System Design” through work conducted for the General Land Office by the University of Texas at Austin Center for Space Research during 2019 and 2020. The report distills our efforts to understand the issues involved in the creation of the Texas Disaster Information System (TDIS) and proposes solutions that lead to a scalable system architecture containing the essential disaster information, data analytics and data distribution components needed to support response, recovery and planning.

The report consists of ten chapters, with each chapter focusing on a different aspect of the concept for TDIS from data content review through data sharing and data distribution processes along with recommendations for best practices and the steps necessary to build an agile information management system in partnership with external organizations of data providers and users. A detailed survey of the major data themes appears in Chapter 1, which discusses the primary data sets with content descriptions and typical applications. The themes cover useful sources for both physical and social data. Chapter 2 is devoted to examples that demonstrate data sharing partnerships and procedures with several examples of arrangements that have proven successful and could be used as models for future development. To meet the rising challenge of security for enterprise data systems, Chapter 3 proposes a responsive method to maintain security within the TDIS enterprise and across a series of future data sharing partnerships. The highest priority sources of disaster-related data are identified in Chapter 4. These “framework” data sets are commonly referenced regardless of the kind of disaster that occurs, but may contain some unique examples linked to a single kind of disaster, such as wildfire detection from data collected by satellite thermal sensors. Chapter 5 reverses the perspective and describes where the data survey discovered shortfalls in the currently available data sources proving to be inadequate for several reasons, including being incomplete, inconsistent and outdated. A summary of the best practices in data curation appears in Chapter 6 and points to the most effective methods for data collection, identification, organization, preservation and sharing.

Turning attention to different system development strategies, Chapter 7 examines the costs and benefits of building alternative data system architectures and underscores the distinction to be

8 made between centralized data clearinghouses and federated information systems that share their data primarily through web-based data services. In Chapter 8, examples of the most critical data sources for disaster information and analytics are considered along with some use cases to emphasize effective pathways for future development. Probabilistic approaches to predicting future disaster impacts are explored in Chapter 9 with an eye toward the time when numerical modeling & simulation methods may improve to become trustworthy sources for damage estimation before a particular disaster strikes. Chapter 10 provides the overall recommendations for the future design and implementation of the TDIS architecture and focuses on the steps that will be most important to guarantee its success.

Work leading to this final report received funding from the General Land Office through GLO Contract No. 18-229-000-B854.

9 Chapter 1 Survey of Data for Response, Recovery and Mitigation Chapter 1. Survey of Data for Response, Recovery and Mitigation

To determine what data resources are most often used to document and analyze the impacts arising from different kinds of natural and man-made disasters and the processes that occur during post-event recovery, we performed an exhaustive survey of the digital archives and data services hosted by federal, state and local organizations. The exploration identified hundreds of data sets having frequent application to disasters, far too many examples than could be discussed in detail in this chapter for the many data themes. Our selection of significant data themes may have its own schematic limitation and have omitted a few data sets that are not easily placed into a single category.

In this review, eighteen data themes were chosen as a method to organize the information used to describe the physical environment, such as weather and climate data, and features of the social and political landscape, such as demographic and economic data. Some of the themes are generalized and broad, whereas others are more limited. All are selected in this instance as distinctively meaningful for their frequent use in disasters. The individual data themes are listed here:

Geodetic Control Agriculture Bathymetry Utilities/Critical Infrastructure Topography Transportation High-Resolution Baseline Imagery Street Addressing Hydrography Parcels Soils Demography Climate/Weather Economic Activity Administrative Boundaries Disaster Insurance Claims Land Cover/Land Use/Zoning Public Health

For several especially relevant categories of data, we compiled extensive lists of the primary data sets while performing an assessment of the characteristics and overall data quality of each

10 example. The analysis applied a uniform set of criteria describing the level of feature attribution, metadata, spatial accuracy, completeness and other quality factors described in the table below. The full results are contained in electronic spreadsheets at this link (#1.i.1).

Table 1. Data Product Quality Assessment Accessibility How accessible is the data? Is the data readily 1 – not available at all to external users available online or must it be requested from 2 – not accessible, must e-mail, transfer on the host? Are only authorized users allowed to media access the data? Are there tools to easily 3 – raw, unprocessed, available on web download the files? Is the data easy or difficult 4 – zipped spatial files, available on web to find in the web service? (some website tools 5 – has web services for using data directly require a significant effort to locate specific data sets). Availability Is the data available in multiple formats 1 – only non-spatial tabular data available including shp, gdb, kml, scientific (NetCDF, 2 – at least 1 format available etc.), and geojson? Some data may be 3 – at least 2 formats available available only in tabular format, which may 4 – at least 3 formats available require additional preparation to transform into 5 - 4 or more formats available a spatial data format (for example, having to manipulate the coordinate fields to enable creation of a point layer file). Currentness How current is the data? How often is it 1 – unknown updated? This mainly applies to the non-static 2 – never updated data sets that are expected to change through 3 – infrequently updated time. For instance, a data set that should be 4 – moderately updated updated monthly, but has not been updated in 5 – frequently updated five years should receive a lower score. Attribution Does the data set have an attribute table with an 1 – no useful attribution or many null values appropriate number of fields describing the 3 – some attributes available features? Are the fields useful or largely 5 – thorough attribution uninformative? Note, some data sets are simpler than others and may not require many descriptive fields. A judgment is needed to determine whether the fields offer a sufficient description of the data set. Completeness Is there obviously incomplete spatial data, such 1 – missing significant features as missing points, lines and polygons? Is 3 – unfinished, moderately complete production of the data set finished or still in 5 – mostly complete progress? Spatial Accuracy When overlaid upon basemap reference data 1 – 10% accurately overlay basemaps and aerial orthoimagery, are the feature 2 – 30% accurately overlay basemaps locations spatially accurate or is there 3 – 50% accurately overlay basemaps significant displacement from the actual 4 – 70% accurately overlay basemaps positions? 5 – 90+% accurately overlay basemaps Level of Detail What level of detail is represented by the 1 – state linework? This factor is typically related to 2 – regional spatial scale; however, many data sets do not 3 – county specify a feature collection scale, and this may 4 – city be hard to estimate without additional informa- 5 – neighborhood tion. What is the appropriate scale for the use of the data set? At the neighborhood level, with features reflecting streets and structures? At the county scale? At the state scale?

11 Table 1. Data Product Quality Assessment (continued) Metadata Is the data set have accompanied by a separate 1 - none metadata document describing the data? Does 3 - partial the metadata contain the most important 5 - detailed information, such as description, date updated, field contents, accuracy, and source informa- tion? Some datasets have auto-generated metadata. Other data sets may have no metadata at all. Some have very abbreviated metadata, such as those typically provided by ArcGIS Online. Still others are machine-generated descriptions that may contain feature counts and processing steps, but do not have useful information written by an author. • Each category is assigned a score (1 to 5) for purposes of ranking • From the categorical scores, a field is created that describes overall quality • Provide overall average score • Highlight categories with low scores

When attempting to perform a survey of this nature, the enormous variety of data themes and associated data sets becomes overwhelmingly apparent. Additionally, the variety of different sources for data and their organizational structure and methods for data distribution are equally diverse and complex. At the national level, Data.gov (#1.i.2) exemplifies an admirable effort to create a single entry point to facilitate discovery and access to the myriad data themes managed by the federal government. The National Map (#1.i.3) platform maintained by the U.S. Geological Survey is a comparable effort that allows direct access to many sources of framework geospatial data having uniform metadata defined by the Federal Geographic Data Committee (FGDC). The care taken to construct and maintain these online resources and data services merits our continued support, but also draws attention to the limitations of the federal archives, where some themes are infrequently updated and do not reflect the higher quality of more recent production methods.

Indeed, when the pressing need to replace outdated information becomes paramount, the federal system may not be responsive, which creates a market for commercially licensed data sources from private sector vendors, as is the case for a large portion of the best data available for transportation and navigation. And in an opposite reaction, it also generates a demand for crowd- sourced data, such as OpenStreetMap (#1.i.4), created by a volunteer community of nongovernment cooperators whose production methods are variable and often loosely managed by an agreed adherence to guidelines set by the Open Geospatial Consortium (OGC). Another complicating factor in data repository organization is the arrival of completely new data sources. Images taken by cameras from small aerial drones can be extremely useful documentation for disasters, but the production standards, data formats, metadata requirements and overall best practices are still evolving. At this time despite the proliferations of camera-equipped quadcopters, the inadequacies make a comprehensive search for drone photography nearly impossible, because the data resource is not yet coherent.

With regard to the geospatial data used in geographic information systems (GIS), the Texas Natural Resources Information System (TNRIS) (#1.i.5), a division of the Texas Water

12 Development Board (TWDB) operates as the gateway to statewide framework data sets used by all levels of government, the private sector and the general public. Most of the data themes are available at comparable spatial scales for any area of Texas and often result from cooperative funding agreements with federal agencies to build high-quality national data products. TNRIS is the home of the Strategic Mapping Program, or StratMap (#1.i.6), which leads the state initiative to build and maintain framework data layers covering orthoimagery, topography, hydrography, land parcels and address points in coordination with federal, local and state agencies.

In the following sections, we focus our attention on each of the 18 data themes by describing their characteristics, use cases and primary sources.

Geodetic Control

Geodetic control forms the reference frame upon which all other geospatial data themes are constructed. The control network of highly accurate point locations anchors features in three- dimensional space with respect to their horizontal positions according to the geodetic reference datum in a map coordinate system and their vertical positions in relation to the Earth ellipsoid approximating the true shape of the Earth. The ellipsoidal heights are normally converted to heights (and depths) measured above or below mean sea level. The National Geodetic Survey (NGS) administers the national program to collect and preserve geodetic control points used by land surveyors and others who need reliable reference points upon which to base their measurements (#1.1.1). NGS provides a Survey Data Explorer that can be used to find the many sources of geodetic control in Texas. These benchmarks are monitored and validated by periodic field inspections and are reoccupied using improved survey instruments to record updated positions to account for the gradual shifts of the ground surface over time. Modern survey devices are capable of establishing geodetic positions having horizontal and vertical accuracies of 1-2 centimeters, or less than an inch. This remarkable accuracy is extremely important when considering the hazard of floods, where an difference of under a foot can result in property loss. Expensive efforts to collect high-resolution orthoimagery and LiDAR-based digital terrain models can be undermined by tying the data to faulty horizontal and vertical control. Much of the value of all geospatial information rests upon having a stable, accurate framework of geodetic control. For this reason, federal, state and local agencies reference their own survey networks to the positions of NGS geodetic control points wherever possible.

Advances in satellite drawing on constellations of global positioning satellites (GPS) have stimulated recent work started under the NGS Height Modernization Program, which is leading to better datums (#1.1.2). Soon the prevailing North American Datum of 1983 (NAD 83) and North American Vertical Datum of 1988 (NAVD 88) will be replaced with new horizontal and vertical reference frames. The program aims to achieve 2-centimeter orthometric heights (heights relative to sea level) across the entire conterminous United States.

In Texas, the NGS regional advisor is affiliated with the Conrad Blucher Institute (#1.1.3) at Texas A&M Corpus Christi, which hosts the Texas Spatial Data Center and coordinates activities related to the federal height modernization program.

13 • A comprehensive discovery tool for finding sources of horizontal and vertical geodetic control in Texas can be accessed through the NGS Survey Data Explorer (#1.1.4).

Bathymetry

Bathymetry data include a wide variety of depth measurements extending below the water surface in the marine, near-shore, bay and estuary, river and reservoir environments. Bathymetry data sets are used to represent underwater topography, including the shapes of features on the seabed and the bottoms of bays, natural lakes and reservoirs. In the context of inland hydrologic processes that convey water downslope and downstream, bathymetry includes the shapes of river channels along with natural and dredged channels located near the coast. Bathymetric measurements involve a broad range of techniques, including rope-line sinker soundings, single- beam echo soundings, multi-beam sonar surveys, total station cross-section surveys, water- penetrating topographic/bathymetric LiDAR and other methods.

The National Oceanic and Atmospheric Administration (NOAA) serves as the authoritative source for bathymetric data along the Texas Gulf Coast. Data sets are available through NOAA and may be explored using the National Geospatial Data Center (NGDC) Bathymetry Data Viewer (#1.2.1). Digital bathymetric grids created using a variety of technologies from historical sounding records to collections from aerial surveys by the latest topo-bathymetric LiDAR (Light Detection and Ranging) sensors are available from the NOAA Office for Coastal Management Digital Coast (#1.2.2). The bathymetric data produced for river channels and other underwater features located inland tend to be created and distributed for individual project areas. For many flood protection and mitigation projects conducted by the U.S. Army Corps of Engineers (USACE), detailed ground-surveyed cross-sections of channel profiles are needed to estimate river discharge volumes and current velocities using the Hydrologic Engineering Center River Analysis System (HEC-RAS) hydraulic model developed by USACE and similar one- and two- dimensional computational fluid dynamics models. Studies of site-specific risks from inland flooding are highly dependent on the quality of data representing the characteristics of channel geometry. Advances in the collection and analysis of aerial LiDAR data permit the development of digital cross-sections of channels and adjacent floodplains under low-water conditions leading to the better definition of drainage networks in flood-prone areas.

Along the shoreline and in the bay and estuary environment near the mainland, accurate bathymetry is required to estimate the inundation that occurs during storm surge and wave impacts are generated by hurricanes and tropical storms during landfall. Characterization of the near-shore bathymetry and bottom friction are essential factors in the numerical forecasts of the water volume that will be pushed over dry land in the storm surge forecasts made by NOAA’s Sea, Lake and Overland Surges from Hurricanes (SLOSH) model and the more detailed Advance Circulation (ADCIRC) Storm Guidance System developed by the ADCIRC Development Group (#1.2.3) of academic researchers. Future advances in the production of bathymetric data for hydrodynamic modeling are closely linked to the national 3D Elevation Program (3DEP) coordinated by the U.S. Geological Survey that integrates high-resolution topographic and bathymetric data in a seamless elevation grid.

14 • An example of seamless digital bathymetric data resulted from a project by the Texas Parks and Wildlife Department to create uniform coverage of shoreline bathymetry and topography. (#1.2.4)

Topography

Topography describes the shape and features of land surfaces and is usually represented by vector data as contour lines and raster data, commonly referred to as Digital Elevation Models (DEMs). Topography data plays a critical role in flood inundation modeling because it determines the flow direction of water downslope and into drainage networks. DEMs can be grouped into different categories based on the spatial scales of DEM coverage (global, national, and regional).

Accurate topography is critical for making decisions to mitigate damage from river flooding. 1 arc-second (~30 meter) resolution DEM products are available at the national scale from the U.S. Geological Survey (USGS) (#1.3.1), including the National Elevation Dataset (NED) (#1.3.2) and Shuttle Radar Topography Mission (SRTM) (#1.3.3). 1/3 arc-second (~10 meter) NED data products cover most of the United States. NED 30 m products are consistent and seamless topography datasets available at the national level, which are good for hydrological modeling at a large scale, but they are not suitable for a fine scale, or street level flood modeling. For example, a 30 m resolution pixel is not able to distinguish small rivers with a bank-full width smaller than 30 m.

At the regional scale, the NED offers some areas with 1/9 arc-second (~3 meter) resolution data but is limited to urban areas collected using aerial LiDAR systems (#1.3.4). These different spatial resolutions of the DEMs lead to discontinuities where the elevation grids are joined, and these problems impact the performance of the models. To resolve this issue, the USGS 3D Elevation Program (3DEP) (#1.3.5) plans to complete acquisition of nationwide high-resolution topographic elevation data by 2023 by using Interferometric Synthetic Aperture Radar (IfSAR) over Alaska and LiDAR (bare earth and 3D point cloud elevations) for the rest of the nation. LiDAR topography will be discussed in more detail in a separate section later.

DEMs provide “absolute” elevations referenced to surveyed geodetic points in the local terrain. A lower elevation location is not necessarily more likely to be flooded than a higher elevation location because the directions of water flow are determined by the local slope and aspect of the terrain. When modeling flood inundation, a DEM is not used directly in the model. Instead, it is usually de-trended, so that the heights of the surface relative to the river channels can be derived. These relative heights can be applied to mark water levels predicted by numerical hydrologic and hydraulic flow models. Height Above the Nearest Drainage (HAND) (#1.3.6) is a hydrologically relevant new terrain model that implements the “relative” elevations. By normalizing topography based on the local relative heights found along the drainage network, the HAND model represents the topology of the local draining potential.

The National Water Model (NWM) (#1.3.7) is the first hydrologic model to simulate observed and forecast streamflow over the entire continental United States (CONUS). Together with

15 HAND -derived rating curves for discharge, the NWM uses the predicted stream flow to estimate water depths. The first version of the HAND used for the NWM was generated from the 1 arc-second (~30 meter) resolution National Elevation Dataset (NED). Due to the drawbacks mentioned above, a newer version of the HAND was derived from the 1/3 arc-second (~10 meter) NED for most parts of the US by the NOAA National Water Center using Esri’s newly developed ArcHydro Tools. The completion of the USGS 3DEP’s coverage of consistent, high-resolution elevation data will permit the HAND analysis technique to make a significant contribution to national flood inundation modeling.

High-Resolution Baseline Imagery

Once exact geodetic control is established for an area and used to anchor digital elevation models of the local topography, the next step in the creation of an accurate digital map is the production of orthoimagery from conventional aerial photographs or from the imagery collected by aerial and satellite sensor systems. The original images are combined with the digital elevation models to produce orthoimagery, which adjusts the positions of the photographed features to match their locations in the digital elevation grid containing the correct orthometric heights. Every feature in the resulting orthoimage appears as if it were viewed directly from above, without the distortions caused by the different heights of objects, the camera tilt and lens aberrations. When presented in a standard map coordinate system, orthoimagery can be used to measure features and to extract points and polygons from a defined reference frame. With the introduction of advanced airborne GPS navigation systems coupled with inertial measurement units, high-quality orthoimagery can now be produced within hours after the source imagery is collected by aerial surveys. This entire process enables the development of a time series of orthoimages in which the baseline is rigorously maintained, allowing changes in surface features to be accurately displayed, analyzed and mapped.

High-resolution baseline imagery commonly used today in Texas captures feature details with a typical spatial resolution of 1-meter for statewide coverage and improving to 15 centimeters in large urban areas. Aerial multiband camera systems record visible color images and composite images that add a near infrared band to the visible red and green. Comparable high-resolution multispectral satellite instruments collect data in the visible, near infrared and shortwave infrared parts of the spectrum. The additional spectral bands recorded by satellites can be used to differentiate vegetation, water and man-made structural features, allowing the development of image classification algorithms that can be used to automate land cover classification, enhance the detection of waterborne pollutants and distinguish the burn scars left by wildfires. In recent years, the imaging systems carried by satellites have become miniaturized. Constellations of small orbiting satellite sensors capable of recording 1- to 3-meter resolution images are now operated by Planet Labs and other private firms that offer the possibility of nearly daily orthoimage production for any location in Texas, if weather conditions allow.

16 Figure 1.1. A NOAA Digital Modular Camera orthoimage shows floodwaters in Rose City, Texas, from Hurricane Harvey on September 1, 2017.

High-resolution baseline imagery serves a variety of purposes in support of disaster response, and recovery activities as well as the mitigation of hazards. The imagery helps to map the locations of critical infrastructure, industrial facilities, retail businesses and residences in areas that may be threatened by heavy rainfall, river flooding, storm surge, windstorms, wildfires and other damaging events (Figure 1.1). When a disaster does occur, the baseline orthoimagery serves as a pre-event reference frame with which to determine the extent, magnitude and style of the damage recorded in post-event orthoimagey. As image processing and machine learning techniques improve, remote sensing scientists anticipate advances in computer-assisted techniques to perform wide area damage assessment. Technical breakthroughs of this kind would significantly accelerate the documentation of disaster impacts in the future. • Texas Orthoimagery Program (TOP) managed by the Strategic Mapping Program of the Texas Natural Resources Information System (TNRIS) maintains statewide collections of aerial orthoimagery dating from the mid-1990s, including the most recent 50-centimeter resolution products. (#1.4.1) • The Hazards Data Distribution System (HDDS) administered by the U.S. Geological Survey provides access to many collections of satellite and aerial orthoimagery collected during major disasters in the United States and worldwide. While the main purpose is to distribute post-event imagery showing disaster damage, the archive often holds data sets of pre-event baseline reference imagery for comparison. (#1.4.2) • One of the most effective sources for rapid orthoimagery production during disaster response operations is the NOAA Emergency Response Imagery program conducted by the National Ocean Service using high-resolution multiband aerial camera systems (#1.4.3).

17 Hydrography

Hydrography is the science dealing with the measurement and description of the physical features of water bodies and the land areas adjacent to them and predicting their change over time.

In the United States, national surface water and hydrographic geospatial data sets are managed by the U.S. Geological Survey (USGS). These include the National Hydrography Dataset (NHD) (#1.5.1), Watershed Boundary Dataset (WBD) (#1.5.2), and NHDPlus High Resolution (NHDPlus HR) (#1.5.3). Hydrography data are widely used by government agencies, such as federal, state, regional, county and local governments, and it plays an important role in stream flow and storm water management, flood risk modeling, coastal hazards assessment, landscape conservation and water quality management.

The National Hydrography Dataset (NHD) is the most up-to-date and comprehensive hydrography dataset for the United States. The NHD High Resolution, mapped at a scale of 1:24,000 or larger scale is the most recent version of the data. NHD HR is a vector-based data set representing the river and stream drainage network of the country, including multiple feature datasets, feature classes, event feature classes, attribute tables, relationship classes, domains, and feature-level metadata. Along with the Watershed Boundary Dataset (WBD) and 3D Elevation Program (3DEP) data, an improved version is contained in the NHDPlus High Resolution. (NHDPlus HR) data set.

NHDPlus HR Beta is modeled after the highly successful medium resolution NHDPlus v2 (#4). Like the NHDPlus v2, the NHDPlus HR is comprised of a seamless national network of stream reaches, elevation-based catchment areas, flow surfaces, and value-added attributes that enhance stream network navigation, analysis, and data display. However, the NHDPlus HR increases the number of features NHDPlus Version 2 from 2.7 million to over 30 million, and will also allow the selection of more generalized stream networks for regional or national analysis, while retaining the spatial accuracy of the highest-resolution, nationally available data sets. NHDPlus HR Beta generation is ongoing and reaching completion. When finished, the NHDPlus HR will provide a common geospatial framework that is open and accessible for public, including government, private citizens, and industry (#1.5.4).

Stream gages are important sources for hydrologists and environmental scientists monitoring water stage and flow discharge. However, the number of gage stations in service are less than 8000 and the official NWS river forecasts are only available at approximately 4000 locations across the CONUS (#1.5.5). In Texas, there are less than 1200 gages available across the huge territory. This greatly limits the capability of river forecasting in the state. The NWM can make predictions for about 100 K river reaches with NHDPlus v2 dataset, which highly complements the gage measurement. NHDPlus HR Beta is currently available for Texas with about two million river reaches. It will significantly improve existing riverine flood inundation modeling because it provides far more detailed special attributes than NHDPlus V2 to support local analysis and modeling. These improvements to hydrography will help emergency response activities and/or the recovery from future flood disasters.

18 Soils

Soils data products contain a wide variety of descriptive attributes for the soils covering a particular area, including taxonomic classification, chemical and physical properties, water capacity, frequency of flooding, and land development. In addition to the primary descriptive elements, the databases also often include interpretations for engineering purposes, natural resource management and disaster response. The soils data are usually collected by ground surveys conducted by soil scientists. Soil samples collected in the field are further analyzed in laboratories. The data are typically represented in publications and digital data sets at multiple scales and resolutions in both tabular and geospatial formats.

The National Resources Conservation Service (NRCS), an agency under the U.S. Department of Agriculture (USDA), serves as the primary source of soils data in Texas. The NRCS maintains the largest natural resource information system in the world and provides soil maps and data for over 95 percent of the counties in the United States. The Soil Survey Geographic Database, otherwise known as SUURGO (#1.6.1), represents the NRCS’s flagship soils data product. The database contains over 36 million mapped features that link to over 70 tables describing hundreds of soil feature attributes. These data constitute the agency’s highest resolution product for most areas and is typically published at scales ranging from 1:12,000 to 1:24,000.

The State Soil Geographic (STATSGO) data set provides lower resolution, more generalized mapping of the soils in the United States (#1.6.2). STATSGO products are released at a scale of 1:250,000 and are appropriate for broad-based planning at a state or regional level. More detailed soil survey maps are generalized to contribute to the STATSGO data. In areas where no detailed mapping exists, the NRCS analyses , topography and vegetation maps in conjunction with Landsat satellite images to classify the soil associations. A 2006 revision of the STATSGO spatial and tabular data resulted in the U.S. General Soil Map (STATSGO2).

The Raster Soil Survey (RSS) product more precisely represents soil concepts in the landscape than the more conventional STATSGO and SUURGO products. The RSS stores data as an array of raster cells with each being assigned soil taxonomy and other soil map unit components corresponding to the descriptions in the SUURGO database. The available RSS data covers only a very small portion of the United States.

The Gridded National Soil Survey Geographic Database (gNATSGO), a composite of SUURGO, STATSGO, and RSS databases, provides the best soils data available from a single online resource (#1.6.3). In this database, soil map units have been converted to 10-meter rasters. SUURGO represents the largest percentage of the NATSGO database with STATSGO used only in places where SUURGO and RSS do not exist, essentially filling in gaps in the national coverage.

All of the soils data described here can be accessed through the USDA Geospatial Data Gateway (#1.6.4). The data are generally available in file geodatabase format enabling the user to visually display and link to the database using Geographic Information Systems (GIS) software. To assist an analyst in mapping the various soil , an Esri ArcGIS Toolbox called the Soil Data

19 Development Toolbox (#1.6.5) (Figure 1.2) allows the user to create many different queries of the data. q untitled - ArcMap File Edit View Bookmarks Insert Selection Geoprocessing Customie Windows Help De as t-[izs2077 gZan, ii More"± G ::M5 , Drawing L·A•

Editor F Table Of Contents

Moed, thermic Ustic Toripsamments Psamnments .Sandy Ustorthents Sol Data Development Toolbox.tbx Sandy, mioed, hyperthermic Anidi Utfl a4 Download SUR6O Toolet .Sandy, med, hyperthermic Typic Aqursal d g5SURGO Dabs Toolet t] Sandy, mored, hyperthermic Typi Ut+flu e4 gSSURGO Mpping Toolet .Sandy, mod, mesic Typi Ustifluvents Add National Map Unit Symbol L Sandy, mioed, thermic Oryaquic Udifluven $ Crete Soil Map Crete Soil Series L] Sandy, mod, thermic Typi Hplrgids Mp identify Dominant Components Sandy, meed, thermic ypic Torrifluvents Merge Rting Table Sandy, mod, thermic ypie Udfluvents L Sandy, moved, thermic ypi Utftuvents Sandy, moued, thermic Udic Ustifluvents Sandy, mored, thermic Ustic Torrftuvents Sandy, siliceous, hyperthermic Oxyaquic H Aggregation Method Sandy, siliceous, yperthermic Typie Udfhe Sandy, siliceous, thermic Arenic Ustalfie H ·] Map unit Sandy, siliceous, thermic Cumuli«c Humaq aggregation/summary Sandy, siliceous, thermic Lamellic Paleudu methods available for each Sandy, siliceous, thermic Lelli Paleust ]Sol Taxonomy Cass-fcton sol property or Sandy, siliceous, thermic ypi Huraquep interpretation L Sandy, siliceous, thermic ypi Udftuvents 'gDominanet Cod-nton Sandy-skeletal, crbonatic, thermic ypie The aggregation ornery Const ant (optional) Sandy-skeletal, mored, hyperthermic Usti method Dominant t Sandy-skeletal, mored, thermic Fluvertic H Condition" fest Sandy-skeletal, med, thermic ypic Tort Secondary Constraint (op tonal) groups hike attnbute Sandy-skeletal, moed, thermic Ustic Torrfl values for the Sandy-skeletal, siliceous, thermic Lithic Us Top Depth (a) components in a map unit For each Siliceous, hyperthermic Alfie Ustipsammer < > qroup. percent .Siliceous, hyperthermic Ory»qui Haplust Siliceous, hyperthermic Psamentic Pale Siliceous, yperthermic Sodic Psammaque Siliceous, hyperthermic ypi Psammaque Siliceous, thermic Alfie Ustipamments

Figure 1.2. ESRI’s Soil Data Development Toolbox.

The physical characteristics of soils, particularly their mechanical properties, control the rate of infiltration, when rainfall or floodwater accumulates upon the landscape. An accurate characterization of soil properties represents a critical input to numerical models of flooding. Land management practices that account for the type of soil on a property can increase the soil’s infiltration of rainwater, thereby decreasing soil erosion and reducing the runoff potential for flooding. Soils that retain moisture can be beneficial during extended periods of drought. Instrumentation for soil moisture measurement is used in the Texas Soil Observation Network (TxSON) to monitor drought conditions through a network of soil moisture observing stations and Hydromet stations (#1.6.6). Other uses of soil data in disaster planning include stabilizing soils in flood-prone areas, determining the environmental suitability of proposed locations for the disposal of debris and animal carcasses, and managing soil contamination from pollution events.

20 Climate and Weather

As weather impacts to Texas residents and commerce increase year after year, climate and weather data becomes increasingly vital to the emergency response, recovery, and planning community. The National Oceanic and Atmospheric Administration (NOAA) provides data to the public for free, but many private vendors market weather products as a supplement to the federal data.

With respect to climate data, a National Weather Service (NWS) study of historical rainfall titled NOAA Atlas 14, Volume 11, Version 2.0, published updated precipitation frequency estimates that quantify the degree or risk of flooding at a location. Additionally, NOAA’s Climate Prediction Center (CPC) supports operational predictions of climate variability, real-time monitoring of climate and related databases, and assessments of the origins of major climate anomalies. Analysts use CPC information as an aid in long-range disaster planning. Applications include the mitigation of weather-related natural disasters and uses for social and economic welfare. The products cover time scales from a week to seasons and cover the land, the ocean, and the atmosphere. GIS data in Shapefile and raster formats are available for download from the CPC for drought monitoring, precipitation analysis, sea surface temperature and weather hazard assessments (#1.7.1).

NOAA’s National Centers for Environmental Information (NCEI) serves as the world’s largest provider of weather and climate data. With regard to disaster impacts, NCEI’s severe weather archive of destructive storm and other weather data and information includes local, intense and damaging events, such as thunderstorms, hailstorms and tornadoes. Additionally, the NCEI Severe Weather Data Inventory (SWDI) includes data from a variety of sources in NCEI's archive. SWDI provides the ability to search through all data to find records covering a time period and geographic with the ability to download search results in a variety of formats. The formats currently supported are Esri Shapefile, Google Earth KMZ, CSV, and XML. For example, the NCEI Lightning Products and Services provide the number of cloud-to-ground lightning flash detections summarized for each day in 0.10-degree tiles. The tiles are part of the NCEI SWDI and are accessible from a basic Google Maps-based search tool as well as RESTful web services. Web services are also available as gridded summaries, WMS and WCS in NetCDF format by way of FTP data servers (#1.7.2).

For weather data, the NWS GIS Portal is a geospatial data portal that provides current weather and forecasts in addition to past weather and climate data. The GIS Portal can be leveraged for disaster management applications by developing automation methods using RESTful web services or file data downloads. Some useful examples include RADAR, watch/warning area polygons, hurricane forecasts, U.S. hazards/drought/wildfire, weather features, precipitation forecasts/hazards/winter, Advanced Hydrologic Prediction Service (AHPS) river gage observations, National Significant River Flood Outlook, Quantitative Precipitation Forecast (QPF), and Storm Prediction Center (SPC) fire weather outlooks (#1.7.3).

NOAA’s mapping portal titled nowCOAST Web Services provides access to real-time coastal observations, forecasts and warnings. This site is useful for emergency managers or analysts that need current, real-time or near real-time perishable meteorological data for operational decision-

21 making and situational awareness (#1.7.4). Surface meteorological and oceanographic observations are obtained from the NCEP Meteorological Assimilation Data Ingest System (MADIS) every 10 minutes and every hour from National Environmental Satellite Data and Information Service (NESDIS). From geostationary weather satellites, the GOES Visible and GOES IR layers receive updates with the latest imagery from NESDIS every 15-30 minutes. MRMS Doppler radar mosaics are updated every 4 minutes. The warnings for short-duration events are updated every minute, while the watches for short-duration events as well as the watches, warnings, and advisories for long-duration events are updated every 10 minutes. Meteorological data includes sea surface temperatures, thunderstorm outlooks, significant wave heights, NexRAD radar, QPE and QPF. nowCOAST provides direct access to data layers through both ArcGIS RESTful Map Services and OGC WMS (#1.7.5).

As seen in the above summaries, abundant climate and weather data exist for Texas from authoratative sources, most of which can be downloaded, while many are communicated in near- real time as data services.

Administrative Boundaries

Geographic boundaries that are not part of the natural environment and are defined by legal statute, treaty, or other anthropogenic definitions can be described as administrative boundaries. They are often placed to mark the divisions between sovereign governments and political jurisdictions, such as countries, states, counties, cities or local zoning units. These geographic entities may be delineated based on a common culture and history, language, systems of governance, or a long-standing result of trade, conflict or war.

Other types of administrative boundaries represent a governmental unit’s scope of responsibility within a defined area to deliver some aspect of service or taxing authority. Councils of government (COGs) in Texas represent a collection of counties that addresses issues, such as regional and municipal planning, economic and community development, pollution control, transit administration, transportation planning, human services and water supply. COGs are also involved with emergency planning and management and often provide sources of local demographic, economic, public health and geospatial data. The table below lists some typical administrative boundaries, along with their sources, that may contribute to the information needed for an emergency recovery and mitigation project (See Table 1).

22 Table 1. Administrative Boundary Sources. Data offerings may change over time.

Administrative Boundary Data Description TNRIS 1DOT FEMA TWDB COG Federal Emergency Management Agency (FEMA) and Offices X x FEMA Current and Historical Disaster Declarations X x Texas Councils of Government X X X Cities X X X U.S. Congressional Districts X X Texas House and Senate Districts X X TxDOT Urbanized Are as X X Texas Metropolitan Planning Organizations X X Federal and State Agency Regional Boundaries X X Texas Regional Mobility Authorities X x Texas Education Agency Boundaries X x Texas Military Boundaries X x Texas State Boundaries X x Counties X X Texas Groundwater Conservation Districts X X Texas River Authorities and Special Law Districts X X Texas Hood Planning Regions X X 2010U.S. Census Tracts, Block Groups, Blocks X X Texas Parks X Texas State Board of Education Districts X U.S. Postal Zip Codes ¥ Local Zoning X Subdivisions X

As Table 1 shows, numerous government organizations maintain public data sources for administrative boundaries. In addition, the Texas Natural Resources Information System (TNRIS) DataHub (#1.8.1) offers boundary data through a web-based interface. TNRIS serves Texas agencies and residents as a centralized clearinghouse and referral center for natural resource data, census data, data related to emergency management, and other socioeconomic data (#1.8.2).

Other sources for administrative boundary data include TxDOT (#1.8.3), FEMA (#1.8.4), TWDB (#1.8.5), and primarily three COGs that have extensive data offerings freely available to the public with some specialty data available for purchase. These three COGs represent approximately 60 percent of the total Texas population and cover 39 counties: 1. North Central Texas Council of Governments (NCTCOG) – 16 counties (#1.8.6) 2. Houston-Galveston Area Council (H-GAC) – 13 counties (#1.8.7) 3. Capital Area Council of Governments (CAPCOG) – 10 counties (#1.8.8) COGs provide local data not commonly hosted by state agencies. For example, CAPCOG’s Regional Open Data Portal includes geospatial boundary data for school districts, local parks and

23 open spaces, city limits, and locally extracted U.S. census tracts, block groups, and blocks for 2010.

For disaster response, the use of administrative and census boundaries separates jurisdictional command and control, delineates the scope of responsibility for impacted geographic areas and also provides information sources for societal impact estimates through demographic aggregations at the census tract or block levels. Geospatial boundary data can also be staged and prepared in advance of an event and later used for mitigation and planning purposes.

Land Cover / Land Use / Zoning

Land use and land cover are commonly referenced in the same conversation because they are inherently related. Land cover typically refers to the physical landscape and its dominant vegetation, whereas land use describes the modification and management practices applied to the land for economic purposes in human . The two prevailing methods for determining land cover characteristics are field surveys and the interpretation of imagery collected by satellites and aircraft. Examples of land cover include grassland, water, urban or agricultural cover types. By analyzing several years or decades of changing land cover, it is possible to create land conversion models that show how man (or nature) has altered the landscape. Land conversion models can help to predict or even guide how the land cover will develop in the future. Zoning is a method to administer land use in which a jurisdiction, typically a city, designates areas or zones where certain land uses are permitted or prohibited. Some zones are single use (residential, industrial) and some are designated for mixed-use.

The Multi-Resolution Land Characteristics (MRLC) consortium (#1.9.1) of federal agencies coordinates to create a nationwide land cover data product known as the National Land Cover Database (NLCD). The 30-meter resolution product represents 16 classes based on a modified Anderson Level II classification system of land cover characteristics. Through 2016 NLCD was updated in five-year cycles. The NLCD 2016 version (based on 2016 Landsat satellite data) enables scientists to assess land cover changes and trends between 2001 and 2016 and can be downloaded from the MRLC website (#1.9.2). In the case of land use zoning, spatial data and map services are usually posted on city websites, such as the City of Round Rock (#1.9.3). Unfortunately, there is no central source for zoning data in Texas.

Natural hazards and major land-use/land-cover change can have significant adverse impacts on urbanized areas in Texas. By understanding the relationship between land use, land cover and disaster impacts, urban planners can propose changes through zoning to prevent or mitigate damage to communities in the future. An effective Land Use Planning (LUP) policy should consider the spatial distribution of disaster risks and take measures to encourage more sustainable land development while reducing the vulnerability to lower income communities that often occupy locations in the more high-risk areas.

With increased urbanization becoming a global trend, the natural environment is rapidly being converted to man-made environments that have higher percentages of impervious cover types (roads, structures, etc.). This vastly increases the vulnerability of residents to floods and property

24 loss. Land cover change and flood disasters significantly decreased in forested areas and increased in developed urban areas resulting in greater property damage from flood events according to one study, “The Effect of Land Cover Change on Flooding in Texas”(#1.9.4).

In coastal areas, planners can use land cover to predict and assess impacts from floods and storm surges during hurricanes. After Hurricane Katrina, the accelerating loss of Louisiana’s coastal marshes and swamps contributed to the overall damage. In research published in the Journal of Geophysical Research (JGR): Oceans, the accurate representation of land cover characteristics leads to a major improvement in the results produced by numerical storm surge simulations (#1.9.5).

Agriculture

A wide range of periodic statistical reports and geospatial data are contained in the agriculture data theme, but one of the primary data sets, the Common Land Unit, carries use restrictions. The United States Department of Agriculture (USDA) issues county-level data in voluminous, well- structured reports every five years through the National Agricultural Statistics Service (#1.10.1). The most recent Census of Agriculture appeared in 2017, with the next release in preparation for 2022. The census data covers agricultural production by farmers and ranchers for all commercial crops and animals raised and sold and contains data on the numbers of farms/ranches and acreage by county and the types of production. Extensive supplementary information also appears and is very useful in tracing changes in agricultural trends through time referencing the census, which has five-year data compilations dating back to the mid-nineteenth century. In cooperation with the Texas Department of Agriculture (TDA), the USDA provides Texas- specific production reports for many categories with annual updates by county (#1.10.2). The USDA Forest Service issues state and county level statistics on forested lands and commercial timber production through its Forest Inventory and Analysis Program (#1.10.3) that are relevant to the counties in East Texas.

Whereas data sources are plentiful at the county level, more specific information about current agricultural practices and production for individual land holdings can be difficult to acquire and share. In descriptions of agricultural practices, the USDA Farm Services Agency (FSA) designates the Common Land Unit (CLU) as the smallest contiguous parcel of land with a relatively permanent boundary having particular land cover characteristics and land management practices. CLUs are often used as the administrative areas for crop insurance programs and other farming uses that receive federal reimbursement for prescribed land management, such as the acreage enrolled in the Conservation Reserve Program. The FSA offered public access to GIS data products that represent the CLUs until 2008, when congressional action caused their removal on the basis of protecting landowner privacy. CLU data sets continue to be updated, but the FSA must approve any data transfer for use by external organizations. To assist field inspections and perform updates to the CLUs, FSA sponsors the National Aerial Imagery Program (NAIP), which acquires detailed (1-meter or better) multispectral orthoimagery of cropland, ranchland and other areas monitored by USDA programs. In Texas, the Strategic Mapping Program administered by TNRIS has partnered with the FSA to produce 60-centimeter NAIP orthoimagery collected during the 2018 growing season over the entire state. At a less

25 detailed scale, the National Agricultural Statistics Service compiles the Cropland Data Layers and Cultivated Layer data sets using 30-meter spatial resolution satellite imagery. While the satellite-based products lack the complete attribution contained in the CLUs, the GIS-ready data sets are current and comprehensive for the entire United States and continue to be updated.

Agricultural data are pertinent for disaster response and recovery mainly through the estimation of damage losses. Hailstorms, ice storms, snowstorms and floods disrupt agriculture. Forests and commercial woodlands are frequently damaged by tornadoes, downbursts and straight-line winds, as are fruit orchards and greenhouse horticulture. Over longer time spans, the greatest agricultural losses are the result of prolonged periods of severe drought accompanied by the wildfires that ignite when the landscape is covered in dry vegetation. While the county-level USDA statistics support the assessment of losses to agricultural production from disasters, their contribution could be more effective if the current Common Land Unit data products were placed in the public domain. • Several generations of NAIP county orthoimagery mosaics of Texas are available from the USDA National Resource Conservation Service with data sets archived from 2003 to 2018-19. (#1.10.4)

Transportation

Transportation systems provide a means to convey people or goods from a point of origin to one or more destinations. The means by which transportation is accomplished can be divided into three basic types: land, such as road rail, and pipelines; water, which includes shipping; and air travel and shipping. Land-based transportation networks and their related infrastructure are the central factors impacting emergency management response, recovery and mitigation. To be useful for planning, analysis and decision-making, transportation systems must be converted to digital networks that represent a framework of routes and hubs that link locations. The networks form the basis of spatial data sets residing in a Geographic Information System (GIS) that supports geospatial analysis functions, such as data and geodatabase management, spatial and network analysis, transportation modeling and presentation of the data. A visual representation of the data allows organizations to envision and discover emerging patterns within the transportation network that reflect different or changing situations, therefore making possible more informed decisions.

Transportation data can be acquired for commercially licensed use or through local and state governments that typically provide free data access to the public. Commercial transportation data often excels when the user requires accuracy, robust and quality-controlled attribution, specialty tools, routing analysis, complementary data layers and timely updates. TomTom, a leading location specialist headquartered in Amsterdam with offices in 30 countries (#1.11.1), provides an example of a qualified private sector vendor offering comparable products with certified quality levels.

In Texas, public sources of transportation data include the Texas Department of Transportation (TxDOT), and various councils of government (COGs), such as the Houston-Galveston Area Council and the North Central Texas COG. The major COGs offer public access to common data

26 sets available through web interfaces that include downloadable data for statewide and local roadways, airports and runways, and major rail or transit rail.

At the state level, TxDOT’s Open Data Portal platform serves as a comprehensive statewide transportation data source. The Open Data Portal, developed and maintained by TxDOT’s Transportation Planning and Programming Division, can be used to explore and download GIS datasets. One example is the TxDOT Roadways polyline data set that includes interstate highways, U.S. and state highways, farm, ranch and county roads and local streets. The data set also contains measures, which are stored as M-values within each vertex along the line, or road segment, in the same way that some data sets store z-values for the elevation, except that measures store the distance from the origin point of the line. M-enabled networks serve as the framework for locating roadway assets along the network using a linear referencing system. TxDOT makes available an application programming interface (API) for GIS developers to leverage the ArcGIS REST API for server-based applications that consume these transportation data (#1.11.2).

For flood mitigation planning, analysis using the highway network data could identify areas where road closures may create problems during flood events. For example, by overlaying and intersecting post-Hurricane Harvey historical flood data from the Dartmouth Flood Observatory (#1.11.3) with Houston area highways and streets data, a GIS analyst could identify which roads are likely to be impacted in future events. Additionally, high water mark data may provide further granularity to describe an historical event by documenting measured flood depths with an array of points. The results of this analysis would assist emergency planners in the development of improved evacuation and transportation routing plans. A graduate research project (#1.11.4) at the University of Texas at Austin conducted after 2017’s Hurricane Harvey investigated a similar flood impact scenario.

Critical Infrastructure

The Department of Homeland Security (DHS) defines critical infrastructure as “the physical and cyber systems and assets that are so vital to the United States that their incapacity or destruction would have a debilitating impact on our physical or economic security or public health or safety” (#1.12.1). Examples of Critical Infrastructure include electrical power generation systems and transmission lines, water supply systems, water treatment facilities, dams, oil and gas production sites and pipelines, petroleum refineries and chemical plants, chemical and hazardous waste storage, nuclear reactors, emergency services, telecommunications and transportation networks.

The DHS Homeland Infrastructure Foundation-Level Data (HIFLD) (#1.12.2) assembles by far, the most reliable sources of critical infrastructure spatial data. Established in 2002 under the Homeland Security Information Network (HSIN), the HFLD subcommittee sought to improve the collection and sharing of infrastructure data. Contributors to the HFLD data repository include federal, state and local government agencies as well as private sector partners. HFLD provides data in two categories: 1) Open Data, which is freely available to the public (#1.12.3) and 2) Secure Data, which requires registration and assignment of an HSIN user account (#1.12.4).

27

Of the more than 500 different HIFLD data sets, about half are accessible through ESRI’s ArcGIS Online program, which organizes the collections. The Open Data portal provides a search tool to locate data sets of interest or a user can browse the data categories. Each data set carries a description, the date it was last updated, access constraints and a listing of the associated fields. The search results can be further filtered by location. A drop-down list provides the ability to download data in multiple formats including Esri Shapefiles, geodatabases, spreadsheets, or Google Earth KMLs. For most of the data sets, map service URLs are provided including ArcGIS map services, Web Map Service (WMS) and GeoJSON, an open data format for representing geographic features. Common examples of data found on the HIFLD site include electric substations, drinking water treatment plants and power generation plants.

Critical Infrastructure data sets maintained by agencies and organizations in Texas are more limited and often require submitting a special request for access. For instance, the Texas Commission on Environmental Quality (TCEQ) maintains the location data for dams, however, the information does not appear on their pubic-facing website. TCEQ also maintains a point data layer for TIER II reports that track businesses storing over a certain amount of hazardous chemicals, though access to this data is available only to a very limited number of authorized contacts (#1.12.5). The locations of oil and gas wells can now be downloaded from a map viewer (#1.12.6) accessible through the Texas Railroad Commission (RRC) website, however, a special request must be submitted to receive the data that covers the entire state.

Critical Infrastructure data sets are commonly needed during disaster response activities and planning for events, such as hurricanes, flooding, tornadoes, and fires. For instance, storage facilities containing hazardous materials may be located in areas prone to flooding. Knowing the vulnerable locations can help local officials and planners take measures to mitigate the risk posed by facilities located in areas where floods occur.

In March 2019, powerful storms swept through Dallas and Fort Worth causing widespread damage and power outages across the metroplex. During this event, the UT-CSR was able to map all of the electric substations and associated transmission lines that Oncor reported as damaged. From this information, it was possible to provide the Civil Air Patrol pilots the necessary targets to perform aerial surveys of these damaged locations.

Street Addressing

Street addressing is a method for locating a building or entrance using a system of digital maps and road signs to indicate the physical locations of address points. The street addressing data theme provides an indispensable foundation for urban, suburban and rural community administration and planning. Addressing systems are managed by local jurisdictions and incorporate a fixed, numerical scheme assigned to all street and roadway segments. The numerical street values, their relative orientation and map coordinates are stored digitally as address points in a spatial database. The address points can be used for emergency services, mail routing, consumer mapping applications and location analysis. Local emergency 9-1-1 coordinators who have an urgent public safety requirement to find an address point location,

28 such as a residence, office or store front, frequently manage the most complete, up-to-date and accurate addressing databases. The 9-1-1 system coordinators across Texas serve as authoritative data sources for the address point compilation.

Public or commercial acquisition of complete, up-to-date, and accurate address point data for Texas covering large areas or statewide is possible, but with certain limitations. In many areas, the user will obtain data, whether free or for purchase, that carries no guarantee to be complete, recently updated, or standardized regarding where the address point is placed in relation to the address (roof-top, driveway, parcel centroid, street, etc.).

Data acquired from commercial sources are proprietary and aggregated from authoritative government and private sources. These products often include additional features, such as comprehensive attribution, and other benefits, such as geocoding and routing tools, presentation quality cartographic map displays and address points adjusted to match the correct locations (#1.13.1).

Address point data are also available to the public at no cost through the Strategic Mapping Program (StratMap) managed by the Texas Natural Resources Information System (TNRIS). TNRIS address point data includes a combination of city, county and regional sources that vary by area. TNRIS created partnerships with the stakeholders and their authorized aggregators to compile and share the statewide address point data set. In collaboration with these groups, TNRIS also created a standardized GIS address point database structure. The addressing information and structure is recorded and maintained at the city and county level in Texas and aggregated to regional planning commissions or emergency communications districts (#1.13.2). The data contributed from 9-1-1 Service Entities were translated into the common database structure and are currently available from the online TNRIS DataHub (#1.13.3). Data refresh requests are made annually by TNRIS, however at no one time will there be a completed or final version of data because each data source follows a different update schedule. Currently, 9.17 million address points are available for 249 of 254 counties in Texas.

Common users of address points include local emergency management services (EMS) that run extensive dispatch systems for police, fire and ambulance routing and other local and state emergency management agencies, who work to mitigate emergency situations. The Texas Water Development Board also makes extensive use of address point data to predict if structures will be affected by flood conditions for early evacuation and to determine which properties will be affected by new dam construction.

Land Parcels

A land parcel represents the legal boundaries of a unit of property defined for the purpose of tracking ownership and real estate title, land use, land value and taxation. In Texas, the land parcel polygons, which are frequently edited and updated using geographic information systems, are maintained by central appraisal districts at the local county level often with the assistance of private sector vendors. An associated attribute table contains the basic descriptive information

29 for each unit, which has a unique ID that can be linked to information collected from jurisdictions and private organizations.

Texas Natural Resources Information System (TNRIS) compiles an annual statewide collection of land parcel boundaries (#1.14.1). The data is collected directly from the appraisal districts or their commercial services providers. None of the feature geometry is edited by TNRIS, but in cases in which the attribute data is imported, a translation tool converts each of the fields into a common statewide parcel data schema that makes it easier to interpret when working with parcels from multiple counties.

Geospatial data for parcels can also be accessed directly from many county central appraisal district websites. In Williamson County, the Central Appraisal District (WCAD) provides a link to download shapefiles of their parcel data (#1.14.2). The shapefiles contain both residential and commercial parcels that can be filtered by land use type and other characteristics. Parcel data from the central appraisal district has the advantage of being more frequently updated. The WCAD updates their data every night.

Other sources of land parcel geospatial data can be acquired from the Homeland Infrastructure Foundation-Level Data (HIFLD) Subcommittee (#1.14.3). The data set, produced by the vendor CoreLogic, is a license-restricted product acquired through an enterprise license agreement with the National Geospatial Intelligence Agency (NGA), a member of the HIFLD Subcommittee. The NGA is authorized to redistribute the parcel data to other agencies that support Homeland Security and Emergency Response and Recovery activities. To obtain access to this data, users must complete a Data Use Agreement (DUA) and have a federal sponsor. The geodatabases are downloadable by county and are separated into residential and non-residential files. The attribute table contains both public and proprietary data, including addresses, coordinates, land value, square footage of buildings and houses, year built and the number of stories and rooms. In addition to land parcel polygon boundaries, the CoreLogic parcel data also includes a point layer representing the centroids of the parcels.

Land parcel information has several uses in disaster response and planning. Rapid access to land ownership information can greatly improve the effectiveness of emergency response after storms, fires, and floods. In the absence of readily accessible data, it could take days to find the information. Having current and accurate parcels data would permit rapid property damage estimation in Texas. When creating land use plans, city planners may want to address areas occupied by more vulnerable, low income groups. These locations are typically areas with aging or insufficient infrastructure and are more prone to damage from disasters. As an example, Florida State University researchers used land parcel polygons and property tax-roll data to assess the hurricane hazard exposure in Okaloosa County, Florida, reported in a journal article on “A parcel-based GIS method for evaluating conformance of local land-use planning with a state mandate to reduce exposure to hurricane flooding” (#1.14.4).

30 Demography

Demography is essentially the study of human population in terms of its size, composition and distribution over space and time. Primary sources of demographic data include census enumeration, official birth and death records and population surveys. Population size can be expressed at multiple levels, ranging from global to national, regional, to local, and as fine as the smallest enumeration geography, which is a census block in the United States. Population composition refers to broad classifications such as gender and age group, as well as more specialized topics, such as income level, educational attainment and ethnicity. Population dynamics, the geographical distribution of population over time, track ongoing changes in all aspects of demography.

In the United States, the primary demographic data sets are derived from the decadal census (#1.15.1) of the nation required by the U.S. Constitution, and the American Community Survey (#1.15.2), an annual survey, both conducted by the U.S. Census Bureau. The official U.S. census count, the decennial census, is performed every ten years on April 1 in years ending in zero. The census collects demographic and housing data, ideally for the location of every person on the specified date. Basic data categories include age, race, sex, household relationship and housing tenure. One in six households are selected to fill out a longer census form that seeks more granular information related to demographics, housing and economics. Such data inform governmental programs at many levels. Given the dynamic nature of the American population, the U.S. Census Bureau established the American Community Survey (ACS), first launched in 2005, to gather so-called long-form data from a scientifically determined sample of the total population at annual intervals. The decennial ‘short-form’ census remains the authoritative count of the U.S. population used for the purposes of apportioning congressional representation as specified in the U.S. Constitution, whereas the ACS is collected, compiled and processed as a planning data set. As such, ACS-derived data sets provide a continuously updated picture of community needs pertinent to disaster information.

The U.S. Census Bureau (#1.15.3) is the agency responsible for the collection, compilation and distribution of national demographic data sets. In Texas, the Texas Demographic Center (TDC) (#1.15.4), formerly known as the Texas State Data Center and directed by the State Demographer, provides state-focused demographic services for Texas that are based on U.S. Census data. TDC also generates Texas-focused population estimates and projections for the state as a whole and for geographic subdivisions (counties, cities, etc.).

The U.S. Census Bureau provides all decennial population data and American Community Survey data in tabular format (#1.15.5). The bureau compiles four ACS Data Profiles containing frequently requested estimates and statistics organized by year, by geography and by social, economic, housing and demographic characteristics (#1.15.6). 5-year Data Profiles are the most statistically robust compilations. The agency also offers geographic information in multiple data formats, including GIS Shapefiles, Esri geodatabases, Google Earth KML and TIGER/Line ASCII files (#1.15.7), which refer to the Census Bureau’s own data format for Topologically Integrated Geographic Encoding and Referencing (TIGER). Shapefiles and TIGER/Line ASCII files include unique geographic entity codes (GEOIDs) that link to tabular data. Geography files

31 are released up to twice annually, with the most recent annual compilation available for the year preceding the year of its release. For example, 2019 geography data is released in 2020.

With tabular data linked to geography, demographic variations can be displayed and analyzed to support disaster response, recovery, preparedness and long-term planning. For example, civilian disability status by age and by census tract, extracted from the Social Characteristics Data Profile is useful for evacuation planning at a neighborhood level. The Housing Characteristics Data Profile compiles information about housing stock age, whether owner or renter-occupied, and median mortgage or rent costs, all useful data for recovery program targeting. Preparedness efforts are better informed with data about employment status, income, health insurance coverage, and poverty derived from the Economic Characteristics Data Profile. All major aspects of disaster information requirements are supported by demographic information about sex, age, race and ethnicity distribution, leading to a better understanding of where vulnerable populations may be concentrated in cities or dispersed in rural areas. • The Texas Demographic Center offers a helpful population projection tool that estimates future population totals with key demographics for several geographic divisions, such as counties, with forecasts available through the year 2050. (#1.15.8)

Economic Activity

Economics is the study of how society allocates limited resources. Agencies that manage disaster recovery programs can benefit from understanding how disasters affect local and regional economic activity and incorporate this awareness into improved planning and mitigation strategies. In Texas, natural disasters have repeatedly caused economic losses in the billions of dollars. Local impacts require a micro-economic analysis to help understand the dynamics underlying the economic toll resulting from an event. The analysis should examine which industries and markets are most heavily impacted, which are most likely to gain an advantage, and how human activity, occupations and employment are affected (#1.16.1).

Dun and Bradstreet (D&B), an industry leader in business analytics, partnered with the National Emergency Management Association (NEMA) to develop business risk and economic impact assessments that are used by state, local and federal agencies to guide response and recovery efforts for Hurricanes Harvey, Irma and Maria. Assessments included a review of job concentration by business type, location and revenue that supported interagency efforts to minimize job loss or prevent business failure. With millions of jobs impacted, the information was used to prioritize support for anchor businesses in the regional economy. Localities that prepare with reliable foundational information are able to build resilience in their communities, make data-driven policy decisions and address economic vulnerabilities before a disaster strikes (#1.16.2). D&B provides an array of reports for geographies at the state, county, city, MSA and ZIP Code level, made available in text, PDF, and HTML formats, including their Business Information Report and Market Analysis Report. D&B’s market analysis business data can also be joined with geospatial data and are available for purchase and delivered as a batch transfer or through an application programming interface (API) using Simple Open Access Protocol (SOAP) or Representational State Transfer (REST) formats.

32 In addition, economic profiles of Texas industries can be studied categorically to gain a better understanding of regional industry dynamics. Available by license, Esri’s ArcGIS Business Analyst data enables complex geospatial study of industry dynamics and economic analysis. The data includes 13 million U.S. business locations with employee and sales information, in addition to nearly 7,000 major shopping centers with attributes, such as retail sales and leasable area. Esri updates their business data semi-annually or annually (#1.16.3).

Public information sources in Texas also provide economic data, however only a small portion of these data is GIS-ready. For example, the largest council of government (COG), the North Central Texas COG serving 7.7 million residents, provides only employer point data, employment estimates and demographic census data as a feature web service, Shapefile, KML and CSV formats (#1.16.4). The remaining information is available as a PDF report titled Comprehensive Economic Development Strategy. Also, the City of Houston, with Texas’s largest urban population of 2.3 million, does not have geospatial economic trending or financial data, except for census demographic data. However, data such as operating funds budget vs. actuals, vendors and suppliers, registered small, minority, and women disadvantaged business enterprises (SMWDBE), and fund accounting can be downloaded in Excel format (#1.16.5). The Texas Comptroller also provides Excel spreadsheets through a tool called the State Revenue and Expenditure Dashboard enabling tabular report building that offers a wide range of revenue choices for comparison with long lists of other choices, namely agencies, appropriated funds and comptroller tax and fee items (#1.16.6).

The U.S. Bureau of Labor Statistics (BLS) hosts another tool that can track employment and wages reported by employers covering more than 95 percent of U.S. jobs, available at the county, MSA, state and national levels by industry. Titled QCEW (Quarterly Census of Employment and Wages) State and County Map, the results can also be downloaded as a CSV or XML file (#1.16.7) and also viewed as a map display.

Recent breakthroughs in geospatial technologies have begun to develop solutions that bring economic activity data from the past and into the present. SafeGraph, a commercial vendor, leads in this emerging technical space by collecting high quality geospatial information and tracking mobile devices to produce data that can be used by companies with expertise in artificial intelligence and machine learning (#1.16.8). With SafeGraph’s Places data as a hosted feature service, geospatial professionals can analyze new markets with point of interest (POI) data. The Places data set contains business listings and location point and geometry data for approximately five million POIs in the U.S. The POI data can be enhanced by purchasing foot-traffic data derived from 35 million anonymized mobile devices (#1.16.9).

More recent and concentrated point data will help recovery managers better understand changing patterns of economic activity, which will lead to better decisions, more accurate risk and vulnerability analysis and a deeper understanding of the locations where people spend money.

33 Disaster Insurance Claims

Disaster insurance providers, who sell insurance for damage caused by flooding, hurricanes, hailstorms, windstorms and wildfires, maintain information about policyholders in their internal databases. The information usually includes attributes, such as names, addresses, policy types, and date of loss. In most cases, the records are not maintained in geospatial format, nor do the databases contain specific map coordinate information. As a consequence, a GIS analyst must often geocode the addresses to obtain a geospatial point layer. Furthermore, due to the personally identifiable information (PII) in the databases, they are considered to be confidential and restricted from use online.

The National Flood Insurance Program (NFIP), managed by the Federal Emergency Management Agency (FEMA), provides flood insurance to property owners, renters and businesses who live within approximately 23,000 participating communities (#1.17.1). Homes and businesses located within high-risk flood areas and financed by mortgages from government- backed lenders are required to have flood insurance. The NFIP program also assists communities in adopting and enforcing floodplain management regulations to mitigate the consequences of flooding. The NFIP data, provided by secure data exchange methods, must be formally requested and is available only to responders and disaster management staff who have a justified need for the data, such as operational support during a disaster or for the administration of recovery programs.

The National Insurance Crime Bureau (NICB) is a national leader in preventing, detecting and defeating insurance fraud and crime (#1.17.2). The bureau employs specially trained professionals who use data analytics to work with law enforcement in the investigation of suspicious claims. As with information from NFIP, the NICB data must be formally requested and may involve a license arrangement. The data contains street addresses, loss type and policy type, which allows the user to filter by type of insurance.

Following Hurricane Celia in 1970, the Texas Legislature created the Texas Windstorm Insurance Association (TWIA) with the intention to provide windstorm and insurance primarily along the Texas Gulf Coast (#1.17.3). TWIA restricts access to the insured property addresses and other specific location information, but the agency does collate the policyholders by ZIP Code Tabulation Area (ZCTA), which can be linked to GIS data for mapping and visualization.

Many uses arise for insurance claims data during and after a disaster. In one case study (#1.17.4), analysts with Amica Mutual Insurance used real-time streaming weather data (including forecasted wind speeds) during Hurricane Irene and the property locations of their policyholders to quickly determine areas that were likely to result in large numbers of claims. This allowed the company to plan proactively and provide adequate resources to potential areas of predicted high claim volume rather than wait for first responder reports, impact models and aerial damage surveys.

The insurance claims data can also be used for future planning. At the University of Texas Center for Space Research (UT-CSR), spatial scientists applied the NFIP claims data in Texas to

34 identify areas of repetitive loss over an extended period of time. Addresses with repetitive losses were mapped by census block and normalized by area to show concentrations of repetitive loss over time (Figure 1.3). By understanding the conditions affecting areas having greater numbers of claims after repeatedly experiencing damage losses caused ty floods and storms, it is possible to place more resources in the areas before an event as well as take measures to mitigate future loss.

NFIP Repetitive Loss Addresses Per Sq Mlle Properties (2001 - 2018) by Block Group (Geometric Inter val) 4 272 6-1020 (31) 7280-272 5(128) 19 40-72 79 (423) 5111-19 39 (647) 295-5 110 (471) 02733-1.294 (180)

Sources. 2017 5.year American Community Survey geography and data files containing estimates for total population, housing units and tenure (housing occupancy) Insurance claims compiled fr om Apnl 2019 National Flood Insurance Program (NFI) records

Figure 1.3. Map showing NFIP Repetitive Loss Properties by U.S. Census Block Group.

Public Health

Public Health data can be considered a specialized subset of demographic data in that demography provides the population context and the geographic backbone supporting analysis and interpretation. Other critical resources for the collection of public health data include hospitals and healthcare providers for disease, general health and insurance information, environmental regulatory agencies, and birth and death registries.

In the United States, the Centers for Disease Control and Prevention (CDC) (#1.18.1) is the premier resource for national public health data sets and information. The Texas Department of State Health Services (DSHS) (#1.18.2) provides similar services at the state level. DSHS houses the Center for Health Statistics (#1.18.3), as a portal for comprehensive health data in Texas.

35 CDC and DSHS rely on and link to the U.S Census Bureau and the Texas Demographic Center described in the Demography section of the current chapter. American Community Survey Data Profiles offer limited summary data about general health characteristics extending from a national to census tract level. Topics include the number of women 15 to 50 years old who gave birth in the past 12 months, disability status by age group (minors, adults and seniors), health insurance coverage by age, insurance type, and work status, and family composition by household. The Census Bureau maintains a list of additional surveys and programs that feature health related data (#1.18.4). An example of a health-related topic is tobacco use, compiled in the Current Population Survey (CPS) supplement.

Although epidemiologists, who are tasked with tracing patterns of the spread of disease, have promoted the marriage of health data with Geographic Information Systems (GIS) for at least the past quarter century (#1.18.5), challenges are still encountered when seeking to acquire public health data from a central distribution source. Given the sensitive nature of individual health information and the frequently very small sample sizes found within census tracts and blocks, agencies, such as the CDC and DSHS, may compile GIS data for study and program use but rarely disseminate this information for public access or sharing with other organizations. CDC maintains a web page devoted to its collection of interactive web applications (#1.18.6). It defines the categories of applications as chronic disease, infectious disease, social/behavioral /environmental, and general. DSHS provides some public health data in tabular and interactive map format compiled statewide and by county. The Cancer registry also maps data by health region, Council of Government, Metro Statistical Area (MSA), and Micro Statistical Area (mSA).

The CDC’s 500 Cities Project (#1.18.7) demonstrates how public health data can enrich a standard census places data set and assists epidemiologists responding to a local epidemic outbreak or a pandemic. Epidemiologists and biostatisticians need data about the presence of comorbidities that can alter the course of an infectious disease in differing populations. Information about gender, race, economic status and household characteristics also informs efforts to model the spread of infection within susceptible populations. With reliable data in hand, pandemic epidemiologists can create numerical models that offer insights into the transmission rates and spatial patterns caused by the spread of infection. Another data set useful for disaster response, recovery and preparedness is the CDC Social Vulnerability Index (SVI) (#1.18.8) which tracks 15 social factors by census tract. The SVI fact sheet (#1.18.9) details the ways that the SVI can be used to identify and map communities most vulnerable to natural and man-made disasters based on data as recent as 2016...... The broad spectrum of data themes having different data types, originating sources and varying contexts for data application as shown in this chapter reveals the many challenges confronting the construction of the TDIS data management system. Authoritative sources of timely, accurate and easily accessible information must be identified and continuously evaluated for their suitability in the realms of disaster response, recovery, mitigation and long-range planning. To be truly useful, the chosen framework data products need to be uniformly integrated in continuous coverage over large regions, if not statewide, and readily available beyond a broken patchwork of individual project areas.

36

In Texas, organizational structures already exist to promote the selection and cooperative funding for the production and maintenance of many key data sets discussed in the preceding sections and for the development of new data sets as production technologies improve. As an example, the Strategic Mapping Program coordinated by TNRIS has helped to foster cost- sharing arrangements with federal, state and local stakeholders for the creation of consistent, high-value geospatial data made freely available in the public domain. More could be done to expand the current data catalog of reliable, high-quality data. Also new data sources emerging from their research-and-development phase should be explored and tested for adoption in the areas underserved by the data products now in use. Arguments for the selection and sustained development of certain high-priority data sets are considered in Chapter 4, while examples of the most significant data shortfalls uncovered by our survey are recognized in Chapter 5.

37

Chapter 2 Data Sharing

Chapter 2. Data Sharing

Data sharing relationships are built both internally within different parts of an organization and externally between separate organizations that hold a common interest in making their own data more easily accessible and in receiving a flow of timely, informative data that they are not creating themselves.

To examine the traits associated with data sharing arrangements, we reflect in this chapter on how we worked with several different organizations to expand and improve data sharing processes that are representative of the data partnerships likely to be promoted during early stages in the development of TDIS. These various experiences enabled UT-CSR to reach a better understanding of the data collection and production techniques pursued by different organizations along with their current data access and sharing policies. From the beginning, some of the exchanges were frictionless and quickly led to satisfactory outcomes for everyone. In other instances, substantial obstacles needed to be surmounted, which required changes in procedures that no one had anticipated during the initial preparations.

The restrictions to sharing data products or data services by a potential TDIS partner may have several origins. Some organizations have only recently begun to consider allowing access to their digital data sets through methods other than by connecting to a simple file server system or by transfer on physical media. In this case, the use of automated procedures to gather frequently updated data in real-time might require substantial changes to server and network control systems. Some agencies operate in extremely restricted environments with tightly controlled security, such as the USACE as a component of the United States military. For organizations adhering to homeland security or military network protocols, access to servers by external users may be blocked entirely, and it may not be possible for the agency to deploy server infrastructure beyond their firewall. In these cases, data sharing can still occur, primarily through read-only or other data conduits that prevent tampering with either the data products or the secured networks. Despite the limitations, and even if the data exchange is less than an optimal two-way street, there are mutual benefits to the pursuit of data sharing agreements with these organizations. In the course of dialog, the discussion of how to exchange data opens a window into how and why

38 an organization requires different sources of information, which leads to a better overall understanding of data uses and needs.

The following sections relate several experiences in building data sharing arrangements across a range of agencies and other groups.

FEMA HPA Team and Hurricane Harvey High Water Marks

In recent years, UT-CSR’s MAGIC (Mid-American Geospatial Information Center) group has collaborated with the FEMA Hazard Performance and Analysis (HPA) team at their Joint Field Office (JFO) location in Austin. The JFO houses the Texas Long-Term Recovery Office, which focuses on the post-Hurricane Harvey recovery. The HPA team collected data following Hurricane Harvey as part of the Texas Strategic Phased Watershed Study (TSPWS). The TSPWS study concentrated on 21 sub-basins in Texas and included 26 categories of data.

During the collaboration with the HPA team, members of MAGIC and HPA shared information about Hurricane Harvey-related data sets maintained by each agency and discussed methods of data product sharing. Though much of FEMA’s data is publicly available, some of the data, including individual assistance, insurance claims data and grant projects, is restricted for reasons of confidentiality, and transfer to any external group must be approved through the FEMA Information Management Office.

One of the HPA team’s goals was to share their TSPWS project data through a web interface that would allow users to select a sub-basin of interest and view the geographic data in a map display. The web application was nearing completion before being placed on hold. As a consequence of data sharing restrictions, the HPA team could share only a vector geodatabase of an earlier version of their flood inundation data with UT-CSR. They were not permitted to share the final, more detailed version of the inundation data nor their data processing steps. In contrast, UT- CSR shared many Hurricane Harvey data sets and ArcGIS data services, including aerial and satellite imagery, Civil Air Patrol (CAP) handheld photography and precipitation data, through the MOVES web application and ArcGIS Portal data services created by UT-CSR.

The FEMA HPA team also expressed an interest in acquiring a copy of a USGS High Water Marks (HWM) data set that UT-CSR had reviewed, highlighting areas of concern. Documentation included with the data described the results of the UT-CSR assessment. In addition to sharing the data, UT-CSR provided links to its USGS Harvey High Water Marks

ArcGIS Portal viewer (-#2.1.1---). During discussions in meetings and message exchanges, both groups were able to share information about the recognized deficiencies in the HWM data. The HPA team discovered measurement fields with no data entry in the USGS attributes. UT-CSR determined that a percentage of the HWMs were assigned to types that did not agree with the descriptions in the feature tables.

During the collaboration, several problems occurred when attempting to share and receive data from the FEMA JFO. FEMA employees could not access the Box cloud storage system used by the University of Texas at Austin. Data also could not be transferred by using physical media,

39 such as a flash drive or external hard drive. FEMA staff had to transfer the data from UT-CSR directly to a laptop using an Iron Key encryption flash drive. In addition, the JFO facility is kept under high security, which makes meeting in person difficult. Staff at the JFO must also use teleconferencing applications that are not in common use at UT-CSR. In the end, FEMA designated much of their most useful data as confidential and restricted from transfer to third parties such as UT-CSR.

Despite the limitations, the two groups succeeded in establishing a collaboration to review and understand the Hurricane Harvey HWM data. They discussed and attempted by many means to develop an optimal pathway to exchange data between the groups. The experience places a spotlight on the kinds of data sharing impediments that TDIS will need to resolve.

UTA-UTSA 9-County Southeast Texas Project

In response to work conducted for the GLO by a project team from the University of Texas at Arlington (UTA) and University of Texas at San Antonio (UTSA), UT-CSR reached out to them to determine which kinds of geospatial data and data distribution methods were under development. The UTA-UTSA project covered portions of three river basins in Southeast Texas in nine counties, including Jefferson County in the Beaumont-Port Arthur area. UT-CSR pursued two main goals: learning about the inventory of data layers to be used in the project and exploring the possible approaches to data services that UT-CSR could share with the UTA- UTSA team.

Direct interactions with the UTA-UTSA group were very positive. The group was very receptive to and interested in reactions to their project development work. They also were eager to hear constructive criticisms and insights into possible improvements that could be made to the project. The heart of their project involved the aggregation of many data sets across a nine-county region that would then be displayed using visualization techniques in a spatial data explorer. The process required gathering and preparing data, often in disparate formats, from various providers with the goal of achieving a seamless experience while viewing any particular data set. One pitfall with framing the project in this manner concerns the researchers not recognizing that at its core the project is about data and data access for any and all who wish to acquire it. While the creation of a visualization tool deserves admiration, it is not sufficient to create a robust end result.

One of the major action items to emerge from our discussions was the creation of accessible data services across the web. Additionally, the UT-CSR and UTA-UTSA teams discussed the advantages of data services in depth so that it became one of the primary goals for creating a truly useful data information system. Issues began to appear at this stage. Since the original concept was the aggregation of the data to be visualized in a viewer, the UTA-UTSA team was unaware of web services that could be used to decouple the visualization tool away from the data. As the collaboration continued, the IT infrastructure of UTSA caused issues with the procedures that needed to be resolved. Many of the challenges became more about process than about technical feasibility. Time had to be spent developing the access and authentication needed in building and exposing the data as a set of services.

40

Another interesting problem revealed by the work arose because the data structure was not sufficiently intuitive without the context of the application. A part of the deficiency can be ameliorated by having good metadata, but not all of it. Understanding that the data could be uncoupled from the application and should, instead, be delivered in a web service model leads to the realization that most of the data should have easy to follow data types and field names. Toward the conclusion, the collaborating teams were able to use a remote visualization tool to access and display the data, an achievement that proved the viability of the data services. The issues were all solved at project’s end. Moving ahead toward future interaction between the teams, a few things can be learned from the experience. One is that early discussion and recognition of the importance of web data services should inform much of the rest of the work. Knowing this can shape the development of IT infrastructure and software to build an architecture that can support both local and remote access to data services.

Sharing data from the ADCIRC Storm Guidance System with the SOC and TTF1

When hurricanes and large tropical storms make landfall along the Texas Gulf Coast, the greatest damage and loss of life often occurs as the result of the inundation caused by powerful storm surge accompanied by large waves. The depth of water pushed inland by the surge coupled with the battering impacts of wave fronts riding atop the surge have resulted in billions of dollars in property damage and cost hundreds of lives. As a result, when tropical cyclones threaten coastal communities, state and local emergency managers need the best possible predictive information to tell them exactly when and where storm surge will strike and the estimated magnitude of the impacts. For the past two decades, a national team of academic researchers has worked to develop a high-resolution hydrodynamic model that can be run on supercomputers to create predictive guidance.

The ADCIRC (Advanced Circulation) Storm Guidance System (ASGS) (#2.3.1) runs on the Frontera, Stampede and Lone Star supercomputers at the UT Texas Advanced Computing Center (TACC) and issues hydrodynamic model products that estimate the maximum water levels, inundation depths, surface wind velocities and other physical effects of an impending storm surge approximately one hour after the National Hurricane Center releases a storm advisory. The numerical results are represented in the Coastal Emergency Risks Assessment (CERA) (#2.3.2) web application developed by an ASGS research team led by Louisiana State University. The web application displays visualizations of the approaching storm surge and integrates real-time NOAA tide gage observations and GIS data for coastal protective infrastructure. The forecast products from the ASGS available through the CERA website are used by federal and state emergency managers from the Gulf of and Texas to states along the Atlantic Seaboard.

In the Texas State Operations Center (SOC), UT-CSR monitors hurricane forecasts issued by a wide range of sources and works with the disaster response coordinators from state agencies participating in the Emergency Management Council and from the Texas Division of Emergency Management (TDEM). The agency coordinators have many questions in the period leading up to landfall that can be answered by the forecast information from ASGS that UT-CSR facilitates. For example, the Texas Department of Transportation (TxDOT) may request data on the

41 predicted impacts to bridges, causeways and highways near the coast and on the times to cease ferry services at Port Aransas and Bolivar and to open the swing-gate bridge at Sargent on the Intracoastal Waterway. The Texas Department of Criminal Justice (TDCJ) may inquire about the need to evacuate state prisons in low-lying coastal areas, while the Texas Commission on Environmental Quality may need to know what industrial facilities storing hazardous materials may become inundated the storm surge.

Morn tr Might be vp.4 (2Aug2020. 06 00-3040g2020. 06 00 UC tropic.l St LAA Aery22(24020200900u1 fee4w% .tee.et o ¢ ± 8 , .

... .. 0 4 • ·�,, Ganis.ta IE Figure 2.1. In the morning of August 26, 2020, the ASGS forecast the timing and magnitude of high winds in Port Arthur, Texas, from Hurricane Laura. Visualization from the ASGS team via the LSU Coastal Emergency Risks Assessment web service.

Most significantly, the search-and-rescue teams sent in advance into a hurricane landfall region by the Texas A&M Engineering Extension (TEEX), Texas Military Department (TMD), Texas Parks & Wildlife Department (TPWD) and local emergency managers need the best available guidance for their life-saving activities. In the case of Hurricane Laura, which struck near the Texas-Louisiana border on August 27, 2020, UT-CSR worked with the director of Texas Task Force 1 (TTF 1) to use ASGS products to understand the storm surge and high windspeed impacts of the major hurricane on the Port Arthur-Beaumont-Orange area of Southeast Texas, as the storm approached the coast. An accurate forecast allowed TTF1 to occupy staging sites in safe locations very near the areas that were predicted to receive the greatest damage (Figure 2.1). One of the most useful ASGS guidance products that UT-CSR has helped to develop is the CERA charting tool for windspeed time series at any selected location. For Hurricane Laura, TTF1 positioned search-and-rescue boat squads at Lumberton Dome, Ford Park near Beaumont and in southwestern Port Arthur. UT-CSR provided charts that displayed the timing of high winds the evening before landfall so that the TTF1 director could determine how long his strike teams would need to shelter in place, and when wind velocities would fall below tropical storm strength to allow operations to begin (Figure 2.2).

42 Wind Speed (fl, NAVD88) for location -93.955855,29.873607 Maximum Wind Speed as predicted by the model during the forecast time range: 102.41 mph

Hover over the graph for more details. Click and drag in the plot area to zoom in. 134.22 mph _ SW Port Arthur

102.41 mph at 1 AM 89.48 mph _

44.74 mph = TS f--'--'�------40.36 mph at 6 AM

0.00mph _, Aug 27 Aug 28 Aug 29 Aug 30 Aug 31 00:00 00:00 00:00 00:00 00:00 predicted wind speed predicted maximum

ADCIRC node: 932344 dry. Elevation: 2.13 ft above NAVDBB. Figure 2.2. In the morning of August 26, 2020, the ASGS forecast the timing and magnitude of high winds in Port Arthur, Texas, from Hurricane Laura. Visualization from the ASGS team via the LSU Coastal Emergency Risks Assessment web service.

Having an agile planning tool of this kind permits the SOC coordinators and incident commanders in the field to make reliable plans and explore their mission options well before the eye of the storm reaches the mainland. The original ADCIRC and later ASGS forecast products have proved their accuracy for over a decade during many storms, particularly in Hurricanes Ike, Isaac, Harvey, Hanna and Laura, where the predicted storm surge estimates are compared with actual NOAA tide gage measurements. Several aspects of the research and development underlying ADCIRC received funding from the U.S. Army Corps of Engineers (USACE) in the storm surge reconstruction done after Hurricanes Katrina and Rita in 2005. The use of the ASGS as a method for hindcasting opens opportunities for its application in recovery and planning, where it can also perform tests of possible future protective coastal infrastructure using potential future hurricanes (or design storms) in the numerical model domain before any physical structures are built.

The results of data sharing with ASGS demonstrates the value of collaborative research among academic partners that meets the long-range planning needs of USACE and other organizations involved in recovery and mitigation, as well as the immediate response requirements of first responders, such as TTF 1. It stands as an ideal example of collaboration that signals the value of data sharing partnerships that takes information from modeling into operations.

Sharing satellite imagery with the NWS West Gulf River Forecast Center

For the majority of the river basins in Texas, the West Gulf River Forecast Center (WGRFC) is responsible for monitoring weather conditions that lead to river flooding and for initializing

43 hydrologic models that forecast the estimated river stage heights at gage station locations along channel segments during flood events. In most instances, the process of prediction is informed by a long history of previous floods, where gage measurements have recorded the stage heights used to infer the onset and magnitude of local inundation and potential damage to surrounding properties. While predictive models have improved over time and modern weather radars now deliver more accurate estimates of rainfall rates and accumulation, circumstances can make accurate flood forecasts difficult and local conditions hard to assess in sufficient time to transfer useful information to emergency managers. Such conditions occurred in late July 2020 after Hurricane Hanna made landfall in and swept across the border into the mountainous terrain of the Sierra Madre Oriental in . Torrential rains were followed by floodwaters entering the Rio San Juan, a major tributary to the . Very few real-time reporting river gages exist in northern Mexico, thus limiting the inputs to the WGRFC hydrologic modeling system and placing the forecasters in a difficult position. The Rio San Juan passes floodwater through the El Cuchillo and Marte Gomez reservoirs before flowing into the Rio Grande below Falcon Reservoir. If the Mexican reservoirs cannot safely store the inflows, then emergency releases will occur, and a flood wave will reach the Lower Rio Grande Valley, requiring diversion into the International Floodway system and forcing evacuations in low-lying areas near the river.

In the final week of July 2020, the lack of river gage measurements recording the movement of Hurricane Hanna floodwaters down the Rio San Juan narrowed the options available to WGRFC to confirm their flood forecast, which had predicted that the estimated river discharge could be held in stormwater storage in the two large Mexican reservoirs. If the WGRFC forecast proved to be correct, then no water would be released downstream into the Rio Grande, and the Lower Rio Grande Valley would be safe from flooding. On the other hand, if the forecast turned out to be wrong, there would be little time to move residents from the threatened low-lying areas along the Rio Grande on both sides of the border.

In this case, another source of information about flood conditions in Mexico could be provided by the UT Center for Space Research, which continually receives synthetic aperture radar imagery of Texas and adjacent regions from the European Space Agency (ESA) Sentinel-1A and -1B satellites (Figure 2.3). Data from the Sentinel C-band imaging radars can be processed to determine where areas of open water and very wet soils occur. By having a large number of pre- event images collected under dry conditions before a flood, a differencing algorithm can be applied to the images collected at the time of flooding to detect the previously dry areas that have gone underwater. The image processing technique allows the extent of flooding to be identified with remarkable definition. A Sentinel-1A radar image collected on July 30, 2020, was combined with pre-flood imagery to show the region where the Rio San Juan flows from the Marte Gomez Reservoir dam to the Rio Grande near Rio Grande City. While the features away from the river displayed traces of localized flooding, the river channel itself remained within its banks with no signs of a flood wave underway. This image and others were provided to the WGRFC staff and proved to be a significant source of evidence used to validate their forecast.

44 • a. -A $ •

de. �-,.

'L-t" wee .i , Figure 2.3. A European Space Agency Sentinel-1A synthetic aperture radar composite image collected on July 30, 2020, reveals that the Rio San Juan is flowing within its banks northward toward the Rio Grande from Marte Gomez Reservoir in , Mexico.

The incident is not unique. In response to (2010), Hurricane Harvey (2017), Tropical Storm Imelda (2019) and other major floods in Texas, UT-CSR has relayed high- resolution radar and optical satellite imagery to the WGRFC to use in their analyses and to contribute to the validation of model forecasts. The combination of near real-time satellite observations with numerical hydrologic modeling opens new avenues to improve the quality of flood warning products and the guidance delivered to decision makers. Timely information about surface conditions from satellite sensors covering areas where conventional data sources are scarce or unavailable can increase confidence in our understanding of the complex nature of flood disasters wherever the events occur. The insights gained from satellite sensor observations should lead to better-informed operational decisions.

USACE Trinity River Investigation

Between 2015 and 2019, the Trinity River experienced a series of major flood events breaching a number of agricultural levees, damaging oil and gas production sites, disrupting wastewater treatment facilities and causing damage to residences and commercial buildings along the middle reach of the river from Trinidad in Henderson County downstream to Lake Livingston. The Fort Worth District of the U.S. Army Corps of Engineers (USACE) manages flood control dams surrounding Dallas and Fort Worth in the northern portion of the Trinity River Basin and regulates the discharge of stored floodwater into the lower basin. Although the primary operations of the USACE flood control focuses on the heavily populated DFW Metroplex, the damage caused by floodwater releases into the rural downstream areas has become an increasing

45 concern. Since 2015 floods have repeatedly damaged properties in the area of Riverside near the entrance to Lake Livingston upstream through Anderson and Freestone counties. An expansion of sand and gravel excavation site accompanied by a new levee structure near the Texas Highway 7 Bridge has raised concerns about local flooding near the earthworks.

The continuing threat from river flooding along the Trinity River led the USACE to begin studies for mitigation of locations on the middle reach and prompted a request to the University of Texas Center for Space Research (UT-CSR) for assistance in documenting the impacts of the recent flood events. UT-CSR offered to explore a time series of medium resolution satellite image collections to determine changing conditions in the flood-prone area (Figure 2.4).

A UT-CSR remote sensing scientist selected the satellite imagery to be used for the purpose of detecting inundated surfaces. The selection included a combination of synthetic aperture radar (SAR) and optical satellite imagery. Due to its all-weather and daytime/nighttime imaging advantages, SAR imagery has become the most effective imagery applied in flood detection. For the middle reach of the Trinity River (#2.5.1), there are three European Space Agency (ESA) Sentinel-1A regularly collected scenes covering this area, including two ascending scenes with imaging times in the early evening and one descending scene with an imaging time in the morning. Each single observation of the scenes has a 12-day revisit cycle, but the combination of all the three shortens the interval to between four and seven days.

Figure 2.4. A European Space Agency Sentinel-1A synthetic aperture radar image collected on January 10, 2019, shows floodwater inundation along the Trinity River near Riverside, Texas, during a high river stage recorded by a USGS flood gage at the Riverside bridge.

46

The ESA Sentinel-1 product chosen for the flood analysis is the L1 Detected High-Res Dual-Pol (GRD-HD) data (#2.5.2). The original images are first radiometrically calibrated from intensity to Sigma Naught, speckle-removed, and then terrain-corrected using the SRTM 1 HGT product (#2.5.3). Finally, the Sigma Naught values are converted into decibel (dB) records. For the middle reach of the Trinity River, the available polarization combination is VV+VH. The co- polarization band VV is the critical band used to detect floodwater because the backscattering from open water is very weak when compared to the soil surface before flooding occurred. Flooded image pixels tend to be very dark. The cross-polarization band VH is sensitive to vegetation. When comparing the VH band under non-flood and flood conditions, brighter pixels occurring in the VH band in vegetated areas usually indicate floodwater underneath canopy due to the double-bounce scattering mechanism of the incident radar beam. Polarization bands from different dates (before and after an event) are regularly used to make false color composite images to visualize floods.

In addition to the radar satellite imagery, cloud-free or low cloud cover optical imagery from Sentinel-2 (10-meter resolution) (#2.5.4), Landsat 8 Optical Land Imager (30-meter) (#2.5.5), NASA Terra and Aqua MODIS (250-meter) (#2.5.6), and Sentinel-3 (300-meter) (#2.5.7) were also acquired for the study area. In the optical satellite images, dark pixels imaged in the near infrared bands are usually associated with water because water bodies absorb near infrared wavelengths. False color composite images and Normalized Difference Water Index (NDWI) (#2.5.8) images are also generated to discriminate water bodies.

UT-CSR sent the results obtained from the satellite observations to the USACE and Trinity River Authority, who then used the analysis to identify inspection sites along the river. The investigation successfully detected the passage of flood waves between river gage locations and showed the timing and extent of local inundation. The study provides another example of how cooperating TDIS partners can share new data and analytical techniques to monitor flood damage in near real-time thereby providing useful guidance to engineering teams and first responders.

. . . . . The examples of data sharing explored in the preceding sections include partnerships with local, state and federal stakeholders, as well as with the academic research community. Each has a particular interest in pursuing the kinds of data exchange described for the different use cases. Some information needs are truly urgent, as in the case of Texas Task Force 1 requesting assistance for an accurate description of storm surge and high wind timing and magnitude in a hurricane landfall region, where TTF 1 teams will perform search-and-rescue operations. Other exchanges are less immediate and imperative but carry the desire to receive an information product that can offer new insights that are otherwise unavailable from traditional sources. For the USACE investigation of flood-prone segments of the Trinity River, the collection of optical and radar satellite imagery revealing changes in the areas of inundation contributes information that cannot be efficiently gathered from other evidence. As experience grows in the uses of a new data source, and investigators confirm that the information is trustworthy, other applications

47 may soon follow in other regions for different purposes. Data sharing of this kind expands our knowledge base.

The long-established rules and habits of organizations can make the attempts to share data awkward, tedious, and frustrating for everyone involved, but the many benefits of finding better pathways for data sharing are well worth the effort required.

48 Chapter 3 Data System Security Chapter 3. Data System Security

The prospects for the growth of a viable Texas Disaster Information System (TDIS) depend in large part on the trust relationships developed among organizations that choose to become partners and participants in data sharing arrangements. That trust will be anchored by the enactment of robust, well-tested procedures to monitor and maintain the kind of data system security that meets the high expectations of partners.

“Society’s overwhelming reliance on this complex cyberspace . . . has exposed its fragility and vulnerabilities that defy existing cyber-defense measures.” Secure and Trustworthy Cyberspace (#3.1)

Security threats and breaches both researched internally and released actively in the wild are a given and escalation for the last several years have led to constant concerns in a space that used to be very episodic. With the increasing numbers of new devices connecting to the internet, such as the internet-of-things (IOT) and the rise of both virtual and containerized servers, the scale of this threat is clearly not going to decrease anytime soon. Some of the increasing problems lie outside of the control of any organization, but many others are directly related to decisions made during the design and execution of a project. “The rise of APIs also [raises] the potential for more security holes…the challenges start with a programmers’ priority lists”. (#3.2) As a case in point, take the Facebook-Cambridge-Analytica scandal (#3.3). Facebook did not take security nor privacy into account when building its application programming interface (API). The consequent failures can be traced directly to Facebook’s disregard for these critical concerns. Security, as it has been practiced, is generally implemented as an add-on to an already existing application, or as a second phase of project development to be completed in later stages of work. While there is some level of justification for this approach, the unfortunate truth is that it is much more difficult to make an application thoroughly secure after the majority of the coding is done versus taking it as a first-level consideration in the process of creating an application. In the world of hyper interconnectedness and after a long period of inadvertent trial-and-error, now we see the emergence of the science of cybersecurity along with the integration of cybersecurity research into expressions of best practices as seen in recent research awards made by the

49 National Science Foundation (#3.4). From the recognition of the dimensions and serious nature of the problems, the beginnings of a road map for handling these issues can be drawn.

In the context of TDIS, the fundamental question of security involves several parallel questions. Data Services, Data Systems, Applications and Users - all have equal importance as it concerns security. Remember that these are fluid, complex questions. Until a specific implementation of a system is made explicit, it is not useful to talk about the true level of detail necessary to build the security models and to define the comprehensive list of security concerns and answers. Thereafter, several issues remain to be discussed to keep security concerns in sharp focus. “Examining the fundamentals of security and privacy as a multidisciplinary subject can …protect existing infrastructure” (#3.5) .This chapter provides guidelines for the posited security concerns but not necessarily the answer of a specific implementation.

In Towards a Safer and More Secure Cyberspace (Chapter 5), the authors offer another way to separate security vulnerabilities into confidentiality, integrity and availability. They go on to assert that the pervasiveness of the technology and its inroads into everyday life means that the world and its systems are connected “Everywhere, all the time.”

Major Security Concerns related to Enterprise-Scale Disaster Information System “Users and Infrastructure take center stage”

A discussion regarding an enterprise at the scale of the Texas Disaster Information System (TDIS) generates many security concerns. In discussing and visualizing a complex hybrid distributed system (i.e., a federated system) of resources, which are demanded by the operational scale of TDIS, security is a concern that must be addressed and focused on from the outset. Most of this discussion will concentrate on users, especially their authentication and authorization. A system that requires security in depth is only ever as good as the user policies and business practices enforced. We must review the unique challenges of creating users when considering that there will be both an internal set of users and numerous external groups of users that require access to resources available from the system.

Beginning with user creation, there will need to be a process by which new users can register with TDIS, verify their identities and then be allowed access. Since a disaster is unpredictable, we oftentimes are unable to anticipate and control all of the factors that might increase the challenges associated with assuring secure access. For this reason alone, there should be some level of automation that governs the process of registering. The need for automation requires a hierarchy of assigned roles to which a user might belong. Planning ahead, we could make the default new user a member of a read-only public data group or groups that may not alter or add data.

The premise behind the restriction would be that the user is only accessing data sets, data processing and tools that they could access simply by virtue of this data being openly available in the public domain. While the data in question might be difficult to find and retrieve, if it is public domain data, then it should be accessible with the default user role (Figure 3.1).

50 Conceptual Model of different types of data

Different access rights can be granted by different roles for different types of data

Figure 3.1. Conceptual Model of Data Types

Moving up the hierarchy of restriction, there would be roles assigned that are related to commodity or commercial data sources perhaps driven by licensing restrictions. Other user roles would permit access to sensitive data, including personally identifiable information (PII). Some restricted data could potentially have only a single or a few approved data roles. Consequently, users would collect roles that allow them to see different data based on a vetting and verification process. Since this step follows the initial role provided upon registration, the process would require human intervention and administration. This also ensures a comfort level for those managing the system that the process allows a specific level of access only to users known to be appropriate (Figure 3.2).

51 Conceptual Model of User with many roles TCEQ Water Data Role

Disaster information Swstem Use Local Public profile works Data Name &Other Contact Detail EMA A Data Role Role John Doe

Critical Each of these roles NC Statistics infrastructure Data Role could have originated DatRole with a separate group. Each also has specific data that can be accessed by an authenticated user Figure 3.2. Conceptual Diagram of Users, illustrating many roles.

The described user authentication system will require administrative overhead for maintenance. Given the unique situations that occur in any given disaster, the number of users requesting access to TDIS could swell considerably within a very brief interval. Some users that justifiably need access during a specific event might never need access again. Also, users will inevitably forget their passwords or usernames. Some users might even change job responsibilities. These factors, in turn, could potentially cause their roles within the system to require change. The results of churn within the user system will require human administration to maintain accurate up to date security.

Moving on from users, consider the resources that drive the interest in using TDIS. This system will only thrive if it is constantly receiving up to date, pertinent, accessible data. The system will be also be used to integrate resources from the many different organizations that establish data sharing agreements. There is a potential for any one of the sources at different locations to experience security issues. Business processes will need to be in place to allow staff to verify not only accessibility but also focus security measures on those external resources that require special authorizations.

Besides the remote sources of data, TDIS will require constant monitoring to ensure that the security of the system itself is performing as expected. Internal data sources from the centralized system model, will also need verified security measures. The internal data might be specific to GLO or to any part of the data sharing agreement for those groups that provide data without depending on web-driven data services. As new data sets come online, they will need to be tested

52 and secured with the appropriate access. Beyond these concerns are the health and maintenance of the infrastructure of the system itself. Software and firmware. OS-level and software level updates and patches will need to adhere to a rigorous schedule of maintenance.

Adjacent to the system infrastructure is the requirement for network monitoring. Appropriate Tools appliances and other hardware should monitor traffic on the system network in order to recognize and flag suspicious behavior. Efficient and reliable communication of threats and other problems encountered by network monitoring systems is critical for success. Mechanisms need to be in place to report issues to participating organizations and other responsible parties in a timely manner.

Federated Identity Management is the best means by which to handle a distributed TDIS “Each partner maintains their own security models and access privileges”

Today we face an increasingly challenging environment on the internet when we consider security. The challenge comes in addition to maintaining internal security. There is a constant and ever-present threat of security exploits. One of the most successful methods for breaching a system is through some sort of credentialing and authorization exploit that unintentionally delivers controlled data to unauthorized users. The process to maintain security contains many solutions, from individual-based access to integration with an LDAP (Lightweight Directory Access Protocol) or some other database of users/roles and access privileges.

If we consider the most scalable and resilient options, then the current technical trend is to use federated identity management (Figure 3.3).

Providers can sync with a federated identity service to provide users with authentication and authorization rights to the wide variety of data sources distributed through TDIS. This process allows an organization with a set of designated users to authenticate and then acquire access to networks of different enterprises through an agreed upon trust among the organizations that are responsible for those enterprises. Partners in the federated identity management are responsible for their own users and their own authentication. Each partner vouches for their individual user’s access to the other partner’s enterprise resources.

Some of the benefits arising from this approach are obvious. It means that individuals only need to authenticate once to receive the privilege to access all the resources that their rights and roles afford them. Also, it means each group can maintain their own directories of authenticated users with their own business processes and security constraints. Individual partners are then responsible only for the management of their own users and not for the users of every partner involved across TDIS.

53 •

2 t l

• • • • •

Figure 3.3. Example Diagram of Federated Identity Management.

54 Since access for individuals is negotiated through server interactions decided at the partner agreement level, not on a per individual basis, the entire process becomes much simpler than having to maintain different credentials for each member enterprise. In this arrangement, access would be granted in the form of groups to which users would be assigned. These groups, or roles are given authorization to see specific data sets or resources. Individuals can be assigned groups through federated identity management services to be hosted by GLO or by a GLO designated partner.

A simple and effective method will be needed for maintaining lists of the roles that are shared between different member enterprises for this work. Since we are essentially passing around roles instead of individual users, such synchronization is paramount. The roles should have information available about the kinds of resources the roles are permitted to access and whatever restrictions on data usage exist for the data shared. The implementation of the procedures will have to be defined by the business practices that best suit each individual partner enterprise requirements and agreements.

Data Sharing agreements among Organizations “Read-Only simplifies security concerns across Organizations”

Data Sharing Agreements should be as uncomplicated as business processes allow. We can begin by examining access by remote data services. This is the preferred method of data sharing because it enables the data to be as up to date as possible and ready on demand in unpredictable time frames. The bulk of the access should be read-only rights for groups. In that case, the resources and data can be grouped into categories, either based on thematic lines or based on the level of security concerns or other categories yet to be determined. Then these categories would be the basis for the roles/groups that would be built into the organization and shared with the GLO Federated Identity Management System.

In terms of data access, if a security model, such as a federated identity management system, can be agreed upon from the beginning then that model can inform the structure of the sharing agreement in powerful and meaningful ways. Having an understanding that one will be building a set of shared roles provides clarity about the groupings that an organization must build in order to maintain appropriate security levels. It also allows for easier consideration when new data sets are desired or created. New data would simply need to be included in one of the existing groups or be attached to one of the groupings as a model for the new data set.

The argument presented above can be illustrated by examples:

Thematic – Impacted rivers/creeks, lakes, and concentrated animal feeding operations (CAFOs) from environmental partners might include an Impacted Water and Sources thematic category. This category might be built with the expectation that it is read-only, as mentioned previously. Next, a role would be built into the security model by the environmental partner. Said role is then shared with the GLO Federated Identity Management Service. Later, that role becomes populated with individuals who require access to that category by GLO or by a GLO designated partner. Then when authorized individuals log into TDIS, they can browse the list of resources in the thematic category of Impacted Water and Sources to select data to which they need access.

55 Critical Infrastructure – represents a category connected to sensitive data. It might include electric power transmission lines, nuclear waste facilities, fire stations and Red Cross shelters. In this instance, there could be a need for two different roles – One for transmission lines and facilities and another for fire stations and Red Cross shelters. Individuals would be placed into either or both categories’ roles dependent on the access requirements for a given user. This does beg the question that also arises in user creation and user maintenance – how will it be determined which users are permitted access to which data sets? A resolution will need to be worked out through data sharing and security agreements with participating organizations.

Instances will occur in which the data cannot always be facilitated or exposed by means of web- based data services. In the case of data dumps or snapshot data files, the access level agreements will still need to be negotiated with partners so that everyone understands the levels of access granted and the criteria for granting the access. In this instance though, data access has no direct link back to the data sharing partner. Instead, the requested data resource is housed within the GLO-sponsored TDIS infrastructure.

Such ad hoc data sets should still follow the same data access models presented above (thematic, security sensitivities, etc.) and build out similar roles based on those models. And to carry the description forward, individuals would be placed into those different roles as needed. The benefits received from this process are similar to those above with the added feature of future proofing the security model. If at some later time, a member group decides to change their agreement model to permit a method for more direct access, it would require less effort to migrate access from the original model.

If we accept the premise that partner data access is read-only at the partner site, then the flow of data enabled by the data sharing agreements is primarily into TDIS. The majority of the security responsibilities fall on the organization that supervises the operation of TDIS. However, each member remains responsible for maintaining and operating its own security authorization infrastructure to allow the federated identity management system to call upon it for authenticating a particular role when accessing data sets. Under these procedures, the protocols for data access to be developed through data sharing agreements should reflect the principle that partners maintain control of their data, such as PII, and can enforce restrictions on access to the data so designated.

Another important consideration with the negotiated data sharing agreements relates to transparent operations. A recommendation is that GLO build into the maintenance of TDIS a requirement for security reports released at regular intervals. The reports could document all enhancements and operations done in support of data security. The numbers of users and the types of users, populations of each role along with other network traffic metrics should be provided to all partners on a regular basis. The reporting requirements might need to be tailored to the considerations of particular partners, if there are specific security concerns. Otherwise, a general report that discusses all topics would be delivered to every participating organization.

As one can see, issues of security are salient, pervasive, complex, and highly dependent on the system implementation, the adopted user model and the interrelationships among the system architectures of the partners. Time and again, the weakest link in a security model is the human

56 factor. In large systems with hundreds or more users, the nature of the user magnifies this issue. Understanding this and considering how to deal with users across a wide spectrum of enterprises and industries means that we must make careful choices about how to implement user security. Because the problem can grow so large, the only effective answer is to distribute a large portion of the administrative load amongst the partners of the architecture. Federated Identity management is the means to do this. It turns the Gordian knot of entangled, unrealistic, bureaucratic processes into something that is widely and efficiently used by industry today. Affirming the idea that there will be many kinds of access to resources across data sharing partners is realistic and resilient. It also allows the data providers, the ones most invested in the security of their own data, to be the final arbiters of the access to that data.

A significant portion of the security concerns must follow the architecture, functionality, and implementation details of a project, which leads to another phase beyond the one discussed here that will become even more comprehensive. The nature and ubiquity of security threats to the systems designed today demand that the attention of any project focus significant energy on addressing these issues. The resultant effort in the near-term and in the future will help ensure a meaningful, secure, trusted collaborative system.

57 Chapter 4 Highest Priority Data Chapter 4. Highest Priority Data

The outcome of our broad data survey of the different information sources used during and after disasters leads to a recognition that certain data products deserve to be spotlighted for their importance to many stakeholders. The data that we underscore in this chapter are fundamental and foundational to the entire process of dealing with disasters through building a better understanding of disaster impacts to the physical environment and to the societal landscape of human population.

The data sets described in the following sections are significant because they often serve as the primary components in the creation of other valuable information products. For example, high- quality LiDAR data can be used to produce more accurate three-dimensional terrain models for the processing of baseline orthoimagery and also better representations of topographic slopes for incorporation into the rainfall/runoff processes used by hydrologic models. In addition, techniques for filtering LiDAR point-clouds can extract building footprints and estimate the ground floor elevations of structures located in flood risk areas, as well as define the height of vegetation canopy and calculate the fuel loads that could be consumed in a wildfire. A single high-quality data set can have an exceptional number of applications.

One could argue that other high priority data sets exist beyond the ones selected here. That is certainly true, but many of those examples have limited use cases. For instance, satellite and airborne thermal sensor detections of the hotspots in forest fires play an absolutely crucial part in wildfire suppression, but observations recording thermal emissions have a more limited role in other disasters, in contrast to the uses of LiDAR data that contribute so prominently to many applications. The high priority data presented in this chapter are notable for their frequent, widespread use across organizations having very different missions. The groups might not otherwise have reasons for interaction and discussion than their shared interest in a particular high-quality data product.

Recognition of the highest priority data paves the way for a community agreement to emphasize the importance of maintaining and enhancing key data sets. In the early stages of the transformation to digital data sources in the 1990s, Texas stood out as the national leader in geospatial data production. Following the passage of Senate Bill 1 in 1996, no other state

58 invested more funding in the creation of foundational data through statewide collaborative projects with federal partners. Has Texas continued in recent years to uphold an equal commitment of resources to sustain its position of national geospatial leadership? Funding the highest priority data on a statewide basis will increase production efficiency and result in more timely replenishment of these essential data sources.

LiDAR-Based Topography

The highest resolution and finest quality elevation measurements allowing the detailed representation of the different terrains and landscapes in Texas originate from aircraft surveys carrying Light Detection and Ranging (LiDAR) (-#4.1.1---) sensor systems. Two pioneering experiments for the collection of aerial LiDAR data involved Texas research institutions: the University of Texas Bureau of Economic Geology (UT-BEG) conducted river surveys in Honduras following Hurricane Mitch in 1998 and the Houston Advanced Research Center (HARC) mapped the Greater Houston Region after Tropical Storm Allison in 2000. The results from these and other early surveys quickly convinced professional hydrologists, civil engineers and floodplain managers that the elevation products made using LiDAR data can cover large geographic areas and yield topographic grids that are superior to those made by any other method for modeling flood events and other applications.

After a dense point cloud of laser reflection measurements is acquired, LiDAR data can be used to create high-resolution digital elevation models (DEMs) with vertical accuracy as good as 10 centimeters over non-vegetated surfaces. LiDAR sensors record multiple returns. Typically, the first return pulses are associated with vegetation canopy and building roofs, and the last returns are reflected from the ground. Usually a LiDAR topography product contains both estimated canopy height and bare ground elevation. LiDAR topography is extremely useful in the production of street-level flood inundation mapping and flood modeling to identify threatened properties (homes, commercial buildings, industrial facilities) (Figure 4.1). When used in conjunction with an accurate raster image of flood water or a derived flood extent polygon, which could be extracted from synthetic aperture radar (SAR) and/or optical imagery, one can estimate the water depth for any point inside the water boundary.

The terrain data used to represent river topography and floodplain form an essential aspect of hydraulic modeling. With a relatively low- resolution DEM (10- to 30-meter elevation postings), a model can perform well in topographically simple areas, such as in rural areas where the terrain changes gradually, even though important surface features and properties are not captured in full detail. In urban areas, however, more complex features, such as roads, buildings, riverbanks and dikes, etc., need to be incorporated in the model set-up, because they can produce great effects on flow dynamics and flood propagation. Because LiDAR topography offers a very detailed representation of terrain features, vegetation and structures in the built environment, LiDAR-based data products have become highly desirable for regional hydraulic modeling.

59 Figure 4.1. A high-resolution LiDAR terrain image of an area in Madisonville, Texas, shows parkland and buildings below a small flood control reservoir at the top of the image.

Beyond flood modeling, aerial LiDAR collection offers valuable information for post-disaster damage assessment. Damaged features often display changes in height and volume. For instance, tree canopy may be destroyed by a or hurricane as well as by wildfire. Heavily impacted areas will show large losses in vegetation volume, and LiDAR data collection can be an ideal method to measure the volume loss with high accuracy. Similarly, houses damaged by water, wind or fire may have torn roofs, fallen walls and debris piles of household contents, all of which can be detected by comparing detailed LiDAR-based topography acquired before and after an event.

NOAA Atlas 14

NOAA Atlas 14, Volume 11, Version 2.0 (Atlas 14) is a peer-reviewed National Weather Service study of historical rainfall and is the de facto national standard and source of precipitation frequency estimates and associated information for the U.S. Government and affiliated territories. The National Oceanographic and Atmospheric Administration (NOAA) Hydrometeorological Design Studies Center published Atlas 14 in September of 2018. The Atlas is divided into volumes based on geographic sections of the country (#4.2.1). The historical

60 record for the previous rainfall study ended in 1994. Atlas 14 extends the rainfall record through 2017 (#4.2.2).

Volume 11 of Atlas 14 contains the Texas project area and publishes updated precipitation frequency estimates that quantify the degree or risk of flooding at a location. Atlas 14 uses improved statistical techniques with longer record lengths that provide more reliable estimates, however additional research is needed for area reduction factors, because Atlas 14 estimates are point-based. The effects of climate change also require further study. A comparison of maps published in the current Atlas 14 study and past studies conducted by the United States Geologic Survey (USGS), both showing Texas 100-year 24-hour rain depths, illustrate the dramatic increases in precipitation depths over most of the state (#4.2.3) (Figure 4.2). Key findings in the updated Atlas 14 detail significantly higher rainfall frequency values in parts of Texas that redefine the amount of rainfall to qualify as a 100-year or 1,000-year event. For example, Austin’s 100-year rainfall values for 24 hours surpassed the earlier estimates by as much as three inches, reaching totals of up to 13 inches. More alarmingly, the single-day 100-year estimates for Houston increased from 13 inches to 18 inches. The higher rainfall values result in event classification changes, with events formerly classified as 100-year events now more frequently designated as 25-year events (#4.2.4).

NA14 pacc.4 14--UGS 100-yr 24-hr 100-yr 24-hr

-t 0 • • °

' 0.148.1$¢ 0001-100 0201.J$0 9401.$0 t_" (4$.000 (101.200 (501.400$01.4 00 M ! 40t4) ' s C• Figure 4.2. Comparison of Maps. Texas 100-Year 24-Hour Precipitation Depths. Atlas 14 vs. USGS.

All published data and resources are located on the Precipitation Frequency Data Server (PFDS) (#4.2.5), a point-and-click interface developed to deliver precipitation frequency estimates in several formats, including point-based ESRI Shapefiles and ArcInfo ASCII Grids, cartographic maps, temporal distributions and seasonality analysis of annual maxima. PFDS users have the option to choose from a list of stations, indicating preferred time series type (partial duration, and

61 annual maximum) and data types (precipitation depth in inches, and intensity in inches per hour). Street addresses can also be entered to reveal the table of estimates and depth curves in multiple time durations and recurrence intervals. Since precipitation frequency estimates from this atlas are estimates for a point location, they do not directly relate to an area.

Atlas 14 data help state and local communities prepare for potential flooding and minimize the impacts of weather on lives and livelihoods. Engineers and planners use precipitation frequency estimates to bring knowledge of flood hazards into land use and development decisions, including managing and designing stormwater conveyance systems and other infrastructure. Hydrologic models also include the estimates in order to delineate flood risks and assist floodplain management under guidelines established by the Federal Emergency Management Agency (FEMA) National Flood Insurance Program (#4.2.6). The NOAA Atlas 14 update demonstrates that historical rainfall data is essential for long-term planning to mitigate flood risk and to understand how flooding can impact communities during disaster response and recovery operations.

DHS HIFLD Products and Data Services

As described in Chapter 1, the Homeland Infrastructure Foundation-Level Data (HIFLD) Subcommittee within the Department of Homeland Security (DHS) provides access to a large collection of critical infrastructure data sets, licensed data and other spatial data essential to disaster response and recovery. While their open data portal is accessible to the public, the secure portal (#4.3.1) linking to “For Official Use Only” (FOUO) data and licensed data sets is available only to individuals involved in the support of homeland security and homeland defense missions.

The HIFLD products merit consideration as high priority data because they represent the most comprehensive collection of specialized information dedicated to critical infrastructure and key resources (CI/KR). Although various agencies and private sector entities offer certain elements of CI/KR data for Texas, HIFLD provides the broadest data range, available in common formats and supplemented with metadata and other documentation about collection method, date of production, etc. The assembly of a similar massive data compilation ready for urgent use during a crisis is a daunting challenge. Rather, emphasis should be placed on maintaining the currency and accuracy of HIFLD and on resolving any remaining obstacles to its effective use.

Until recently, access to secure HIFLD products was possible only by contacting staff in the Geospatial Management Office who were responsible for authorization. Under the former process, obtaining credentials could take several weeks. An improved process now allows individuals to apply for access credentials online, resulting in speedier access. For approved users, secure resources are made available through the Homeland Security Information Network (HSIN). To request access to licensed data sets, including land parcels, transportation data, Dun & Bradstreet Business Points Data, etc., an HSIN user must log in and complete an online Data User Agreement (DUA).

62 During recent responses to natural disasters in Texas, such as hurricanes, tornadoes, floods and the COVID-19 pandemic, UT-CSR has obtained valuable HIFLD data sets for electric substations, nursing homes, animal processing plants and pharmacies. While HIFLD supplies a very useful collection of data for disaster response and recovery, these data are not easily shared rapidly with other agencies, an unfortunate drawback caused by the requirement to request special permission from HIFLD representatives. Another limitation is the currency of the data, which may not be as recently collected as some locally available data sets. An annual update cycle for land parcels data contained in HIFLD lags considerably behind other sources in which some Texas central appraisal districts refresh their parcels data daily. As a high priority data source contributing essential support to response operations, HIFLD deserves continued maintenance and improvement with an emphasis on accelerated data delivery.

Radar Satellite Imagery

For decades, remote sensing satellites have played an important role in monitoring Earth’s resources, weather and biogeophysical processes. In recent years, satellite observations have become an effective means for disaster monitoring and damage assessment on a global scale through the work of international space agencies and affiliated organizations including the International Charter for Space and Major Disasters. Multispectral imagery acquired from satellites with optical sensors, such as Landsat 8, Sentinel-2, Terra and Aqua, etc., are in wide use for flood detection and mapping because open surfaces of water can be easily discriminated from land surfaces as a result of their characteristic differences in spectral reflectance in the visible and infrared spectrum. However, weather conditions significantly limit collection of cloud-free optical imagery, especially during and after a weather-related disaster. Since severe flooding usually occurs as a result of heavy rainfall, optical imagery may not be available for real-time flood monitoring.

By contrast, remote sensing satellites that carry a synthetic aperture radar (SAR) (#4.4.1) have an active microwave sensor that works under all weather conditions and at all times during the day or night completely independent of illumination by the sun. As a consequence of their long wavelengths, radar microwaves penetrate clouds and interact with land surfaces and water bodies. The critical band used to detect water bodies is the co-polarization VV or HH band because the backscattering from open water tends to be very weak and produces dark areas in radar imagery, where flooding may be underway. The European Space Agency (ESA) Sentinel- 1A and-1B (#4.4.2) satellites carry SAR instruments that can differentiate very effectively between land and water (Figure 4.3). Each satellite follows an orbit with a 12-day revisit cycle to cover locations in Texas. When both satellites are collecting SAR data, the repeat coverage cycle becomes six days.

63 R: Pd8 Aug 31 R: VW Jul31 R: VWAug 30 R:WSep5 R: VWAug 30 G:P, dB Aug 31 G: VH Jul31 G: VH Aug 30 G: VH Sep 5 G: VH Aug 30 B: VWJul31 8: VWAug 30 8: VWSep 5 8: VH Jul31 (a) 8: P, dB Aug 31 (b) Figure 4.3. Flood detection during Hurricane Harvey 2017: from (a) Sentinel-1 SAR; (b) UAVSAR (Freeman 3 Components Decomposition)

Radar satellite imagery collections have made important contributions to emergency response operations and decision making during recent disasters affecting Texas, as discussed in Chapter 2 with the coordination that occurred between UT-CSR and the NWS West Gulf River Forecast Center (WGRFC) following the landfall of Hurricane Hanna. Another example happened during the widespread flooding from Tropical Storm Imelda in September 2019 (Figure 4.4). During that event, Sentinel SAR data revealed the extent and magnitude of the flooding in detail and showed that most of the inundation occurred as the result of the local accumulation of heavy rainfall directly on the surface and not from floodwaters discharged downstream along rivers and tributaries. A broad view of the nature of a disaster can make a strong impact on how field operations are planned and conducted.

Another reason for the state to support the high priority of radar satellite imagery recognizes that the technology is rapidly evolving and expanding. We are in the early stages of an era that promises breakthroughs in the application of new and emerging SAR data sources. In 2022, NASA and the Indian Space Research Organization (ISRO) will launch NISAR (#4.4.3), a joint mission to deploy the next generation of L-Band and S-Band radar sensors. The L-Band instrument was flown by the NASA Jet Propulsion Lab (JPL) UAVSAR (#4.4.4) for four days during the flooding from Hurricane Harvey and produced outstanding results. Today, commercial microsatellite operators (#4.4.5) are beginning to collect SAR data that may become

64 increasingly useful during disasters. For all these reasons, radar satellite imagery deserves our continued attention.

Ii'-' ESA Sentinel-1A SAR Conditions on September 19, 2019 R = VW(9/19/19) G = VH (9/19/19) ••;k • 5·+$ rd'% » _et'_ ' 3

Figure 4.4. A European Space Agency Sentinel-1A synthetic aperture radar composite image collected on September 19, 2019, indicates widespread flooding (blue) in the suburban areas northeast of Houston, Texas.

FEMA Individual Assistance / NFIP

The FEMA Individual Assistance Program (IA) provides disaster assistance to homeowners who have no flood insurance and whose property has been damaged as the result of a federally- declared disaster (#4.5.1). It includes the Individuals and Households Assistance Program (IHP). Among other services, the program provides rental assistance and temporary housing, such as Manufactured Housing Units (MHUs). UT-CSR maintains an ArcGIS Portal application showing the locations of currently active MHUs and nearby river gages (#4.5.2). In the months following Hurricane Harvey, UT-CSR created and maintained a Hurricane Harvey Dashboard application, which displayed the current number of IA registrations for the event (#4.5.3).

Long-term recovery planning and disaster analytics benefit from access to the IA data. In a report prepared by UT-CSR, the team performed geospatial analysis to determine the localities in Texas that recorded different concentrations of IA claims. The IA claims were combined with different census boundaries to produce several map products. Point densities were created in ArcGIS to highlight areas of highly concentrated claims (Figure 4.5). Using the same data input, these areas could also be visualized as three-dimensional Point Density maps (Figure 4.6).

65 Figure 4.5. IA Point Density Map Figure 4.6. IA 3D Point Density Map

Claims filed under the National Flood Insurance Program (NFIP) were combined with census information and the boundaries of ZIP Code Tabulation Areas to show the highest concentrations of claims per 1,000 housing units. Analysis of NFIP data representing a period of several years permits the visualization of the areas in Texas where the incidence of NFIP claims is highest over the time span. Properties with two or more claims were mapped to show areas of higher repetitive loss in Texas from the years 1976 to 2018 (Figure 4.7).

PLATE AU NFIP Repetitive Loss Properties Santa f e 0 Texas (1976- 2019) uqueraue OKL. HK MA 8 ARKAN SA Mer OUACHITA MOUNTAINS ; Roe W ME «Co

Lubbock • o o 0 0 0 8 '% Rs 0 .'3 0 0 O 0 Jacks 0€ % td •·% 0 0 LOUISIANA urez 0Odessa % o rexAs ( ° ¢ s ° 989 o o° itssr 0/Q,Sb 0O 0oo 0%� o BnlonRou E�::e�: tin 0 0 0% 9, ?° oorte 8 ·8g,g? Aaaresses wrm Two 0 % %g $~ . o .Pk, or more Claims cihuatua 6 o (Natural Breaks) 0 % ~ 8% 0 • Count

.e0 0 3-4 (11,165) 0 0 5-8 (1,766) 0 9-17 (155) 18-35 (13) Source: Insurance claims compiled from most recent • National Flood Insurance Program (NFIP) records. Figure 4.7. NFIP Repetitive Loss Properties in Texas (1976 – 2019).

66

Geospatial analysis of insurance claims data, such as the NFIP and IA records, offers insightful support to long-term mitigation planning and zoning strategies in communities that frequently experience damaging storms. The examples from disaster analytics clearly illustrate vulnerable areas that repetitively experience problems in disasters and can help guide city planners in addressing areas of concern that may require rezoning or other changes to reduce future damage.

The state should place a high priority on access to these data sets. The experience of UT-CSR reveals that the records are not easy to acquire and are often transferred in a format that cannot be readily ingested for geospatial analysis. Having the data immediately available and processed in advance for geospatial use will assist crucial planning efforts for future storms and other disasters.

Enhanced Demography / SVI

As the 2020 decadal report of the U.S. Census approaches, Texas once again will claim a place as a leading state for growth over the previous decade, with the population rising at a pace that could overtake California by 2050 as the most populous state. With the rapidly expanding population come many demographic changes to the current conditions represented by age stratification, household economic status, racial and ethnic diversity, language spoken and educational attainment. Demographic shifts in different regions of Texas will likely increase the risks posed by disasters of different kinds, as increasing urban and suburban growth raises rents and housing costs and pushes low income groups into places where living costs are lower, but also where disasters are more damaging and more frequent.

The extraordinary diversity of the population of Texas and its changing character present challenges to the collection and analysis of demographic data required to detect trends and to make reliable forecasts for long-term planning. The U.S. Census Bureau’s American Community Survey (ACS) (#4.6.1) offers the best currently available metrics to observe annual trends, but it may miss important shifts especially during their early stages. The success of TDIS will require an understanding of the demographic forces that shape the future population landscape and its relationship to the geography of disasters. That understanding will require enhanced demographic data capable of documenting a broader range of socioeconomic and other variables. In addition to reaching out to the academic research community, TDIS should begin a dialogue with the Texas Demographic Center (#4.6.2) to formulate a plan for the development of suitably enriched demographic data centered on the societal impacts caused by disaster.

A recent report (#4.6.3) by the Texas Demographic Center uses data compiled from the 2019 American Community Survey to identify the state’s most vulnerable populations in the wake of the hardships caused by the COVID-19 pandemic and subsequent disruption to the national economy. The report focuses on disparities in minority household income, housing insecurity, food insecurity and lack of health insurance coverage among other factors. While the coronavirus pandemic may reflect a rare combination of circumstances, the analytical techniques of the investigation have wider application to the consequences of other kinds of natural disasters. The study also reinforces the need for a Texas-based Social Vulnerability Index (SVI or SoVI)

67 (#4.6.4). Originally developed by the Hazards & Vulnerability Research Institute of the University of South Carolina, the SVI applies ACS data measuring a selection of socioeconomic variables to produce a national index at the census tract level of the degrees of population vulnerability when disasters strike. The national SVI deserves recognition for its approach and can offer insights, but Texas should have its own SVI based upon the particular set of variables that best captures characteristics of the Texas population, where the weighting of several factors, such as primary language spoken, may be more influential than in a national context. A Texas- based SVI should also take advantage of the recent work done by the Centers for Disease Control (CDC) in their reanalysis of the national SVI with relation to Public Health (#4.6.5).

Enhanced demographic data products and data services designed for Texas need to be considered among the highest priority data for TDIS.

. . . . . The many applications supported by the several exceptional resources described in the preceding sections serve to demonstrate the importance of a commitment to advancing the future development of high priority data products. Recognition of their widespread use along with other versatile, high-quality data products should elevate the need for a discussion amongst the agencies partnering with TDIS to make firm plans for the maintenance and enhancement of a selected core of high priority data in coordination with national initiatives.

The need to rebuild and make improvements to physical infrastructure following a major disaster is taken for granted as a necessity for recovery to occur. By contrast, the need to create and enrich digital infrastructure is often neglected, or not even considered to be an element required to bring about effective recovery and to generate resilience through better-informed planning.

In addition to refining data quality through higher production standards and better procedures, there should be a consensus to pursue continuous data improvement with an eye toward future needs. Data having higher spatial resolution or extended feature attribution would be desirable for most current disaster-related applications and could make some new approaches to disaster analytics feasible. Just as the collection and processing of aerial LiDAR data was regarded as an experimental curiosity in the 1990s only to become the central component of multiple applications today, the products from technologies now emerging may be added to the future list of high priority data. In fact, new data sources having a range of unexplored possibilities have become available very recently in the form of mobility tracking data through smartphone and other GPS-equipped devices in which aggregated, anonymized data can reveal the frequency and patterns of population movement within the area of a census block group. Mobility tracking data has already contributed to new kinds of epidemiological modeling of pandemic disease, and its use in other kinds of disasters is only beginning.

68 Chapter 5 Data Shortfalls Chapter 5. Data Shortfalls

All data can be improved, but the improvements needing to be made to some data collections used by applications in disaster response and recovery have more significance than others and should be addressed. Our data survey uncovered examples of data shortfalls of various kinds.

The reasons for the inadequacies can be either straightforward or complicated depending on the data set in question. Some data products are re-collected and replaced too infrequently and are simply outdated. Others were originally collected at a coarse spatial resolution that is no longer sufficiently detailed to meet current uses. In some instances, data maintenance has declined over time, and there may be a lack of continuing stewardship by the originator. One of the most common data gaps is the lack of comprehensive statewide coverage consisting of uniform data collected during the same time frame, which tends to lead to other problems with data subject to different quality control standards or the blending of data obtained using different collection methods or production technologies.

Some inadequacies result from circumstances that seem almost arbitrary. For instance, in the 1990s, the U.S. Geological Survey (USGS) began to produce digital elevation models (DEMs) using the elevation contours and supplementary spot elevations extracted from their 1:24,000 scale 7.5-minute topographic maps. The resulting digital matrices could be generated using a 10- meter separation between the elevation points within the grid or a less detailed 30-meter spacing interval. Obviously the 10-meter product captured the more accurate representation of topography, but nevertheless the USGS proceeded to create a national coverage of 30-meter DEMs. Why was this done deliberately? For purposes of distribution, the file sizes of many individual 10-meter DEMs would not fit on a standard 2.5-inch floppy disk. All 30-meter DEMs would fit, and so the physical media in common use for distribution at the time dictated the quality of the data product. It would take well over a decade for the USGS to replace the original DEM data sets with their superior 10-meter versions.

Finally, data shortfalls can result from policy decisions that particular data products are not considered to be the responsibility of government agencies to make openly available to the public. The following sections focus attention on several data themes and individual data sets that contain shortfalls that should be recognized and should be improved.

69

Stream and Tide Gages

Rivers are the veins and arteries of the state’s hydraulic system. As such, monitoring the flows, levels and composition of the water provide the state with important actionable information for environmental quality, water supply and disaster response. The historical record of these observations and their statistical analyses form part of the foundation to answer the questions asked during disaster recovery, planning studies and projects. Similar information is collected along the coast. Real-time measurements of tides, storm surge and other important coastal ocean statistics are fed into an archive. Both the river and ocean sentinel systems use gages to measure and convey their results. Consider for a moment, how these are used. A group of first responders heading into an area near a river overflowing its banks could easily be overwhelmed by rapidly changing conditions that might have required different equipment. The harsh environment in which these instruments operate requires maintenance and regular replacements, not to mention reconstruction of the sensor and communication installations lost through misadventure or unforeseen circumstances.

Beyond those concerns, there is a real need to expand the network of both the tide gages and the river gages. When the 2015 Blanco River Flood caused devastation in Wimberley, Texas, the river gage was destroyed. Upstream gages did not exist, so the flood struck with little warning of its size historic size. In the aftermath, the partial record from the destroyed gage at Wimberley showed a flood of unprecedented magnitude. More gages give more lead-time to react to fluid, life-threatening conditions. Using data from recent storms and floods, new gages strategically placed will allow responders, planners and others to continue to combat new challenges, as storm frequency and severity increase. Also, better placement of the gages and the elevation of the monitors would increase their usefulness. Continuous improvement should be made to the connectivity between the gages and the system of record for the data. The resiliency of that pulse of data ensures its consistency and dependability.

Many have long recognized the lack of investment in stream gage infrastructure and the consequences of that failure in lost lives and property damage (#5.1.1). The state should not permit the river gage and tide gage networks to continue to decline during a time of intensifying weather-related disasters.

Height Modernization

The discussion of geodetic control in Chapter 1 introduced the topic of height modernization, which is a central part of a recent program led by the National Geodetic Survey (NGS) to create a more accurate National Spatial Reference System (NSRS) (#5.2.1) to replace the current system of NAD 83 horizontal control and NAVD 88 vertical control used to define locations and measure elevations. Field surveys conducted by NGS have demonstrated that the majority of geodetic benchmarks in Texas contain vertical errors of 10 to 15 centimeters (4 to 6 inches) and many have greater discrepancies when compared to the values obtained by modern high- accuracy techniques based on global positioning satellite (GPS) technology and new models of the Earth’s geoid. Common errors of 10 to 15 centimeters become serious problems for some

70 locations, such as Texas coastal counties affected by the combination of local ground subsidence and global sea level rise. For floodplain management studies incorporating LiDAR-based topography that attains a maximum vertical accuracy of approximately 10 centimeters over non- vegetated surfaces, the 10- to 15-centimeter error in local geodetic control can produce a foot of total uncertainty in the LiDAR elevations. The result is disturbing given the current multi-agency initiative to create more accurate flood risk maps for Texas.

Until existing benchmarks are reoccupied and systematically corrected by surveys using high- fidelity GPS instruments, and new benchmarks are established to replace destroyed and missing monuments, a shortfall will exist that reduces the benefit that could be realized from regional flood management projects in major river basins. Fortunately, the national effort is underway for Height Modernization benchmark campaigns to produce GPS-derived orthometric heights with a vertical accuracy of 2 centimeters. Those concerned with reducing disaster impacts should support Height Modernization as an essential component in the development of statewide foundational data.

Street Addressing and Geocoding

One of the first steps to understanding the risk associated with a location is to know precisely where that location is found. One such location is an address point, which is a data point linked to a map coordinate position representing a street address. Address point locations are frequently estimated using software that matches and interpolates the location through a process called geocoding. The geocoding method is helpful when gaps occur in the address point data available from trustworthy sources.

Addressing systems are managed by local jurisdictions and aggregated by regional organizations. The 9-1-1 system coordinators across Texas serve as the state’s authoritative data sources for address point compilation. Generally, statewide address point data for Texas can be acquired from public or commercial sources, but such data contain gaps and shortfalls that should be investigated before using the data for emergency management purposes. The three common shortfalls of data quality that affect address point data, regardless of the source, are completeness, currency and accuracy.

Maintaining data completeness for 254 counties is a challenge, as gaps will occur wherever existing or new addresses have not been digitally collected and coded. In recent years, the Texas

Strategic Mapping Program (StratMap) (-#5.3.1---) has partnered with local jurisdictions to produce a compilation of address points in 249 Texas counties that is now available for public use. Unfortunately, a data gap remains, representing approximately 546,000 residents in Brazos, Ector, Howard, Rockwall, and Somervell counties. Data submissions are requested annually, but the address point currency cannot be guaranteed at this temporal scale because of the large number of jurisdictions involved, each with its own production schedule.

Geocoding can be used to fill data gaps, but it is important to note that the geocoding process has its own shortcomings. First, the geocoding coverage is only as good as the reference street network data coverage. For example, new subdivisions or rural residences may not yet be

71 included in a streets database. Geocoding fails from the outset when a street is not available for address matching. Addresses that pass the matching step are processed by the interpolator, which produces a latitude and longitude coordinate for the address point. Geocoding software uses an algorithm to score an interpolated point for locational accuracy. Any points that score below a predetermined threshold should be reviewed for accuracy using other tools such as Google Street View (#5.3.2).

Address points missing at the time of future disaster events will impact an emergency manager’s ability to respond effectively, conduct adequate risk analysis in the community and make informed decisions. Stop-gap solutions, such as purchasing commercial data and geocoding software tools, are available. These products advertise a high level of quality control, which if verified, could reduce the gaps in data completeness, currency and accuracy.

Building Footprints

The mapped outlines of structures (houses, retail stores, factories, outbuildings, etc.) are often represented as vector objects in GIS layers created to support applications for public works, zoning and tax appraisal. Commonly called “building footprints,” they are especially useful when attempting to estimate damage costs from a disaster or determining where residential properties are located in hazardous areas. Texas contains millions of habitations, with more than eight million detached houses. The state currently lacks a comprehensive source for statewide data containing building footprints that could be considered reliable and updated on a regular basis. In a state experiencing rapid growth and development over many areas, the shortfall of building footprint data affects both disaster response operations during a crisis and post-event recovery activities together with planning for mitigation.

Several methodologies can produce building footprint data sets that meet acceptable standards for accuracy. Each technique involves semi-automated or fully automated processes that typically begin with imagery. Stereo photogrammetric analyses that measure structure heights as well as two-dimensional shapes allow buildings to be extracted from aerial survey photography. In 2018, Microsoft released a national building footprint data set having over 125 million polygons using the aerial photography collected for their Bing search engine mapping feature (#5.4.1). The results meet accuracy requirements, with the exception of buildings with atypical shapes or roofs covered by tree canopy. While the Microsoft data resides in the public domain, commercial data sets have appeared recently that require a license fee. EarthDefine (#5.4.2) offers a national product with 145 million building footprints acquired and processed in much the same way as the Microsoft release (Figure 5.1). Multispectral satellite imagery offers another technique for building footprint extraction using image processing capabilities for the edge detection and spectral characterization. DigitalGlobe (#5.4.3) has created building footprints for many large urban and suburban areas in the United States. The third and most accurate method for structural feature extraction involves the processing of LiDAR point clouds generated by aerial surveys. The three-dimensional laser beam reflection data captures the distinctive differences in building edges, roof profiles and construction materials and can penetrate vegetation canopy that often obscures residential roofs.

72 l re,

#;8ie] '#'ti' + switi;igi " •i's!¢ I · { � , I ,

$ ":

Figure 5.1. A building footprint product licensed by EarthDefine covers a part of the University of Houston campus and nearby commercial buildings and residences.

Each method has flaws, but the technology needed to collect building footprints at reasonable costs over large areas exists today and will improve as Machine Learning (e.g., CNN deep learning-based computer vision) techniques are applied.

The use of building footprint products that are enhanced and updated on a regular basis would have a significant impact on disaster damage estimation by accelerating the determination of impacted structures in floodplains, along tornado tracks and within the burn scars left by wildfires. When used in conjunction with risk assessment models, the building footprint data would lead to more reliable predictions of the total number of structures at risk, especially in high-growth areas of Texas.

Base Flood Elevations

Reliable flood risk assessments depend on accurate measurements of elevation over the entire landscape of floodplains and nearby areas. Height Modernization surveys furnish geodetic benchmarks having 2-centimeter vertical accuracy, as described earlier in this chapter, while improvements to the collection and processing of LiDAR-based topography offer enhanced digital elevation models with elevation grids that can be referenced to the highly-accurate geodetic control points. By using this verified digital terrain, teams of climatologists, meteorologists, hydrologists, hydrodynamic modelers, and others can create numerical models

73 that estimate the thresholds for flooding and surface inundation over whole river basins. Base Flood Elevations comprise one of the primary outcomes from the numerical models and can be used to inform the much-needed flood risk assessment of areas in Texas experiencing rapid population growth and development.

FEMA defines the Base Flood Elevation (BFE) (#5.5.1) as the surface water elevation resulting from a flood that has a one percent chance of equaling or exceeding that level in any given year. The one-percent annual occurrence flood is also referred to as the base flood, or commonly as the 100-year flood. Using detailed hydraulic analysis and modeling, FEMA publishes Flood Insurance Rate Maps (FIRMs). The FIRMs show the land area under the base flood elevations as Special Flood Hazard Areas (SFHA) (#5.5.2) using the one percent annual recurrence interval of flooding to designate areas as “high risk”. Low risk areas lie outside both the SFHA and higher elevation areas that have a 0.2 percent recurrence interval (once in 500-year event) for flooding. The moderate flood hazard areas occupy the areas between the base flood elevation and 500-year flood elevation boundary.

Knowledge of the BFE is important for conducting an accurate risk analysis. For instance, properties built in an area with elevations equal to or lower than the local BFEs means that the flood probability represents high risk (one percent annual recurrence interval). To reduce the damage to a minimum, the lowest habitable areas (floors with living areas) in buildings should be built above the BFEs. In a SFHA without a designated BFE, new construction and improvements made to existing structures shall have the lowest floor elevated no less than two feet above the highest adjacent grade at the building site (#5.5.3).

Base Flood Elevations can also serve as warning thresholds for flood inundation modeling. As an example, the National Water Model (NWM) (#5.5.4) predicts the discharge for about three million river reaches across the nation. The discharge is converted to an estimated water surface level using the enhanced Height Above Nearest Drainage (HAND) grid. A river reach will be highlighted by a high-risk warning whenever the NWM-derived water level exceeds the BFE.

Texas lacks BFE data for about two-thirds of the state, which represents a major data shortfall. The absence of BFE data means that areas at high risk of flooding cannot be accurately identified. Consequently, local flood risk cannot be assessed with confidence. A multi-agency effort to remedy the data shortfall is underway. The goal of the Interagency Flood Risk Management (InFRM) (#5.5.5) program is to perform Watershed Hydrology Assessments to produce consistent one percent recurrence interval (100-year) and other frequency flows across river basins in Texas based on the best available hydrologic information.

Economic Activity

Locating economic activity data in geospatial formats is a challenging exercise at best, especially when searching through government agency record holdings. These specialized data contain geolocation information merged with business intelligence that may be useful for the purposes of building community resilience and addressing economic vulnerabilities before a disaster strikes. Recently, new geospatial technologies have emerged in the commercial data and software space

74 that can infer economic activity by measuring human activity in near real-time. One such example is SafeGraph (#5.6.1). Better business and economic geospatial software is also available, namely the ArcGIS Business Analyst extension (#5.6.2).

Public data records commonly take the form of reports available as PDF or other document file types that are not readily ingested for geospatial analysis or mapping applications. In the absence of geospatial economic or business data, geospatial analysts must use a geographic information system (GIS) to preprocess and build the specialized data by merging nonspatial records with spatial data in the desired geographies.

Data gap examples not readily found in public sources include business locations with employee and sales information and major shopping centers with attributes, such as retail sales and leasable area. Other missing geospatial data cover consumer spending categories, crime indexes, traffic volume counts and market potential data. Additional public data shortfalls involve socioeconomic and demographic compositions of neighborhoods that can only be found in reports or spreadsheets that must be downloaded and manually transformed.

In the public sector, some intermediate solutions fill the data gaps, but they often fail to match commercially licensed data offerings. One example is the U.S. Bureau of Labor Statistics (BLS) interactive map application (#5.6.3). The web-based thematic map shows variations in time period and employment and wages reported by employers in many industries at the county level. To access the raw data referenced by the website, the user must download a CSV or XML file. A similar condition exists with the Texas Comptroller’s Office, which publishes the online State Revenue and Expenditure Dashboard displaying tabular data, charts and graphs that can be created on the fly with the results limited to export as an Excel spreadsheet (#5.6.4).

Private sector sources of economic data have modernized their platforms to use RESTful web services and spatial data types that allow quick integration and use of the data. The licensed commercial data can be used in disaster response and recovery planning to provide a richer understanding of populations that are at risk and to identify business activity patterns. In the absence of public sector economic activity data prepared with necessary spatial components and made available to the public as a web data service, commercial licensing of economic data packages remains the only viable option available for disaster planning and recovery efforts.

Land Parcels / Zoning

The availability of geospatial data for land parcels and zoning in Texas has improved markedly over the past decade, as more jurisdictions and central appraisal districts (CADs) move from paper-based and Mylar records to digital formats coupled with database maintenance within a geographic information system (GIS). However, while much improvement has occurred, many data gaps remain along with other concerns that may reduce the effectiveness of the data for disaster response, recovery, and planning.

As a consequence of the dynamic nature of land parcels, many datasets are incomplete, out-of- date or inaccurate. The land parcel data set maintained by TNRIS currently contains geospatial

75 data representing only 228 of the 253 appraisal districts in Texas. For many of the missing CADs, work to produce digital parcel data is in progress, parcels remain available only as paper maps, or the parcel records are stored in non-ESRI formats, such as MicroStation’s DGN file format, that need to be converted to be shared effectively. Parcel maps change continuously as new subdivisions are built, corrections are made and ownership and land value information are updated. A significant number of jurisdictions lack the software, hardware, personnel and automation to maintain their parcel changes.

Data quality also raises concerns with geospatial land parcel data. In the majority of CADs, the issues stem from the parcel geometry not aligning correctly with roads, county and municipal boundaries, or with aerial orthoimagery. Spatial analysis of two or more adjacent counties often detects boundary matching problems, where no apparent coordination has occurred between neighboring counties (Figure 5.2). CADs also use different map projections and coordinate systems, which further amplifies the feature alignment issues.

Land Parcel Alignment Issues Between Harris and Fort Bend Counties ,./

4$ A � J_ �

I 'r . .. L7

Figure 5.2. Land parcel overlap Issues between Harris and Fort Bend Counties.

Feature attribute information can also be problematic. At TNRIS, a statewide standardized data schema is applied to land parcel data that can be very helpful. Unfortunately, during the data import and conversion process, some fields can be omitted and not appear in the final version. In

76 a further complication, some CADs only publish their data as shapefiles, which limit field names to ten characters. As a result, the fields in the original data become truncated, which may cause confusion and additional complications. The irregularities in address field data structures together with inaccuracies in the addresses themselves create still more problems. This can lead to difficulties in merging records with other parcel data.

Centralized data sources, such as TNRIS and HIFLD, increase access to land parcel data, but these sources are updated only once a year, whereas the CADs often publish their databases daily or monthly. Depending on the publication date, records may be months out-of-date.

Zoning data suffers many of the same problems encountered with land parcel data. Accurate, easily accessible zoning data is critical to land use planning and to hazard mitigation plans. No centralized source for zoning data exists, and no adoption of common data standards has occurred amongst the various jurisdictions. Many of these have not established an effective GIS solution, a technology that enables the linking of spatial data to zoning requests, building code information and site plans that could be useful in disaster response and planning. Gaps in land parcel and zoning data can significantly cripple efforts of city planners to effectively mitigate future damages to infrastructure and impacts to vulnerable populations. Data gaps also hinder recovery efforts that require rapid access to ownership and related information immediately following storm, fire, or flood events.

Land Cover

Land cover is included in the current chapter for several reasons. First and foremost, no concerted effort has been made to create a dedicated fine spatial resolution land cover product for Texas. The most current available dataset is the 30-meter 2016 National Land Cover Database (NLCD) described in Chapter 1. The 2016 NLCD is a more sophisticated and robust data set than preceding versions and provides a more consistent representation of landscape dynamics and change, but as a national dataset, of necessity it does not reflect the unique characteristics of Texas ecoregions and land development practices. Also, because of the complexity of the NLCD project, database updates frequently lag years behind disaster preparedness needs for representations of current conditions. One encouraging development is the anticipated release of the 2019 NLCD update in late 2020 (#5.8.1). The emphasis on replicability of outcomes and process automation (#5.8.2) have resulted in more rapid updates than previously possible.

Another shortfall of NLCD data is the broad definition of land cover categories. Such categories are useful for national and regional modeling and mapping efforts but fall short in supporting disaster response, recovery and preparedness efforts. For example, NLCD includes four developed land categories: open, low-intensity, medium intensity and high intensity. These data classifications may better meet disaster planning requirements when combined with land use categories and characterized by city and county zoning information. The challenge for Texas is that no statewide compilation exists for land use data or zoning maps. Similar to land parcel mapping, land use and zoning information are maintained locally.

77 Texas Parks and Wildlife Department (TPWD) compiles and maintains Texas Ecological Mapping Systems (EMS) vegetation data derived from aerial photography and sophisticated image classification, a rich data set that encompasses the diversity of the Texas natural landscape. Area updates, dependent on funding availability, are rare and limited in geographic scope, with the last two made in 2014 and 2015 (-#5.8.3--- ).

The anticipated release of new 2019 NLCD impervious cover products may increase the effectiveness of the NLCD, particularly for flood, fire and heat-related disasters. The need in Texas will be to examine the data for accuracy and to develop recommendations for data use. Current and accurate data describing land cover contribute important inputs for numerical hydraulic and hydrologic models that require estimations of surface roughness and other physical characteristics associated with different land cover types. The lack of updated, high quality land cover data has negative impacts on the performance of river flooding and storm surge models. In general, spatial resolution, update schedules, and data set enrichment to meet disaster needs constitute the primary land cover data shortfalls that should be addressed by the state.

. . . . . An open discussion should be encouraged to identify the inadequacies detected in the key data sets important to understanding and addressing the impacts of disasters in Texas. The examples given in the preceding sections do not capture the full range of data deficiencies, but they do target particular cases that can serve as guidance for the broader debate about which data products reflect the most serious shortfalls.

The organizations partnering with TDIS to develop data sharing relationships would be natural stakeholders in a collaborative effort to remove the major data gaps by first building and prioritizing an expanded list of the substandard data products that are inhibiting progress and next by proposing solutions to improve data quality in each of the areas recognized. Through this process, it would be especially helpful to survey the TDIS partners to record each group’s perception of their own critical data gaps from the perspective of their own organization’s activities and plans. The partner survey could lead to the resolution of a consensus view of the scope of the problem involving data shortfalls and establish the progressive steps for data remediation.

An advisory group or technical committee could also consider the issue of data shortfalls. A candidate group might be the academic council advising the Interagency Flood Risk Management (InFRM) team involving USACE, FEMA, the National Weather Service and U.S. Geological Survey. InFRM members are participants in many current regional studies of coastal and inland flooding in Texas, which brings them into direct contact with data discontinuities and other shortcomings. The membership of the InFRM academic council also contains scientists and technical experts who are well-suited to evaluate new data collection and production technologies capable of filling certain data gaps in which the inadequacies stem from a failure to keep pace with technology.

78 The organizations supporting TDIS need to anticipate the challenges arising from the continued use of inadequate data sources that result in adverse effects on future projects and plans. As opportunities emerge, the TDIS partners should prepare to make recommendations to fund the required data improvements based upon their assessment and a consensus agreement on the data shortfalls.

79 Chapter 6 Best Practices in Data Curation Chapter 6. Best Practices in Data Curation

Organization matters.

At the level of the internal design for the TDIS data system architecture as well as in the public- facing data services and data access portals to be offered by TDIS to users, organization will matter. In the digital world, the expression of organization emerges from how the system’s many parts mesh to form an efficient, functional whole reflecting a regular pattern that is immediately recognized by the user community.

In a responsive information system, organization builds confidence in the user to expect to receive well-curated data products supported by complete documentation and made available on demand by following a stable, logical set of procedures that should soon become intuitive and transparent to the user. To win the trust of diverse groups of users from the beginning, TDIS will need to adopt the best practices of enterprise data management and modern information science.

In this chapter, we emphasize the main aspects of data curation and the system infrastructure design that shares and distributes data products. The first steps involve the development of methods for effective data collection, indexing, preservation and sharing. Data descriptions must contain consistent metadata that identifies the source, date of origin, version, production techniques and other essential information required to understand the intended uses for the data and its limitations.

Not all data engaged by TDIS will be well structured, especially data arriving at the time of an emergency. Coherent information management requires a systematized approach to handling structured as well as loosely structured and unstructured data along with procedures to recognize and deal with new data sources that appear unexpectedly and without precedent during disaster response operations. Disasters often provide data rich environments, as many agencies and organizations work together to solve unique problems through informal communication networks. Unfortunately, the opportunity to collect this information fades with time, as the participants return to their normal routines. TDIS has an unmatched opportunity to capture, preserve and make available data that might otherwise disappear into inaccessible domains.

80 Incorporating the concepts and best practices of scientific data management allows TDIS to achieve the rigor and the adaptability needed to achieve efficient data organization.

Data File Format Standards

A collection of data stored digitally and identified by a filename is called file. A file has a format that describes the structure and data type within the file. The structure of a typical file often includes a header, metadata, saved content and an end-of-file (EOF) marker. A file format also defines whether the data is stored in binary format or plain text. Files stored in a binary format are constructed with a sequence of ones and zeros and are designed to be used for multiple purposes. Binary files are deployed as executables, databases, application data, media, documents, configuration files, libraries, drivers, file encryption and compression (#6.1.1). Plain text files, a universal open format, can be opened and viewed using a standard text editor. Text- based files are highly compatible with most applications and data processes; however, they often require more digital storage volume than comparable binary files.

In choosing a file format, data collectors should select a format that is useable, open and likely to be readable well into the future. Microsoft Excel, as an example, is a useful tool for data manipulations and data visualization, but versions of Excel files may become obsolete and may not be easily readable in the future. Similarly, database management systems (DBMS) such as PostgreSQL, MongoDB, MySQL and others, are an effective way to store and query data, however, the raw formats are likely to change over time, and sometimes change in only a few years. As a best practice, organizations that use open-source or proprietary DBMS frameworks should create a plan for exporting the data in an open, stable, well-documented and non- proprietary format (#6.1.2). By the same logic, organizations should avoid using proprietary file formats that are difficult to export or cannot be translated into another compatible format.

Table 6.1 is a compilation that includes some of the suggested compression, text, tabular, image, video and geospatial data file formats suitable for long-term and short-term archiving (#6.1.3) (#6.1.4) (#6.1.5) (#6.1.6).

Table 6.1. Suggested File Types for Long and Short-Term Archiving. Usage Long-Term Short-Term Audio WAV AIFF, MP3, WMA Compression 7Z, GZIP, TAR ZIP Database, Tabular CSV, XML XLS, XLSX Geospatial Vector DBF, GeoJSON, KML, NetCDF DWG, GDB, SHP Geospatial Raster GeoTIFF, HDF, NetCDF, TIFF GDB, GRB Presentation ODP PDF, PPT, PPTX Still Images BMP, PNG, TIFF GIF, JPEG, JPEG2000, PSD Text ASCII, ODF, PDF/A, UTF-8, XML DOC, DOCX, PDF Video AVI, MJ2, MOV MPEG-4 Web WARC HTML, JSON

81

Metatdata

Any reliable geospatial data archive or online data service should have proper documentation (or metadata) that describes its data sets. By the most simplistic definition, metadata is information about data. It represents the who, what, when, where, why and how of a particular resource. Metadata may describe geospatial data sets, imagery, data catalogs, mapping applications, data models and other records.

Common elements found in a metadata document include the originator, publication date, overall description, process steps, quality assessments and attribute descriptions with explanations of the domain values for each attribute. Depending on software used to generate or compile data, some metadata elements may be auto-generated, such as the date, map projection or the numbers of features. Esri’s ArcGIS Desktop and ArcGIS Pro have built-in metadata creation functionality. Because the metadata connects directly to the data set, each time the data file is edited, elements in the associated metadata are updated automatically. In GIS applications, this function is referred to as feature-level metadata. Tools are provided to enter manual updates of non- automated sections, such as the description for contact information.

In the online world, publication of metadata and the use of keywords are crucial because they give users the ability to find relevant data using web search engines. For geospatial data, online repositories, such as ArcGIS Online and the Open Data Portal, require at a minimum basic metadata describing each data set and usually offer an additional link to more detailed metadata. In general, all levels of government should adhere to a standard for writing metadata. The most widely-adopted geospatial standards are published by the Federal Geographic Data Committee (FGDC), a component of the National Spatial Data Infrastructure (NSDI) (#6.2.1). The FGDC is “a United States government committee which promotes the coordinated development, use, sharing, and dissemination of geospatial data on a national basis.” First adopted in 1994 (and revised in 1998), the Content Standard for Digital Geospatial Metadata (CSDGM) has long served as the primary format used by agencies and organizations that publish geospatial data (#6.2.2).

In 2010, the FGDC formally endorsed a metadata standard called ISO 19115 published by the International Organization for Standardization (ISO) and encouraged federal agencies to make the transition to this more recent standard. The ISO standard improves the documentation for spatial data, services, models, sensor technologies, quality assurance procedures, collection methods, and many other features. Enhancements in 2016 improved search and discovery, access to the data and more direct access to online data services.

Digital Object Identifiers (DOIs) and FAIR Principles

Scientific data management offers a roadmap to the curation of the many forms of data that TDIS will collect and serve. Specifically, in the age of digital publishing, the adoption of FAIR Data Principles helps to enable data discovery by making resources Findable, Accessible, Interoperable and Reusable (#6.3.1). To meet the requirements to be Findable, data (or its

82 metadata) must be carefully referenced through the assignment of a digitally unique and persistent identifier and registered in a searchable resource. Data (or metadata) that are Accessible must be retrievable by their identifier using a standardized communications protocol. To be Interoperable, data (or metadata) must use a formal, accessible, shared and broadly applicable language. And Reusable data (or metadata) are described with accurate and relevant attributes. Without specifying an absolute standard, the FAIR Principles enable data products of many kinds and their associated metadata to be machine searchable across global networks.

TDIS will be expected to deal with large volumes of geospatial information and other highly structured digital data used interactively by data application services. Furthermore, traditional forms of records (PDF reports, schematic graphics, scanned papers, field notes, etc.) will also require organization by TDIS. For this purpose, the FAIR guidelines offer a coherent strategy for data curation of many data types. A wide variety of different media and document formats can be tracked and retrieved through the attachment of a Digital Object Identifier (DOI) linking to the object’s metadata, thus permitting electronic discovery by automated search engine routines. DOIs allow data to enter into communities using open data exchange standards and open-source data repository software, such as Dataverse (-#6.3.2---).

Scientific data management is a rapidly evolving field. It can make enormous volumes of “big data” available to an individual who searches for a single piece of information as well as to the agents of high performance computing systems that scan and merge data from repositories around the world.

Data Sharing Standards

Data sharing standards are essential when any organization attempts to build access to data. Choosing to share something without adopting a standard or with undocumented standards arguably is not sharing data at all. As indicated in the preceding section regarding the FAIR Principles, to be discoverable data must adhere to at least the minimum guidelines for accessibility. If the goal of providing access to data is the use of that data, then the Data Sharer must adopt data sharing standards. Many standards are worth consideration, and methods used to decide which one is the best for a given implementation are well documented.

What is the interface for sharing? is a good question to begin the discussion. TDIS is a collaboration of partners who are trying to facilitate access interactively. That is, the access is often made and parsed as part of a workflow, model or application that calls out to its many partners. Current methodologies for data services often use REST or RESTful mechanisms and transfer data via JSON. Well-documented standards are readily available for current uses of both methods. Libraries for parsing JSON exist in all major development languages, and REST mechanisms are built upon the foundational processes that power the Internet. Also, all major languages and frameworks offer libraries to connect and create REST-based services.

These factors combine to invite partners and users to discover and use data services. In the context of TDIS, this also creates a low bar to entry. It follows that ease of access translates into more adoption. How can it not? Ancillary benefits that come from adopting standards of this

83 kind include ubiquitous example code and documentation for both creating and consuming the data being shared. Also, as part of the many libraries alluded to above, tools exist to transform data from other formats into JSON. An agreement to a set of standards means that the partnering organizations can develop expertise in working with the different standards and models to build community and reinforce their collaboration within TDIS.

. . . . . Operating under the principles of best practices for data management, TDIS will host a scalable system architecture with data system security assured by the processes described in Chapter 3. The agreements with data sharing partners should permit access at the database level whenever possible in order that seamless data transfers can occur without the danger of exposing sensitive information. A transparent, effective system traces all data from both internal and external sources with equal attention paid to the metadata descriptions for all information and thorough records for data sources. The data service applications constructed for TDIS must avoid the failures of incomplete data documentation and process description. To the greatest degree possible, there should be no black boxes left behind to perplex the user.

In the concepts presented here, data discovery reigns preeminent and can be accomplished through the application of scientific data management across the federated system that TDIS enables.

84 Chapter 7 Cost-Benefit of Information System Design Chapter 7. Cost-Benefit of Information System Design

There are many ways to build out the architecture of a system. A discussion of two frequently used designs will compare some of the features of each. For the scope of this discussion, monoliths or centralized systems build the components of the system’s architecture and in turn host them on premise or in the cloud by a cloud provider. Other solutions, such as distributed or federated systems, are quite popular and have a different set of characteristics. Federated systems present a flexible architecture solution addressing complex problems of scale and resilience. As is often the case, the fluid nature of information technology terminology requires us to provide a definition for purposes of this discussion. Here a federated system typically means that different organizations can work as partners by invoking and sharing public services and processes. There is an implication that these different groups are also distributed and distinct from one another. As we know, TDIS by one broad view is a collection of partnerships. So, a federated system here implies that these partners would then have public services that each could invoke. Some of these services might require different levels of authorization, which is to say that some might be completely accessible to all and others not.

Contrast a federated model with a centralized system. Recall that a centralized system is one where the services are deployed and managed by one entity. Hidden within that definition are some important details. Are these services publicly accessible or only accessible to the central entity? Is the entity using a single data center for the system or is the system up in the cloud? By centralizing the deployment of the services there are tangible benefits. Such a system can reduce the complexity of the architecture and the discovery and access of services. Also, a single entity often simplifies the designation of points of contact for access or information. Further, a sole source also means that the points of contact for issue resolution are more intuitive. There is a perception that centralized systems offer a one-stop shop for the services that the system provides.

If we step back though and look more closely, what we are really considering here is where the data, data services and processes behind those services mentioned are housed. Services are most often data driven. The most fundamental questions about system architecture concern data, ranging from the conceptual (data model) to process (data inputs and updates) to reliability (data

85 access and archiving). Each of these may have many levels of consideration that must be addressed.

Thinking more on this, one can see that even in a centralized system (Figure 7.1), some of the initial concerns must still be addressed by whoever acts as the data owner, whether data is internal or external to the central system. That is, centralized systems might create some data and collect and migrate other datasets as part of partnerships and collaboration. In either case the implementation of the services based on those datasets is still affected by the data model as are the intent, scope, resolution, and age of the data.

Representation of Centralized System - including data for rehosting from some partners

Workflows for Processing Rehosted Data

Figure 7.1. Centralized System Conceptual Model.

At their most fundamental, federated systems are about services built to agreed-upon standards. In general, the expectation is that certain federated services will behave in certain ways and be updated with certain frequencies. The federated model empowers users to build their own solutions on top of a predictable set of well-maintained data and processes accessed through integration with multiple sources via services model. Often different levels of access for

86 different datasets and services will be implemented. Data owners can continue to maintain and update data that feeds data services. No other partner need concern themselves with the steps needed in the creation or maintenance of the data.

Looking more deeply at data maintenance provides additional insight. Most often as part of the agreement of the partners, data age will be documented and understood. Understanding the refresh cycle of the data though means that certain decisions can be made about the appropriateness of a dataset for solving a given problem. The established protocol that the data owner is updating and providing a data service ensures that the latest available data is retrieved when accessing the service.

All of these are good reasons to deploy federated systems. But there would be no other solutions if this one had no drawbacks. By their nature, federated services hosted on partner architecture assure the introduction of a potential bottleneck to the collective system. For example, a data service might be brought down by circumstances beyond the control of managing partners. Other infrastructure, policy changes or unforeseen partner situations could all affect the availability of data that the rest of the partners might require. There might even be other business processes opaque to partners that could impact access. Whenever one group controls a dataset there is at least a potential for access issues. Among the possible approaches to mitigate some of these problems, rehosting the data is the most immediate and obvious solution. Rehosting presents issues we will consider later with a centralized data services model.

Are these drawbacks enough of an argument to reject system federation? Despite the potential problems described in the previous paragraph, a federated system is suitable for most data serving needs. Yet in recognition that one sole solution cannot meet the complex issue of data sharing, we must consider other alternatives.

Some potential TDIS partners will want to collaborate but for a variety of reasons will be unable to act as their own hosts for data services. Such partners could logically participate as data creators. Data creator partners might continue to host data as prepackaged datasets. Some of these data creators might not even want to broadly host their data for download even if they are willing for other partners to access their data. These likely scenarios highlight where federation is not a suitable answer.

In contrast, what are issues related to centralizing data and data distribution? Timeliness of the data, lack of dataset comprehension and a host of other metadata-related issues come to the fore. If an entity is not the data creator for a needed dataset then a lag between creation and access is unavoidable. Also consider the case of owning data desired by a large user base that does not have the means or perhaps, desire to host access to that data asset. The workflow necessary to transfer that data is the weakest link. Recognizing that fact and ensuring that the process is the most efficient possible is the only way to minimize the effects of this issue. There is a similar but lesser concern if the data owner is hosting data but not providing a data service. At a minimum, the data owner should commit to a predictable data update pattern. There should be at least an understanding of a commitment to some pattern for updating the data. Processes are also required on the eventual host to collect, check and then rehost the data. The difference between these two scenarios is one of degrees rather than type.

87

A bigger concern when a data creator is hosting versions of data can be called version fragmentation. Imagine the following scenario: two different groups are interested in using data from a creator. Group A sets up a mechanism to acquire the data and process it into an accessible data service. Group B acquires the same data but at a different frequency or perhaps only once. Each group integrates the dataset into their decision making and workflows for business processes. Notice what has happened. While both groups acquire data from the same data creator, a version drift that we are calling fragmentation could quickly develop. During a critical response, the existence of multiple versions of the same data could lead to faulty decision making. Decisions based on out of date data are possible. Confusion arises when different analyses lead to conflicting results caused by the use of data of different vintage. This problem is very real and shows itself in many unexpected ways.

When rehosting data the disruption between creation and hosting often results in misunderstanding about appropriate uses for the data and understanding of the dataset itself, particularly when metadata is incomplete or limited. Is the data only appropriate at particular scales or for a particular set of circumstances? These and other details are critical for appropriate use. Often when decisions need to be made quickly on the inputs for a model or analytical process, it is cumbersome to gather the data experts and to get consensus on the data. When the data owner is not the data host, a disconnect is more likely. Incomplete knowledge about the data model, the data coverage or the data attributes can all play a role in this disconnect or misuse.

Standard considerations in building out an architecture are resiliency and continuity of service. Here centralized systems have an advantage over federated systems. It is simply easier to agree on a single plan for data archives, backups, scaling and clustering than coordinating many plans. A system like TDIS will have to address these issues. Data archiving at the very least means that there will be a historical record of different event-based datasets. These will be used in recovery projects and studies. There is no short cut here. Archiving historical inventories will be a priority in TDIS. Closely related to archives conceptually are backups. While archives are designed to be accessed by users, backups are more closely related to the administration of an architecture. If a data service fails or an archive becomes unavailable, then a restoration from backup might be the remedy. How does one resolve these questions in a federated system?

A federated system does not mean that it is just a collection of peers. By necessity there will be one person or group of responsible persons serving as First Peer, a concept discussed hereafter. Even without such a formal designation or recognition, each partner or peer will likely invest in their own archive and back up processes. With even a handful of partners this becomes difficult to manage and to orchestrate remedies that the archive and back up processes are designed to address. The lag time alone makes this solution unrealistic. That leaves us with the idea of First Peer. And one of the best mitigation strategies for this partner is to have them act as a centralized system. This partner will ultimately be responsible for the archive and back up processes. While partners will surely continue their own archives and backups, the First Peer should attempt to have as cohesive a process for each action as possible. And while not required, the First Peer is the logical host for the data creators already mentioned that need help hosting data or data services.

88 Exploring this concept of First Peer yields other results. The idea behind an architecture and system like TDIS requires direction, instructions, and discovery mechanisms. Using the centralized model for these things makes sense, if only because of the unrealistic alternative of trying to use many different sites to piece together how to use or access specific data services or data sets. Among the necessary services, the First Peer acts as a switchboard (Figure 7.2) for users and partners. One example could be a partner or user searching for datasets and services by specific characteristics and spatial extents. A related case could be filtering desired data by current availability or date range. The First Peer is also the entity providing comprehensive help and documentation.

Internet/Networks

First Peer Switchboard/Directory of Resources EI E first Peer g Figure 7.2. Concept Diagram for Federated Services with TDIS.

Is there a working proof of concept for this idea of a switchboard? One of the most intuitive and useful interfaces for accessing data within an organization is with a commercial off the shelf product from software vendor Esri called Open Data Portal. Open Data Portal provides standard processes to register services to populate the portal with data services. The search tools allow users to either browse by category or search by keywords. Results can further by filtered by geographic areas, thematic categories, or data format. Organizations may also customize the portal to ensure a consistent branding.

Weighing all the complex questions raised by data shows that choosing a single approach requires understanding its strengths and weaknesses. But there is nothing preventing a hybrid solution that combines strengths from both solutions to address issues found in each. Broadly, TDIS could host a catalog of data services that are discoverable by users and partners. The data

89 behind the services could be local to TDIS’ First Peer or hosted by a Partner. The switchboard could house data documentation, process documents, and run books or instruction guides for each data service that help users understand metadata about the data service and the underlying data. The architecture for the data services will hinge on the responsible partner. Some will be local to the First Peer and some will be distributed by the other partners. Distributing the load to the members where possible ensures timely data built near the data owners. Centralized data services provide those partners who cannot host their own data a method to contribute valuable information to TDIS. The centralized system can unify how to archive and backup the data so that the threat of is minimized. Table 7.1 illustrates a comparison of the pros and cons of both federated and centralized systems.

By breaking out of the dogma that one approach or another must be used, a vision of a resilient, timely, accessible, and comprehensive system materializes. Enabling and including all levels of data providers rewards collaboration and strengthens partnerships. Flexibility also makes the bar of entry much lower and invites new connections. The primary ideas of timely data served out to all and the idea of the data services as an API creates a momentum for TDIS, proof that the system is not for a single event, but for many events. Not for a single use but for many uses.

Table 7.1. Comparison Pros and Cons of Both Federated and Centralized Systems. Federated System Centralized System Pros Cons Pros Cons Data Currency Can be more complex Lower Bar for Data Timeliness – inclusion of Data disconnected from Owner Data Owner Distributed Loads – Opaque Data Single Source for Data Fragmentation – partner services processes Contacts Many versions of hosted remotely data may exist Data Creator also More Points of Easier for Archives Data Unfamiliarity at Serves Data Failure and Backups Data Hosts Easier to implement Fail-over is Can be Simpler than More Expensive for for some Data complicated and not Federated Scaling Systems – Owners orchestrated single agency Scalability is cheaper Easier to set up Tendency for Lack of System Transparency Failover/Clustering Easier for services to Requires many be multipurpose – not internal processes to strictly for TDIS build out partner Data Services Overt Expression of Collaboration & Partnerships

90

Chapter 8 Critical Data and Application Development

Chapter 8. Critical Data and Application Development

The early stages of building TDIS will offer an opportunity to test the foundational principle of a hybrid federated data management system with the initial core of partners who are willing to share their data and data services through secure data protocols. Following the partnership agreements to implement data system security and adhere to the best practices of data curation to enable effective data sharing, TDIS and its partners will develop the first series of prototype data applications that illustrate what can be achieved through cooperative effort in terms of real-world results.

In this chapter, we discuss some early prototypes developed by UT-CSR as simple examples of recurrent data retrieval from a partner’s active archive or the integration of a data feed from a partner’s data service. The resulting data services match those that are typically deployed during a crisis, when demand rises quickly for a particular combination of data sets to be analyzed in context with one another. For instance, moving beyond the examples described in the following sections, one could imagine a use case calling for a web service mapping application in which the NEXRAD-derived Quantitative Precipitation Estimate regularly updated by the National Weather Service is displayed with the rain gage observations reported by the TWDB Texas Mesonet and overlaid upon the most recently collected radar satellite imagery processed by UT- CSR to reveal the cause and pattern of a regional flood event in progress.

The first TDIS application prototypes should be uncomplicated examples constructed on a test bed to familiarize the collaborating partners with how data sharing could be efficiently implemented through the federated system in which the partners manage their own data and make their choices to share selected products or data services. Initial concepts for application development could be evaluated to find the best pathways between internal and external data sources and the most frictionless ways to exchange data with the partners using data services or web service applications. Where necessary, data can be transferred to an agreed upon partner to process and host the data for partners unable to do so. Once the TDIS partners reach agreement in general principle, the creativity released by the seamless interactions may lead to immediate benefits for all and spark further collaboration.

91 NOAA QPE

NOAA through the National Weather Service issues estimates of observed rainfall amounts in hourly, 3-hourly, daily and longer time series using data collected by the Next Generation Weather Radar (NEXRAD) installations with additional data and verification provided by local networks of rain gages. The Quantitative Precipitation Estimate, or QPE, is the primary data product distributed on a recurrent basis and is publicly available from an NWS archive.

The NOAA QPE is a useful and important data set in the context of flooding caused by heavy rainfall. When flood disasters occur, the most useful QPE version tends to be the 24-hour rainfall accumulation data set. The estimated rainfall over the past 24 hours intersected with areas where ground reports of flood conditions are received indicate the extent and magnitude of possible flooding through comparison of the reported locations with the coincident estimated rainfall amounts. NOAA through NWS provides access daily to the 24-hour QPE for the conterminous United States. The data sets are posted every morning to a file repository.

For the QPE data, UT-CSR built a procedure to gather the data set and run it though 13 processing steps to convert it into a pair of data services that can be accessed locally or remotely. As a proof of concept, the data is regularly included as part of the UT-CSR Modeling, Observation and Visualization for Emergency Support (MOVES) application in near real-time data feeds. NWS provides QPE in a specialized format, necessitating the intensive post- processing prior to display in MOVES. The need to develop the QPE workflow illustrates that there will likely be multiple methods of access for data inclusion in a dynamic data information system. Some of the methods require processes to be developed and implemented that can re- home the data local to the information system. The QPE is a great example of that kind of data.

Challenges occur in the data delivery method, in the processes required to turn the data into useful data services and in archiving the data. Understanding that this is going to be the only method for some partners to provide data and that subsequent, unique steps are necessary to bring the data in line with the rest of the project is an important consideration. : Although we envision scripting and work flows common to many data sets, it is inevitable that some tasks must be tailored to individual data requirements, given the wide variety of data resources. This still requires an understanding and cooperation between the data provider and the information system. It should also be understood that space must be allocated for reworked data and data services in the context of data archiving.

Texas Mesonet

The Texas Mesonet API along with its data developed by the Texas Water Development Board (TWDB) represent a great example of web services done with forward thinking in mind. To begin, the website has documentation for the web services and example syntax. A crucial step in building data services that is often overlooked, documentation empowers both known users and future or unexpected users to explore the data. This is the kind of accessibility that can ensure viability and lead to recommendations from users and partners. While the Texas Mesonet site has

92 great data, the developers did not stop there. TWDB also created an application that exists loosely coupled to the API. That is, they built the API data services and the application as separate modules. What better way to prove confidence in and further test the API than to build the production viewer as a proof of concept for the data services. The data format is JSON, a modern de facto format for use on web and mobile applications. A text-based data format, JSON can be processed by most programming languages in use today. There is room for improvement though, even with the Mesonet, which we will discuss later.

The Texas Mesonet API subdivides its data into eight categories. The first two are the stations themselves and the current data. The other six are the different weather metrics broken out to allow historical queries against the data. This approach illustrates an important consideration that is overlooked in building data services - the concept of time. The historical data is particularly important in disaster recovery. The data really serves two different user bases here. Recovery is most often using historical data. Response efforts ideally require real-time data. Response efforts data quickly becomes stale, and the tactical nature of response means that the older the data is, the less important it is for ‘taking action right now’. However, from a recovery standpoint, having a historical archive of data means that workers can step back into the event and look at what the conditions were for an arbitrary timestamp of the data.

Texas Mesonet API epitomizes the kind of data service that UT-CSR recommends as suitable for TDIS. Implementing the data into the UT-CSR MOVES application proved to be straightforward. Implementation details became the only concern, given the availability of published data service documentation and API URLS. With the documentation and the API URLs published, it just became a question of the implementation details. UT-CSR recommends some minor improvements. The API uses prior minutes as the means to retrieve archival data, a time unit that is often less intuitive for a user than dates or date/time stamps. Users must convert a time query into prior minutes in order to access a desired slice of time. In addition, data for the different stations is not standardized. Although less than desirable, the lack of standardization does not diminish the usefulness of the service overall. Rather, this limitation encourages future iterations of this API and other data services to seek data standardization as a goal. Overall, the Texas Mesonet API data services are well constructed, useful and pertinent to disaster-related applications, a good model for future TDIS development.

USGS Hurricane Harvey High Water Marks Viewer

The USGS Hurricane Harvey High Water Marks Viewer (#8.3.1----) exemplifies one of the many mapping applications developed by UT-CSR. After the landfall of Hurricane Harvey, USGS and other federal agency field teams collected 2,123 high water marks (HWM) in Southeast Texas. The collection of HWMs was later used to create 19 inundation maps to document the extent and depth of the regional flooding.

UT-CSR obtained the point layers, reviewed them for quality assurance and published the results as a publicly available ArcGIS Portal feature layer. Once published, the HWM feature layer and the related Peak Summary Points data were used to create an interactive ArcGIS Portal mapping

93 application using the Web AppBuilder feature to make the HWM data more accessible to the disaster recovery community studying the effects of Hurricane Harvey.

The viewer shows the spatial extent of the HWMs and Peak Summary Points in Southeast Texas in conjunction with an Esri Streets base map. A link to a user guide can be found at the top of the viewer window. Navigation tools are provided to zoom and pan when navigating the map display. As an option, the user can search by address or place. When zoomed in for detail in the coastal region, the elevation contours derived from the StratMap hypsography layer become visible.

Two tools enable the user to examine the individual data layers in the viewer. A Legend tool allows visualizing the symbology. A Layer List tool gives the user the ability to turn layers on and off, set layer transparency, choose the map scale at which data turns on and view the attribute table. A Show Item Details option takes the user to the metadata and underlying map service for that dataset. Additional tools allow the user to change the base map or add other feature layers from ArcGIS Online or from spatial data layers stored in a local GIS. A filter tool permits users to filter by the data quality of the HWM or height above ground.

High water marks are collected after storm events to determine the peak water surface height, especially in areas where no stream gages exist to record measurements. Since the teams of high water mark collectors are rarely in the field during the flood peak, they must look for visual signs (or HWMs), such as debris or seed lines on buildings and other surfaces or still water stains inside structures. The location coordinates, type of HWM, the date collected, the assessed quality of the HWM, elevation and other details are attributed for each point. Hydrologists and other scientists use the data for validating and calibrating points in their hydraulic engineering models. These attributes in combination with hypsography and other digital terrain data can be used to estimate the how much land in the local vicinity would become inundated at various flood levels. Information of this kind is vital for the development of inundation maps that reveal the impacts of flooding in an area.

GLO Post-Ike Projects

Following Hurricanes Ike and Dolly in 2008, the GLO, as part of the Texas Coastal Resiliency Study (TCRS), contracted the Chicago Bridge & Iron Company (CB&I) to identify and document projects that would facilitate the protection of critical infrastructure in the coast area from future storms. After the critical infrastructure assets were recorded within 22 of the coastal counties most prone to hurricane damage, existing projects were identified, and new projects were initiated with the goal of mitigating future damage to the assets. The projects were compiled in a master list, ranked, and categorized by a risk score based on their vulnerability and other factors.

UT-CSR created an ArcGIS Portal application entitled the GLO Identified Projects Viewer to visualize the geographic distribution of the identified projects (-#8.4.1---). Built with the Web AppBuilder technology, the viewer contains many of the standard tools for navigation within the map display, searching by address and place, inspecting the attribute table and viewing the

94 metadata and underlying map service information. The points on the map display can be clicked to open a pop-up window containing basic information, such as the project name, description, the type of infrastructure, status and risk score. A link in the pop-up window will take the user to a PDF document of the project summary that includes additional information, such as the project ranking details. A key is also provided for further information describing the consequence ranking and risk score.

The viewer also provides three ways to query the data and three charting options. Under the query tool, drop-down lists are provided allowing the user to choose a particular risk score, type of infrastructure and whether the project is new or existing. The locations that meet the users chosen criteria are highlighted on the map and statistics on the number of selected projects is provided. The Chart tool displays pie charts showing the distribution of values for risk score, vulnerability zone, and consequence scale.

. . . . . While the TDIS prototype data applications introduced in this chapter are small-scale examples, they point the way toward more ambitious collaborative projects. Creating an environment within TDIS for multi-agency cooperation to build custom data applications should become a major goal. Achievement of that goal will be both a validation of the underlying data services, and a key to success that leads to expanding opportunities through the participation of both public and private sector organizations and academic institutions.

95 Chapter 9 Uses of Probabilistic and Deterministic Approaches Chapter 9. Uses of Probabilistic and Deterministic Approaches

Recent advances in numerical modeling techniques appear to have opened the possibility of creating accurate forecasts of future disasters for real-time response operations and for long- range planning in disaster-prone regions. Whether we have arrived at the stage in which probabilistic and/or deterministic modeling approaches to the outcomes of future disasters could be applied with a high level of confidence to direct recovery and mitigation plans remains an issue. In this essay, the points to be examined will consider the validity and usefulness of employing numerical forecast models to generate the results of potential disaster impacts and their physical and socioeconomic consequences in Texas. One should take into account that the discussion here reflects only the current state-of-the-art, which could advance rapidly through the introduction of new techniques. The examination in this discussion could also be expanded to include many additional examples and more detailed analyses of some of the particular aspects presented, but a truly comprehensive summary would not easily fit within the scope of a single chapter.

Our focus is placed on the comparative value of different numerical approaches. For the modeling of disasters, deterministic approaches apply the laws of physics to solve the equations governing a physical process in the real world, such as the shallow-water equations derived to resolve the mechanics of storm surge and wave dynamics. Often relying on computational fluid dynamics, deterministic models are developed to leverage the power of high performance computing using massive parallel processing to calculate their results over a spatial domain defined by a grid of points. Whether the process to be described involves a coastal storm surge, inland river flood or wildfire behavior, the results of each processing step are passed through the grid points to produce a time series of snapshots that represent changing physical impacts.

Probabilistic approaches apply a different mathematical description of the real world that does not necessarily invoke a detailed description of the physical interactions occurring in the environment. Probabilistic analyses attempt to account for randomness and the fact that chance occurrences can dramatically alter the final outcomes obtained from exactly the same set of initial conditions. In the study of disasters, probabilistic numerical models often begin with a set of historical observations, which will include extreme events, and then determine the likelihood of future events taking place. This approach has been most fully developed by applying Bayesian statistics in which the observed data are used to construct a statistical model that describes event

96 likelihood, and then determines the prior distribution that defines the possible outcomes and their possibilities as a likelihood distribution. In turn, the prior and likelihood distributions are combined to yield a posterior distribution that can be used to make predictions. Bayesian techniques are most often applied to large spatial domains, such as entire geographic regions, and are used to account for trends that may indicate increasing or decreasing probabilities of an occurrence.

Both probabilistic and deterministic modeling techniques take advantage of computing power to develop ensemble results that include many computational runs that contain degrees of randomness in the case of probabilistic modeling or that make changes to the physical parameters in the case of deterministic models. In either instance, the ensemble results are useful to describe the wide range of possible outcomes and their likelihood.

An example of probabilistic modeling used with a rich set of historical observations can be viewed in the NOAA Atlas 14 products discussed in Chapter 4. The probabilities of extreme rainfall events of different durations are calculated over a range of point locations and used to define the events to be included in the deterministic rainfall-runoff models and hydraulic models of river flows to develop flood inundation maps. Another use of the probabilistic approach is applied in the P-Surge product computed by the National Hurricane Center, when storm surge warnings are issued for hurricane landfall. The results of an ensemble of deterministic storm surge runs containing variations in track location, wind speed, wind field size and storm movement are combined in the P-Surge product to calculate the probabilities of surface water heights and inundation depths for different locations in the landfall region.

The most significant use of the probabilistic approach to disaster impacts occurs in the financial sector, particularly in the insurance and global re-insurance industries. For these purposes, the insurers must determine how to manage their liabilities in a world of increasing risk from extreme weather events. Probabilistic modeling approaches are well designed to provide guidance in these situations, particularly on a regional scale. Private sector risk consultants, such as AIR Worldwide, have developed Bayesian approaches to forecast potential impacts from different natural disasters and offer their analyses to the insurers.

In this case, the probabilistic models generate results that represent the changing risks created by the insurers’ exposure to the potential losses of their policyholders. The insurers can then make decisions to hedge their risk by setting higher future rates, excluding areas from their programs, or simply refusing to sell future policies in a region. The latter decision can be sudden and disruptive, as in the case of the state of Florida following the 2004 hurricane season, when four strong hurricanes caused extensive damage. Private insurers recalculated their risk and withdrew their windstorm insurance from the Florida market, effectively forcing the state to become the insurer of last resort. Note that the 2004 decision by multiple companies applied to the entire state. The factor of an all-or-nothing tipping point of this kind seems to be one of the negative consequences of the adoption of probabilistic numerical modeling by the insurance industry.

The non-location specific character of most probabilistic approaches uncovers a topic of concern for its application for response operations and disaster recovery, where a site-specific location can be a very important factor. Consider an incident from the 2016 Sabine River Flood, when the

97 river set a new record exceeding the level reached by the previous record flood in 1884. As the floodwaters were rising, the Sabine River Authority attempted to save a major pumping station in Newton County by building a large sandbag structure around the pump house. The water rose high against the sandbag walls, and the defense almost worked, but in the end the station with a value of over $50 million was destroyed, and local industries, such as International Paper, lost their water supply. A single-point failure created the majority of all public sector damage in the entire county. While the approach might be suitable for other purposes, it is difficult to imagine how a probabilistic numerical model would evaluate an event of this kind.

By contrast, a deterministic modeling approach is well suited to the analysis of a specific site. An example of the success appears in Chapter 2, which reviews the results of the ADCIRC Storm Guidance System (ASGS) for Hurricane Laura in the area surrounding Port Arthur, Texas. In this instance and many other cases, the ASGS provided timely, accurate forecasts that aided first responders in the field. ADCIRC model simulations were also used to design the protective infrastructure in southern Louisiana following Hurricane Katrina and to determine the height above grade for construction of new hospital facilities at the University of Texas Medical Branch in Galveston following Hurricane Ike. The ADICRC model can recreate the impacts of historical storms along their tracks, or the storm tracks can be shifted so that landfall occurs at different locations to compare the damage results. The parameters for design storms, which have never existed, but which are conceivable, can be used by ADCIRC to assemble a library of potential storm surge and wave impacts for an area. A particular strongpoint of ADCIRC comes from the ability to create ensembles containing many different members representing a broad range of high probability and lower probability events. At the UT Texas Advanced Computing Center, new supercomputing resources are available that could build libraries of ensemble storm impacts for both regional and site-specific studies.

The strengths of deterministic methods can balance some of the weaknesses found in the probabilistic approaches. Neither technique offers a complete solution. The missing factors involve the changing human landscape that is acted upon by natural disasters. One of the limiting factors is our inability to predict where future populations will live and what demographic and socioeconomic characteristics they will share. Determining the resilience thresholds linked to societal fragility is a major challenge for currently existing communities. If long-term recovery and mitigation of future disaster impacts is to occur, there will need to be better predictive models of that go beyond merely extending the current trends of settlement and development.

For whatever method happens to be applied, one large question remains. How long into the future would any numerical forecast be viable given the escalating rates of change now observed? Whether the change is sea level rise, storm intensity, rainfall event magnitude or other disruptions to the environmental equilibrium, our modeling techniques are being applied to a moving target. While the large question is impossible to ignore, progress can still be made on a smaller scale. One of the greatest benefits could stem from the use of numerical simulations by cooperating groups to design their plans for recovery and mitigation in the context different ensemble results that demonstrate a range of likely and extreme outcomes. In this regard, the probabilistic and deterministic approaches would be guides to the fundamental considerations for

98 the development of new projects. The central point would reinforce a message that all projects have an Achilles’ heel, and it is better to learn this before they are constructed.

99 Chapter 10 Recommendations

Chapter 10. Recommendations

In the course of our work, we have considered the different elements and capabilities necessary to create and sustain the Texas Disaster Information System. The components of the system and their connections are examined in the chapters addressing data themes and data applications (Chapters 1, 2, 4, 5 and 8), data system security (Chapter 3), data standards (Chapter 6) and data system architecture (Chapter 7). From our review, we have chosen to list ten recommendations that summarize our findings. The list is not meant to be comprehensive but contains the most significant and frequently discussed issues. While the listed recommendations do not appear in an order of priority, the first four are especially important to the foundation of TDIS, while the remainder highlight other factors requiring attention.

The recommendations emphasize a set of actions established from our viewpoint. While they seem to be rational steps to take in a deliberative process, not all of them may be completely practical taken in the context in which TDIS must operate. For this reason and others, one would expect further discussion with the TDIS partners to shape the eventual outcome. Not all other plans that could be enacted would necessarily lead to failure, but the one presented here has the strongest possibility of success based on our experience.

The Recommendations:

1. Negotiate data sharing partnerships with the major agencies (USACE, FEMA, NWS, USGS, TWDB, TDEM, GLO). The building of TDIS first requires that an understanding be reached with each of the partners in a core group of participating federal and state agencies, including USACE, FEMA, NWS, USGS, TWDB, TDEM and GLO. The understanding needs to cover more than a simple agreement to work together toward mutual objectives. It calls for a willingness to address in detail the methods and step-by-step procedures for sharing data, the maintenance of data system security and the adherence to data content and quality standards. A negotiated agreement focused on these topics and other concerns, which may vary by organization, will lay the foundation for a viable TDIS.

100

All TDIS partners should recognize the advantages to be gained through seamless data sharing with TDIS in as direct a method as feasible. As discussed in Chapter 8, some agencies have already adopted the practice of allowing open access to critical information through their data services. Their experiences should encourage others to explore this approach. In the meantime, the routine business practices of some TDIS partners may resist change, especially if there are perceived security threats. Proposed solutions will need to recognize these obstacles and demonstrate their ability to avoid serious disruption to existing procedures.

Above all, TDIS partners need to be willing to experiment and test the boundaries of new ways to unite their interests and collaborate through data sharing. The effort will require patience from all participants in finding the pathways to move forward and to develop trust in TDIS as a concept that needs to become reality.

2. Build the TDIS Architecture using primarily a data services design based on the concept of federated data systems acting in partnership. The TDIS architecture will need to be a very flexible environment. By leveraging proven models as discussed in Chapter 7, TDIS can build novel software and partnerships through the richness of data sharing and collaboration.

Resiliency and scalability. Archiving and system back-ups. Separation of concerns by using the N-tier server architecture. These are the stable, battle-tested ideas that should be invoked from the earliest stages. Kubernetes or a similar monitoring and deployment system would provide solutions for maintaining system uptime and the ability to expand resources during peak demand times. Virtual machines would ensure maximum efficiency in hardware utilization. Further, if containers were used (also called “paravirtualization”), their installation would facilitate continuous integration along the lines of a robust development-operations model. Such a strategy would mean that changes could be pushed out and tested with greater frequency to mitigate bugs or to meet other requirements for upgrades and updates. Persistence with an enterprise database, such as PostgreSQL, would allow the maximum flexibility in data model storage, performance and features.

3. Implement Data System Security using a Federated Identity Management solution. As described in Chapter 3, TDIS will be a complex hybrid collection of partners and internal processes that will have both public and restricted areas of concern. In such as a system, one of the best methods for handling security involves Federated Identity Management.

The proposed solution removes the management of the individual users in particular external organizations by shifting the management of an organization’s users back to the organization. For TDIS, system security must be considered early and often. As part of the early work, a security team should be established that could build a reference implementation of the Federated Identity System. At the minimum, this system should include internal users belonging to the managing Partner, one or two federated partners and some individual users that have no partner affiliation. Each type of user represents one of the user security models that will exist in TDIS.

101 Importantly, these procedural steps refer to providing authorization to resources in the system. Once authorization is successfully tested, access levels will need to be assigned. Some users will seek access to public data while others will require different levels of access to restricted data resources. Building out these groups or at least an initial subset of these groups will cement security as a recognized priority for TDIS.

4. Adopt Data Content and Quality Standards for the Data Accessed Through TDIS. To fully develop the capabilities of information accessed through TDIS, data quality standards must be implemented. Quality standards reduce data discrepancies, and for GIS-ready data spatial data quality standards include measures of completeness, currency, accuracy, precision, and consistency.

As discussed in Chapters 1 and 6, completion can be measured based on the completeness of database attribution values, or the number of spatial features that are completed in the form of points, lines, or polygons. Currency describes how recently the event represented by the data occurred. Accuracy (attribute, locational) refers to how well the data describes the real-world conditions, object, or event being described. Consistency is the absence of conflicts in a database, most often applying to the rules for geospatial data concerning the relationships between points, lines, and polygons, also known as geospatial topology.

Finally, the data managed by TDIS must adhere to the FAIR guiding principles described in Chapter 6 and be Findable, Accessible, Interoperable and Reusable through Digital Object Identifiers or other sources of comprehensive metadata.

5. Explore and implement the licensing of commercial data products. Any plan to establish a disaster information system should include a thorough exploration of commercial data products and services. Although freely available, many public data sets suffer from data quality issues, currency concerns, differing data formats and the lack of a unified data schema.

In the geospatial realm, private firms that specialize in transportation and navigation data services often make available superior products for transportation-related applications. Having a common transportation network at all levels across the state not only ensures matching of data across boundaries, but also provides a consistent and reliable schema for the use of geocoding tools. In a similar vein, private sector vendors of commercial high-resolution orthoimagery provide quality products through licensed data feeds. Their imagery coverage can provide higher spatial resolution and more current data than comparable public sources for areas that are rapidly changing and need more frequent collections. Whereas the licensing arrangements for access to commercial data introduce a layer of complexity and expense, the proposed design for TDIS data management is well suited to handle commercial data sources.

102 6. Develop Prototype Web Services for TDIS. TDIS data services will feed the collaborations and applications built by the partners and users. It follows that prototype web services will play an outsized role as development begins. Web services are both a reflection of the data being served and the means of access for the system. Just starting a few services can prove the viability of the concept and the usefulness of the system. Consideration and priority should be on building out examples of both reference services and targeted services.

Reference services are those that help ground or locate the other resources being used by TDIS. Examples might be a vector-based data service that collects common and well-known features, such as streets, political boundaries, points of interest, etc., or a satellite image base map in conjunction with labels. These represent common practices in mapping applications and provide a reference for other data sources. The reference data sources could be consumed from a federated source or local to the managing partner. The former would be a better choice to validate the experience of the federated data service.

Inspiration for the targeted data services could come from one of the high value data sets reviewed in Chapters 2, 4 or 8. The USGS river gage observations are a good example of a data service that underscores a few different points. These observations come from a data provider that hosts data but not the data service. In this case, a process can create the necessary centralized data to build the data service. Also, since the data is disconnected from the hosted data service, the workflow could be refined. Other choices for prototype development are nearly boundless. NFIP data, high water marks, critical infrastructure or land parcels could also stimulate high interest and yield high value for the process. Many of these could also become sources from federated partners or hosted centrally.

Ultimately, the prototype data services will be an early validation of the conceptual architecture and infrastructure of TDIS. Federated and centralized, data hosted locally and at remote locations, are all essential to that idea. The services can also begin to lay the groundwork for data security and access level assignment. The documentation for the services and the specifications used will feed proof of concept applications.

7. Collect new, higher quality and more comprehensive high priority data. TDIS should be an advocate for the continuous improvement of data resources needed for disaster response, recovery, and mitigation. As our technology advances, new information sources emerge, and older sources can be refined and updated. The optimal data used for topography presents an example, where over the course of twenty years, the collection and processing of aerial LiDAR terrain data has moved from being experimental to becoming essential for a wide range of common applications.

The high priority data sets identified in Chapter 4 are a selection of resources that need to be maintained and extended for reasons of their recognized utility and importance. They also include sources that stand to gain from technological innovation during the next several years, as in the case of radar satellite imagery. On the other hand, the critical data shortfalls discussed in Chapter 5 are readymade subjects for new investment to create suitable data updates and

103 replacements where deficiencies currently exist. The TDIS partners should confer and offer reasons concerning their own priorities for investment in new data. An agreement among the partners will advance the cause of renewable digital infrastructure as a necessary complement to the new physical infrastructure now in development for recovery projects in Texas.

8. Develop and expand the use of social impact data including Public Health data. Texas-focused social impact demographic data should be developed, and its use promoted in disaster response, recovery and preparedness because a disaster frequently impacts vulnerable populations differently than more resilient communities. Anonymized public health information should be an integral facet of data compilation. The CDC has already made the case for the development of its Social Vulnerability Index discussed in Chapter 1. TDIS should expand upon the SVI’s 15 social factors (#10.8.1), attach suitable environmental and health data and update the data to be as current as possible. The SVI’s current social factors, all collected in the annual American Community Survey, are: being below the poverty level, being unemployed, income, lack of high school diploma, aged 65 or older or 17 or younger, disability status older than 5, single-parent household, minority status, speaks English less than well, multi-unit structures, mobile homes, crowding, no vehicle, and group quarters, supplemented by lack of health insurance and daytime population density.

With social vulnerabilities exposed, TDIS better supports the development of community resilience and adaptability. Programs become more effective when targeted to meet identified gaps. Frequently, the most vulnerable populations live in some of the most hazardous environments or have the least available resources to prepare for or resume normal activities after a disaster. If this recommendation is ignored or only partially implemented, TDIS might be able to map and identify the where and why of Texas disasters but miss the opportunity to direct recovery and preparedness efforts to the areas of greatest need.

9. Adopt procedures to handle the arrival and use of new and novel data sources during disasters. The data generated during disasters in Texas is expanding exponentially. TDIS must anticipate an increasingly complex information environment during future disasters with ever-growing volumes of relevant data and the arrival of novel data sources never dealt with before. To capture the full range of data sources that come into play during the activation of state resources for emergency response, TDIS needs to participate in the creation of the information products required to assist the emergency management community. The participation will expose TDIS to the enormous variety of potentially actionable information that could be filtered from the chaotic stream of loosely structured and unstructured data, along with well-structured data arriving from unfamiliar sources.

TDIS should be ideally suited for dealing with the complexities of a rapidly evolving data-rich environment from its close association with the high-performance computing capabilities of academic research groups that handle large volumes of data and generate numerical model results. Current academic research in artificial intelligence, particularly the application of machine learning, could contribute to finding useful data during a crisis. The sudden introduction of new, unfamiliar information sources frequently occurs in the midst of disaster response, and

104 these data sets can be extremely useful if recognized and acted on. The appearance of SafeGraph data and other anonymized mobility tracking information in the early stages of the COVID-19 pandemic has guided many insights, as mentioned in the Economic Activity discussion in Chapter 1. During a disaster, TDIS should expect the arrival of novel data sources and be prepared to react.

10. Perform an online data survey of the core TDIS partners to develop a better understanding of their needs for data, data services and applications and their perceptions of data shortfalls. In Chapters 1, 4 and 5, we report our own conclusions about these topics that were developed by analyzing the actions of the community of disaster data producers and data consumers. An online survey could be employed to gather responses to questions concerning each of the 18 data themes discussed in Chapter 1 from the core group of TDIS partners (GLO, TWDB, TDEM, USACE, FEMA, NWS, USGS) to record each organization’s expert opinion. The survey questions would be designed to identify significant data gaps that now inhibit progress and what steps could be taken to remove the data shortfalls. The metrics developed from the responses will help to prioritize future data needs across the different agencies and solidify a group consensus for high priority data. In addition, the survey would promote the building of a coalition needed to advocate for the future digital infrastructure.

105 Appendix 1 Reference Links Appendix 1. Reference Links

Chapter 1. Survey of Data for Response, Recovery and Mitigation

Introduction #1.i.1: Data.gov https://www.data.gov

#1.i.2: National Map https://www.usgs.gov/core-science-systems/national-geospatial-program/national-map

#1.i.3: OpenStreetMap https://www.openstreetmap.org/about

#1.i.4: TNRIS https://www.tnris.org

#1.i.5:Strategic Mapping Program https://www.tnris.org/stratmap/

Geodetic Control #1.1.1: National Geodetic Survey https://geodesy.noaa.gov

#1.1.2: NGS Height Modernization Program https://geodesy.noaa.gov/INFO/OnePagers/NewDatumsOnePager.pdf

#1.1.3: Conrad Blucher Institute https://cbi.tamucc.edu/Geospatial-Computing/tsrc/

106 #1.1.4: National Geodetic Survey Data Explorer https://geodesy.noaa.gov/NGSDataExplorer/

Bathymetry #1.2.1: National Geospatial Data Center Bathymetry Data Viewer https://www.ngdc.noaa.gov/maps/bathymetry/

#1.2.2: NOAA Office for Coastal Management Digital Coast https://coast.noaa.gov/dataviewer/

#1.2.3: ADCIRC Development Group http://adcirc.org/community/developers/development-group/

#1.2.4: TPWD Coastal Bathymetry https://data.tnris.org/collection/8fe992d8-1019-492a-b36b-9cfc3293ac6b

Topography #1.3.1: U.S. Geological Survey https://www.usgs.gov/

#1.3.2: National Elevation Dataset https://www.usgs.gov/core-science-systems/national-geospatial-program/national-map

#1.3.3: Shuttle Radar Topography Mission https://www2.jpl.nasa.gov/srtm/

#1.3.4: LiDAR https://oceanservice.noaa.gov/facts/lidar.html

#1.3.5: USGS 3DEP Elevation Program https://www.usgs.gov/core-science-systems/ngp/3dep

#1.3.6: Height Above the Nearest Drainage https://www.sciencedirect.com/science/article/abs/pii/S0022169411002599

#1.3.7: National Water Model https://water.noaa.gov/about/nwm

High Resolution Baseline Imagery #1.4.1: TNRIS Orthoimagery https://tnris.org/stratmap/orthoimagery/

107 #1.4.2: USGS HDDS Explorer https://hddsexplorer.usgs.gov

#1.4.3: NOAA Emergency Response Imagery https://storms.ngs.noaa.gov

Hydrography #1.5.1: National Hydrography Dataset https://www.usgs.gov/core-science-systems/ngp/national-hydrography

#1.5.2 https://www.usgs.gov/core-science-systems/ngp/national-hydrography/watershed- boundary-dataset?qt-science_support_page_related_con=4#qt- science_support_page_related_con

#1.5.3: Watershed Boundary Dataset https://www.usgs.gov/core-science-systems/ngp/national-hydrography/nhdplus-high- resolution

#1.5.4: National Hydrography Dataset Plus https://www.epa.gov/waterdata/nhdplus-national-hydrography-dataset-plus

#1.5.5: National Weather Service River Gages https://water.weather.gov/ahps/

Soils #1.6.1: Natural Resources Conservation Service Soils https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/?cid=nrcs142p2_05362 7

#1.6.2: State Soil Geographic Database https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_0 53629

#1.6.3: Gridded National Soil Survey Geographic Database https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcseprd1464 625

#1.6.4: USDA Geospatial Data Gateway https://gdg.sc.egov.usda.gov/

#1.6.5: USDA Soil Data Development Toolbox https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/office/ssr11/?cid=nrcse prd1327265

108

#1.6.6: Texas Soil Observation Network https://www.beg.utexas.edu/research/programs/txson

Climate and Weather #1.7.1: Climate Prediction Center GIS Data https://www.cpc.ncep.noaa.gov/products/GIS/GIS_DATA/

#1.7.2: NOAA Severe Weather Data https://www.ncdc.noaa.gov/data-access/severe-weather

#1.7.3: NOAA NWS OGC Web Service https://www.weather.gov/gis/WebServices

#1.7.4: nowCOAST Web Mapping Portal https://nowcoast.noaa.gov/

#1.7.5: nowCOAST ArcGIS REST Services https://nowcoast.noaa.gov/arcgis/rest/services/nowcoast

Administrative Boundaries #1.8.1 TNRIS DataHub https://data.tnris.org/

#1.8.2: About TNRIS https://tnris.org/about/

#1.8.3: TxDOT Open Data Portal https://gis-txdot.opendata.arcgis.com/

#1.8.4: FEMA Data Feeds https://gis.fema.gov/DataFeeds.html

#1.8.5: TWDB GIS Data https://www.twdb.texas.gov/mapping/gisdata.asp

#1.8.6: North Central Council of Governments https://data-nctcoggis.opendata.arcgis.com/

#1.8.7: Houston-Galveston Area Council http://www.h-gac.com/gis-applications-and-data/datasets.aspx

#1.8.8: Capital Area Council of Governments https://regional-open-data-capcog.opendata.arcgis.com/

109 Land Cover / Land Use / Zoning #1.9.1: Multi-Resolution Land Characteristics https://www.mrlc.gov/

#1.9.2: Multi-Resolution Land Characteristics Data https://www.mrlc.gov/data

#1.9.3: Round Rock Zoning Districts https://geohub.roundrocktexas.gov/datasets/zoning-districts-2/data

#1.9.4: The Effect of Land Cover Change on Flooding https://www.researchgate.net/publication/319489661_The_Effect_of_Land_Cover_Change _on_Flooding_in_Texas/fulltext/59aea0de458515d09ce7beb5/The-Effect-of-Land-Cover- Change-on-Flooding-in-Texas.pdf

#1.9.5: Uncertainty in Hurricane Surge Simulation https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2013JC009604

Agriculture #1.10.1: USDA Census of Agriculture https://www.nass.usda.gov/AgCensus/index.php

#1.10.2: USDA Agricultural Statistics https://www.nass.usda.gov/Statistics_by_State/Texas/

#1.10.3: Forest Inventory and Analysis https://www.fia.fs.fed.us

#1.10.4: USDA NRCS NAIP Data https://nrcs.app.box.com/v/naip

Transportation #1.11.1: TomTom MultiNet Product Info Sheet https://www.tomtom.com/products/

#1.11.2: TxDOT Roadways Service https://gis-txdot.opendata.arcgis.com/datasets/txdot-roadways

#1.11.3: Dartmouth Flood Observatory https://floodobservatory.colorado.edu/Events/2017USA4510/2017USA4510.html

#1.11.4: Flooded Roadways from Hurricane Harvey http://www.geo.utexas.edu/courses/371c/project/2017F/Roberts_GIS_Project.pdf

110 Critical Infrastructure #1.12.1: DHS Critical Infrastructure Security https://www.dhs.gov/topic/critical-infrastructure-security

#1.12.2: DHS Homeland Infrastructure Foundation-Level Data https://gii.dhs.gov/hifld/

#1.12.3: HIFLD Open Data https://hifld-geoplatform.opendata.arcgis.com/

#1.12.4: HIFLD Secure Data https://gii.dhs.gov/HIFLD/data/secure/

#1.12.5: TCEQ SEERS https://www3.tceq.texas.gov/steers/

#1.12.6: Railroad Commission Map Viewer https://gis.rrc.texas.gov/GISViewer/

Street Addressing #1.13.:1 TomTom Address Points https://www.tomtom.com/lib/doc/licensing/I.ADDRESSPOINTS.EN.pdf

#1.13.2: StratMap Address Points https://tnris.org/stratmap/address-points/

#1.13.3: TNRIS DataHub https://data.tnris.org/

Land Parcels #1.14.1: TNRIS Land Parcels 2019 https://data.tnris.org/collection/2679b514-bb7b-409f-97f3-ee3879f34448

#1.14.2: Williamson Central Appraisal District https://www.wcad.org/gis-data/

#1.14.3: Homeland Infrastructure Foundation-Level Data https://gii.dhs.gov/hifld/

#1.14.4: Land Parcel Hurricane Hazard Exposure Study https://journals.sagepub.com/doi/pdf/10.1068/b32114

111 Demography #1.15.1: United States Decennial Census https://www.census.gov/programs-surveys/decennial-census/about.html

#1.15.2: American Community Survey https://www.census.gov/programs-surveys/acs/about.html

#1.15.3: United States Census https://www.census.gov/

#1.15.4: Texas Demographic Center https://demographics.texas.gov/

#1.15.5: U.S. Census Tabular Data https://data.census.gov/cedsci/

#1.15.6: U.S. Census American Community Survey https://www.census.gov/acs/www/data/data-tables-and-tools/data-profiles/

#1.15.7: U.S. Census Geography Program https://www.census.gov/programs-surveys/geography.html

#1.15.8 TDC Texas Population Projections https://demographics.texas.gov/Data/TPEPP/Projections/Tool

Economic Activity #1.16.1: Economic Applications in Disaster Research paper https://training.fema.gov/hiedu/docs/disciplines%20disasters%20and%20em%20book %20-%20chapter-econ%20appli%20in%20disasters%20research.doc

Terry L. Clower. Economic. Applications in disaster research, mitigation, and planning, Denton, TX: Center for Economic Development and Research, University of North Texas. Accessed October 26, 2020 at https://training.fema.gov/hiedu/docs/disciplines%20disasters%20and%20em%20book %20-%20chapter-econ%20appli%20in%20disasters%20research.doc

#1.16.2: Dun and Bradstreet Emergency Management https://www.dnb.com/perspectives/government/emergency-management-software-for- resource-allocation.html

#1.16.3: ESRI Data and Infographics https://www.esri.com/en-us/arcgis/products/arcgis-business-analyst/data-infographics

#1.16.4: North Central Texas COG https://data-nctcoggis.opendata.arcgis.com/search?tags=economy

112

#1.16.5: City of Houston Data http://data.houstontx.gov/dataset?_tags_limit=0&page=1

#1.16.6: Texas Comptroller Revenue and Expenditure Dashboard https://bivisual.cpa.texas.gov/CPA/opendocnotoolbar.htm?document=documents%5CTR_ Master_UI.qvw

#1.16.7: U.S. Bureau of Labor Statistics QCEW Map https://data.bls.gov/maps/cew/TX?period=2020- Q1&industry=10&geo_id=48000&chartData=3&distribution=1&pos_color=blue&neg_color =orange&showHideChart=show&ownerType=0

#1.16.8: Safegraph Data https://www.safegraph.com/

#1.16.9: ESRI Safegraph POI data https://www.esri.com/en-us/landing-page/product/2019/esri-partner-data-safegraph

Disaster Insurance Claims #1.17.1: FEMA Flood Insurance Program https://www.fema.gov/flood-insurance

#1.17.2: National Insurance Crime Bureau https://www.nicb.org/

#1.17.3: Texas Windstorm Insurance Association https://www.twia.org/

#1.17.4: Amica Mutual Insurance ArcNews Article https://www.esri.com/news/arcnews/summer12articles/amica-mutual-insurance-maps- real-time-data.html

Public Health #1.18.1: Center for Disease Control https://www.cdc.gov/

#1.18.2: Texas Department of State Health Services https://www.dshs.state.tx.us

#1.18.3: DSHS Center for Health Statistics https://www.dshs.state.tx.us/chs/

113 #1.18.4: U.S. Census Surveys and Programs Contributing to Health https://www.census.gov/topics/health/surveys.html

#1.18.5: CDC On Epidemiology Article https://wwwnc.cdc.gov/eid/article/2/2/96-0202_article

#1.18.6: CDC Interactive Web Applications & Data https://www.cdc.gov/gis/interactive-applications.htm

#1.18.7: CDC 500 Cities https://www.cdc.gov/500cities/

#1.18.8: CDC Social Vulnerability Index https://www.atsdr.cdc.gov/placeandhealth/svi/index.html

#1.18.9: CDC SVI Fact Sheet https://www.atsdr.cdc.gov/placeandhealth/svi/fact_sheet/fact_sheet.html

Chapter 2. Data Sharing

FEMA HPA Team and Hurricane Harvey High Water Marks #2.1.1: USGS Harvey High Water Marks Viewer https://agportalw-sec- green7.csr.utexas.edu/portal/apps/webappviewer/index.html?id=c5593f6de081458f8605 ba30d75c32de

Sharing data from the ADCIRC Storm Guidance System with the SOC and TTF1 #2.3.1: DHS ADCIRC Storm Guidance System https://www.dhs.gov/science-and-technology/news/2019/07/30/snapshot-adcirc- prediction-system

#2.3.2: Coastal Emergency Risks Assessment http://coastalrisk.live

USACE Trinity River Investigation #2.5.1: Trinity River Authority of Texas http://www.trinityra.org/

#2.5.2: ESA Sentinel-1 Data Products https://sentinel.esa.int/web/sentinel/missions/sentinel-1/data-products

114

#2.5.3: USGS EROS SRTM Digital Elevation https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-elevation-shuttle- radar-topography-mission-srtm-1-arc?qt-science_center_objects=0#qt- science_center_objects

#2.5.4: ESA Sentinel-2 Mission https://sentinel.esa.int/web/sentinel/missions/sentinel-2

#2.5.5: USGS Landsat 8 Mission https://www.usgs.gov/core-science-systems/nli/landsat/landsat-8?qt- science_support_page_related_con=0#qt-science_support_page_related_con

#2.5.6: NASA MODIS Information https://modis.gsfc.nasa.gov/

#2.5.7: ESA Sentinel-3 Mission https://sentinel.esa.int/web/sentinel/missions/sentinel-3

#2.5.8: Normalized Difference Water Index https://en.wikipedia.org/wiki/Normalized_difference_water_index

Chapter 3. Data System Security #3.1: NSF Secure and Trustworthy Cyberspace Quote https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504709

#3.2: TechBeacon Best Practices for API Security https://techbeacon.com/app-dev-testing/8-essential-best-practices-api-security

#3.3: Facebook-Cambridge Scandal https://www.theguardian.com/technology/2018/jul/11/facebook-fined-for-data- breaches-in-cambridge-analytica-scandal

#3.4: NSF award for interdisciplinary cybersecurity https://nsf.gov/news/news_summ.jsp?cntn_id=136481&org=NSF&from=news

#3.5: NSF Secure and Trustworthy Cyberspace Quote https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=504709

115 Chapter 4. Highest Priority Data

LiDAR-Based Topography #4.1.1: What is lidar? https://oceanservice.noaa.gov/facts/lidar.html

NOAA Atlas 14 #4.2.1: NOAA Precipitation-Frequency Atlas https://repository.library.noaa.gov/view/noaa/22619

#4.2.2: Flood Risk and Atlas 14 https://www.austintexas.gov/page/flood-risk-and-atlas-14-details

#4.2.3: NOAA Atlas Volume 11: Texas project area https://www.nctcog.org/nctcg/media/Environment-and- Development/Documents/Watershed%20Management/CRS%20User%20Group/022719- Combined-Presentations.pdf

#4.2.4: NOAA Texas Rainfall Frequency Update https://www.noaa.gov/media-release/noaa-updates-texas-rainfall-frequency-values

#4.2.5: NOAA Precipitation Frequency Data Server (PFDS) https://hdsc.nws.noaa.gov/hdsc/pfds/

DHS HIFLD Products and Data Services #4.3.1: DHS HiFLD Secure Data Portal https://gii.dhs.gov/HIFLD/data/secure/

Radar Satellite Imagery #4.4.1: International Charter Space and Major Disasters https://disasterscharter.org/web/guest/home

#4.4.2: What is SAR? https://asf.alaska.edu/information/sar-information/what-is-sar/

#4.4.3: Sentinel-1 Mission https://sentinel.esa.int/web/sentinel/missions/sentinel-1

#4.4.4: NASA-ISRO NISAR SAR mission https://nisar.jpl.nasa.gov

116 #4.4.5: JPL UAVSAR https://airbornescience.jpl.nasa.gov/news/uavsar-–-uninhabited-aerial-vehicle-synthetic- aperture-radar-20

#4.4.6: ECEYE Microsatellite SAR https://www.iceye.com

FEMA Individual Assistance / NFIP #4.5.1: FEMA Individual Disaster Assistance https://www.fema.gov/individual-disaster-assistance

#4.5.2: Manufactured Housing Units https://agportalw-sec- green7.csr.utexas.edu/portal/apps/webappviewer/index.html?id=344e6ee86b3b4afa925 4d1d131661318

#4.5.3: Hurricane Harvey Dashboard IA Registrations http://magic.csr.utexas.edu/HurricaneRecoveryDashboard/views/

Enhanced Demography / SVI #4.6.1: US Census American Community Survey https://www.census.gov/programs-surveys/acs/about.html

#4.6.2: Texas Demographic Center https://demographics.texas.gov/

#4.6.3: Texas’s Most Vulnerable Populations https://demographics.texas.gov/Resources/publications/2020/20200918_ACS2019Brief_ TexasMostVulnerablePopulations.pdf

#4.6.4: South Carolina Hazards & Vulnerability http://artsandsciences.sc.edu/geog/hvri/sovi®-0

#4.6.5: CDC Social Vulnerability Index https://www.atsdr.cdc.gov/placeandhealth/svi/index.html

Chapter 5. Data Shortfalls

Stream and Tide Gages #5.1.1: USGS Decline in Long-record Stream Gages Citation https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/99EO00406

117 Height Modernization #5.2.1: NOAA New Datums https://geodesy.noaa.gov/INFO/OnePagers/NewDatumsOnePager.pdf

Street Addressing and Geocoding #5.3.1: TNRIS StratMap https://tnris.org/stratmap/address-points/

#5.3.2: Google Street View https://www.google.com/streetview/explore/

Building Footprints #5.4.1: Bing Building Footprints Open Data https://blogs.bing.com/maps/2018-06/microsoft-releases-125-million-building- footprints-in-the-us-as-open-data/

#5.4.2: EarthDefine Building Footprint https://www.earthdefine.com/buildings/

#5.4.3: DigitalGlobe Building Footprints https://www.digitalglobe.com/products/building-footprints

Base Flood Elevations #5.5.1: FEMA Base Flood Elevation Definition https://www.fema.gov/node/404233#:~:text=The%20elevation%20of%20surface%20 water,level%20in%20any%20given%20year

#5.5.2: Flood Zones Definition https://www.fema.gov/glossary/flood-zones

#5.5.3: Floodplain Management Requirements https://www.fema.gov/media-library-data/1481032638839- 48ec3cc10cf62a791ab44ecc0d49006e/FEMA_480_Complete_reduced_v7.pdf

#5.5.4: NOAA National Water Model https://water.noaa.gov/about/nwm

#5.5.5: Interagency Flood Risk Management https://webapps.usgs.gov/infrm/

118 Economic Activity #5.6.1: SafeGraph Places Data https://www.safegraph.com/

#5.6.2: ArcGIS Business Analyst https://www.esri.com/en-us/arcgis/products/arcgis-business-analyst/data-infographics

#5.6.3: U.S. Bureau of Labor Statistics – Texas QCEW https://data.bls.gov/maps/cew/TX?period=2020- Q1&industry=10&geo_id=48000&chartData=3&distribution=1&pos_color=blue&neg_color =orange&showHideChart=show&ownerType=0

#5.6.4: Texas Comptroller Revenue Dashboard https://bivisual.cpa.texas.gov/CPA/opendocnotoolbar.htm?document=documents%5CTR_ Master_UI.qvw

Land Parcels and Zoning

Land Cover #5.8.1: USGS National Land Cover Database Update https://www.usgs.gov/center-news/nlcd-readies-improvements-upcoming-release-2019- product-suite?qt-news_science_products=1#qt-news_science_products

#5.8.2: TPWD Ecological Mapping Systems https://tpwd.texas.gov/landwater/land/programs/landscape-ecology/ems/

Chapter 6. Best Practices in Data Curation

Data File Format Standards #6.1.1: Types of Binary Files https://simplicable.com/new/binary-file

#6.1.2: Data and File Formatting https://www.axiomdatascience.com/best-practices/DataandFileFormatting.html

#6.1.3: Georgia Southern University File Formats for Archival https://georgiasouthern.libguides.com/c.php?g=833713&p=5953146

#6.1.4: Stanford Libraries File Format Best Practices https://library.stanford.edu/research/data-management-services/data-best- practices/best-practices-file-formats

119 #6.1.5: Social History Portal File Formats https://socialhistoryportal.org/bestpractices/fileformats

#6.1.6: Federal Records Management File Formats https://www.archives.gov/records-mgmt/policy/transfer-guidance- tables.html#geospatialformats

Metadata Standards #6.2.1: FGDC Geospatial Metadata Standards https://www.fgdc.gov/metadata/geospatial-metadata-standards

#6.2.2: FGDC Digital Geospatial Metadata Standard https://www.fgdc.gov/standards/projects/metadata/base-metadata/v2_0698.pdf

Digital Object Identifiers (DOIs) and FAIR Principles #6.3.1: FAIR Guiding Principles article https://www.nature.com/articles/sdata201618

#6.3.2: The Dataverse Network article http://www.dlib.org/dlib/january11/crosas/01crosas.html

USGS Hurricane Harvey High Water Marks Viewer #8.3.1: MOVES USGS Harvey Hight Water Marks https://agportalw-sec- green7.csr.utexas.edu/portal/apps/webappviewer/index.html?id=c5593f6de081458f8605 ba30d75c32de

GLO Post-Ike Projects #8.4.1: UT-CSR GLO Identified Projects Viewer https://agportalw-sec- green7.csr.utexas.edu/portal/apps/webappviewer/index.html?id=b9b2758d91df44e0a9af 7aecf3b4fc89

Chapter 10 Recommendations

Develop and expand the use of Social impact data #10.8.1: CDC SVI 2018 Documentation https://www.atsdr.cdc.gov/placeandhealth/svi/documentation/SVI_documentation_2018. html

120 Appendix 2 Acronyms Appendix 2. Acronyms

3D 3 Dimensional Spatial term 3DEP 3D Elevation Program USGS program 7Z 7 Zip File format ACS American Community Survey US Census Bureau program ADA Americans with Disabilities Act Anti-Discrimination legislation ADCIRC ADvanced CIRCulation Ocean current model AHPS Advanced Hydrological Prediction Service NWS program AIFF Audio Interchange File format File format API Application Programming Interface Software development resource ASCII American Standard Code for Information Text format Interchange ASGS ADCIRC Storm Guidance System Storm surge modelling system AVI Audio Video Interleave File format BEG Bureau of University department BFE Base Flood Elevation FEMA geographic designation BLS U.S. Bureau of Labor Statistics Federal Agency BMP BitMaP File format CADs Central Appraisal Districts Appraisal District CAP Civil Air Patrol Volunteer organization CAPCOG CAPital area Council of Governments Regional entity CB&I Chicago Bridge & Iron Company Private contractor CDC Centers for Disease Control and Prevention US agency CDR Community Development and Revitalization Plans for improving civil infrastructure CERA Coastal Emergency Risks Assessment Web application CI/KR Critical Infrastructure and Key Resources Civil infrastructure designation CLU Common Land Unit Farm service land unit definition COG Council Of Governments Texas regional planning group CONUS CONtinental United States Geographic region

121 COVID Corona VIrus Disease Pandemic virus CPC Climate Prediction Center NOAA climate group CPS Current Population Survey US Census Bureau program CSDGM Content Standard for Digital Geospatial Metadata US data standard CSR Center for Space Research University Department CSV Comma Separated Value Data File format D&B Dun and Bradstreet Private corporation DBF Data Base Format File format DBMS DataBase Management System Software DEM Digital Elevation Model Terrain dataset DFW Dallas Fort Worth Metropolitan area DHS Department of Homeland Security US agency DOC DOCument File format DOCX DOCument open Xml File format DOI Digital Object Identifier File format DSHS Texas Department of State Health Services Texas agency DUA Data Use Agreement Specification for data use DWG DraWinG File format EMS Emergency Management Service Local emergency responder EMS Texas Ecological Mapping Systems Texas agency program EOF End Of File marker Electronic file code ESA European Space Administration Space Agency FAIR Findability, Accessibility, Interoperability and File format Reusability FEMA Federal Emergency Management Agency US agency FGDC Federal Geographic Data Committee Data format management agency FIRMs Flood Insurance Rate Maps FEMA dataset FOUO For Official Use Only Sensitive information FSA USDA Farm Services Agency Federal Agency FTP File Transfer Protocol Web service protocol GDB Geographic DataBase File format GEOID geographic entity code Unique identifier for Census data GeoJSON Geographic JavaScript Object Notation File format GeoTIFF Geographic Tagged Image Format File format GIF Graphic Image Format File format GIS Geographic Information System System for managing geospatial data GLO General Land Office Texas agency gNATSGO Gridded National Soil Survey Geographic USDA dataset Database GOES Geostationary Operational Environmental Satellite Geostationary satellite system GPS Global Positioning System Satellite based location system GRB GRidded Binary File format GRD-HD Ground Range Detected High-res Dual-pol Sentinel SAR data products

122 GZ GNU Zip Compressed Archive File format GZIP Gnu ZIP File format H-GAC Houston-Galveston Area Council Area council HAND Height Above the Nearest Drainage Relative terrain model HARC Houston Advanced Research Center Independent research hub HDDS Hazards Data Distribution System USGS web portal HDF Hierarchical Data Format File format HEC-RAS Hydrologic Engineering Center River Analysis Hydraulic model System HGT SRTM Height product File format File format HH Horizontal-Horizontal SAR polarization data HIFLD Homeland Infrastructure Foundation Level DHS program HPA Hazard Performance and Analysis FEMA program HSIN Homeland Security Information Network Sensitive information system HTML Hyper Text Markup Language Web data format HWM High Water Marks USGS dataset IA FEMA Individual Assistance program FEMA program ID IDentifier Reference string IfSAR Interferometric Synthetic Aperture Radar Remote Sensing Datatype IHP Individuals and Households assistance Program FEMA program InFRM Interagency Flood Risk Management Proposed advisory group IOT Internet Of Things Internet paradigm IR Infra-Red Range light wavelengths ISO International Organization for Standardization US data standard ISPRS International Society for and Remote Sensing Organization Remote Sensing ISRO Indian Space Research Organization Space Agency ISSN International Standard Serial Number Publication Identification IT Information Technology Type of infrastructure JFO Joint Field Office FEMA office JGR Journal of Geophysical Research Journal, publication JPEG Joint Photographic Experts Group File format JPEG2000 Joint Photographic Experts Group 2000 File format JPL Jet Propulsion Laboratory NASA research center JSON JavaScript Object Notation File format KML Keyhole Markup Language Geospatial data format KMZ Keyhole Markup language Zipped Geospatial data format LDAP Lightweight Directory Access Protocol Database LiDAR Light Detection and Ranging Geospatial data format LUP Land Use Policy Government regulation MADIS Meteorological Assimilation Data Ingest System NCEP Data processing system MAGIC Mid-American Geospatial Information Center UT program MHUs Manufactured Housing Units FEMA program

123 MJ2 Motion JPEG 2000 File format MODIS Moderate Resolution Imaging Spectroradiometer Remote Sensing Instrument MOV MOVie File format MOVES Modeling-Observation and Visualization for UT application Emergency Support MP3 Motion Picture experts group 3 File format MPEG Motion Picture Experts Group File format MPEG-4 Motion Picture Expert Group 4 File format MRLC Multi-Resolution Land Characteristics US program MRMS Multi-Radar Multi-Sensor Rainfall remote sensing system MSA Metro Statistical Area Census enumeration area mSA Micro Statistical Area Census enumeration area NAD 83 North American Datum of 1983 Geospatial datum NAIP Nation Agriculture Imagery Program USDA Aerial Imagery program NASA National Aeronautics and Space Administration Space Agency NATSGO Gridded National Soil Survey Geographic USDA dataset Database NAVD 88 North American Vertical Datum of 1988 Vertical datum NCEI National Centers for Environmental Information Source of weather and climate data NCEP National Centers for Environmental Prediction Federal agency for global forecast NCTCOG North Central Texas Council of Governments Regional entity NDWI Normalized Difference Water Index Remote sensing product NED National Elevation Dataset Elevation Model Data NEMA National Emergency Management Agency Federal Agency NESDIS National Environmental Satellite Data and Federal satellite data service Information Service NetCDF Network Common Data Form Scientific data format NexRAD Next-Generation Radar Ground based weather radar NFIP National Flood Insurance Program FEMA program NGA National Geospatial Intelligence Agency US program NGDC National Geospatial Data Center Federal data management agency NGS National Geodetic Survey Geospatial framework NHD National Hydrography Dataset USGS dataset NHD HR National Hydrography Dataset High Resolution USGS dataset NHDPlus National Hydrography Dataset Plus High USGS dataset HR Resolution NICB National Crime Insurance Bureau USGS dataset NISAR NASA-ISRO SAR mission Joint satellite mission NLCD National Land Cover Database USGS program NOAA National Oceanic and Atmospheric Federal Agency Administration NRCS National Resources Conservation Service US program NSDI National Spatial Data Infrastructure Texas agency

124 NSRS National Spatial Reference System Geographic spatial standard NWM National Water Model Hydrologic model NWS National Weather Service Weather forecast organization ODF OpenDocument Format File format ODP OpenDocument Presentation format File format OGC Open Geospatial Consortium Open source geospatial consortium OS Operating System Computer software PDF Portable Document Format Document format PDF/A Portable Document Format for Archiving File format PFDS Precipitation Frequency Data Server Weather web service PII Personally Identifiable Information Sensitive information PNG Portable Network Graphics File format POI Point Of Interest Business locations PPT Power PoinT format File format PPTX Power PointT XML format File format PSD PhotoShop Document File format QCEW Quarterly Census of Employment and Wages Labor Metric QPE Quantitative Precipitation Estimate Weather data QPF Quantitative Precipitation Forecast Weather forecast RADAR RAdio Detecting And Ranging Detection and location system REST Representational State Transfer Web service architectural style RRC Texas Railroad Commission USDA dataset RSS Raster Soil Survey USDA dataset SAR Synthetic Aperture RADAR Remote Sensing Datatype SFHA Special Flood Hazard Areas Land type designation SHP SHaPe File format SLOSH Sea, lake and Overland Surges from Hurricanes Storm surge model SMWDBE Small, Minority and Women Disadvantaged Business demographic Business Enterprises SOAP Simple Object Access Protocol Web programing protocol SOC State Operations Center Texas center for managing hazard response SoVI Social Vulnerability Index CDC program SPC Storm Prediction Center NOAA/NWS service SRTM Shuttle Radar Topography Mission Elevation Model Data STATSGO State Soil Geographic TCEQ database StratMap Strategic Mapping Program Division of TNRIS SUURGO Soil Survey Geographic Database Texas agency SVI Social Vulnerability Index CDC program SWDI Severe Weather Data Inventory Historical storm damage database TACC Texas Advance Computing Center University Computing Group TAR Tape ARchive File format TCEQ Texas Commission on Environmental Quality GLO program

125 TCRS Texas Coastal Resiliency Study Texas agency TDA Texas Department of Agriculture Texas agency TDC Texas Demographic Center State demographics data source TDCJ Texas Department of Criminal Justice Texas prisons TDEM Texas Department of Emergency Management Texas Agency TDIS Texas Disaster Information System Proposed organization TEEX Texas A&M Engineering Extension University entity TIFF Tagged Image File format File format TIGER Topologically Integrated Geographic Encoding Geospatial data format and Referencing TMD Texas Military Department Texas agency TNRIS Texas Natural Resources Conservation State Agency for environmental Commission management TOP Texas Orthoimagery Program Texas Aerial Orthoimagery dataset TPWD Texas Parks and Wildlife Department Texas agency TSPWS Texas Strategic Phased Watershed Study UT program TTF1 Texas A&M Task Force One University affiliated search and rescue group TWDB Texas Water Development Board State Agency TWIA Texas Windstorm Insurance Association Texas insurance program TX-TF1 Texas A&M Task Force One University affiliated search and rescue group TxDOT Texas Department of Transportation Texas agency TxSON Texas Soil Observation Network US agency UAVSAR Unmanned Aerial Vehicle Synthetic Aperture JPL SAR mission RADAR URL Universal Resource Locator Internet address US United States of America Country USACE United States Army Corp of Engineers US Army engineer formation USDA U.S. Department of Agriculture UT program USGS United States Geological Survey Appraisal District UT University of Texas University UT-BEG University of Texas Bureau of Economic University Department Geography UT-CSR University of Texas Center for Space Research University Space Science Research Department UTA University of Texas at Arlington State university UT-CSR University of Texas Center for Space Research University Space Science Research Department UTF-8 8-bit Unicode Transformation Format File format UTSA University of Texas at San Antonio State university VH Vertical-Horizontal SAR polarization data VV Vertical-Vertical SAR polarization data WARC Web ARChive File format

126 WAV WAVeform File format WBD Watershed Boundary Dataset USGS Hydrographic Dataset WCAD Williamson Central Appraisal District Appraisal District WCS Web Coverage Service Geospatial web protocol WGRFC West Gulf River Forecast Center Texas river forecast agency WMA Windows Media Audio File format WMS Web Map Service Geospatial web protocol XLS eXceL Spreadsheet File format XLSX XmL Spreadsheet open Xml File format XML eXtensible Markup Language Meta language ZCTA ZIP Code Tabulation Area Geographic area ZIP ZIP compressed file File format

127