White Paper

Using Analytics With Open Data to Build a Stronger Government Contents

Open Data: What It Is, Where It Comes From and Why It Matters...... 1

What Are APIs?...... 2

How SAS® Can Help...... 2

Open Data in Action...... 3

Public Health Surveillance: An Example...... 3

The Future Looks Brighter With Open Data and Analytics...... 5

Learn More...... 5 1

From Washington, DC, to cities, states and countries all around the world, is shaping a new definition of Open Data: What It Is, Where It what it means to serve the citizens of a country – and how to be Comes From and Why It Matters an engaged citizen. Open government is revolutionizing the To support open government initiatives and uphold the way citizens interact with government leaders. It connects like- values of , participation and collaboration in the minded citizens with each other, with government agencies and US, federal agencies now make their data open. This means with many other types of organizations. making data publicly accessible in a format that can be shared. Highlighting the significance of open data, President Obama In the US, the open and accountable government President signed the nation’s first legislative mandate for data transpar- Obama mandated from his first day in office intentionally ency, the Digital Accountability and Transparency Act (DATA involves citizens in a way that’s participatory, collaborative Act), in May 2014. and transparent. It even calls on citizens to help solve national problems. Citizen scientists, for example, have created platforms Open data from the government gives citizens the information to collect and aggregate data on landslide tracking information, they need to hold government leaders accountable. On the flip which provides warnings about landslide triggers.1 side, soliciting expertise from people outside of government helps leaders form policies based on the latest, most compre- hensive information possible. Open data fosters collaboration between government leaders and citizens, and encourages Government is more effective when cooperation internally among government entities. The results can be tremendously better decisions that have the potential to it gathers input from the public as change our lives drastically. it makes decisions. By harnessing input and expertise from a wide array of voices, we can continue to FDA Adverse Drug Event Data strengthen government.2 Each year, health care professionals and consumers submit millions of individual reports on drug safety to the Food and Drug Administration (FDA). These anonymous reports are a critical tool to support drug safety surveillance. Today, this data is only available through limited quarterly reports. But the administration will soon be making these reports avail- able in their entirety so that software developers can build tools to help pull potentially dangerous drugs off shelves faster than ever before.3 1 See: 2014.spaceappschallenge.org/project/landslide-tracker and landslidetracker.com. 2 See: whitehouse.gov/blog/2014/04/30/ open-government-public-participation-we-can-t-do-it-without-you. 3 See: whitehouse.gov/blog/2014/05/09/continued-progress-and-plans-open- government-data and open.fda.gov/drug/event. 2

example, open data can help you assess college affordability, What Are APIs? the economy, educational issues, environmental damage, Application programming interfaces (APIs) are connectors health care, taxes, agriculture, the climate – the list goes on. between disparate computer systems. You can distribute (or connect) data to various types of applications through APIs. The ® first CTO of the US federal government, Aneesh Chopra, says How SAS Can Help APIs make government more transparent because they are the SAS recognizes that the way people see the world’s gigantic key to unlocking government data. APIs, he explains, dramati- data stores must evolve, and SAS is committed to supporting cally lower the barriers of entry to information sharing because open data initiatives. But first, we have to make data under- they allow data and multiple, diverse systems to connect. You standable and usable. To do that, data has to be clean, consoli- can also use APIs to automate business processes and make dated and available in a format that we can easily comprehend. them uniform and repeatable. For these reasons, Chopra This is where SAS excels. believes APIs will have a huge effect on how the government functions over the next few years.4 SAS Visual Analytics is a platform for visualizing, exploring and One fertile source of open data from the federal government, analyzing data. Through programs that run on the web – SAS as well as APIs from a variety of federal agencies and other Stored Processes – SAS Visual Analytics can integrate with open resources, is data.gov. Some other rich open data sources are: data and related APIs. In turn, you can more easily spot trends and patterns in data and obtain new insights about all sorts of • USAID, a large organization responsible for international challenges. Using SAS Stored Processes inside of SAS Visual development: usaid.gov/open. Analytics means your organization can capitalize on investments • The United Nations, which shares information about popula- you’ve already made in SAS skills and technology. This helps tions and more: data.un.org. you accelerate project development that involves open data, • The US Census Bureau, which has started rolling out data because you can produce results quickly and then publish easy- sets via APIs: census.gov/developers. to-understand summaries in reports or dashboards. • World Bank Institute (WBI), which provides data related to SAS also lets you build, test and refine predictive models quickly the issue of poverty: data.worldbank.org/indicator. so you can find a champion model – the one that will give you • HHS, which has many efforts underway to support the the most predictive power. Then you can use SAS to create and open government agenda: hhs.gov/open/ and the HIW: share simple, interactive reports that display your insights in healthindicators.gov. whatever format is best suited to your data and your intentions.

As open data continues to emerge from multiple sources and in many varieties, citizens can use it in exciting new ways. For

Federal Health Data The National Center for Health Statistics’ (NCHS) Health Indicators Warehouse (HIW) is part of the US Department of Health and Services’ (HHS) response to open government efforts to make federal data more accessible to all users. Through it, users can view and download data and metadata for more than 1,200 indicators on health status, outcomes and determinants from more than 180 federal and nonfederal sources. These sources include NCHS data systems, Centers for Disease Control and Prevention (CDC) surveillance data, census data, Medicare and Medicaid administrative data and much more. HIW provides access to data through an API.5

4 See: govtech.com/data/Former-US-CTO-Says-APIs-Are-Key-to-Unlocking-Government-Data.html. 5 See: healthindicators.gov. 3

Open Data in Action Public Health Surveillance: An Example Imagine that you’re going to look at more than 200 countries Connecting directly to open data via APIs, you can pull data into divided into eight regions over a period of 53 years. Your goal is SAS Visual Analytics to help you pinpoint, analyze, solve and to determine which locations have a high risk of low life expec- predict both expected – and unexpected – problems. tancy. To complete this exercise, you’d want to look at open data sources containing government indicators like life expec- Consider public health surveillance, for example. This activity is tancy, fertility rate and gross domestic product (GDP). You’d also not as much about finding a bad guy as it is about monitoring want to capture general population information, including age and evaluating indicators that point to problem areas. The aim dependency ratio, which shows the ratio of those people who is to prevent diseases and outbreaks. Public health surveillance are not likely to be in the labor force (the majority of people can also serve as an early warning system for impending emer- under 16 and over 64). Finally, you could include an indicator gencies, document the impact of an intervention, track progress related to HIV and/or AIDS. toward public health goals, and clarify health problems to inform public health policies and strategies. SAS has a number of ways to examine these indicators and shed light about where there are high-risk areas, and about where and how you need to focus your efforts. Let’s see how it’s done.

Figure 1: A correlation matrix displays the degree of correlation between measures as a series of colored rectangles. The dark blue rectangles here indicate a strong correlation. In this case, fertility rate and life expectancy exhibit a strong correlation. 4

Figure 2: Using open data with SAS Visual Analytics can help identify trends and patterns that reveal new insights. In this cluster of sub-Saharan African countries, SAS Visual Analytics identifies trends in life expectancy and fertility rate over time. Note that Rwanda, denoted by the green bubble, shows a significant dip in life expectancy and family size due to the civil war in the early 1990s; but it dramatically improves on these health indicators by 2012.

Figure 3: SAS Visual Statistics is an add-on to SAS Visual Analytics. It extends the capabilities of SAS Visual Analytics by creating, testing and comparing models based on the patterns SAS Visual Analytics discovers. The predictive (linear regression) model in the screenshot above identifies and predicts the risk of low life expectancy in our collection of countries. 5

Figure 4: In SAS Visual Analytics, you can use geomaps to plot risk scores that are a result of a model-building exercise. Then you can publish them across your organization through a dashboard. The dark red bubble here indicates a high-risk area for low life expectancy. In this case, it’s Sierra Leone, which has been at the epicenter of the Ebola outbreak. Based on insights from this map, some global health organizations might decide to reallocate resources to this region.

The perceptions health organizations glean from open data and The Future Looks Brighter With analytics are just glimmers of how different our world could be. Open Data and Analytics Open data and analytics, grounded by human judgment, are invaluable at providing insights and enabling good decisions What is effective decision making? For global health organiza- across all levels of government, at every agency and for every tions, it means being able to confidently allocate resources process. Now that we can extend our pool of data almost indefi- and respond rapidly and effectively to problems all around nitely, the possibilities for the future are almost limitless. the world. It means improving program prioritization. It means being able to validate existing investments. Learn More Through public health surveillance, in conjunction with open data and analytics, government agencies can improve Find out how SAS helps governments around the world resource allocation by targeting problem spots. They can improve the lives of citizens: sas.com/gov. focus efforts on the programs shown to be most critical in the bigger scheme of things. They can understand how to allocate Try SAS Visual Analytics and SAS Visual Statistics for yourself: existing investments most appropriately across different sas.com/tryva and sas.com/tryvs. regions and countries. And they can continually evaluate whether existing processes and focus areas are the best choices for the issues at hand. To contact your local SAS office, please visit:sas.com/offices

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. © 2015, SAS Institute Inc. All rights reserved. 107499_S132014.0115