Info Paper AI Data Marketplace

DATA FUELS DIGITAL AGENCIES ACHIEVING YOUR MISSION: THE AI DATA MARKETPLACE SOLUTION AT EQUINIX® Enabling trusted, effective data sharing in a scalable, secure, cost-effective environment to support AI innovation and meet mission needs.

Introduction

Government agencies increasingly depend on data to support decisions and automate actions. (AI) is used to predict trends, optimize processes, speed up decision-making, enhance accuracy, improve pattern recognition and much more. For AI to learn how to reason, AI must be able to access large, diverse data sets to ensure their functionality and accuracy. The demand for data from within and outside of agencies is increasing exponentially—on average, 75 percent of enterprise applications use 10 external data sources.1 Agencies must utilize One Platform. neutral, secure data marketplaces to exchange the data and algorithms that make AI-driven innovation possible. One Solution.

Data marketplaces (also known as digital data marketplaces, or DDMs) Simplify Data Exchange allow data providers and data consumers to share, buy or sell data and and Monetization algorithms privately and securely, without violating government regulations such as GDPR, using a programmable, community-owned, safe and secure Easy to share. Easy to consume. infrastructure that organizes trust. models regulate how the Easy to sell. Easy to buy. All in members of the marketplace interact. The marketplace facilitates legal one platform. and asset registration that enable third-party services (data anonymization, conflict arbitration, analytics tools, etc.) and payment Ensure Data Traceability management. Sharing data assets via a secure, trusted and neutral data and Integrity marketplace driven by a consortium established on the basis of a common Fully secured AI models benefit is a promising opportunity for government agencies as they advance and data lineage tracking. their missions in the 21st century. Bring Algorithms to the Data* Run analytics wherever data resides. No need to move data. The ability to merge physical models with digital Enjoy Proven Trust and Neutrality content and conduct deep analysis at extraordinary Turnkey data center and speeds presents unprecedented opportunity to capabilities in transform the productivity, capacity, and capability ONE solution. No need to of the geospatial intelligence workforce. integrate point solutions. Global Distributed Solution MeriTalk, in collaboration with USGIF 1 Support for data sharing in different regions/markets for data residency/compliance.

* Depending upon the trust model, bring data Equinix.com to the or the algorithm to the data. PLATFORM EQUINIX®

Data & Algorithm Public Agency 1 Governance Cloud 1

Data & Algorithm

Data & Algorithm Marketplace Data & Algorithm

Data & Algorithm Public Agency 2 Analytics Cloud 2

Data & Algorithm Infrastructure

Agency 3 Data Broker

Fig. 1. A highly secure data marketplace enables sovereign to make assets available to others for mutual benefit.

Data marketplace overview The AI Data Marketplace

The basic role of the data marketplace is to organize Solution at Equinix is different, and facilitate interactions between data suppliers and and here’s why algorithm developers to explore, select, and agree to create, execute and complete data science transactions. 1. It supports multiple data sharing A data marketplace is a global structure that enables and trust archetypes sovereign organizations—many of which require absolute control of their data asset—to make assets Different data sets warrant different data sharing available to others under strict conditions to achieve and trust models. The AI Data Marketplace solution mutual benefits that no single could at Equinix makes it easy to bring data and algorithms achieve on its own. into a secure, software-definable and geo-distributed data exchange sandbox while preventing the data There are many types of data marketplace solutions or algorithms from being exposed to other parties. already out there, but they aren’t fully addressing AI algorithms can be trained on data from different government agencies’ data sharing challenges and owners at different locations via the data marketplace, concerns. In many cases, they do not adequately making the AI Data Marketplace solution at Equinix satisfy the demands of data providers or data right for most government agency use cases. consumers, either. Agencies need a solution that The solution supports all three of these data gives them the confidence to share data and sharing mechanisms: algorithms with each other and with partner organizations and service providers. They need Distributed Model a trusted, secure data marketplace. Bring the algorithm to the data Data providers who are unwilling to let data leave The high-level architecture depicted in Figure 1 shows their premises due to confidentiality or intellectual the data marketplace as an entity that is owned and property concerns can utilize a private cage at operated by a membership organization through Equinix to host an AI training stack and run federated, which data providers and data consumers can securely privacy-preserving algorithms that require higher interact and transact based on community rules and power density requirements. individual agreements.

Equinix.com Federated Model Local Model Bring the data and algorithms to a neutral exchange location Building Data and algorithm providers who prefer that their assets remain inaccessible in each other’s locations can utilize secure, neutral exchange infrastructure cages inside Equinix data centers for data trading and algorithm use (see Figure 2). Governance is negotiable. Raw data and Customer 1 algorithms are never taken outside the shared cage. Private Cage

Centralized Model Global Model Building Bring the data to the algorithm Local Data providers who are comfortable sharing low-risk or non-confidential Models assets can utilize a public cloud marketplace or a hybrid model where the data to be shared is stored in a persistent manner in a private cage Customer 2 Shared at Equinix, then moved into the public cloud infrastructures used by data Private Consortium Data Center Cage science organizations for sharing or model training purposes. Global Model Equinix Data 2. Its multi-zone architecture ensures that data Center is always secure The AI Data Marketplace solution at Equinix deploys an architecture with three separate security zones (domains): the control plane Customer 3 software zone, the provider/consumer data exchange zone, and the Public Cloud data provider’s permanent data storage location (secure repository). Among the security advantages: Fig. 2. A federated learning framework.

§ Protection against hacking – Cybercriminals cannot access data being exchanged between parties in the data exchange zone.

§ Inability to take raw data out – AI/analytics pipelines are executed From a mission standpoint, in a sandbox run on a Kubernetes cluster where ingress/egress is stakeholders believe GEOINT- strictly controlled. related AI will have the greatest impact on:1 § Restricted access pattern monitoring – Providers do not have access to, or visibility into, how data consumers are using purchased data, and the IP of data consumer algorithms is protected.

§ Flexibility in security hardware – Providers have several encryption options to ensure their data can never be accessed in the clear in the National security data exchange zone.

§ Time-bound access to data – dProxy provides a layer of indirection that ensures a provider’s real data storage location is never divulged to a data consumer.

Auditability and lineage tracking – Lineage tracking can be done for any § Urban planning AI model created in the secure sandbox (data sources, AI frameworks, and development who did the model training).

Emergency response/ natural disaster aid

Equinix.com 3. Federated analytics enable privacy Unlocking the Power of AI and efficient data handling The American Artificial Intelligence Initiative calls The AI Data Marketplace solution at Equinix is for government agencies to maximize AI resources, distributed, meaning that a marketplace can embrace trustworthy AI and remove barriers to AI simultaneously manage multiple geo-distributed data innovation. To do so effectively, they must collaborate exchange locations at any given point in time. This with each other and with non-federal entities to share allows it to support federated learning frameworks the data and algorithms that feed robust AI models and where local AI models on private infrastructure stacks drive innovation. (as shown in Figure 2) can be built, then aggregated into a global AI model at a mutually trusted neutral location like Equinix. The AI Data Marketplace solution at Equinix allows data scientists to invoke third-party federated learning frameworks via Kubeflow. The president’s FY21 budget A federated learning approach is primarily useful commits to double AI research and for two reasons: development over the next two years. Privacy-preserving AI § – When data providers want The White House Office of Science and Technology Policy, algorithm providers to ship their algorithm to the data February 2020 location because they do not want to let raw data out of their security domain, federated learning can be leveraged to build a model locally, then share the anonymized model with the data consumer. Equinix partners with the world’s leading technology § Efficient handling of large datasets at the edge – companies to ensure optimal performance in every Federated learning also solves the issue of having to layer of its data marketplace architecture: the security transport large, edge-based datasets. Rather than layer, the data marketplace and metadata storage layer; sending raw data (which reaches into the terabytes) the analytics platform layer; and the infrastructure layer. to a far-off, centralized location, the AI model The AI Data Marketplace solution at Equinix provides (typically kilobytes) is built locally and sent instead. several ways to share data sets with different security Since traffic doesn’t have to be backhauled from the requirements on a highly interconnected platform. With edge to a core location, it is faster and less costly. secure, neutral exchange hubs for inferencing available worldwide, it gives government agencies the ability to 4. It just makes sense unlock the power of AI to achieve their missions. Equinix “IBX®” data centers are within 10ms (round trip) For more information on the AI Data Marketplace at of end devices in most markets. They are secure, neutral Equinix, including strategy, best practices and the steps interconnection hubs where data already flows between involved in standing up a marketplace: thousands of cloud service providers, network service providers, financial institutions, and media companies. Read the White Paper As the amount of data generated at the edge increases, eqix.it/AIgovwhitepaper it makes sense to trade and process data at the metro Explore the Playbook edge close to where it is getting generated rather than eqix.it/AIgovplaybook backhaul it to a central cloud or private data center for cost, performance and privacy reasons. Since Equinix has interconnection rich data centers in 54 markets globally, it is the right place for data exchange. Ready to get started?

Learn more about how the AI Data Marketplace at References Equinix can help you achieve your agency mission.

1. “Mapping AI to the GEOINT Workforce,” Meritalk and the United State eqix.it/DESBfederal Geospatial Intelligence Foundation, April 2020.

2. David Freeman Engstrom, Daniel E. Ho, Catherine M. Sharkey, and Mariano-Florentino Cuellar, “: Artificial Intelligence in Federal Administrative Agencies,” February 2020.

Equinix.com Questions? Equinix.com/Contact-Us © 2020 Equinix, Inc. AI Data Marketplace_Info Paper_Value_US-EN | 0920