The Ai Data Marketplace Solution At
Total Page:16
File Type:pdf, Size:1020Kb
Info Paper AI Data Marketplace DATA FUELS DIGITAL AGENCIES ACHIEVING YOUR MISSION: THE AI DATA MARKETPLACE SOLUTION AT EQUINIX® Enabling trusted, effective data sharing in a scalable, secure, cost-effective environment to support AI innovation and meet mission needs. Introduction Government agencies increasingly depend on data to support decisions and automate actions. Artificial intelligence (AI) is used to predict trends, optimize processes, speed up decision-making, enhance accuracy, improve pattern recognition and much more. For AI to learn how to reason, AI algorithms must be able to access large, diverse data sets to ensure their functionality and accuracy. The demand for data from within and outside of government agencies is increasing exponentially—on average, 75 percent of enterprise applications use 10 external data sources.1 Agencies must utilize One Platform. neutral, secure data marketplaces to exchange the data and algorithms that make AI-driven innovation possible. One Solution. Data marketplaces (also known as digital data marketplaces, or DDMs) Simplify Data Exchange allow data providers and data consumers to share, buy or sell data and and Monetization algorithms privately and securely, without violating government regulations such as GDPR, using a programmable, community-owned, safe and secure Easy to share. Easy to consume. infrastructure that organizes trust. Governance models regulate how the Easy to sell. Easy to buy. All in members of the marketplace interact. The marketplace facilitates legal one platform. contracts and asset registration that enable third-party services (data anonymization, conflict arbitration, analytics tools, etc.) and payment Ensure Data Traceability management. Sharing data assets via a secure, trusted and neutral data and Integrity marketplace driven by a consortium established on the basis of a common Fully secured AI models benefit is a promising opportunity for government agencies as they advance and data lineage tracking. their missions in the 21st century. Bring Algorithms to the Data* Run analytics wherever data resides. No need to move data. The ability to merge physical models with digital Enjoy Proven Trust and Neutrality content and conduct deep analysis at extraordinary Turnkey data center and speeds presents unprecedented opportunity to technology capabilities in transform the productivity, capacity, and capability ONE solution. No need to of the geospatial intelligence workforce. integrate point solutions. Global Distributed Solution MeriTalk, in collaboration with USGIF 1 Support for data sharing in different regions/markets for data residency/compliance. * Depending upon the trust model, bring data Equinix.com to the algorithm or the algorithm to the data. PLATFORM EQUINIX® Data & Algorithm Public Agency 1 Governance Cloud 1 Data & Algorithm Data & Algorithm Marketplace Data & Algorithm Data & Algorithm Public Agency 2 Analytics Cloud 2 Data & Algorithm Infrastructure Agency 3 Data Broker Fig. 1. A highly secure data marketplace enables sovereign organizations to make assets available to others for mutual benefit. Data marketplace overview The AI Data Marketplace The basic role of the data marketplace is to organize Solution at Equinix is different, and facilitate interactions between data suppliers and and here’s why algorithm developers to explore, select, and agree to create, execute and complete data science transactions. 1. It supports multiple data sharing A data marketplace is a global structure that enables and trust archetypes sovereign organizations—many of which require absolute control of their data asset—to make assets Different data sets warrant different data sharing available to others under strict conditions to achieve and trust models. The AI Data Marketplace solution mutual benefits that no single organization could at Equinix makes it easy to bring data and algorithms achieve on its own. into a secure, software-definable and geo-distributed data exchange sandbox while preventing the data There are many types of data marketplace solutions or algorithms from being exposed to other parties. already out there, but they aren’t fully addressing AI algorithms can be trained on data from different government agencies’ data sharing challenges and owners at different locations via the data marketplace, concerns. In many cases, they do not adequately making the AI Data Marketplace solution at Equinix satisfy the demands of data providers or data right for most government agency use cases. consumers, either. Agencies need a solution that The solution supports all three of these data gives them the confidence to share data and sharing mechanisms: algorithms with each other and with partner organizations and service providers. They need Distributed Model a trusted, secure data marketplace. Bring the algorithm to the data Data providers who are unwilling to let data leave The high-level architecture depicted in Figure 1 shows their premises due to confidentiality or intellectual the data marketplace as an entity that is owned and property concerns can utilize a private cage at operated by a membership organization through Equinix to host an AI training stack and run federated, which data providers and data consumers can securely privacy-preserving algorithms that require higher interact and transact based on community rules and power density requirements. individual agreements. Equinix.com Federated Model Local Model Bring the data and algorithms to a neutral exchange location Building Data and algorithm providers who prefer that their assets remain inaccessible in each other’s locations can utilize secure, neutral exchange infrastructure cages inside Equinix data centers for data trading and algorithm use (see Figure 2). Governance is negotiable. Raw data and Customer 1 algorithms are never taken outside the shared cage. Private Cage Centralized Model Global Model Building Bring the data to the algorithm Local Data providers who are comfortable sharing low-risk or non-confidential Models assets can utilize a public cloud marketplace or a hybrid model where the data to be shared is stored in a persistent manner in a private cage Customer 2 Shared at Equinix, then moved into the public cloud infrastructures used by data Private Consortium Data Center Cage science organizations for sharing or model training purposes. Global Model Equinix Data 2. Its multi-zone architecture ensures that data Center is always secure The AI Data Marketplace solution at Equinix deploys an architecture with three separate security zones (domains): the control plane Customer 3 software zone, the provider/consumer data exchange zone, and the Public Cloud data provider’s permanent data storage location (secure repository). Among the security advantages: Fig. 2. A federated learning framework. § Protection against hacking – Cybercriminals cannot access data being exchanged between parties in the data exchange zone. § Inability to take raw data out – AI/analytics pipelines are executed From a mission standpoint, in a sandbox run on a Kubernetes cluster where ingress/egress is stakeholders believe GEOINT- strictly controlled. related AI will have the greatest impact on:1 § Restricted access pattern monitoring – Providers do not have access to, or visibility into, how data consumers are using purchased data, and the IP of data consumer algorithms is protected. § Flexibility in security hardware – Providers have several encryption options to ensure their data can never be accessed in the clear in the National security data exchange zone. § Time-bound access to data – dProxy provides a layer of indirection that ensures a provider’s real data storage location is never divulged to a data consumer. Auditability and lineage tracking – Lineage tracking can be done for any § Urban planning AI model created in the secure sandbox (data sources, AI frameworks, and development who did the model training). Emergency response/ natural disaster aid Equinix.com 3. Federated analytics enable privacy Unlocking the Power of AI and efficient data handling The American Artificial Intelligence Initiative calls The AI Data Marketplace solution at Equinix is for government agencies to maximize AI resources, distributed, meaning that a marketplace can embrace trustworthy AI and remove barriers to AI simultaneously manage multiple geo-distributed data innovation. To do so effectively, they must collaborate exchange locations at any given point in time. This with each other and with non-federal entities to share allows it to support federated learning frameworks the data and algorithms that feed robust AI models and where local AI models on private infrastructure stacks drive innovation. (as shown in Figure 2) can be built, then aggregated into a global AI model at a mutually trusted neutral location like Equinix. The AI Data Marketplace solution at Equinix allows data scientists to invoke third-party federated learning frameworks via Kubeflow. The president’s FY21 budget A federated learning approach is primarily useful commits to double AI research and for two reasons: development over the next two years. Privacy-preserving AI § – When data providers want The White House Office of Science and Technology Policy, algorithm providers to ship their algorithm to the data February 2020 location because they do not want to let raw data out of their security domain, federated learning can be leveraged to build a model locally, then share the anonymized model with the data consumer. Equinix partners with the world’s