An Enterprise Architect's Guide to Big Data
Total Page:16
File Type:pdf, Size:1020Kb
An Enterprise Architect’s Guide to Big Data Reference Architecture Overview O R A C L E ENTERPRISE ARCHITECT U R E WHITE P A P E R | M A R C H 2 0 1 6 Disclaimer The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. ORACLE ENTERPRISE ARCHITECTURE WHITE PAPER — AN ENTERPRISE ARCHITECT’S GUIDE TO BIG DATA Table of Contents Executive Summary 1 A Pointer to Additional Architecture Materials 3 Fundamental Concepts 4 What is Big Data? 4 The Big Questions about Big Data 5 What’s Different about Big Data? 7 Taking an Enterprise Architecture Approach 11 Big Data Reference Architecture Overview 14 Traditional Information Architecture Capabilities 14 Adding Big Data Capabilities 14 A Unified Reference Architecture 16 Enterprise Information Management Capabilities 17 Big Data Architecture Capabilities 18 Oracle Big Data Cloud Services 23 Highlights of Oracle’s Big Data Architecture 24 Big Data SQL 24 Data Integration 26 Oracle Big Data Connectors 27 Oracle Big Data Preparation 28 Oracle Stream Explorer 29 Security Architecture 30 Comparing Business Intelligence, Information Discovery, and Analytics 31 Data Visualization 33 Spatial and Graph Analysis 35 Extending the Architecture to the Internet of Things 36 Big Data Architecture Patterns in Three Use Cases 38 Use Case #1: Retail Web Log Analysis 38 Use Case #2: Financial Services Real-time Risk Detection 39 Use Case #3: Driver Insurability using Telematics 41 Big Data Best Practices 43 Final Thoughts 45 ORACLE ENTERPRISE ARCHITECTURE WHITE PAPER — AN ENTERPRISE ARCHITECT’S GUIDE TO BIG DATA Executive Summary Today, Big Data is commonly defined as data that contains greater variety arriving in increasing volumes and with ever higher velocity. Data growth, speed and complexity are being driven by deployment of billions of intelligent sensors and devices that are transmitting data (popularly called the Internet of Things) and by other sources of semi-structured and structured data. The data must be gathered on an ongoing basis, analyzed, and then provide direction to the business regarding appropriate actions to take, thus providing value. Most are keenly aware that Big Data is at the heart of nearly every digital transformation taking place today. For example, applications enabling better customer experiences are often powered by smart devices and enable the ability to respond in the moment to customer actions. Smart products being sold can capture an entire environmental context. Business analysts and data scientists are developing a host of new analytical techniques and models to uncover the value provided by this data. Big Data solutions are helping to increase brand loyalty, manage personalized value chains, uncover truths, predict product and consumer trends, reveal product reliability, and discover real accountability. IT organizations are eagerly deploying Big Data processing, storage and integration technologies in on premises and Public Cloud-based solutions. Cloud-based Big Data solutions are hosted on Infrastructure as a Service (IaaS), delivered as Platform as a Service (PaaS), or as Big Data applications (and data services) via Software as a Service (SaaS) manifestations. Each must meet critical Service Level Agreements (SLAs) for the business intelligence, analytical and operational systems and processes that they are enabling. They must perform at scale, be resilient, secure and governable. They must also be cost effective, minimizing duplication and transfer of data where possible. Today’s architecture footprints can now be delivered consistently to these standards. Oracle has created reference architectures for all of these deployment models. There is good reason for you to look to Oracle as the foundation for your Big Data capabilities. Since its inception, 35 years ago, Oracle has invested deeply across nearly every element of information management – from software to hardware and to the innovative integration of both on premises and Cloud-based solutions. Oracle’s family of data management solutions continue to solve the toughest technological and business problems delivering the highest performance on the most reliable, available and scalable data platforms. Oracle continues to deliver ancillary data management capabilities including data capture, transformation, movement, quality, security, and management while providing robust data discovery, access, analytics and visualization software. Oracle’s unique value is its long history of engineering the broadest stack of enterprise-class information technology to 1 | ORACLE ENTERPRISE ARCHITECTURE WHITE PAPER — AN ENTERPRISE ARCHITECT’S GUIDE TO BIG DATA work together—to simplify complex IT environments, reduce TCO, and to minimize the risk when new areas emerge – such as Big Data. Oracle thinks that Big Data is not an island. It is merely the latest aspect of an integrated enterprise- class information management capability. Looked at on its own, Big Data can easily add to the complexity of a corporate IT environment as it evolves through frequent open source contributions, expanding Cloud-based offerings, and emerging analytic strategies. Oracle’s best-of-breed products, support, and services can provide the solid foundation for your enterprise architecture as you navigate your way to a safe and successful future state. To deliver to business requirements and provide value, architects must evaluate how to efficiently manage the volume, variety, velocity of this new data across the entire enterprise information architecture. Big Data goals are not any different than the rest of your information management goals – it’s just that now, the economics and technology are mature enough to process and analyze this data. This paper is an introduction to the Big Data ecosystem and the architecture choices that an enterprise architect will likely face. We define key terms and capabilities, present reference architectures, and describe key Oracle products and open source solutions. We also provide some perspectives and principles and apply these in real-world use cases. The approach and guidance offered is the byproduct of hundreds of customer projects and highlights the decisions that customers faced in the course of their architecture planning and implementations. Oracle’s architects work across many industries and government agencies and have developed a standardized methodology based on enterprise architecture best practices. These should look familiar to architects familiar with TOGAF and other best architecture practices. Oracle’s enterprise architecture approach and framework are articulated in the Oracle Architecture Development Process (OADP) and the Oracle Enterprise Architecture Framework (OEAF). 2 | ORACLE ENTERPRISE ARCHITECTURE WHITE PAPER — AN ENTERPRISE ARCHITECT’S GUIDE TO BIG DATA A Pointer to Additional Architecture Materials Oracle offers additional documents that are complementary to this white paper. A few of these are described below: IT Strategies from Oracle (ITSO) is a series of practitioner guides and reference architectures designed to enable organizations to develop an architecture-centric approach to enterprise-class IT initiatives. ITSO presents successful technology strategies and solution designs by defining universally adopted architecture concepts, principles, guidelines, standards, and patterns. The Big Data and Analytics Reference Architecture paper (39 pages) offers a logical architecture and Oracle product mapping. The Information Management Reference Architecture (200 pages) covers the information management aspects of the Oracle Reference Architecture and describes important concepts, capabilities, principles, technologies, and several architecture views including conceptual, logical, product mapping, and deployment views that help frame the reference architecture. The security and management aspects of information management are covered by the ORA Security paper (140 pages) and ORA Management and Monitoring paper (72 pages). Other related documents in this ITSO library include cloud computing, business analytics, business process management, or service-oriented architecture. The Information Management and Big Data Reference Architecture (30 pages) white paper offers a thorough overview for a vendor-neutral conceptual and logical architecture for Big Data. This paper will help you understand many of the planning issues that arise when architecting a Big Data capability. Examples of the business context for Big Data implementations for many companies and organizations appears in the industry whitepapers posted on the Oracle Enterprise Architecture web site. Industries covered include agribusiness, communications service providers, education, financial services, healthcare payers, healthcare providers, insurance, logistics and transportation, manufacturing, media and entertainment, pharmaceuticals and life sciences, retail, and utilities. Lastly, numerous Big Data materials can be found on Oracle Technology Network (OTN) and Oracle.com/BigData. 3 | ORACLE ENTERPRISE ARCHITECTURE WHITE PAPER — AN ENTERPRISE ARCHITECT’S GUIDE TO BIG DATA Fundamental Concepts What is Big Data? Historically, a number of the large-scale