Nosql in O&G
Total Page:16
File Type:pdf, Size:1020Kb
NoSQL in O&G PPDM Q4 Data Management Luncheon – Fort Worth © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Agenda Part 1 – What is the context? Part 2 – What is the challenge? Part 3 – What will cause breakthrough? Part 4 – How can NoSQL be applied? Thomas Tong CTO - Energy Part 5 – Who is MarkLogic? © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. You are now the CEO of an O&G company WTI prices are at a 6 year low, from $110 to $44 you are in a market share war growth portfolio has halted; risk adverse cashflow is very tight; operations are constrained; CAPEX is reduced production is forecasted to reduce over the next 3 years reserves replacement ratio is down 23% loss of primary containment has increased 27% 13% of your leases are at risk new GHG restrictions will reduce Bakken profits by 14% You have some hard decisions to make… SLIDE: 3 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. You are now the CIO of an O&G company IT budgets have been rolled back to 2013 levels IT enterprise portfolio investments in BI, DQ, MDM have not realized ROI your relationship with the business is “limited” cyber-attacks have increase 3000% over the past 5 years 74% of your next year budget is OPEX most of your CAPEX is already allocated all new projects going through IT governance are scrutinized closely new compliance reporting must be on-time, no option you must lay off 24% of your staff You have some hard decisions to make… SLIDE: 4 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. WHAT IS THE CHALLENGE? SLIDE: 5 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The “Great Divide” in O&G Complexity of business activity and operations has resulted in fragmented processes and “silos” of data/content, impacting leadership’s capacity to see the big picture Silo culture is inherently reflected along business lines, geographies, and KPIs, such as “operations + production is king + basin” O&G decentralized organizational structure supports localization, but not enterprise efforts Relationships between business, operations and IT has historically been constrained The “Great Divide” threatens O&G performance capacities, operational excellence, and overall commitment to HSE SLIDE: 6 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The “Great Diversity” in O&G Complexity of business activity and operations demands specialization along the value chain of assets, producing unique data and content needs Size and complexity of data files/formats inherently has caused application-centric mentality, resulting in a diverse IT portfolio Data and content are historically separated align core processes and applications Compliance needs continue to increase; from records management, GHG reporting, to reverse reporting The “Great Diversity” threatens O&G capacity to manage across the business and gain competitive advantage SLIDE: 7 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The “Great Divide and Diversity” in O&G Core processes segregated Application centric per core process Data type variance – size, format Location variance Tight Schema adoption Temporal business rules Application level security Poor practices - hard coupling Weak enterprise search Broken semantics SLIDE: 8 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Innovations you are experiencing 1. Three dimensional seismic 2. Directional drilling 3. Horizontal drilling 4. Secondary and tertiary recovery = Data 5. Hydraulic fracturing 6. Multistage fracturing SLIDE: 9 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Data is Growing at a Staggering Rate – Web 2.0 Needs 44 ZB 8 ZB 2015 2020 SLIDE: 10 Source: IDC © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Full Awareness – all sources of data Location • regions • nodes • zones • hubs Time • hourly • daily • monthly • quarterly • yearly Types • natural gas molecules • electrons • pipeline • transmission capacity • generation capacity • storage volumes SLIDE: 11 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Unprecedented Data Challenges 12% Structured 88% Unstructured Reference Data Warehouse OLTP Data Marts Archives ? SLIDE: 12 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Systems on Relational Databases – Not the Answer Explosion of Inability of O&G Companies to Store, Manage, and Heterogeneous Data Search Their Data 50 44 ZB 40 Reference 30 Data 20 8 ZBs OLTP Warehouse 10 0 ? 2015 2020 Data Unstructured Structured Unstructured Archives Marts Data Source: IDC SLIDE: 13 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The Endless Cycle of Data Normalization Take snapshot of current data Build master data model based 1 on initial view 2 x 4 Revise static model & Extract, transform, & 3 restart process for new data load data into data model SLIDE: 14 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The Endless Cycle of Data Normalization Take snapshot of current data Build master data model based 1 on initial view 2 2-5 years $5M++ x 4 Revise static model and restart Extract, transform, and 3 process for new data load data into data model SLIDE: 15 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Treadmill of Adding Data Articles Books ? Industr y Data Reports ? 1. Design infrastructure, 3. Define queries & Service 4. Define schemas, indexes 2. Analyze Data Formats services & applications APIs and services 5. Build databases, 7. Load and normalize 8. Develop, integrate and middleware and services 6. Define & implement ETL data test infrastructure & infrastructure processes applications SLIDE: 16 © COPYRIGHT 2015 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Well-Known SQL Database Issues . Badly named columns . Sparse data problem . Brittle extension capabilities . Slower query time for joins . Extract, Transform, Load (ETL) changes . Reporting tool changes SLIDE: 17 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. WHAT WILL CAUSE A BREAKTHROUGH? SLIDE: 18 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. First, a Quick History Lesson Enterprise - NoSQL “Proven for mission critical” . Hardened & certified security . Integrated – search + app services Any Structure Era - NoSQL . Scalable + Hadoop integration “For all your data!” . ACID transactions . Massive scale . High availability & disaster recovery . Built for heterogeneous & unstructured data . Faster time-to-results Open Source - NoSQL . Commodity hardware “”Contextually right” . Fraction of the cost . Scalable . Designed for purpose Relational Era . Schema-agnostic “For all your structured data!” . Eventually consistent . Bad for unstructured . Less initial cost . Difficult for heterogeneous . Proprietary hardware . Expensive Hierarchical Era “For your application “data!" . Proprietary hardware . Very Expensive SLIDE: 19 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Core Differentiator: Purpose-built for the Enterprise Open Source Relational Enterprise NoSQL NoSQL ACID TRANSACTIONS ✔ ✔ ✗ SECURITY ✔ ✔ ✗ HIGH AVAILABILITY & DISASTER RECOVERY ✔ ✔ ✗ SCHEMA-AGNOSTIC ✗ ✔ ✔ SCALE-OUT ✗ ✔ ✔ ELASTIC ✗ ✔ ✗ TIERED STORAGE ✗ ✔ ✗ SEMANTICS ✗ ✔ ✗ SLIDE: 20 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. A Change in DB Technology PDF SLIDE: 21 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Reduced Complexity with Faster Speed to Value PDF SLIDE: 22 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The Beauty of NoSQL Take all of Ingest your Search and 1 2 3 your data data as-is query everything SLIDE: 23 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Market Trends in the Energy Sector . Cost to manage the complexity and volumes of data are accelerating . Services to establish data agility are profound and growing . Software will increase 6x in cost, but labor to make it work is much higher SLIDE: 24 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. What You Can Do With NoSQL Bring all your data together – regardless of type Dynamically publish your content Analyze all of your data Tackle your most complex data challenges SLIDE: 25 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. What NoSQL Databases Offer Schema Flexibility Free of Complex Joins Horizontally Scalable Compatible with Commodity Hardware Self-contained Rapid Application Development SLIDE: 26 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. NoSQL Flavors 1 Key-value Model data as a search/index key and a value represented as an uninterpreted sequence of bytes. You can quickly read a record based on its key, but you can’t search value data across multiple records 2 Column-family Like very large tables with zillions of rows and possible columns, but each row actually has a small number of columns compared with the total number possible. Programmers recognize this arrangement as a hash table or dictionary mapping a key to a set of key-value pairs 3 Document Similar to key-value, except that the value associated with the key contains structured and semi-structured data – which is labeled a “document”. You can query against the structure, as well as elements within that structure, and return only portions of the document 4 Graph The relationships among the various entities are the most important thing. In the graph, a node is a particular entity and the edges between the nodes are labeled a particular kind of relationship. An edge may have particular attributes SLIDE: 27 © COPYRIGHT