Data Vault & Ensemble Modeling
Total Page:16
File Type:pdf, Size:1020Kb
25568 Genesee Trail Rd Golden, Colorado 80401 (303) 526-0340 Data Vault Modeling and Approach DW2.0 and Unstructured Data Master Data Management and Metadata Data Vault & Ensemble Modeling BI Podium Next Generation DWH Modeling 2013 gohansgo © 2013 Genesee Academy, LLC 25568 Genesee Trail Rd Hans Hultgren Golden, Colorado 80401 © 2013 Genesee Academy, LLC Data Vault & Ensemble Modeling • Welcome • Quick audience poll: – Data Warehousing Business Intelligence – Data Vault Modeling – Certification Course • Session will cover: – Data Vault – Ensemble – Unified Decomposition – Data Warehousing – Agility • More information © 2013 Genesee Academy, LLC Data Vault and Ensemble Modeling Intro • About Data Warehousing - Characteristics 1 Each layer of the architecture has its own requirements, constraints & variables © 2013 Genesee Academy, LLC 3 Data Vault and Ensemble Modeling • Why do we need it? Intro Each layer of the architecture has its own requirements, constraints & variables © 2013 Genesee Academy, LLC 4 Data Vault and Ensemble Modeling • Why do we need it? Intro Each layer of the architecture has its own requirements, constraints & variables 3 layer architecture… © 2013 Genesee Academy, LLC 5 About Data Vault, Ensemble & the EDW Intro 2 • Enterprise Data Warehousing • Integrated, Non-Volatile, Time-Variant, Subject/Concept Oriented, Central data store. • Core Features: Enterprise-Wide, Historized, Auditable, Central Data, Integrated across all forms of sources internal and external. Why data vault… © 2013 Genesee Academy, LLC 6 Why do we use Data Vault Intro 2 • Integration • Traceability • History • Incremental Build • Agility • Gracefully Adapts to New Sources • Full Auditability - Source to Mart • Enterprise View of Central Data • Data Vault is optimized for modeling the EDW What is data vault… © 2013 Genesee Academy, LLC 7 Data Vault & Ensemble Modeling Intro 2 • Data Vault is the leading data modeling approach among new options for the flexible/agile data warehouse. Data Modeling Approaches: Operational Data Warehouse Data Mart rd 3 Normal Form Data Vault Dimensional • For data warehouse agility there are other techniques as well. The broader family of techniques are all flavors of Ensemble Modeling. • In effect Ensemble modeling = EDW modeling. • Ensemble is based on the premise: The flexibility required by the data warehouse needs a model that de-couples changing context from relationships from the business keys (Unified Decomposition). Agenda… © 2013 Genesee Academy, LLC Agenda • Background Topics: – Core Business Concepts – Agility • Unified Decomposition • Ensemble Modeling • Data Vault Agility • The Data Vault Ensemble • Data Vault Core Constructs • Applying Data Vault • Core Concepts and the Backbone • DV Pattern applied • Bottom Line and Summary © 2013 Genesee Academy, LLC INTEGRATION & THE CORE BUSINESS CONCEPT © 2013 Genesee Academy, LLC The Core Business Concept • The Core Business Concept is the basis for our Data Vault Data Warehouse. It is similar to the Entity in 3NF or a Dimension in a Star Schema. And so it commonly includes Customer, Product, Employee, and etc. • Important to note: 1) Business Driven, and 2) Enterprise Wide. © 2013 Genesee Academy, LLC 11 ABOUT AGILITY © 2013 Genesee Academy, LLC Agile Data Warehousing BI 4 • Agility = Measure of ability to Adapt to Change • The EDW is constantly needing to adapt to change – New Sources – New Attributes – Changing Sources – New and Changing Requirements – New and Changing Business Rules – New and Changing Deliveries – Expanding Subject Areas Data Adapting Warehousing to Change = © 2013 Genesee Academy, LLC 13 ™ UNIFIED DECOMPOSITION © 2013 Genesee Academy, LLC ™ Unified Decomposition Separate things that change from things that are not changing. • Break things out into component parts for flexibility and to facilitate the capture of things that are either interpreted in different ways or changing independently of each other. Decomposition. • These parts however need to be integrated to define the core business concept (the Entity, the Dimension, etc.). So they must be kept together. Unified. © 2013 Genesee Academy, LLC 15 ™ Ensemble Modeling • The constellation of component parts acts as a whole – an Ensemble. All the parts of a thing taken together, so that each part is considered only in relation to the whole. • With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. • An Ensemble is based on all things defining a Core Business Concept that can be uniquely and specifically said for one instance of that Concept. © 2013 Genesee Academy, LLC 16 Data Vault Agility • The Data Vault Ensemble conforms to a single key embodied in the Hub construct. • The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History © 2013 Genesee Academy, LLC 17 The Data Vault Ensemble Core • Data Vault constructs have been broken out by type of data… Customer Customer Core Constructs… © 2013 Genesee Academy, LLC 18 Hubs – A Hub Construct in Data Vault H_Customer • contains Business Key H_Customer_SID • only the Business Key Business Key Date/Time Stamp • contains No Context Record source • is always 1:1 with EWBK – A Hub Table contains only • Business Key • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2013 Genesee Academy, LLC Links – A Link Construct in Data Vault • contains Relationship L_Cust_Class L_Cust_Class_SID • only a Relationship H_Sequence1_SID • contains No Context H_Sequence2_SID Date/Time Stamp • is always 1:1 with Relationship Record source – A Link Table contains only • 2-n FKs for the Relationship • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2013 Genesee Academy, LLC Satellites – A Satellite Construct in Data Vault • contains Context only S_Customer • has no FKs (no relationships) H_Customer • Designed by * Rate of Change Date/Time Stamp Context A * Type of Data * System… Context B – A Satellite Table contains only Context C Context D • Business Key FK + Record source • Load Date / Time Stamp • Context Data… • Record Source © 2013 Genesee Academy, LLC Applying the data vault modeling pattern © 2013 Genesee Academy, LLC Data Vault Model – How it Looks Data Vault Model for Customer Sales with Employee and Product. © 2013 Genesee Academy, LLC 23 Core Concepts © 2013 Genesee Academy, LLC 24 Core Concepts Six (6) Concept Keys © 2013 Genesee Academy, LLC 25 Data Vault Backbone The core foundation, the skeletal structure of the data vault model The model as viewed.. without the things that describe the key without the things that change over time Six (6) Concept Keys © 2013 Genesee Academy, LLC The Complete Data Vault Model Complete model with all context and history. Easily adapting to changes. © 2013 Genesee Academy, LLC 27 Applying the data vault modeling pattern © 2013 Genesee Academy, LLC Tracking History: Time Slice Data © 2013 Genesee Academy, LLC Tracking History: Time Slice Data © 2013 Genesee Academy, LLC Tracking History: Time Slice Data © 2013 Genesee Academy, LLC Tracking History: Time Slice Data © 2013 Genesee Academy, LLC Tracking History: Time Slice Data © 2013 Genesee Academy, LLC Impact of Change: New Attribute 5 New Attribute © 2013 Genesee Academy, LLC 34 The Bottom Line • The Data Warehouse needs to adapt to change easily, be based on central business concepts, integrate data from several sources, track history of changing context, contain trusted and auditable information, and it needs to perform. • Answering this call means a data warehouse program that is designed to meet these requirements with the people, processes, and the modeling techniques that support them. • Data Warehouse modeling => Ensemble modeling. Techniques that are based on Unified Decomposition. There are several forms of Ensemble methods in play today. • Data Vault modeling is the leading form of Ensemble modeling today. • The Best Practice is Modeling Awareness © 2013 Genesee Academy, LLC Data Vault Around the World Estimated 750 Data Vault based Data Warehouses around the world © 2013 Genesee Academy, LLC 36 Data Vault Certification Course The Genesee Academy CDVDM – Data Vault Modeling Course. The CDVDM is the data vault certification course covering all main topics of data vault modeling. The course is delivered in a blended learning method using online video lessons (2 weeks), classroom lectures, exercises, labs and small group modeling cases. Public courses are offered on a regular schedule www.GeneseeAcademy.com and there are in-company options as well. Data Vault Class June 10-11 Amsterdam NL Register Today! © 2013 Genesee Academy, LLC 37 About Hans Hultgren • Hans Hultgren is an author, speaker, educator and advisor in the data warehousing and business intelligence space. He is an expert on data vault modeling and the author of Modeling the Agile Data Warehouse with Data Vault where he introduced Ensemble Modeling and Unified Decomposition. • Hans is the President of Genesee Academy, LLC (including also www.DataVaultAcademy.com) which provides the CDVDM data vault certification around globe. • For 20 years Hans was a professor at DU where he was the founder and director of the masters of science degree in business intelligence and data warehousing MSBI. © 2013 Genesee Academy, LLC Links and Information Data Vault Class CDVDM Training & Certification June 10-11 www.GeneseeAcademy.com Amsterdam NL [email protected] Register Today! gohansgo Book DataVaultBook.blogspot.com HansHultgren.WordPress.com HansHultgren DataVaultAcademy Online video-lesson training DataVaultAcademy.com © 2013 Genesee Academy, LLC 39 .