Master Aligning Data, Process, and Governance Donna Burbank Global Data Strategy, Ltd. April 23rd , 2020

Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies Copyright Global Data Strategy, Ltd. 2020 Simplifying Advanced Data Workloads with NoSQL Data Management for Modern Data Demands

David Jones-Gilardi Developer Advocate @DataStax NoSQL and Cassandra: Foundational Capability

LEGACY DATA REAL-TIME, DISPARATE, INTEGRATION STREAMING, EVENTS SILO’D DATA

2 Modern Data Diversity and Complexity

LEGACY DATA REAL-TIME, DISPARATE, INTEGRATION STREAMING, EVENTS SILO’D DATA

DATA SECURITY / UNPREDICTABLE HYBRID, MULTI, SOVEREIGNTY SCALE INTER-CLOUD

3 Modern Data Brings Workload Complexity

> Traditional Siloed Today’s Related Data and Data and Workload Complex Management Workloads Data Management Evolution { : } 1980’s 2000’s 2020’s So, how do we solve for mixed workloads?

How complex is your query?

• Simple - Single /Single Index Lookup, Single Iteration • Complex - Full Scan, Large Aggregation, Unknown Iterations • In between - Multiple Partition/Indexes, Aggregations, or Multiple Iterations

How fast do you need it?

• Machine Time < the time it takes to interrupt a user process • Human Time < time a user will wait • Offline Time is Everything else

6 Mixed Workload Coverage – Customer 360 Queries

1. Find me Dave Offline 2. Find me all people with similar fast Analytics 5 names to ‘Dave’ 3 3. Tell me if there are duplicate Daves Human 4. Find how Dave and Jenn are fast DSE 4 connected Graph 5. Find how influential Dave is in Response time CQL Search my application 2 Machine 6. Show Dave what songs are 6 fast 1 trending for anyone with the same preferences while he Stream Processing looks for a song to play in his Simple Complex mobile app New Opportunities

BLENDED WORKLOADS AT SCALE WITH DATASTAX

Recommendation Law Fleet Fraud IoT Systems Enforcement Management

Anomaly detection Act on sensors Your preferences Bad actor identification Vehicle tracking and connected and analyze and your network’s and criminal and path components the network preferences network activity optimization

Manage Seamlessly with one Graph, Analytics, Search, Advanced Security,

Stream Processing, In-memory Engine 8 Simplified Data Complexity SINGLE CLOUD

A Single Data Platform

Mixed Workloads at Scale Scale apps for complex HYBRID ON data & workloads with CLOUD PREMISES ease Real-time Intelligence Access the full value of your ever-changing data Multi-DC, Multi- Platform Deployment and MULTI- / INTER-CLOUD operations wherever

9 you choose to deploy Learn about Cassandra and TinkerPop today! academy.datastax.co m @DataStaxDevs

datastax.com/download s @DataStax

10 Thank you Master Data Management Aligning Data, Process, and Governance Donna Burbank Global Data Strategy, Ltd. April 23rd , 2020

Follow on Twitter @donnaburbank Twitter Event hashtag: #DAStrategies Copyright Global Data Strategy, Ltd. 2020 Donna Burbank

technology. In past roles, she has served in She has worked with dozens of Fortune 500 key brand strategy and product companies worldwide in the Americas, management roles at CA Technologies and Europe, Asia, and Africa and speaks regularly Embarcadero Technologies for several of the at industry conferences. She has co- leading data management products in the authored two books: Data Modeling for the market. Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular As an active contributor to the data contributor to industry publications. She can management community, she is a long time be reached at Donna is a recognised industry expert in DAMA International member, Past President [email protected] information management with over 20 years and Advisor to the DAMA Rocky Mountain Donna is based in Boulder, Colorado, USA. of experience in data strategy, information chapter, and was awarded the Excellence in management, data modeling, Data Management Award from DAMA management, and enterprise architecture. International. Her background is multi-faceted across Donna is also an analyst at the Boulder BI consulting, product development, product Train Trust (BBBT) where she provides advice management, brand strategy, marketing, and gains insight on the latest BI and and business leadership. Analytics software in the market. She was on She is currently the Managing Director at several review committees for the Object Global Data Strategy, Ltd., an international Management Group’s for key information information management consulting management and process modeling company that specializes in the alignment of notations. business drivers with data-centric Follow on Twitter @donnaburbank

Global Data Strategy, Ltd. 2020 Twitter Event hashtag: #DAStrategies 2 DATAVERSITY Data Architecture Strategies This Year’s Lineup

• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same • April 23 Master Data Management – Aligning Data, Process, and Governance • May 28 Data Governance and Data Architecture – Alignment and Synergies • June 25 Enterprise Architecture vs. Data Architecture • July 22 Best Practices in Metadata Management • August 27 Best Practices • September 24 – Separating Myth from Reality • October 22 Data Architect vs. Data Engineer vs. Data Modeler • December 1 Graph : Practical Use Cases

Global Data Strategy, Ltd. 2020 3 What We’ll Cover Today

• Master Data Management (MDM) provides organizations with an accurate and comprehensive of business-critical data such as Customers, Products, Vendors, and more. • While mastering these key data areas can be a complex task, the value of doing so can be tremendous – from real-time operational integration to data warehousing & analytic reporting. • This webinar provides practical strategies for gaining value from your MDM initiative, while at the same time assuring a solid architectural and governance foundation that will ensure long- term, enterprise-wide success.

Global Data Strategy, Ltd. 2020 4 MDM is Part of a Wider Data Strategy A Successful Data Strategy links Business Goals with Technology Solutions

“Top-Down” alignment with business priorities

Managing the people, process, policies & culture around data

Leveraging & managing data for strategic advantage

Coordinating & integrating disparate data sources

“Bottom-Up” management & inventory of data sources

Copyright 2020 Global Data Strategy, Ltd

Global Data Strategy, Ltd. 2020 5 www.globaldatastrategy.com What is Master Data? Definition

• Master Data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts (sic).

• Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets.

- Source Gartner

Global Data Strategy, Ltd. 2020 6 What is Master Data? Real-world examples

The $1M cheese slice The $2M baby bottle The “dead” living organism

Which Dr. Smith is credentialled Which Michael Jones is the How do we define Regions, Markets, for heart surgery? high-net worth customer? Locations, Catchments, Sites, etc.?

Global Data Strategy, Ltd. 2020 7 What is Master Data? What is Reference Data?

Master Data Reference Data

Address Line 1 Address Line 2 One person’s Master Data is City another person’s Reference Data… vs. State

How do we define Regions, Markets, AL Locations, Catchments, Sites, etc.? AK AR AZ CA CO ..etc. Global Data Strategy, Ltd. 2020 8 … but Don’t make this overly Pedantic …

Global Data Strategy, Ltd. 2020 9 Understanding Your Customer A 360 Degree View through Data

Address = Pontresina, Switzerland

Occupation = Ski Instructor Purchased €500 in outdoor gear in 2015

Member of Loyalty Program since 2010 100% of purchases online

Stefan Krauss Age = 31 Top Finisher in Engadin Ski Marathon 2010-2015 Prefers Text Message Global Data Strategy, Ltd. 2020 10 Understanding Your Customer A 360 Degree View through Data

Address = Zurich, Switzerland Occupation = Banker

Purchased €3.500 in outdoor gear in 2019

Member of Loyalty Program since 1990 75% of spending is while Stefan Krauss on holiday Age = 62

Football Fan 100% of spending in store

Global Data Strategy, Ltd. 2020 Prefers Physical Mail 11 Transaction Data vs. Master Data

Consider the following retail transaction data

Customer Date Product Code Price Quantity Location Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY

Transaction Data Master Data • Describes an action (verb): E.g. “buy” • Describes the key entities (nouns), e.g. Customer, Product, • May include measurements about the action: (Who, When, Location What, How Many, Where, How Much, etc.) • Provides attributes & context for these nouns • E.g. Stefan Kraus, 1/2/2017/, Scarpa Telemark Ski Boot, St. • e.g. Wendy Hu, age 25, female, resident of New York, NY, Moritz, CH, €250 Customer since 2005, preferred customer card, etc.

Global Data Strategy, Ltd. 2020 12 Transaction Data vs. Master Data Reference Data: Country Codes Reference Data: State Codes

Customer Date Product Code Price Quantity Location Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO Transaction Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH Data Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY

Master Data: Master Data: Location Master Data: Product Customer

Global Data Strategy, Ltd. 2020 13 Master Data Overview

Supply CRM In-Store Finance Marketing Online Sales Sales Chain

Publish & Subscribe

Data Stewardship Validation Data Quality & Matching

Data Model MDM Data Lookup ETL Warehouse “Golden Record” BI & Reporting Reference End User Applications Data Sets

Global Data Strategy, Ltd. 2020 14 Master Data Overview Each system has its own unique functionality and associated data model.

First Name First Name Customer ID First Name First Name First Name

Family Name Postal Code First Name Middle Name Family Name Family Name

Address Line 1 Family Name Family Name Address Line 1 Address Line 1

Address Line 2 Account # Address Line 2 Address Line 2 Email Supply City CRM In-Store Credit Balance Finance Twitter ID Marketing City Online City Chain State Sales Gender State Sales State

Phone Credit Card # Country Publish & Subscribe Email Phone Phone

Spouse

Data Stewardship Validation Data Quality & Matching

Data Model MDM Data Lookup ETL Warehouse “Golden Record” BI & Reporting Reference End User Applications Data Sets

Global Data Strategy, Ltd. 2020 15 Master Data Overview

First Name First Name Customer ID First Name First Name First Name

Family Name Postal Code First Name Middle Name Family Name Family Name

Address Line 1 Family Name Family Name Address Line 1 Address Line 1

Address Line 2 Account # Address Line 2 Address Line 2 Email Supply City CRM In-Store Credit Balance Finance Twitter ID Marketing City Online City Chain State Sales Gender State Sales State

Phone Credit Card # Country Publish & Subscribe Email Phone Phone

Spouse

Data Stewardship The MDM data model is a selected Validation Data Quality super/subset of the source system models. & MatchingCustomer ID First Name

Family Name Data Model Address Line 1 Data AddressMDM Line 2 Lookup ETL Warehouse “GoldenCity Record” BI & Reporting State Reference End User Applications Country Data Sets Country Codes Phone State Codes Email

Global Data Strategy, Ltd. 2020 16 Master Data Overview

First Name = John First Name = Jack Customer ID = 123 First Name = J. First Name = Johnn First Name = John

Family Name = Smith Postal Code = 10101 First Name = John Middle Name Family Name Family Name = Smith

Address Line 1 = 101 Main ST Family Name = Smith Family Name = Smith Address Line 1 = 101 Main Street Address Line 1 = 101 Main St

Address Line 2 Account # Address Line 2 = Apt 2 Address Line 2 Email = [email protected] Supply City = Anywhere CRM In-Store Credit Balance Finance Twitter ID = @johnsMarketing City = Plano Online City = Anywhere Chain State = Texas Sales Gender = Male State = TX Sales State = TX

ZIP = 10101 Credit Card # Country = USA Publish & Subscribe Phone = 555 927 1212 Phone = +1 555 927 1212 Phone = x1212 Email = [email protected] Matching Rules help create a Spouse = Mary “Golden Record” Data Stewardship Validation Data Quality & Matching Customer ID = 123 First Name = John Family Name = Smith Data Model Address Line 1 = 101 Main ST Data MDM Address Line 2 = Apt 2 Lookup ETL Warehouse “Golden Record”City = Anywhere BI & Reporting State = TX Reference Postal Code = 10101 End User Applications Data Sets Country = USA

Phone = +1 555 927 1212

Email = [email protected] Global Data Strategy, Ltd. 2020 17 Data Model / Database Keys for Matching Rules

• Candidate attribute combinations for matching are often aligned with the primary and alternate keys from the logical data model. Matching on Primary Key

Customer Ideally, if all systems use the same unique identifier, matching is easier. Customer ID But this isn’t often realistic in “real world” systems.

Matching on Alternate Keys • First, match on Date of Birth + SSN • Then, match on SSN + Last Name • Etc.

Global Data Strategy, Ltd. 2020 18 Fuzzy Matching

• Fuzzy matching logic can also be used, which is particularly helpful in matching string fields such as names and addresses, where human error or different entry standards between systems can cause slight variations in similar values, e.g. • “101 Main St” vs. “101 Main Street” • “John Smith” vs. “J Smith” • In addition synonyms can be created to assist with matching, for example • “St”, “St.”, “Street”, etc. for addresses • “Tim”, “Timothy” for names and nicknames. • When using fuzzing matching, data quality thresholds can be defined for auto approval. • Match scores are created for each fuzzy match, for example .9 would indicate a strong match and .2 a weak one. • Using these scores as a guide, thresholds can be defined for which matches can be auto-approved, which can be auto-rejected, and which need human review from a .

Global Data Strategy, Ltd. 2020 19 Master Data Overview

Supply CRM In-Store Finance Marketing Online Sales Sales Chain

Publish & Subscribe

“Human in the Loop” – Data Stewards can validate match Data Stewardship Validation candidates. Data Quality & Matching

Data Model MDM Data Lookup ETL Warehouse “Golden Record” BI & Reporting Reference End User Applications Data Sets

Global Data Strategy, Ltd. 2020 20 Master Data Overview

Supply CRM In-Store Finance Marketing Online Sales Sales Chain

Publish & Subscribe

Applications can reference Data Stewardship the “Golden Record” for Validation Data Quality lookup. & Matching

Data Model MDM Data Lookup ETL Warehouse “Golden Record” BI & Reporting Reference End User Applications Data Sets

Global Data Strategy, Ltd. 2020 21 Master Data Overview

Supply CRM In-Store Finance Marketing Online Sales Sales Chain

Publish & Subscribe

Data Stewardship Validation Data Quality & Matching

Data Model MDM Data Lookup ETL Warehouse “Golden Record” BI & Reporting Reference MDM can feed the dimensional model for End User Applications Data Sets the (e.g. customer, location, etc.)

Global Data Strategy, Ltd. 2020 22 MDM is not Reporting or Analytics -- It can be a Source

Data Warehouse BI & Reporting e.g. Customer by Region

MDM “Golden Record”

Reference Data Sets

Graph Database Social Network Analysis

Data Lake Social Media Sentiment Analysis Global Data Strategy, Ltd. 2020 23 Governance & Business Process for MDM

• While the implementation of the hub and population strategies is complex, more complex is understanding the business processes and governance processes around the populating and publishing systems.

• In fact, the top two reasons for failure of MDM systems cited by the Gartner analyst group1 are :

Failure of IT to Align With Delaying or Mismanaging Business Process Improvements Information Governance and Document Business Value Implementation

1 Top Four Reasons Your MDM Program Will Fail, and How to Avoid Them, Gartner, 2016, ID: Global Data Strategy, Ltd. 2020 G00223675, by Bill O’Kane. Note: The remaining two reasons are: Failure to Manage Initial Master 24 Data Quality & Defining Transactional (Fact) Data as Master Data Successful MDM Combines Data, Process, and Accountability

• System Architecture & data flow • Data models & hierarchies Data • Match/merge and survivorship rules Architecture • & design

Master Data Management • Accountability & stewardship • Business process models • Business rule validation • Data mapping to process Business Data • Conflict resolution • CRUD and usage matrices Process Governance & • Business Prioritization • Optimizing business process Alignment Stewardship for data improvement

Global Data Strategy, Ltd. 2020 The Importance of Business Process Identifying key data dependencies in core business processes

• Process models are a helpful tool for describing core business processes (e.g. BPMN). • “Swimlanes” outline organizational considerations • Data can be mapped to key business processes to understand creation & usage of information. • Understanding business process is critical to Data Governance • Who is using data? • How is it used in business processes? • Are there redundancies, conflicts, etc.?

Global Data Strategy, Ltd. 2020 26 CRUD Matrix – Understanding Data Usage Create, Read, Update, Delete

• CRUD Matrices shows where data is Created, Read, Updated or Deleted across the various areas of the organization • This can be a helpful tool in data governance & data quality to determine route cause analysis. Users, Departments, and/or Systems

Product Supply Chain Marketing Finance Development Accounting

Product Assembly Instructions C R

Product Components C R Data entities or attributes Product Price C U R

Product Name C U,D

Etc.

Global Data Strategy, Ltd. 2020 27 The Intersection of MDM and Data Governance Successful MDM require governed alignment between business & technical roles

Enterprise-wide Business Prioritization Roles Business Technical

Enterprise Conceptual Map to Customer & Logistics Prioritize Subject Areas / MDM Prioritize Geo Rollout Business Data Owners Enterprise Data Business Data Model Journey Domains (e.g. Product) Architect Analyst

Business & Technical Alignment & Implementation Business-level

o Business Rules Steward Business Process Enterprise Data Business o Match-Merge Criteria Steward Architect Analyst o Survivorship Rules Logical Model o Harmonization Strategy Business Process Data Steward Tasks & Workflow Workflow Regional/Geo MDM Steward Architect Technical-level Data Quality Data Integration MDM Platform Admin o o Data Integration o o DB Admin Data Quality Remediation o CRUD Matrix o o MDM Platform Admin Data Quality Physical Model Augmentation (e.g. address) o Publish & Subscribe System MDM o Data Quality Dashboards Steward Architect Specialist

Data Integration Specialist Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com Optimizing Restaurant Revenue through Menu Data Managing the Data that Runs the Business • An international restaurant chain realized through its digital strategy that: • While menus are the core product that drives their business… • They had little control or visibility over their menu data • Menu data was scattered across multiple systems in the organization from supply chain to kitchen prep to marketing, restaurant operations, etc. • Menu data was consolidated & managed in a central hub: • Master Data Management created a “single view of menu” for business efficiency & quality control • Data Governance created the workflow & policies around managing menu data • Process Models & Data Mappings were critical • Business Process diagrams to identify the flow of information • CRUD Matrixes to understand usage, stewardship & ownership

Product Creation & Menu Display & Supply Chain Point of Sale & Testing Marketing Restaurant Operations

Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com Summary

• Interest in Master Data Management (MDM) is on the rise as more organizations look to gain a common, consistent source for their core data assets (Customer, Product, Supplier, Employee, etc.)

• Successful MDM is part of a wider data strategy and requires integration with:

• Data Architecture

• Business Process Alignment

• Data Governance & Stewardship

• Getting this combination right can have a positive impact on the success of the business.

Global Data Strategy, Ltd. 2020 30 DATAVERSITY Data Architecture Strategies Join us next month

• January 23 Emerging Trends in Data Architecture – What’s the Next Big Thing? • February 27 Building a Data Strategy - Practical Steps for Aligning with Business Goals • March 26 Cloud-Based Data Warehousing – What's New and What Stays the Same • April 23 Master Data Management – Aligning Data, Process, and Governance • May 28 Data Governance and Data Architecture – Alignment and Synergies • June 25 Enterprise Architecture vs. Data Architecture • July 22 Best Practices in Metadata Management • August 27 Data Quality Best Practices • September 24 Data Virtualization – Separating Myth from Reality • October 22 Data Architect vs. Data Engineer vs. Data Modeler • December 1 Graph Databases: Practical Use Cases

Global Data Strategy, Ltd. 2020 31 About Global Data Strategy, Ltd Data-Driven Business Transformation

• Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology. • Our passion is data, and helping organizations enrich their business opportunities through data and information. • Our core values center around providing solutions that are: • Business-Driven: We put the needs of your business first, before we look at any technology solution. • Clear & Relevant: We provide clear explanations using real-world examples. • Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of technical expertise in the industry. Business Strategy Data Strategy Aligned With

Visit www.globaldatastrategy.com for more information

Global Data Strategy, Ltd. 2020 32 Questions? • Thoughts? Ideas?

Global Data Strategy, Ltd. 2020 www.globaldatastrategy.com 33