AGILE METHODS AND DATA WAREHOUSING How to deliver faster

KENT GRAZIANO, CHIEF TECHNICAL EVANGELIST I KentGraziano

© 2019 Snowflake Computing Inc. All Rights Reserved AGENDA

My Bio

Why Agile & DW

Agile Manifesto

12 Agile Principles

Agile Concepts

Agile DevOps

Agile Data Modeling

Becoming Agile

Two Week Iterations?

Conclusion

© 2019 Snowflake Computing Inc. All Rights Reserved 2 MY BIO

Chief Technical Evangelist, Snowflake Computing Blogger: The Data Warrior Certified Data Vault Master and DV 2.0 Practitioner Oracle ACE Director (Alumni) & OakTable Member Member – DAMA Houston & DAMA International Data Modeling, Data Architecture and Specialist • 30+ years in IT • 25+ years of Oracle-related work • 20+ years of data warehousing experience Former-Member: Boulder BI Brain Trust (http://www.boulderbibraintrust.org/) Author & Co-Author of a bunch of books Past-President of ODTUG and Rocky Mountain Oracle User Group 3 years in stealth + 3 years GA

Founded 2012 by industry veterans First customers with over 120 2014, general patents availability 2015

Over $850M in venture funding from leading 800+ employees investors Over 2000 customers today

Fun facts:

Queries processed in Largest single Largest number of Single customer Single customer Snowflake per day: table: tables single DB: most data: most users: 100 million 68 trillion rows 200,000 > 40PB > 10,000

© 2018 Snowflake Computing Inc. All Rights Reserved 4 WHY AGILE & DW?

Perceptions • DW too slow to produce results • DW projects “fail” • DW projects fail to adapt to business changes Goal • Change perceptions • Deliver value • Be more adaptable and flexible

© 2019 Snowflake Computing Inc. All Rights Reserved 5 OBJECTIVES

Understand what is meant by “agile” Try to apply some agile ideas to data warehouse and efforts Answer the question – can we deliver results faster?

© 2019 Snowflake Computing Inc. All Rights Reserved 6 MANIFESTO FOR AGILE SOFTWARE DEVELOPMENT

http://agilemanifesto.org

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools Working software over comprehensive documentation Customer collaboration over contract negotiation Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more

Kent Beck James Grenning Robert C. Martin Mike Beedle Jim Highsmith Steve Mellor Arie van Bennekum Andrew Hunt Ken Schwaber Alistair Cockburn Ron Jeffries Jeff Sutherland Ward Cunningham Jon Kern Dave Thomas Martin Fowler Brian Marick

© 2019 Snowflake Computing Inc. All Rights Reserved © 2001, the above authors of this declaration may be freely copied in any form, but only in its entirety through this notice. 7 THE PRINCIPLES OF AGILE

© 2019 Snowflake Computing Inc. All Rights Reserved PRINCIPLE #1

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software. • Who is the customer? • What is “valuable software” in data warehousing? • BI reports • interface • Working ETL code? • In the context of the customer! • Once we start – keep going!

© 2019 Snowflake Computing Inc. All Rights Reserved 9 PRINCIPLE #2

Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage. • Must be flexible and adaptable in thinking and design • User Stories with Prioritized Backlog • Use code generators • Agile data engineering techniques • Flexible and performant data platform

© 2019 Snowflake Computing Inc. All Rights Reserved 10 PRINCIPLE #3

Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale. • Need good scope control! • Prioritized Backlog • One subject area at a time • What is a subject area? • Think Data Vault – more on this later

© 2019 Snowflake Computing Inc. All Rights Reserved 11 PRINCIPLE #4

Business people and developers must work together daily throughout the project. • DW MUST have the business involved • One of the Top 10 reasons for failure • This applies for BI reports • Daily interaction would be great! • But – politics and priorities may interfere! • At HP GBI/EDW – we used “war” room • At MSH – daily standups with Biz

© 2019 Snowflake Computing Inc. All Rights Reserved 12 PRINCIPLE #5

Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done. • Need people who WANT to be on the project • Lose the dead weight • Get training if needed • Keep units of work small to create an atmosphere of success • Don’t try a Big Bang EDW

© 2019 Snowflake Computing Inc. All Rights Reserved 13 PRINCIPLE #6

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation. • Daily team huddles • Co-located work space • While face-to-face is efficient, still need some documentation (or meta-data) for later • Use a tool like JIRA, VersionOne, Rally, LeanKit

© 2019 Snowflake Computing Inc. All Rights Reserved 14 PRINCIPLE #7

Working software is the primary of progress. • Applied to DW: • What is “working software?” ▫ BI reports? ▫ Tables definitions and working ETL code? • Think more broadly – it is not just a data entry screen

© 2019 Snowflake Computing Inc. All Rights Reserved 15 PRINCIPLE #8

Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely. • DW Programs last a long time – don’t burn the team out with unreasonable deadlines • See P#5 – Motivated individuals • Good planning and scope control • No all nighters! • Smallest valuable unit of work possible (MVP) • Keep it moving like a production line • Pick (or develop) a standard, repeatable methodology • Study the Agile methods and adopt what works for your team • Data Vault Modeling Methodology

© 2019 Snowflake Computing Inc. All Rights Reserved 16 PRINCIPLE #9

Continuous attention to technical excellence and good design enhances agility. • Bad design + bad architecture = trouble • Symptom: can’t build a requested • Frequent design reviews a must • Improves team skills – provides cross training • Over time – better designs, shorter review cycles ▫ Faster delivery

© 2019 Snowflake Computing Inc. All Rights Reserved 17 PRINCIPLE #10

Simplicity--the art of maximizing the amount of work not done--is essential. • KISS – Keep it Simple Stupid • Write less code by hand • Use code generators! (No syntax errors – ever) ▫ Oracle SDDM, ERWin, Vertabelo ▫ Oracle Data Integrator, SSIS ▫ Erwin MappingManager, WhereScape RED, Ajilius • Modifications are easier – just regenerate the code • Virtualize initial reporting layers • Agile DevOps!

© 2019 Snowflake Computing Inc. All Rights Reserved 18 PRINCIPLE #11

The best architectures, requirements, and designs emerge from self-organizing teams. • Team of smart, motivated people = success • We succeed (or fail) as a TEAM • Don’t micro manage or pigeon-hole staff • Encourage team work and team thinking • Staff will gravitate to roles based on skills, interest, and personality ▫ Then they have more buy-in to the process • Eliminates delays and bottlenecks by having shared responsibilities (no single point of failure)

© 2019 Snowflake Computing Inc. All Rights Reserved 19 PRINCIPLE #12

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly. • Retrospectives are a MUST! • Ask 3 questions: • What went right? • What went wrong? • What will we do different next sprint? • Related to self-organizing teams • Make finding the solution to a problem the team’s problem ▫ More buy-in to the solution

© 2019 Snowflake Computing Inc. All Rights Reserved 20 USING AGILE CONCEPTS

© 2019 Snowflake Computing Inc. All Rights Reserved AGILE CONCEPTS FOR DW

Team Huddles (Morning Scrum) Extreme Programming Pair Programming User Stories

© 2019 Snowflake Computing Inc. All Rights Reserved 22 TEAM HUDDLES

Daily Standup Meeting • AKA Scrum • Morning Roll Call (FDD) Short meeting (< 15 minutes) • Every morning, mandatory attendance • Review assignments, accomplishments, backlogs, blockers • 3 Questions • What did I do yesterday? • What will I do today? • What do I need help on?

© 2019 Snowflake Computing Inc. All Rights Reserved 23 TEAM HUDDLES

Immediate feedback and assistance • Keeps team motivated and on track (P #5) • Identifies constraints and bottlenecks early in the process • Eliminates backlogs more quickly via re-assignments Improves team work Supports self-organizing teams (P #11)

© 2019 Snowflake Computing Inc. All Rights Reserved 24 EXTREME PROGRAMMING (XP)

Programmer works directly with the end user • At MSH used conference room or developer’s cube • At HP used Virtual Classroom or NetMeeting • At DFA used WebEx In DW: • Best with developing BI reports • DW or data mart must already be populated

© 2019 Snowflake Computing Inc. All Rights Reserved 25 EXTREME PROGRAMMING (XP)

Reports developed using BI tool • With the user in the room (or virtual) • With constant user reviews and input using a web reporting tool (via email even!) Also applies to developing a dashboard or portal interface Works for ETL as well! • Used war room with business to get near instant validation of ETL changes

© 2019 Snowflake Computing Inc. All Rights Reserved 26 PAIR/MOB PROGRAMMING

Part of XP Programmers work side-by-side • One terminal • One codes, the other reviews • Two terminals, one cube • One programming, one documenting • Could also be done virtually! In DW: • Writing ETL Code • Pair data modeling • Report Development

© 2019 Snowflake Computing Inc. All Rights Reserved 27 USER STORIES VS DEVELOPER STORIES

Concept borrowed from Ralph Hughes User Story • Real business requirement from Product Owner • May be so big it can’t be done in one sprint Developer (Technical) Story • A break down of major technical work needed to support a User Story • Small enough to do in one sprint • Table or schema design, source-target mapping, report development, etc. Promote User Story to Epic • Attach Developer stories to that Epic

© 2019 Snowflake Computing Inc. All Rights Reserved 28 STORY GENERATION

In DW/BI there are repeatable design & build processes Use Templates • Develop a repeatable steps/task list for different type of technical stories • Building a Fact or Dimension table (or Hub, Link, or Sat) • Adding a new table to staging area • Building a report • Modifying a report • Some tools (JIRA, VersionOne) allow use of template

© 2019 Snowflake Computing Inc. All Rights Reserved 29 DESIGN AND LOAD A NEW DIMENSION TEMPLATE EXAMPLE (USING VERSIONONE)

© 2019 Snowflake Computing Inc. All Rights Reserved NEW DIMENSION

S = Story AT = Test TK = Task

One button to turn Tests a template into a story in the backlog.

Can also copy a template Tasks then edit.

© 2019 Snowflake Computing Inc. All Rights Reserved 31 NEW DIMENSION - DETAILS

Generic, fill-in-the-blank technical story

Story Points

© 2019 Snowflake Computing Inc. All Rights Reserved 32 NEW DIMENSION - DETAILS

Custom field Very detailed, Added standardized by admin acceptance criteria. Can be customized as needed in real story

Default story status

Default story type

© 2019 Snowflake Computing Inc. All Rights Reserved 33 NEW DIMENSION – TASK 1

Very detailed, standardized sub-tasks. Can be customized as appropriate or if standards change

Default task type

Estimates and To Do (remaining) are in hours, NOT points.

© 2019 Snowflake Computing Inc. All Rights Reserved 34 NEW DIMENSION – TASK 2

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 35 NEW DIMENSION – TASK 3

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 36 NEW DIMENSION – TASK 4

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 37 NEW DIMENSION – TASK 5

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 38 NEW DIMENSION – TEST 1

Example of standardized list of things to check.

Can be customized as appropriate or if standards change

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 39 NEW DIMENSION – TEST 2

Example of standardized list of things to check.

Can be customized as appropriate or if standards change

Default task type

© 2019 Snowflake Computing Inc. All Rights Reserved 40 AGILE DEVOPS

© 2019 Snowflake Computing Inc. All Rights Reserved “THE ART OF MAXIMIZING THE AMOUNT OF WORK NOT DONE”

Remember Principle #10! Need a dev setup that let’s you be agile • Painless & fast setup and configuration • Easy to add more space and compute on demand • Minimal resource contention • Automated query optimization • Easy cloning of data sets Automated Regression Testing! • DBFit, iCEDQ, AnalytixDS, WhereScape • Roll your own – basic SQL scripts

© 2019 Snowflake Computing Inc. All Rights Reserved 42 AGILE DATA MODELING

© 2019 Snowflake Computing Inc. All Rights Reserved DATA VAULT – HOW I DID IT

Data modeling technique for enterprise data warehouse design • See Data Vault white papers at kentgraziano.com • Data Vault related books on Amazon (author Linstedt) • My book on Kindle (Amazon) Allows modeling EDW in small chunks • Develop model, build tables, build ETL, populate, repeat (often) • Key: prioritize the data requirements • Think User Stories and Backlog ala SCRUM

© 2019 Snowflake Computing Inc. All Rights Reserved 44 DATA VAULT MODEL FLEXIBILITY (AGILITY)

Goes beyond standard 3NF • Highly normalized • Hubs and Links only hold keys and meta data • Satellites split by rate of change and/or source • Enables Agile data modeling • Easy to add to model without having to change existing structures and load routines ▫ Relationships (links) can be dropped and created on-demand. • No more reloading history because of a missed requirement Based on natural business keys • Not system surrogate keys • Allows for integrating data across functions and source systems more easily • All data relationships are key driven.

© 2019 Snowflake Computing Inc. All Rights Reserved 45 DATA VAULT EXTENSIBILITY

Sprint 3 Adding new components to the EDW has NEAR ZERO impact to: Existing Loading Processes Existing Data Model Existing Reporting & BI Functions Existing Source Systems Existing Star Schemas and Data Marts Sprint 2 Sprint 1

© 2019 Snowflake Computing Inc. All Rights Reserved 46 HOW TO BE AGILE USING DV

Model iteratively • Use Data Vault data modeling technique • Create basic components, then add over time Virtualize the Access Layer • Don’t waste time building facts and dimensions up front (Principle #10) • ETL and testing takes too long • “Project” objects using pattern-based DV model with database views (or BI meta layer) • Users see real reports with real data Can always build out for performance in another iteration

© 2019 Snowflake Computing Inc. All Rights Reserved 47 SHAMELESS PLUG: MY INTRO BOOK

Available on Amazon.com

http://www.amazon.com/Better-Data- Modeling-Introduction-Engineering- ebook/dp/B018BREV1C/

© 2019 Snowflake Computing Inc. All Rights Reserved 48 NEW DV 2.0 BOOK

Available on Amazon.com

http://www.amazon.com/Building- Scalable-Data-Warehouse- Vault/dp/0128025107/

© 2019 Snowflake Computing Inc. All Rights Reserved 49 REAL WORLD METRICS

© 2019 Snowflake Computing Inc. All Rights Reserved HP EDW EXAMPLES

Business found missing report elements Solution: modify 3 tables to add 5 new columns in reporting model () Tasks: • Document requirements and ETL specs • Modify Logical & Physical model (w/peer review) • Rebuild tables in development • Develop and test ETL • MTI (Move To Integration) tables and code • Execute and test ETL • Modify report in UAT environment & test Result: Revised report ready in 18 hours, 44 minutes • Less than 1 business day 2nd case: 6 tables, 16 new columns • Ready for UAT in 72 hours How? War room with Biz & IT Team

© 2019 Snowflake Computing Inc. All Rights Reserved 51 BECOMING AGILE IN THE REAL WORLD

© 2019 Snowflake Computing Inc. All Rights Reserved GETTING TO AGILE

Cannot instantly become “agile” Must develop an agile mindset & culture Implement agile tools, processes, and methods Get an Agile Coach if necessary • Specifically an agile BI/DW or Data Coach

© 2019 Snowflake Computing Inc. All Rights Reserved 53 GETTING TO AGILE

“better than average expertise” • Expert consulting and mentoring • Do the work (OTJ) At Denver Public Schools • Took two years before we could try being more “agile” • Needed experience in Data Warehousing, Oracle Designer, OWB, and the “process” of building, deploying, and maintaining an Oracle DW

© 2019 Snowflake Computing Inc. All Rights Reserved 54 GETTING TO AGILE

At HP GBI/EDW it took about a year • Needed the right team and the right management support • Also the right project with willing business users At McKesson over a year setting standards and training staff • Built templates in JIRA for repeatable tasks • Then 3 week sprints At DFA – Day 1! • Experienced team with agile BI/DW coach • Used modified KanBan

© 2019 Snowflake Computing Inc. All Rights Reserved 55 TWO WEEK ITERATIONS?

Goal is really a few weeks to a few months (see P#3) What is the deliverable? • A for a star schema • Designed or populated or tested? • A dimension table • A complete star (fact and all dimensions) • One piece of ETL code that populates a fact table • A function needed by the ETL code • A new report or query Who is the customer? • BI programmer? • Knowledge worker? • ETL programmer?

© 2019 Snowflake Computing Inc. All Rights Reserved 56 CONCLUSION

Agile concepts can be applied to data warehouse and BI projects • Not a purist definition! • Try to apply the principles – be creative Suggested approaches • Data Vault 2.0! • Use Snowflake Elastic DW! • Use team huddles • Use pair programming to increase quality and cross training • Use code generators • Read about Agile Methods • Read Oracle CASE Method Fast-Track Be flexible and give it a try

© 2019 Snowflake Computing Inc. All Rights Reserved 57 REFERENCES

© 2019 Snowflake Computing Inc. All Rights Reserved OTHER REFERENCES

Agile Management for Software Engineering: Applying the Theory of Constraints for Business Results by David J. Anderson CASE Method Fast-track: A RAD Approach by Richard Barker & Dai Clegg The Goal by Eliyahu M. Goldratt The Business of Data Vault Modeling by Dan Linstedt, Kent Graziano, & Hans Hultgren Super Charge Your Data Warehouse by Dan Linstedt Test Automation by Lynn Winterboer http://www.slideshare.net/AgileDenver/lynn-winterboer-test-automation

© 2019 Snowflake Computing Inc. All Rights Reserved 59 QUESTIONS? DISCUSSIONS?

© 2019 Snowflake Computing Inc. All Rights Reserved Kent Graziano Snowflake Computing [email protected] On Twitter @KentGraziano

More info at http://snowflake.com

Visit my blog at http://kentgraziano.com

© 2019 Snowflake Computing Inc. All Rights Reserved © 2019 Snowflake Computing Inc. All Rights Reserved