MODELING DATA HERITAGE WITH POWERDESIGNER®

THE ENTERPRISE. UNWIRED.

WHITE PAPER Knowing where our data came from and where it is going is one of the most challenging aspects of managing today’s data center. As our systems become more integrated and interdependent, through replication, ware- housing and system integration efforts, the impact of a change on any data element can be huge. Having the ability to document, and then report on, data heritage, allows users to more effectively and efficiently predict the impact of a change, as well as streamline its implementation. Managing Data Heritage, the sources and targets of data, require a 21st century modeling tool like Sybase PowerDesigner. REQUIREMENTS FOR MANAGING DATA HERITAGE When managing data heritage, it is important to be able to clearly document the data elements (aliases, format, statistics and other quality indicators), document the sourcing information and target data systems information (including transformation rules), and stewardship information (who created and modified the data elements, security and access rights and history). Starting with the processes, users need to be able to trace to all data elements (and any interim elements) to all final implementation points. To achieve this, users need to document: ■ All data processes with business definitions and requirements ■ All data items at a conceptual level with business definitions and stewardship details ■ All physical data (Tables/Columns) with version history ■ All dependencies between source and target tables and columns A TOOLSET AND A SOLUTION: BUILDING BLOCKS FOR DATA HERITAGE To address these needs, PowerDesigner offers rich process modeling, data analysis and design techniques that capture and manage the dependency metadata needed to document data heritage. PowerDesigner includes a metadata repository that offers secure version control and management, complete with change history identifying the user and the change.

For documenting all processes and requirements, PowerDesigner has fully integrated requirements manage- ment and business process modeling. PowerDesigner’s requirements management module links any number of analysis and design models to a given set of business requirements. Understanding each data element in context with the business requirements helps define the purpose and intent of the data being examined. Changes to a business requirement can be easily traced to the relevant data and process elements, iteratively down to the ultimate physical implementation. In reverse, any analysis and design element can report on its linked requirements to ensure any proposed change does not violate the business intent. PowerDesigner’s Business Process Modeling module provides an intuitive, non-technical business process hierarchy and flow diagrams. Data can be linked to the process and flows to produce a CRUD matrix defining data use through- out the business and business automation, as well as providing the process to data linkage that we need to understand the data in motion.

For capturing all data elements with business definition with trace to the physical implementation, PowerDesigner offers robust conceptual, logical and physical capabilities. The PowerDesigner Conceptual Data Model provides a detailed set of data item definitions, independent of implementation details. Serving as the central data dictionary for data definitions, data formats, business descriptions and key data stewardship details, the conceptual model links data analysis to multiple levels of logical and physical implementation models managed by the Physical Data Model module. PowerDesigner Physical Models document, generate and reverse engineer structures for over 45 RDBMS vendor/ versions including the latest Oracle®, IBM® (including Informix® and Red Brick™ Warehouse), Microsoft® (SQL Server and Access), Sybase (including ASE, ASA and IQ), NCR Teradata, MySQL and many more. Support includes all database objects, Java™, XML and Web Services in the Database, Users, Groups and Permissions and much more. This model will serve as the documentation for the final implementation for all the source and target elements.

For version history and stewardship details, in a secure environment, PowerDesigner integrates a full-featured enterprise repository. PowerDesigner’s enterprise repository offers role-based security on models and sub- models, version control and configuration management, merge, delta reports between models and versions, and comprehensive full repository search capabilities. TYING IT ALL TOGETHER—TRACKING DATA HERITAGE The real issue in understanding the data heritage in our environments in not just having all the tools to docu- ment the process, requirements, conceptual, logical and physical data elements, but in tying all this together in a meaningful way so that we can answer the questions:

2 1) Given this data element, where did the information really come from? 2) Changing this data element will impact how many systems? 3) Where is this data element implemented? In how many projects/systems? 4) How is this data used in a given system? 5) What does this data element really represent?

To answer these questions, PowerDesigner provides comprehensive Link and Synchronize features that manage: ■ Linking data elements to processes and process flows to document what data is used by what parts of the business ■ Linking a data entity or data item to one of many logical or physical tables and columns—providing real enterprise level data definitions without loosing continuity to the final implementation ■ Linking data to UML models for aligning data with business logic in code ■ Iteratively managing change from any source

PowerDesigner provides all the tools needed to document the various elements in appropriate detail, and match the layers of abstraction to their detailed implementations. However, having all this rich technology available is only part of the challenge in managing an environment that can allow us to manage data heritage. The biggest question is, “How I can get this from an existing environment?” DOCUMENTING EXISTING SYSTEMS: AN ITERATIVE APPROACH

Consider the case where we have an existing database system, and limited documentation. We can reverse engineer the physical data model and then generate the conceptual data model form the physical model. PowerDesigner automatically establishes links between the conceptual and physical models so that the lineage in either direction is preserved. We can now take that new conceptual model and begin to document it as we examine the original implementation database. However, we could also reverse engineer a different system into a different physical model, and “merge” it into the same conceptual model. If you are not satisfied with the results of the merge process, it may be undone using PowerDesigner’s multi-level undo feature. All transactions to the model may be undone back to the first change made to the model. This is especially useful in what-if scenario testing—users can be confident to try things because we can always get back to the initial state, or any stage in between.

Since we also want to understand how the data in one system “migrates” over to another, we can use the external source mapping features to document the transformation between these systems.

Once we have some of this metadata captured, we will consolidate this to the enterprise repository. Now, we have the first version stored and documented for future research. Capturing the metadata can become a large task for systems that are not already documented. PowerDesigner’s iterative approach to development also carries well into systems discovery. For each project that requires us to perform some data heritage discovery, use PowerDesigner to document the part explored. You may start with only 5% or 10% of the system docu- mented, but over time, more and more of the system will be captured by PowerDesigner until you achieve a highly useful set of definitions, dependencies and links. PowerDesigner’s iterative approach works in both directions to achieve this goal.

3 Defining the Mapping from Source to Warehouse IMPACT ANALYSIS

PowerDesigner can capture and manage the data heritage through the various levels of abstraction in data modeling, through links to process and UML modeling, and through comprehensive data mapping features, but how can we effectively answer our questions about data flow and heritage? PowerDesigner provides a unique impact analysis report feature that allows you to select any number of objects in any model, and press “Impact Analysis”. PowerDesigner then shows all the upstream and downstream dependencies in a single, easy to read report. This report can be saved and printed to share with other members of the project team.

CONCLUSION

PowerDesigner Impact Analysis Report PowerDesigner provides not only the basic building blocks to develop a rich data dictionary, in context with process and requirement, but rich dependency tracking capabilities to document the interdependencies between data elements. This, coupled with impact analysis reporting, secure version control with the enter- prise repository, and comprehensive reverse engineering capabilities, PowerDesigner eases and automates the management of data heritage. With PowerDesigner, knowing where data came from, and where it is going, becomes part of the data management infrastructure, not a new task that needs to be managed after the systems are built and require maintenance and change.

Sybase Incorporated Worldwide Headquarters One Sybase Drive Dublin4 CA, 94568 USA Copyright © 2005 Sybase, Inc. All rights reserved. Unpublished rights reserved under U. S. copyright laws. Sybase, the Sybase logo and PowerDesigner T 1.800.8.SYBASE are registered trademarks of Sybase, Inc. All other trademarks are property of their respective owners. ® indicates registration in the United States. www.sybase.com Printed in the U.S.A. 4/05