Oracle Warehouse Builder Concepts Guide
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Managing Data in Motion This Page Intentionally Left Blank Managing Data in Motion Data Integration Best Practice Techniques and Technologies
Managing Data in Motion This page intentionally left blank Managing Data in Motion Data Integration Best Practice Techniques and Technologies April Reeve AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Morgan Kaufmann is an imprint of Elsevier Acquiring Editor: Andrea Dierna Development Editor: Heather Scherer Project Manager: Mohanambal Natarajan Designer: Russell Purdy Morgan Kaufmann is an imprint of Elsevier 225 Wyman Street, Waltham, MA 02451, USA Copyright r 2013 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods or professional practices, may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information or methods described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. -
Oracle Warehouse Builder 11Gr2: Feature Groups, Licensing and Feature Usage Management
An Oracle White Paper March 2011 Oracle Warehouse Builder 11gR2: Feature Groups, Licensing and Feature Usage Management Oracle Warehouse Builder 11.2: Feature Groups, Licensing and Feature Usage Management Introduction ....................................................................................... 1 Warehouse Builder 11gR2: Feature Groups Overview ...................... 3 Enterprise ETL Feature Group Details............................................... 4 Data Integration Features.............................................................. 5 Data Warehousing and BI Features............................................... 7 Mapping Features.......................................................................... 9 Process Flow Features................................................................ 10 Manageability Features ............................................................... 11 Metadata Management Features................................................. 11 Data Profiling and Quality Feature Group........................................ 12 Data Profile Editor and Data Profiles ........................................... 13 Data Rules................................................................................... 13 Automated Data Cleansing and Correction Mappings ................. 13 Data Auditors............................................................................... 14 Data Rules Used in ETL Mappings.............................................. 14 Application Adapters for OWB ........................................................ -
Using ETL, EAI, and EII Tools to Create an Integrated Enterprise
Data Integration: Using ETL, EAI, and EII Tools to Create an Integrated Enterprise Colin White Founder, BI Research TDWI Webcast October 2005 TDWI Data Integration Study Copyright © BI Research 2005 2 Data Integration: Barrier to Application Development Copyright © BI Research 2005 3 Top Three Data Integration Inhibitors Copyright © BI Research 2005 4 Staffing and Budget for Data Integration Copyright © BI Research 2005 5 Data Integration: A Definition A framework of applications, products, techniques and technologies for providing a unified and consistent view of enterprise-wide business data Copyright © BI Research 2005 6 Enterprise Business Data Copyright © BI Research 2005 7 Data Integration Architecture Source Target Data integration Master data applications Business domain dispersed management (MDM) MDM applications integrated internal data & external Data integration techniques data Data Data Data propagation consolidation federation Changed data Data transformation (restructure, capture (CDC) cleanse, reconcile, aggregate) Data integration technologies Enterprise data Extract transformation Enterprise content replication (EDR) load (ETL) management (ECM) Enterprise application Right-time ETL Enterprise information integration (EAI) (RT-ETL) integration (EII) Web services (services-oriented architecture, SOA) Data integration management Data quality Metadata Systems management management management Copyright © BI Research 2005 8 Data Integration Techniques and Technologies Data Consolidation centralized data Extract, transformation -
Data Profiling and Data Cleansing Introduction
Data Profiling and Data Cleansing Introduction 9.4.2013 Felix Naumann Overview 2 ■ Introduction to research group ■ Lecture organisation ■ (Big) data □ Data sources □ Profiling □ Cleansing ■ Overview of semester Felix Naumann | Profiling & Cleansing | Summer 2013 Information Systems Team 3 DFG IBM Prof. Felix Naumann Arvid Heise Katrin Heinrich project DuDe Dustin Lange Duplicate Detection Data Fusion Data Profiling project Stratosphere Entity Search Johannes Lorey Christoph Böhm Information Integration Data Scrubbing project GovWILD Data as a Service Data Cleansing Information Quality Web Data Linked Open Data RDF Data Mining Dependency Detection ETL Management Anja Jentzsch Service-Oriented Systems Entity Opinion Ziawasch Abedjan Recognition Mining Tobias Vogel Toni Grütze HPI Research School Dr. Gjergji Kasneci Zhe Zuo Maximilian Jenders Felix Naumann | Profiling & Cleansing | Summer 2013 Other courses in this semester 4 Lectures ■ DBS I (Bachelor) ■ Data Profiling and Data Cleansing Seminars ■ Master: Large Scale Duplicate Detection ■ Master: Advanced Recommendation Techniques Bachelorproject ■ VIP 2.0: Celebrity Exploration Felix Naumann | Profiling & Cleansing | Summer 2013 Seminar: Advanced Recommendation Techniques 5 ■ Goal: Cross-platform recommendation for posts on the Web □ Given a post on a website, find relevant (i.e., similar) posts from other websites □ Analyze post, author, and website features □ Implement and compare different state-of-the-art recommendation techniques … … Calculate (,) (i.e., the similarity between posts and ) … Recommend top-k posts ? … Overview 7 ■ Introduction to research group ■ Lecture organization ■ (Big) data □ Data sources □ Profiling □ Cleansing ■ Overview of semester Felix Naumann | Profiling & Cleansing | Summer 2013 Dates and exercises 8 ■ Lectures ■ Exam □ Tuesdays 9:15 – 10:45 □ Oral exam, 30 minutes □ Probably first week after □ Thursdays 9:15 – 10:45 lectures ■ Exercises ■ Prerequisites □ In parallel □ To participate ■ First lecture ◊ Background in □ 9.4.2013 databases (e.g. -
Oracle Warehouse Builder 11Gr2: OWB ETL Using ODI Knowledge Modules
An Oracle White Paper February 2010 Oracle Warehouse Builder 11gR2: OWB ETL Using ODI Knowledge Modules OWB ETL Using ODI Knowledge Modules Introduction ......................................................................................... 1 Oracle Warehouse Builder and Data Integration Requirements ......... 2 OWB 11gR2 and Knowledge Module-Based Data Integration............ 3 Understanding Code Template-Based Data Integration ................. 4 OWB Architecture Extensions for Code Templates ........................ 5 Understanding OWB 11.2 Code Template-Based ETL................... 6 OWB Data Types and Extensible Platforms ................................... 7 OWB Code Template Mappings vs. OWB Classic Mappings ......... 7 OWB 11.2 Connectivity and Data Movement Technologies ........... 8 Working with OWB Code Templates............................................... 8 Working with OWB Code Template Mappings.................................... 9 Migrating Existing Mappings to Code Template Technology ............ 15 Conclusion ........................................................................................ 18 OWB 11.2 Resources ....................................................................... 18 OWB ETL Using ODI Knowledge Modules Introduction This paper discusses how knowledge module technology from Oracle Data Integrator is used in Oracle Warehouse Builder Enterprise ETL 11gR2 (OWB-EE) code template mappings to add fast, flexible, heterogeneous data integration and changed data capture capabilities. OWB-EE 11gR2 adds -
Data Profiling: Designing the Blueprint for Improved Data Quality Brett Dorr, Dataflux Corporation, Cary, NC Pat Herbert, Dataflux Corporation, Cary, NC
SUGI 30 Data Warehousing, Management and Quality Paper 102-30 Data Profiling: Designing the Blueprint for Improved Data Quality Brett Dorr, DataFlux Corporation, Cary, NC Pat Herbert, DataFlux Corporation, Cary, NC ABSTRACT Many business and IT managers face the same problem: the data that serves as the foundation for their business applications (including customer relationship management (CRM) programs, enterprise resource planning (ERP) tools, and data warehouses) is inconsistent, inaccurate, and unreliable. Data profiling is the solution to this problem and, as such, is a fundamental step that should begin every data-driven initiative. This paper explores how data profiling can help determine the structure and completeness of data and, ultimately, improve data quality. The paper also covers the types of analysis that data profiling can provide as well as how data profiling fits into an overall data management strategy. INTRODUCTION Organizations around the world are looking for ways to turn data into a strategic asset. However, before data can be used as the foundation for high-level business intelligence efforts, an organization must address the quality problems that are endemic to the data that’s available on customers, products, inventory, assets, or finances. The most effective way to achieve consistent, accurate, and reliable data is to begin with data profiling. Data profiling involves using a tool that automates the discovery process. Ideally, this automation will help uncover the characteristics of the data and the relationships between data sources before any data-driven initiatives (such as data warehousing or enterprise application implementations) are executed. THE CASE FOR DATA PROFILING Not so long ago, the way to become a market leader was to have the right product at the right time. -
Rapid Data Quality Assessment Using Data Profiling
Rapid Data Quality Assessment Using Data Profiling David Loshin Knowledge Integrity, Inc. www.knowledge-integrity.com © 2010 Knowledge Integrity, Inc. 1 www.knowledge-integrity.com (301)754-6350 David Loshin, Knowledge Integrity Inc. David Loshin, president of Knowledge Integrity, Inc, (www.knowledge-integrity.com), is a recognized thought leader and expert consultant in the areas of data governance, data quality methods, tools, and techniques, master data management, and business intelligence. David is a prolific author regarding BI best practices, either via the expert channel at www.b-eye-network.com, “Ask The Expert” at Searchdatamanagement.techtarget.com, as well as numerous books on BI and data quality. His most recent book, “Master Data Management,” has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at www.mdmbook.com. David can be reached at [email protected]. MDM Component Model © 2010 Knowledge Integrity, Inc. 2 www.knowledge-integrity.com (301)754-6350 1 Business-Driven Information Requirements Driver Benefit Information Requirement Increased revenue, increased share, cross- Unified master customer data, Customer sell/up-sell, segmentation, targeting, retention, matching/linkage, centralized analytics, Intelligence customer satisfaction, ease of doing business quality data, eliminate redundancy Compliance, privacy, risk management, accurate Data quality, semantic consistency Risk & response to audits, prevent fraud across business processes, consistency, Compliance -
Data Quality Fundamentals
Data Quality Fundamentals David Loshin Knowledge Integrity, Inc. www.knowledge-integrity.com © 2010 Knowledge Integrity, Inc. 1 www.knowledge-integrity.com (301)754-6350 Agenda The Data Quality Program Data Quality Assessment Using Data Quality Tools Data Quality Inspection, Monitoring, and Control © 2010 Knowledge Integrity, Inc. 2 www.knowledge-integrity.com (301)754-6350 1 THE DATA QUALITY PROGRAM © 2010 Knowledge Integrity, Inc. 3 www.knowledge-integrity.com (301)754-6350 Data Quality Challenges Consumer data validation of supplied data provides little value unless supplier has an incentive to improve its product Data errors introduced within the enterprise drain resources for scrap and rework, yet the remediation process seldom results in long-term improvements Reacting to data integrity issues by cleansing the data does not improve productivity or operational efficiency Ambiguous data definitions and lack of data standards prevents most effective use of centralized “source of truth” and limits automation of workflow Proper data and application techniques must be employed to ensure ability to respond to business opportunities Centralization of integrated reference data opens up possibilities for reuse, both of the data and the process © 2010 Knowledge Integrity, Inc. 4 www.knowledge-integrity.com (301)754-6350 2 Addressing the Problem To effectively ultimately address data quality, we must be able to manage the Identification of customer data quality expectations Definition of contextual metrics Assessment of levels of data quality Track issues for process management Determination of best opportunities for improvement Elimination of the sources of problems Continuous measurement of improvement against baseline © 2010 Knowledge Integrity, Inc. 5 www.knowledge-integrity.com (301)754-6350 Data Quality Framework Data quality Measurement Policies Procedures expectations Governance Standards Monitor Training Performance © 2010 Knowledge Integrity, Inc. -
Creating the Golden Record
CreatingCreating thethe GoldenGolden RecordRecord BetterBetter DataData throughthrough ChemistryChemistry Donald J. Soulsby metaWright.com AgendaAgenda • The Golden Record • Master Data • Discovery • Integration • Quality • Master Data Strategy DAMADAMA –– LinkedInLinkedIn GroupGroup C. Lwanga Yonke - Information Quality Practitioner ......SpewakSpewak advocatedadvocated usingusing datadata dependencydependency toto determinedetermine thethe idealideal sequencesequence inin whichwhich applicationsapplications shouldshould bebe developeddeveloped andand implemented:implemented: “Develop“Develop thethe applicationsapplications thatthat createcreate datadata beforebefore thosethose thatthat needneed toto useuse thatthat data”data” (p.10).(p.10). ArchitectureArchitecture AdvocatesAdvocates WilliamWilliam SmithSmith – Entity Lifecycle CliveClive FinkelsteinFinkelstein - Information Engineering – CRUD RonRon RossRoss -- ResourceResource LifeLife CycleCycle AnalysisAnalysis CRUDCRUD inin aa PerfectPerfect WorldWorld CanonicalCanonical SynthesisSynthesis Broadly speaking, materials scientists investigate two types of phenomena. Both are based on the microstructures of materials: … ii. How do these microstructures influence the properties of the material (such as strength, electrical conductivity, or high frequency electromagnetic absorption)? http://www.its.caltech.edu/~matsci/WhatIs2.html Business VS Development Life Cycles Zachman Framework for Enterprise Architecture WHERE WHAT WHEN HOW WHO WHY CONTEXTUAL List of List of List of List of List -
The Importance of a Single Platform for Data Integration and Quality Management
helping build the smart and agile business The Importance of a Single Platform for Data Integration and Quality Management Colin White BI Research March 2008 Sponsored by Business Objects The Importance of a Single Platform for Data Integration and Quality Management TABLE OF CONTENTS DATA INTEGRATION AND QUALITY: UNDERSTANDING THE PROBLEM 1 The Evolution of Data Integration and Quality Software 1 Building a Single Data Services Architecture 3 Applications 3 Service-Oriented Architecture Layer 4 Data Services Techniques 5 Data Services Management and Operations 5 Choosing Data Services Products 6 BUSINESS OBJECTS DATA SERVICES PLATFORM 7 BusinessObjects Data Services XI 3.0 7 Getting Started: Success Factors 9 Brand and product names mentioned in this paper may be the trademarks or registered trademarks of their respective owners. BI Research The Importance of a Single Platform for Data Integration and Quality Management DATA INTEGRATION AND QUALITY: UNDERSTANDING THE PROBLEM Companies are fighting a constant battle to integrate business data and content while managing data quality in their organizations. Compounding this difficulty is the growing use of workgroup computing and Web technologies, the storing of more data and content online, and the need to retain information longer for compliance reasons. These trends are causing data volumes to increase dramatically. The growing number Rising volumes are not the only cause of data integration and quality issues, of data sources is however. The growing numbers of disparate systems that produce and distribute data causing data and content also add to the complexity of the data integration and quality integration problems management environment. Business mergers and acquisitions only exacerbate the situation. -
OWB Data Quality
Oracle Warehouse Builder 10gR2 Transforming Data into Quality Information An Oracle Whitepaper January 2006 Note: This document is for informational purposes. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in this document remains at the sole discretion of Oracle. This document in any form, software or printed matter, contains proprietary information that is the exclusive property of Oracle. This document and information contained herein may not be disclosed, copied, reproduced, or distributed to anyone outside Oracle without prior written consent of Oracle. This document is not part of your license agreement nor can it be incorporated into any contractual agreement with Oracle or its subsidiaries or affiliates. Oracle Warehouse Builder 10gR2 Transforming Data into Quality Information INTRODUCTION Enterprises have always relied on data to be successful. Customers, products, suppliers, and sales transactions all need to be described and tracked in one way or another. Even before computers became commercially available, the data in the form of paper records has been vital to both commercial and non-commercial organizations. With the advent of computing technology, the sophistication of data usage by businesses and governments grew exponentially. The technology industry serving these needs has generated many buzz words that come and go: decision support systems, data warehousing, customer relationship management, business intelligence, etc., but the fact remains the same—organizations need to make the best use of the data they have to increase their efficiency today and improve their planning for tomorrow. -
Unifying the Practices of Data Profiling, Integration, and Quality (Dpiq) by Philip Russom Senior Manager, TDWI Research the Data Warehousing Institute
TDWI MONOGRAPH SERIES OCTOBER 2007 Unifying the Practices of Data Profiling, Integration, and Quality (dPIQ) By Philip Russom Senior Manager, TDWI Research The Data Warehousing Institute SPONSORED BY TDWI Monograph Unifying the Practices of Data Profiling, Integration, and Quality (dPIQ) Table of Contents Defining dPIQ...................................................................................................................................3 Cycles and Dependencies in Data Profiling, Integration, and Quality..............................................5 The Unified dPIQ Cycle ...................................................................................................................8 Recommendations...........................................................................................................................12 About the Author PHILIP RUSSOM is the senior manager of TDWI Research for TDWI, where he oversees many of TDWI’s research-oriented publications, services, awards, and events. Prior to joining TDWI in 2005, Russom was an industry analyst covering BI at Forrester Research, Giga Information Group, and Hurwitz Group. He’s also run his own business as an independent industry analyst and BI consultant and was contributing editor with Intelligent Enterprise and DM Review magazines. Before that, Russom worked in technical and marketing positions for various database vendors. You can reach him at [email protected]. About Our Sponsor DataFlux enables organizations to analyze, improve, and control their data through