Oracle Data Integrator Best Practices for a Data Warehouse

Total Page:16

File Type:pdf, Size:1020Kb

Oracle Data Integrator Best Practices for a Data Warehouse Oracle Data Integrator Best Practices for a Data Warehouse Oracle Best Practices March 2008 Oracle Data Integrator Best Practices for a Data Warehouse PREFACE ................................................................................................................ 7 PURPOSE .................................................................................................................. 7 AUDIENCE ................................................................................................................. 7 ADDITIONAL INFORMATION .......................................................................................... 7 INTRODUCTION TO ORACLE DATA INTEGRATOR (ODI) .......................................... 8 OBJECTIVES ............................................................................................................... 8 BUSINESS -RULES DRIVEN APPROACH .............................................................................. 8 Introduction to Business rules ........................................................................... 8 Mappings .......................................................................................................... 9 Joins .................................................................................................................. 9 Filters................................................................................................................. 9 Constraints ........................................................................................................ 9 TRADITIONAL ETL VERSUS E-LT APPROACH ..................................................................... 9 UNDERSTANDING ORACLE DATA INTEGRATOR (ODI) INTERFACES ...................................... 10 A BUSINESS PROBLEM CASE STUDY .............................................................................. 11 IMPLEMENTATION USING MANUAL CODING ................................................................... 13 IMPLEMENTATION USING TRADITIONAL ETL TOOLS ......................................................... 15 IMPLEMENTATION USING ODI’S E-LT AND THE BUSINESS -RULE DRIVEN APPROACH .............. 17 Specifying the Business Rules in the Interface................................................. 17 Business Rules are Converted into a Process .................................................. 18 BENEFITS OF E-LT COMBINED WITH A BUSINESS -RULE DRIVEN APPROACH .......................... 20 ARCHITECTURE OF ORACLE DATA INTEGRATOR (ODI) ......................................... 23 ARCHITECTURE OVERVIEW .......................................................................................... 23 GRAPHICAL USER INTERFACES ..................................................................................... 24 REPOSITORY ............................................................................................................ 24 SCHEDULER AGENT ................................................................................................... 26 METADATA NAVIGATOR ............................................................................................. 27 Oracle Data Integrator Best Practices for a Data Warehouse Page 2 USING ORACLE DATA INTEGRATOR IN YOUR DATA WAREHOUSE PROJECT ......... 28 ODI AND THE DATA WAREHOUSE PROJECT ................................................................... 28 ORGANIZING THE TEAMS ............................................................................................ 28 REVERSE -ENGINEERING , AUDITING AND PROFILING SOURCE APPLICATIONS .......................... 30 DESIGNING AND IMPLEMENTING THE DATA WAREHOUSE ’S SCHEMA .................................. 32 SPECIFYING AND DESIGNING BUSINESS RULES ................................................................ 33 BUILDING A DATA QUALITY FRAMEWORK ...................................................................... 38 DEVELOPING ADDITIONAL COMPONENTS ...................................................................... 39 PACKAGING AND RELEASING DEVELOPMENT .................................................................. 40 VERSIONING DEVELOPMENT ....................................................................................... 40 SCHEDULING AND OPERATING SCENARIOS ..................................................................... 41 MONITORING THE DATA QUALITY OF THE DATA WAREHOUSE ........................................... 41 PUBLISHING METADATA TO BUSINESS USERS ................................................................. 41 PLANNING FOR NEXT RELEASES ................................................................................... 42 DEFINING THE TOPOLOGY IN ORACLE DATA INTEGRATOR .................................. 44 INTRODUCTION TO TOPOLOGY ..................................................................................... 44 DATA SERVERS ......................................................................................................... 45 Understanding Data Servers and Connectivity ............................................... 45 Defining User Accounts or Logins for ODI to Access your Data Servers .......... 47 Defining Work Schemas for the Staging Area ................................................. 47 Defining the Data Server ................................................................................. 48 Examples of Data Servers Definitions ............................................................. 49 Teradata ...................................................................................................................49 Oracle .......................................................................................................................50 Microsoft SQL Server ...............................................................................................50 IBM DB2 UDB (v8 and higher) ..................................................................................50 IBM DB2 UDB (v6, v7) and IBM DB2 MVS ................................................................51 IBM DB2 400 (iSeries) ..............................................................................................51 Flat Files (Using ODI Agent)......................................................................................52 XML ..........................................................................................................................52 Microsoft Excel ........................................................................................................53 PHYSICAL SCHEMAS ................................................................................................... 53 CONTEXTS ............................................................................................................... 54 LOGICAL SCHEMAS .................................................................................................... 56 PHYSICAL AND LOGICAL AGENTS .................................................................................. 56 THE TOPOLOGY MATRIX ............................................................................................ 56 OBJECT NAMING CONVENTIONS .................................................................................. 59 DEFINING ODI MODELS ....................................................................................... 61 INTRODUCTION TO MODELS ........................................................................................ 61 MODEL CONTENTS .................................................................................................... 62 IMPORTING METADATA AND REVERSE -ENGINEERING ....................................................... 63 Introduction to Reverse-engineering............................................................... 63 Reverse-engineering Relational Databases..................................................... 64 Non Relational Models .................................................................................... 64 Flat Files and JMS Queues and Topics ......................................................................64 Fixed Files and Binary Files with COBOL Copy Books ...............................................65 XML ..........................................................................................................................65 LDAP Directories ......................................................................................................66 Other Non Relation Models .....................................................................................66 Troubleshooting Reverse-engineering ............................................................ 67 Oracle Data Integrator Best Practices for a Data Warehouse Page 3 JDBC Reverse-engineering Failure ...........................................................................67 Missing Data Types ..................................................................................................67 Missing Constraints ..................................................................................................68 CREATING USER -DEFINED DATA QUALITY RULES ............................................................. 68 ADDING USER -DEFINED METADATA WITH FLEX FIELDS .................................................... 70 DOCUMENTING MODELS FOR BUSINESS USERS AND DEVELOPERS ...................................... 71 EXPORTING METADATA ............................................................................................. 72 IMPACT ANALYSIS , DATA LINEAGE AND CROSS -REFERENCES ............................................
Recommended publications
  • Delivered with Infosphere Warehouse Cubing Services
    Front cover Multidimensional Analytics: Delivered with InfoSphere Warehouse Cubing Services Getting more information from your data warehousing environment Multidimensional analytics for improved decision making Efficient decisions with no copy analytics Chuck Ballard Silvio Ferrari Robert Frankus Sascha Laudien Andy Perkins Philip Wittann ibm.com/redbooks International Technical Support Organization Multidimensional Analytics: Delivered with InfoSphere Warehouse Cubing Services April 2009 SG24-7679-00 Note: Before using this information and the product it supports, read the information in “Notices” on page vii. First Edition (April 2009) This edition applies to IBM InfoSphere Warehouse Cubing Services, Version 9.5.2 and IBM Cognos Cubing Services 8.4. © Copyright International Business Machines Corporation 2009. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Notices . vii Trademarks . viii Preface . ix The team that wrote this book . x Become a published author . xiii Comments welcome. xiv Chapter 1. Introduction. 1 1.1 Multidimensional Business Intelligence: The Destination . 2 1.1.1 Dimensional model . 3 1.1.2 Providing OLAP data. 5 1.1.3 Consuming OLAP data . 7 1.1.4 Pulling it together . 8 1.2 Conclusion. 9 Chapter 2. A multidimensional infrastructure . 11 2.1 The need for multidimensional analysis . 12 2.1.1 Identifying uses for a cube . 13 2.1.2 Getting answers with no queries . 16 2.1.3 Components of a cube . 17 2.1.4 Selecting dimensions . 17 2.1.5 Why create a star-schema . 18 2.1.6 More help from InfoSphere Warehouse Cubing Services.
    [Show full text]
  • Database Administration Oracle Standards
    CMS DATABASE ADMINISTRATION ORACLE STANDARDS 5/16/2011 Contents 1. Overview ....................................................................................................................................................... 4 2. Oracle Database Development Life Cycle ..................................................................................................... 4 2.1 Development Phase .............................................................................................................................. 4 2.2 Test Validation Phase ............................................................................................................................ 5 2.3 Production Phase .................................................................................................................................. 5 2.4 Maintenance Phase .............................................................................................................................. 6 2.5 Retirement of Development and Test Environments ........................................................................... 6 3. Oracle Database Design Standards ............................................................................................................... 6 3.1 Oracle Design Overview ........................................................................................................................ 6 3.2 Instances ..............................................................................................................................................
    [Show full text]
  • Schema in Database Sql Server
    Schema In Database Sql Server Normie waff her Creon stringendo, she ratten it compunctiously. If Afric or rostrate Jerrie usually files his terrenes shrives wordily or supernaturalized plenarily and quiet, how undistinguished is Sheffy? Warring and Mahdi Morry always roquet impenetrably and barbarizes his boskage. Schema compare tables just how the sys is a table continues to the most out longer function because of the connector will often want to. Roles namely actors in designer slow and target multiple teams together, so forth from sql management. You in sql server, should give you can learn, and execute this is a location of users: a database projects, or more than in. Your sql is that the view to view of my data sources with the correct. Dive into the host, which objects such a set of lock a server database schema in sql server instance of tables under the need? While viewing data in sql server database to use of microseconds past midnight. Is sql server is sql schema database server in normal circumstances but it to use. You effectively structure of the sql database objects have used to it allows our policy via js. Represents table schema in comparing new database. Dml statement as schema in database sql server functions, and so here! More in sql server books online schema of the database operator with sql server connector are not a new york, with that object you will need. This in schemas and history topic names are used to assist reporting from. Sql schema table as views should clarify log reading from synonyms in advance so that is to add this game reports are.
    [Show full text]
  • Referential Integrity in Sqlite
    CS 564: Database Management Systems University of Wisconsin - Madison, Fall 2017 Referential Integrity in SQLite Declaring Referential Integrity (Foreign Key) Constraints Foreign key constraints are used to check referential integrity between tables in a database. Consider, for example, the following two tables: create table Residence ( nameVARCHARPRIMARY KEY, capacityINT ); create table Student ( idINTPRIMARY KEY, firstNameVARCHAR, lastNameVARCHAR, residenceVARCHAR ); We can enforce the constraint that a Student’s residence actually exists by making Student.residence a foreign key that refers to Residence.name. SQLite lets you specify this relationship in several different ways: create table Residence ( nameVARCHARPRIMARY KEY, capacityINT ); create table Student ( idINTPRIMARY KEY, firstNameVARCHAR, lastNameVARCHAR, residenceVARCHAR, FOREIGNKEY(residence) REFERENCES Residence(name) ); or create table Residence ( nameVARCHARPRIMARY KEY, capacityINT ); create table Student ( idINTPRIMARY KEY, firstNameVARCHAR, lastNameVARCHAR, residenceVARCHAR REFERENCES Residence(name) ); or create table Residence ( nameVARCHARPRIMARY KEY, 1 capacityINT ); create table Student ( idINTPRIMARY KEY, firstNameVARCHAR, lastNameVARCHAR, residenceVARCHAR REFERENCES Residence-- Implicitly references the primary key of the Residence table. ); All three forms are valid syntax for specifying the same constraint. Constraint Enforcement There are a number of important things about how referential integrity and foreign keys are handled in SQLite: • The attribute(s) referenced by a foreign key constraint (i.e. Residence.name in the example above) must be declared UNIQUE or as the PRIMARY KEY within their table, but this requirement is checked at run-time, not when constraints are declared. For example, if Residence.name had not been declared as the PRIMARY KEY of its table (or as UNIQUE), the FOREIGN KEY declarations above would still be permitted, but inserting into the Student table would always yield an error.
    [Show full text]
  • Create Table Identity Primary Key Sql Server
    Create Table Identity Primary Key Sql Server Maurits foozle her Novokuznetsk sleeplessly, Johannine and preludial. High-principled and consonantal Keil often stroke triboluminescentsome proletarianization or spotlight nor'-east plop. or volunteer jealously. Foul-spoken Fabio always outstrips his kursaals if Davidson is There arise two ways to create tables in your Microsoft SQL database. Microsoft SQL Server has built-in an identity column fields which. An identity column contains a known numeric input for a row now the table. SOLVED Can select remove Identity from a primary case with. There cannot create table created on every case, primary key creates the server identity column if the current sql? As I today to refute these records into a U-SQL table review I am create a U-SQL database. Clustering option requires separate table created sequence generator always use sql server tables have to the key. This key creates the primary keys per the approach is. We love create Auto increment columns in oracle by using IDENTITY. PostgreSQL Identity Column PostgreSQL Tutorial. Oracle Identity Column A self-by-self Guide with Examples. Constraints that table created several keys means you can promote a primary. Not logged in Talk Contributions Create account already in. Primary keys are created, request was already creates a low due to do not complete this. IDENTITYNOT NULLPRIMARY KEY Identity Sequence. How weak I Reseed a SQL Server identity column TechRepublic. Hi You can use one query eg Hide Copy Code Create table tblEmplooyee Recordid bigint Primary key identity. SQL CREATE TABLE Statement Tutorial Republic. Hcl will assume we need be simplified to get the primary key multiple related two dissimilar objects or adding it separates structure is involved before you create identity? When the identity column is part of physician primary key SQL Server.
    [Show full text]
  • Resource Collection for High Ability Secondary Learners 2011
    Resource Collection for High Ability Secondary Learners Office of Gifted Education Montgomery County Public Schools 2011 - 2012 Table of Contents 2011 – 2012 Materials for High Ability Secondary Students How to Order .................................................................................................................................. 3 Professional Resources for Teachers .............................................................................................. 4 Differentiation ............................................................................................................................. 4 Assessment .................................................................................................................................. 5 Learning Styles and Multiple Intelligences ................................................................................ 6 Curriculum, Strategies and Techniques ...................................................................................... 7 Miscellaneous ............................................................................................................................. 9 English ...................................................................................................................................... 10 Mathematics .............................................................................................................................. 13 History......................................................................................................................................
    [Show full text]
  • Oracle Financials Open Interfaces Manual Release 11 the Part Number for This Manual Is A58482–01
    Oracler Financials Open Interfaces Manual RELEASE 11 March 1998 Oracle Financials Open Interfaces Manual Release 11 The part number for this manual is A58482–01. E Copyright 1995, 1998, Oracle Corporation. All rights reserved. Primary Authors: Christopher Andrews, Louis Bryan, Janet Buchbinder, Frank Colligan, Gail D’Aloisio, Stephen Damiani, Sharon Goetz, Christina Ravaglia Contributing Authors: D. Yitzik Brenman, Steve Carter, Caroline Guenther, Beth Mitchum The Programs (which include both the software and documentation) contain proprietary information of Oracle Corporation; they are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright, patent and other intellectual property law. Reverse engineering of the Programs is prohibited. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without the express written permission of Oracle Corporation. The information contained in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Oracle Corporation does not warrant that this document is error free. RESTRICTED RIGHTS LEGEND Programs delivered subject to the DOD FAR Supplement are ’commercial computer software’ and use, duplication and disclosure of the Programs shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement. Otherwise, Programs delivered subject to the Federal Acquisition Regulations are ’restricted computer software’ and use, duplication and disclosure of the Programs shall be subject to the restrictions in FAR 52.227–14, Rights in Data –– General, including Alternate III (June 1987). Oracle Corporation, 500 Oracle Parkway, Redwood City, CA 94065.” The Programs are not intended for use in any nuclear, aviation, mass transit, medical, or other inherently dangerous applications.
    [Show full text]
  • The Unconstrained Primary Key
    IBM Systems Lab Services and Training The Unconstrained Primary Key Dan Cruikshank www.ibm.com/systems/services/labservices © 2009 IBM Corporation In this presentation I build upon the concepts that were presented in my article “The Keys to the Kingdom”. I will discuss how primary and unique keys can be utilized for something other than just RI. In essence, it is about laying the foundation for data centric programming. I hope to convey that by establishing some basic rules the database developer can obtain reasonable performance. The title is an oxymoron, in essence a Primary Key is a constraint, but it is a constraint that gives the database developer more freedom to utilize an extremely powerful relational database management system, what we call DB2 for i. 1 IBM Systems Lab Services and Training Agenda Keys to the Kingdom Exploiting the Primary Key Pagination with ROW_NUMBER Column Ordering Summary 2 www.ibm.com/systems/services/labservices © 2009 IBM Corporation I will review the concepts I introduced in the article “The Keys to the Kingdom” published in the Centerfield. I think this was the inspiration for the picture. I offered a picture of me sitting on the throne, but that was rejected. I will follow this with a discussion on using the primary key as a means for creating peer or subset tables for the purpose of including or excluding rows in a result set. The ROW_NUMBER function is part of the OLAP support functions introduced in 5.4. Here I provide some examples of using ROW_NUMBER with the BETWEEN predicate in order paginate a result set.
    [Show full text]
  • Keys Are, As Their Name Suggests, a Key Part of a Relational Database
    The key is defined as the column or attribute of the database table. For example if a table has id, name and address as the column names then each one is known as the key for that table. We can also say that the table has 3 keys as id, name and address. The keys are also used to identify each record in the database table . Primary Key:- • Every database table should have one or more columns designated as the primary key . The value this key holds should be unique for each record in the database. For example, assume we have a table called Employees (SSN- social security No) that contains personnel information for every employee in our firm. We’ need to select an appropriate primary key that would uniquely identify each employee. Primary Key • The primary key must contain unique values, must never be null and uniquely identify each record in the table. • As an example, a student id might be a primary key in a student table, a department code in a table of all departments in an organisation. Unique Key • The UNIQUE constraint uniquely identifies each record in a database table. • Allows Null value. But only one Null value. • A table can have more than one UNIQUE Key Column[s] • A table can have multiple unique keys Differences between Primary Key and Unique Key: • Primary Key 1. A primary key cannot allow null (a primary key cannot be defined on columns that allow nulls). 2. Each table can have only one primary key. • Unique Key 1. A unique key can allow null (a unique key can be defined on columns that allow nulls.) 2.
    [Show full text]
  • CHAPTER 3 - Relational Database Modeling
    DATABASE SYSTEMS Introduction to Databases and Data Warehouses, Edition 2.0 CHAPTER 3 - Relational Database Modeling Copyright (c) 2020 Nenad Jukic and Prospect Press MAPPING ER DIAGRAMS INTO RELATIONAL SCHEMAS ▪ Once a conceptual ER diagram is constructed, a logical ER diagram is created, and then it is subsequently mapped into a relational schema (collection of relations) Conceptual Model Logical Model Schema Jukić, Vrbsky, Nestorov, Sharma – Database Systems Copyright (c) 2020 Nenad Jukic and Prospect Press Chapter 3 – Slide 2 INTRODUCTION ▪ Relational database model - logical database model that represents a database as a collection of related tables ▪ Relational schema - visual depiction of the relational database model – also called a logical model ▪ Most contemporary commercial DBMS software packages, are relational DBMS (RDBMS) software packages Jukić, Vrbsky, Nestorov, Sharma – Database Systems Copyright (c) 2020 Nenad Jukic and Prospect Press Chapter 3 – Slide 3 INTRODUCTION Terminology Jukić, Vrbsky, Nestorov, Sharma – Database Systems Copyright (c) 2020 Nenad Jukic and Prospect Press Chapter 3 – Slide 4 INTRODUCTION ▪ Relation - table in a relational database • A table containing rows and columns • The main construct in the relational database model • Every relation is a table, not every table is a relation Jukić, Vrbsky, Nestorov, Sharma – Database Systems Copyright (c) 2020 Nenad Jukic and Prospect Press Chapter 3 – Slide 5 INTRODUCTION ▪ Relation - table in a relational database • In order for a table to be a relation the following conditions must hold: o Within one table, each column must have a unique name. o Within one table, each row must be unique. o All values in each column must be from the same (predefined) domain.
    [Show full text]
  • Relational Query Languages
    Relational Query Languages Universidad de Concepcion,´ 2014 (Slides adapted from Loreto Bravo, who adapted from Werner Nutt who adapted them from Thomas Eiter and Leonid Libkin) Bases de Datos II 1 Databases A database is • a collection of structured data • along with a set of access and control mechanisms We deal with them every day: • back end of Web sites • telephone billing • bank account information • e-commerce • airline reservation systems, store inventories, library catalogs, . Relational Query Languages Bases de Datos II 2 Data Models: Ingredients • Formalisms to represent information (schemas and their instances), e.g., – relations containing tuples of values – trees with labeled nodes, where leaves contain values – collections of triples (subject, predicate, object) • Languages to query represented information, e.g., – relational algebra, first-order logic, Datalog, Datalog: – tree patterns – graph pattern expressions – SQL, XPath, SPARQL Bases de Datos II 3 • Languages to describe changes of data (updates) Relational Query Languages Questions About Data Models and Queries Given a schema S (of a fixed data model) • is a given structure (FOL interpretation, tree, triple collection) an instance of the schema S? • does S have an instance at all? Given queries Q, Q0 (over the same schema) • what are the answers of Q over a fixed instance I? • given a potential answer a, is a an answer to Q over I? • is there an instance I where Q has an answer? • do Q and Q0 return the same answers over all instances? Relational Query Languages Bases de Datos II 4 Questions About Query Languages Given query languages L, L0 • how difficult is it for queries in L – to evaluate such queries? – to check satisfiability? – to check equivalence? • for every query Q in L, is there a query Q0 in L0 that is equivalent to Q? Bases de Datos II 5 Research Questions About Databases Relational Query Languages • Incompleteness, uncertainty – How can we represent incomplete and uncertain information? – How can we query it? .
    [Show full text]
  • The Relational Data Model and Relational Database Constraints
    chapter 33 The Relational Data Model and Relational Database Constraints his chapter opens Part 2 of the book, which covers Trelational databases. The relational data model was first introduced by Ted Codd of IBM Research in 1970 in a classic paper (Codd 1970), and it attracted immediate attention due to its simplicity and mathematical foundation. The model uses the concept of a mathematical relation—which looks somewhat like a table of values—as its basic building block, and has its theoretical basis in set theory and first-order predicate logic. In this chapter we discuss the basic characteristics of the model and its constraints. The first commercial implementations of the relational model became available in the early 1980s, such as the SQL/DS system on the MVS operating system by IBM and the Oracle DBMS. Since then, the model has been implemented in a large num- ber of commercial systems. Current popular relational DBMSs (RDBMSs) include DB2 and Informix Dynamic Server (from IBM), Oracle and Rdb (from Oracle), Sybase DBMS (from Sybase) and SQLServer and Access (from Microsoft). In addi- tion, several open source systems, such as MySQL and PostgreSQL, are available. Because of the importance of the relational model, all of Part 2 is devoted to this model and some of the languages associated with it. In Chapters 4 and 5, we describe the SQL query language, which is the standard for commercial relational DBMSs. Chapter 6 covers the operations of the relational algebra and introduces the relational calculus—these are two formal languages associated with the relational model.
    [Show full text]