Data Integration Overview
Phill Rizzo Regional Manager, Data Integration Solutions [email protected]
2UDFOH¶V'DWD,QWHJUDWLRQ6ROXWLRQV Oracle Products for Data Movement Comparing How They Work Oracle Data Integration Solutions Enterprise-Wide Solutions Support For Any Type of Data Integration
Analytical
OLTP Query / Report ODS EDW ODS OLTP OLTP
Operational
OLTP OLTP OLTP OLTP OLTP OLTP Old New OLTP Heterogeneous Heterogeneous Heterogeneous
Maximizing Availability Maximizing Availability Increased Availability
Component System/Database Site Redundancy Redundancy
MAA Foundation Focus on Data Protection Focus on Data Distribution and Availability
In Datacenter In Data Center or Across Data Centers In Data Centers or Across Data Centers
Server Failover Complete one-way physical replication Read-Write or subset logical replication Disk Striping Support for up to 30 standby databases Bi-Directional or Multi-Master Replication, Disk Mirroring Unique corruption protection Conflict Resolution Required. Continuous Point-in-Time recovery Oracle only with limited support for heterogeneous Platforms Robust support for heterogeneous Data Recovery platforms, endians, and databases Backups
Read-Only Access Read-Only Access while Read-Write access while replication is (Continuous transmission replication is Active active while recovery is stopped)
RAC Rolling Maintenance Database Rolling upgrade and Standby-first Patching Zero downtime database and application upgrades, and platform migrations
RAC, ASM, Flashback Data Guard Active Data GoldenGate RMAN, OSB (Included in Oracle EE) Guard Five Steps to Maximize Availability
GoldenGate Online platform and Active application upgrades Bi-directional and Data Guard multi-master replication Database failure Distribute read-only & System failure read-write workload Flashback Site failure An alternative to Zero data loss physical replication Continuous for site protection point-in-time Automatic database Flexible planned recovery failover RAC maintenance and Lost-write protection Instance failure Granular repair heterogeneous of logical Database rolling migrations Server failure corruptions upgrade Zero downtime RAC rolling ASM, Transaction Offload read-only upgrades and maintenance RMAN, Table workload and migrations Performance backups OSB Database scale-out Some migrations Storage failure Consolidation Data recovery Backups GoldenGate Oracle GoldenGate Technology Differentiators
Oracle GoldenGate provides low-impact capture, routing, transformation, and delivery of database transactions across heterogeneous environments in real-time
Key Differentiators
Non-intrusive, Low Impact, Outside DB, Performance Sub-second to Seconds Latency
Extensible & Open, Modular, Asynchronous Architecture - Flexible Heterogeneous Sources & Targets
Maintains Transactional Integrity - Resilient Reliable Against Interruptions and Failures Oracle GoldenGate Runtime Architecture
Capture: Committed changes are captured (and can be filtered) in Real Time by reading the transaction logs.
Trail files: Stages and queues data for routing. Binary file that can be encrypted.
Pump: OGG process to distribute data for routing to one or multiple targets.
Route: Data can be compressed and encrypted for routing over TCP/IP.
Delivery: Applies data with transaction integrity, transforming the data as required.
TCP /IP Target DB Log LAN / WAN / DB Client Trail Files Pump Delivery Capture Internet Library
DB Client DB Log Delivery Bi-directional Library Pump Capture
Manager Process: Start ± Stop, Process Monitoring & Restart, Trail File Housekeeping GoldenGate: Heterogeneity
Databases O/S and Platforms Oracle GoldenGate Capture and Delivery: Linux Oracle Solaris DB2 Windows 2000, 2003, XP Microsoft SQL Server Sybase ASE HP NonStop Teradata HP-UX Enscribe IBM AIX SQL/MP SQL/MX IBM zSeries MySQL IBM iSeries (AS/400) JMS message queues zLinux
Oracle GoldenGate Delivery: All listed above, plus: TimesTen, IBM System i Netezza, Greenplum, HP Neoview ETL products
GoldenGate: Performance / Reliability
Performance Reliability Log-based change data capture Decoupled architecture
± High volume ± Individual processes can be restarted automatically ± Low overhead ± Tolerance to network outages Decoupled architecture (configurable) ± Multiple capture and delivery processes may be used to scale, but Recovery generally not required ± Recovery ensures that no operations are skipped or duplicated after failure ± 3RVVLEOHWRVSOLW³KRW´WDEOHVLQWRD of any kind separate capture and delivery process Filtering and compression Checkpointing Transaction Grouping for Delivery ± In Capture and Delivery Record Batching Transaction Integrity
GoldenGate & Active Data Guard Active Data Guard and GoldenGate Comparison
Active Data Guard Golden Gate Complete one-way physical replication Logical replication: Active-Active HA, one to many, Standby database is an exact copy open read- many to one, subset replication, transformations, etc only Target is a different database with the same data, it Backups are interchangeable may have a different physical structure or indexing Integrated automatic database failover scheme Dual-purpose standby as test system open Extremely flexible, source and target are open read- read-write write Simplest to use, supports all applications and Trade-off flexibility for more deployment workloads considerations: performance, management, some data Transparent to operate ± no data type type restrictions, application support restrictions Unique protection from silent corruption caused Standard corruption protection integrated with Oracle by lost-writes and automatic repair of corrupt Database data blocks Synchronous or asynchronous redo Logical asynchronous replication transmission Standby-first patching Flexible options for rolling maintenance Database rolling maintenance and upgrades Zero downtime migration and application upgrades Limited cross-platform support Extensive cross-platform support Oracle Active Data Guard Real-Time Data Protection and Availability
Exact copy of primary Disaster Recovery - Manual or Automatic Failover
Exact copy of Query & Report Offloading primary Open Read-Only
Snapshot standby Convert to Test Database Open Read-Write Database Data Guard Single Command Refresh Redo Transport New DB Standby First Patching, SYNC or ASYNC Version Database Rolling Upgrade Oracle Database Exact Offload RMAN Backups, copy of primary Source for Data Extract Note: A DR copy may be multi-purposed for any combination Data Integrator of use cases Data described Warehouse Extract Offload Oracle GoldenGate Low-Impact, Real-Time Data Integration & Transactional Replication
New DB/ HW/OS/APP Zero Downtime Upgrade & Migration
Fully Active Active-Active for High Distributed Availability Legacy Systems DB
Exact Copy of Primary Disaster Recovery for Non- Oracle DB
Log-based Reporting Database Query & Report Offloading Changed Data
Data Oracle & Non-Oracle Warehouse Real-time BI, Operational Database(s) Reporting, MDM
Data Integrator ODS Global Data Data Synchronization within the Centers Enterprise Message Bus Message Bus Event Driven Architecture, SOA GoldenGate and Active Data Guard In Concert
Heterogeneous Distributed Oracle Fully-active Oracle Active Data GoldenGate Subset Guard Replicas Oracle DB Primary Oracle DB Standby
For Information Distribution & Consolidation, Application Upgrades & Changes ± Use GoldenGate - heterogeneous, active-active, transformations, subsetting For Disaster Recovery / Data Protection / HA ± Simple Full Oracle Database Protection Use Active Data Guard ± Application desiring flexible HA, active-active, schema changes, platform changes Use GoldenGate Combine the two for full database protection and information distribution Oracle Data Integrator Oracle Data Integrator Improved Performance, Increased Productivity and Lower TCO
Legacy Sources
E-LT Transformation Any Data vs. E-T-L Warehouse Application Sources Declarative Set-based design
Change Data Capture for Any Dynamic Updates Planning System OLTP DB Hot-pluggable Architecture Sources
Pluggable Knowledge Modules Optimized Data Loading through E-LT The key to improved performance and reduced costs
Conventional ETL Architecture E-LT provides flexible Extract Transform Load architecture for optimized performance Benefits: Leverage Set-based transformations Next Generation Architecture Improved performance for ³E-LT´ loading, no network hop Takes advantage of Extract Load existing hardware
Transform Transform Declarative Design Improved Developer Productivity and Lower Maintenance Costs
Specify ETL Data Flow Graph Conventional ETL Design Developer must define every step of Complex ETL Flow Logic Traditional approach requires specialized ETL skills And significant development and maintenance efforts
Declarative Set-based Design Simplifies the number of steps ODI Declarative Design Automatically generates the Data Flow whatever the sources and target DB 1 2 Define Automatically Generate Benefits What Dataflow 9 Significantly reduce the learning curve You Want 9 Shorter implementation times Define How: Built-in Templates 9 Streamline access to non-IT pros ODI-EE Knowledge Modules Pluggable Architecture for Improved Flexibility
Pluggable Knowledge Modules Architecture
Reverse Load from Journalize Check Integrate, Data Engineer Source to (CDC) Constrains Transform Service Metadata Staging
Sample out-of-the-box Knowledge Modules
SQL Server Oracle Check MS TPump/ Oracle Web SAP/R3 Log Miner JMS Queues Oracle Merge Triggers DBLink Excel Multiload Services
Oracle Check Siebel EIM DB2 Web Siebel DB2 Journals DB2 Exp/Imp Type II SCD SQL*Loader Sybase Schema Services
Key Architecture Benefits:
Ease of maintenance, Flexibility, tailored to existing best practices, Reduces cost of ownership Oracle Data Integrator and GoldenGate Complementary and Used Together Data Integrator and GoldenGate
Oracle Data Integrator Enterprise Edition
E-LT Oracle GoldenGate Transformation Real-time Data Heterogeneous Sources Heterogeneous Targets
Bulk Data Movement Real-Time Data Integration and Transformation and Replication
Fastest E-LT Solution Fastest real-time solution Sub-second latency for real-time feeds Optimized SET-based transformation for high volume transformations Guarantee delivery eliminates data loss Data lineage for improved manageability Eliminates down-time for migration and upgrades Integrates to Data Quality Least intrusive to source systems
Leverage ELT/ETL for complex transformation
Oracle Data Integrator Oracle GoldenGate
2UDFOH¶V'DWD,QWHJUDWLRQ-RLQW6ROXWLRQ Best-of-Breed and Proven
Oracle GoldenGate Oracle Data Integrator
Technology Differentiators:
Lowest latency and E-LT architecture for best performance Performance highest throughput; non-invasive, low of high data volume overhead transformations
De-coupled Knowledge Module Extensible & architecture; multiple architecture for Flexible deployment styles; extensibility and open and extensible flexible connectivity
Maintain SOA-native, transactional integrated with Enterprise integrity; resilient Fusion MW to fit against interruptions future enterprise and failures architectures Oracle for Real-Time Business Intelligence Best Practices Architecture
OBI EE Suite Plus
Real-time Analytics Historic Analytics
Batch Feed
Oracle DIM DIM Log-based, Real-time Data Feeds Data Integrator EMP DEPT trans3 trans2 trans1 FACT
27 EMP DEPT DIM DIM Source OLTP System Oracle GoldenGate ODS Schema DW Schema
Oracle Exadata Real-Time CDC Integration for Data Warehouse Best-in-class, integrated solution for Real-Time Data Warehouse
Traditional ETL + CDC ODI + Oracle GoldenGate
Invasive Capture on OLTP systems Continuous feeds from operational using complex Adapters systems Batch window dependency Maximum availability for sources and DW Transformations in ETL engine on Non-invasive data capture expensive middle tier servers Thin middle tier with transformations on Bulk load to the data warehouse with the database platform (target) large nightly/daily batch Mini-batches throughout the day or bulk Extract processing nightly GG+ ODI GG+ ODI Trickle Xform Xform Bulk
Lookup Lookup Data Data Staging Load Heterogeneous