Data Integration Overview

Phill Rizzo Regional Manager, Integration Solutions [email protected]

2UDFOH¶V'DWD,QWHJUDWLRQ6ROXWLRQV Oracle Products for Data Movement Comparing How They Work Oracle Solutions Enterprise-Wide Solutions Support For Any Type of Data Integration

Analytical

OLTP Query / Report ODS EDW ODS OLTP OLTP

Operational

OLTP OLTP OLTP OLTP OLTP OLTP Old New OLTP Heterogeneous Heterogeneous Heterogeneous

Maximizing Availability Maximizing Availability Increased Availability

Component System/Database Site Redundancy Redundancy

MAA Foundation Focus on Data Protection Focus on Data Distribution and Availability

In Datacenter In Data Center or Across Data Centers In Data Centers or Across Data Centers

Server Failover Complete one-way physical replication Read-Write or subset logical replication Disk Striping Support for up to 30 standby databases Bi-Directional or Multi-Master Replication, Disk Mirroring Unique corruption protection Conflict Resolution Required. Continuous Point-in-Time recovery Oracle only with limited support for heterogeneous Platforms Robust support for heterogeneous platforms, endians, and databases

Read-Only Access Read-Only Access while Read-Write access while replication is (Continuous transmission replication is Active active while recovery is stopped)

RAC Rolling Maintenance Database Rolling upgrade and Standby-first Patching Zero downtime database and application upgrades, and platform migrations

RAC, ASM, Flashback Data Guard Active Data GoldenGate RMAN, OSB (Included in Oracle EE) Guard Five Steps to Maximize Availability

GoldenGate ‡ Online platform and Active application upgrades ‡ Bi-directional and Data Guard multi-master replication ‡ Database failure ‡ Distribute read-only & ‡ System failure read-write workload Flashback ‡ Site failure ‡ An alternative to ‡ Zero physical replication ‡ Continuous for site protection point-in-time ‡ Automatic database ‡ Flexible planned recovery failover RAC maintenance and ‡ Lost-write protection ‡ Instance failure ‡ Granular repair heterogeneous of logical ‡ Database rolling migrations ‡ Server failure corruptions upgrade ‡ Zero downtime ‡ RAC rolling ASM, ‡ Transaction ‡ Offload read-only upgrades and maintenance RMAN, ‡ Table workload and migrations ‡ Performance backups OSB ‡ Database scale-out ‡ Some migrations ‡ Storage failure ‡ Consolidation ‡ Data recovery ‡ Backups GoldenGate Oracle GoldenGate Technology Differentiators

Oracle GoldenGate provides low-impact capture, routing, transformation, and delivery of database transactions across heterogeneous environments in real-time

Key Differentiators

Non-intrusive, Low Impact, Outside DB, Performance Sub-second to Seconds Latency

Extensible & Open, Modular, Asynchronous Architecture - Flexible Heterogeneous Sources & Targets

Maintains Transactional Integrity - Resilient Reliable Against Interruptions and Failures Oracle GoldenGate Runtime Architecture

Capture: Committed changes are captured (and can be filtered) in Real Time by reading the transaction logs.

Trail files: Stages and queues data for routing. Binary file that can be encrypted.

Pump: OGG process to distribute data for routing to one or multiple targets.

Route: Data can be compressed and encrypted for routing over TCP/IP.

Delivery: Applies data with transaction integrity, transforming the data as required.

TCP /IP Target DB Log LAN / WAN / DB Client Trail Files Pump Delivery Capture Internet Library

DB Client DB Log Delivery Bi-directional Library Pump Capture

Manager Process: Start ± Stop, Process Monitoring & Restart, Trail File Housekeeping GoldenGate: Heterogeneity

Databases O/S and Platforms Oracle GoldenGate Capture and Delivery: Linux ƒ Oracle Solaris ƒ DB2 Windows 2000, 2003, XP ƒ Microsoft SQL Server ƒ Sybase ASE HP NonStop ƒ Teradata HP-UX ƒ Enscribe IBM AIX ƒ SQL/MP ƒ SQL/MX IBM zSeries ƒ MySQL IBM iSeries (AS/400) ƒ JMS message queues zLinux

Oracle GoldenGate Delivery: ƒ All listed above, plus: ƒTimesTen, IBM System i ƒ Netezza, Greenplum, HP Neoview ƒ ETL products

GoldenGate: Performance / Reliability

Performance Reliability ‡ Log-based change data capture ‡ Decoupled architecture

± High volume ± Individual processes can be restarted automatically ± Low overhead ± Tolerance to network outages ‡ Decoupled architecture (configurable) ± Multiple capture and delivery processes may be used to scale, but ‡ Recovery generally not required ± Recovery ensures that no operations are skipped or duplicated after failure ± 3RVVLEOHWRVSOLW³KRW´WDEOHVLQWRD of any kind separate capture and delivery process ‡ Filtering and compression ‡ Checkpointing ‡ Transaction Grouping for Delivery ± In Capture and Delivery ‡ Record Batching ‡ Transaction Integrity

GoldenGate & Active Data Guard Active Data Guard and GoldenGate Comparison

Active Data Guard Golden Gate ‡ Complete one-way physical replication ‡ Logical replication: Active-Active HA, one to many, ‡ Standby database is an exact copy open read- many to one, subset replication, transformations, etc only ‡ Target is a different database with the same data, it ‡ Backups are interchangeable may have a different physical structure or indexing ‡ Integrated automatic database failover scheme ‡ Dual-purpose standby as test system open ‡ Extremely flexible, source and target are open read- read-write write ‡ Simplest to use, supports all applications and ‡ Trade-off flexibility for more deployment workloads considerations: performance, management, some data ‡ Transparent to operate ± no data type type restrictions, application support restrictions ‡ Unique protection from silent corruption caused ‡ Standard corruption protection integrated with Oracle by lost-writes and automatic repair of corrupt Database data blocks ‡ Synchronous or asynchronous redo ‡ Logical asynchronous replication transmission ‡ Standby-first patching ‡ Flexible options for rolling maintenance ‡ Database rolling maintenance and upgrades ‡ Zero downtime migration and application upgrades ‡ Limited cross-platform support ‡ Extensive cross-platform support Oracle Active Data Guard Real-Time Data Protection and Availability

Exact copy of primary Disaster Recovery - Manual or Automatic Failover

Exact copy of Query & Report Offloading primary Open Read-Only

Snapshot standby Convert to Test Database Open Read-Write Database Data Guard Single Command Refresh Redo Transport New DB Standby First Patching, SYNC or ASYNC Version Database Rolling Upgrade Oracle Database Exact Offload RMAN Backups, copy of primary Source for Data Extract Note: A DR copy may be multi-purposed for any combination Data Integrator of use cases Data described Warehouse Extract Offload Oracle GoldenGate Low-Impact, Real-Time Data Integration & Transactional Replication

New DB/ HW/OS/APP Zero Downtime Upgrade & Migration

Fully Active Active-Active for High Distributed Availability Legacy Systems DB

Exact Copy of Primary Disaster Recovery for Non- Oracle DB

Log-based Reporting Database Query & Report Offloading Changed Data

Data Oracle & Non-Oracle Warehouse Real-time BI, Operational Database(s) Reporting, MDM

Data Integrator ODS Global Data Data Synchronization within the Centers Enterprise Message Bus Message Bus Event Driven Architecture, SOA GoldenGate and Active Data Guard In Concert

Heterogeneous Distributed Oracle Fully-active Oracle Active Data GoldenGate Subset Guard Replicas Oracle DB Primary Oracle DB Standby

‡ For Information Distribution & Consolidation, Application Upgrades & Changes ± Use GoldenGate - heterogeneous, active-active, transformations, subsetting ‡ For Disaster Recovery / Data Protection / HA ± Simple Full Oracle Database Protection ‡ Use Active Data Guard ± Application desiring flexible HA, active-active, schema changes, platform changes ‡ Use GoldenGate ‡ Combine the two for full database protection and information distribution Oracle Data Integrator Oracle Data Integrator Improved Performance, Increased Productivity and Lower TCO

Legacy Sources

E-LT Transformation Any Data vs. E-T-L Warehouse Application Sources Declarative Set-based design

Change Data Capture for Any Dynamic Updates Planning System OLTP DB Hot-pluggable Architecture Sources

Pluggable Knowledge Modules Optimized Data Loading through E-LT The key to improved performance and reduced costs

Conventional ETL Architecture ‡ E-LT provides flexible Extract Transform Load architecture for optimized performance ‡ Benefits: ‡ Leverage Set-based transformations Next Generation Architecture ‡ Improved performance for ³E-LT´ loading, no network hop ‡ Takes advantage of Extract Load existing hardware

Transform Transform Declarative Design Improved Developer Productivity and Lower Maintenance Costs

Specify ETL Data Flow Graph Conventional ETL Design ‡ Developer must define every step of Complex ETL Flow Logic ‡ Traditional approach requires specialized ETL skills ‡ And significant development and maintenance efforts

Declarative Set-based Design ‡ Simplifies the number of steps ODI Declarative Design ‡ Automatically generates the Data Flow whatever the sources and target DB 1 2 Define Automatically Generate Benefits What Dataflow 9 Significantly reduce the learning curve You Want 9 Shorter implementation times Define How: Built-in Templates 9 Streamline access to non-IT pros ODI-EE Knowledge Modules Pluggable Architecture for Improved Flexibility

Pluggable Knowledge Modules Architecture

Reverse Load from Journalize Check Integrate, Data Engineer Source to (CDC) Constrains Transform Service Metadata Staging

Sample out-of-the-box Knowledge Modules

SQL Server Oracle Check MS TPump/ Oracle Web SAP/R3 Log Miner JMS Queues Oracle Merge Triggers DBLink Excel Multiload Services

Oracle Check Siebel EIM DB2 Web Siebel DB2 Journals DB2 Exp/Imp Type II SCD SQL*Loader Sybase Schema Services

‡ Key Architecture Benefits:

‡ Ease of maintenance, ‡ Flexibility, tailored to existing best practices, ‡ Reduces cost of ownership Oracle Data Integrator and GoldenGate Complementary and Used Together Data Integrator and GoldenGate

Oracle Data Integrator Enterprise Edition

E-LT Oracle GoldenGate Transformation Real-time Data Heterogeneous Sources Heterogeneous Targets

Bulk Data Movement Real-Time Data Integration and Transformation and Replication

‡ Fastest E-LT Solution ‡ Fastest real-time solution ‡ Sub-second latency for real-time feeds ‡ Optimized SET-based transformation for high volume transformations ‡ Guarantee delivery eliminates data loss ‡ Data lineage for improved manageability ‡ Eliminates down-time for migration and upgrades ‡ Integrates to ‡ Least intrusive to source systems

‡ Leverage ELT/ETL for complex transformation

Oracle Data Integrator Oracle GoldenGate

2UDFOH¶V'DWD,QWHJUDWLRQ-RLQW6ROXWLRQ Best-of-Breed and Proven

Oracle GoldenGate Oracle Data Integrator

Technology Differentiators:

‡ Lowest latency and ‡ E-LT architecture for best performance Performance highest throughput; non-invasive, low of high data volume overhead transformations

‡ De-coupled ‡ Knowledge Module Extensible & architecture; multiple architecture for Flexible deployment styles; extensibility and open and extensible flexible connectivity

‡ Maintain ‡ SOA-native, transactional integrated with Enterprise integrity; resilient Fusion MW to fit against interruptions future enterprise and failures architectures Oracle for Real-Time Business Intelligence Best Practices Architecture

OBI EE Suite Plus

Real-time Analytics Historic Analytics

Batch Feed

Oracle DIM DIM Log-based, Real-time Data Feeds Data Integrator EMP DEPT trans3 trans2 trans1 FACT

27 EMP DEPT DIM DIM Source OLTP System Oracle GoldenGate ODS Schema DW Schema

Oracle Exadata Real-Time CDC Integration for Best-in-class, integrated solution for Real-Time Data Warehouse

Traditional ETL + CDC ODI + Oracle GoldenGate

‡ Invasive Capture on OLTP systems ‡ Continuous feeds from operational using complex Adapters systems ‡ Batch window dependency ‡ Maximum availability for sources and DW ‡ Transformations in ETL engine on ‡ Non-invasive data capture expensive middle tier servers ‡ Thin middle tier with transformations on ‡ Bulk load to the data warehouse with the database platform (target) large nightly/daily batch ‡ Mini-batches throughout the day or bulk Extract processing nightly GG+ ODI GG+ ODI Trickle Xform Xform Bulk

Lookup Lookup Data Data Staging Load Heterogeneous