EXASOL Dual Data Center

Using a dual data center approach for Disaster Recovery Agenda

ARZ – Who are we?

Why Exasol with Dual Data Center?

How does it work?  Prerequisites  Configuration  Switchover  Tooling

Advantages

ARZ Allgemeines Rechenzentrum GmbH 2 ARZ – Who are we?

Financial solution provider in Tyrol ARZ – Who are we?

• Computing centre for banks in Austria • Owned by customers • Mainly Volksbanken and Hypo but also private banks and health institutions • ~ 500 employees • Complete service o IT acquisition o Workstation Management o Dev and enhancing core banking system o SaaS o ...

ARZ Allgemeines Rechenzentrum GmbH 4 Why Exasol with Dual Data Center?

Origins of the Dual Data Center Approach A History of Data in ARZ

Data Warehousing at ARZ - 2000 • Used for Basel I reporting regulations • Core banking system does not keep history of data • Monthly snapshots • Only platform we had: DB2 zOS  IBM Mainframe • Development started in Cobol • High Costs for this platform as primarily used for Mission critical OLTP systems

Start of DB2 for Unix as Reporting Database - 2003 • Creation of Data Marts for Reporting • Setup of Business Objects (SAP BI) as Reporting Tool

Setup of Datastage as ETL Tool (now IBM Information Server) – 2011

Start of a redesign project for our data platform - 2015

ARZ Allgemeines Rechenzentrum GmbH 6 Evaluation Project for new Database Platform

• Started Feb 2015 with 7 Vendors • Ended Juli 2015

• PoC with Netezza, Exasol and SAP HANA

• Main Focus was o Ease of development o Ease of administration o Performance o Dual Data Center Solution

ARZ Allgemeines Rechenzentrum GmbH 7 Why Exasol with Dual Data Center?

• ARZ owns 2 data centers o 16 km line length o 2x 20 Gb dedicated direct connections between the data centers o Network latency 120 µs

• Every system connected to our regulation reporting process has to be clustered spanning both data centers o Max delay for reports to the national bank is within hours o High penalties for failure to report

• Blackout of one data center once a year as a disaster test o Varying scenarios every year o This year: simulation of failure of cooling system

ARZ Allgemeines Rechenzentrum GmbH 8 What do we do with Exasol?

Planning vs. reality Processing Chain

•Unload from systems that we don’t have direct access to Unload Sources

•Load unloaded files •Load files from external systems (non ARZ systems) Load Files / Direct Access •Load data from systems from within Exasol (using Exasol Connections)

•Only Data Type checks (through Exasol import) RAW Layer

•Data Cleansing •Reshape into new data model CDWH Layer

•Enrichment •Analytics Compute Cores •Master Data Management

ARZ Allgemeines Rechenzentrum GmbH 10 Planned Data Processing vs Reality

• After the evaluation project we planned for o Complete redesign of our data model . 3NF data model enhanced by surrogate keys (SHA-1 over business key) . we had a look at Data Vault modeling but didn’t go that way o 90 GB of RAW Data per day o 90 days of daily time slices o Infinite amount of ultimo time slices

• Currently o Data Model completely redesigned o 600 GB of RAW Data per day  compressed 100 GB o Time for load + data cleansing and reshape into new data model  3 hours

ARZ Allgemeines Rechenzentrum GmbH 11 How does it work?

Prerequisites Prerequisites

“Common” Exasol installation

ARZ Allgemeines Rechenzentrum GmbH 13 Prerequisites

Stretch Cluster Setup

. 9 active nodes per side . 1 standby node per side . License server is a VMWare Image . Can be switched to secondary DC . Uses mirrored storage

ARZ Allgemeines Rechenzentrum GmbH 14 How does it work?

Switchover Switchover

Switchover • Check if storage is in sync o Syslog monitor for segment recovery o Problem might be network problems prior to switchover • Move License-Server VMWare to secondary DC • Stop database on primary site • Start database on secondary site • Stop nodes on primary site o Otherwise the master segments will be accessed remotely

Switchback • Bring up primary nodes • Wait for segment recovery • Move License-Server VMWare to primary DC • Stop database on secondary site • Start database on primary site

ARZ Allgemeines Rechenzentrum GmbH 16 Switchover Problems

• Hard power down (explosion) o Storage volumes get locked on abnormal power failure o Exasol Support has to unlock volumes • When switching to secondary DC  power down of primary DC nodes recommended o Severe performance degradations when secondary database is accessing master data in primary DC • Connect string for apps contains active/passive site o Longer connect time compared to regular installations

ARZ Allgemeines Rechenzentrum GmbH 17 How does it work?

Tooling Tooling

• Custom build Python Script to manage whole process and more o Used by Operations Department in ARZ • Make it foolproof o Operations has to Manage > 1000 Systems  no time for complex checks o Errors may lead to corrupted data  ongoing segment recovery, ... • Automatic Shutdown of Nodes if takeover to secondary site • Automatic Startup and wait for seg recovery on takeback from secondary site

ARZ Allgemeines Rechenzentrum GmbH 19 Advantages

Advantages compared to other vendors Advantages

• Cheap license o No full license for secondary site required o Only additional hardware in secondary DC • Functionality build in o No extra components required • Save even during network outages o Quorum build in through license server • No Impact on Query Performance • Low Impact on Load Performance o Is dependent on your network connection (latency) o In our Installation < 5 %

ARZ Allgemeines Rechenzentrum GmbH 21 Questions DBeaver Open Source

• Open Source DB-Query Tool o I build an Exasol Plugin for DBeaver (shipped with DBeaver) • Some enhancements to EXAPlus o Column Auto-Complete o Direct Table Data Editor o Multiple Table Import/Export o Manage Database Sessions o Support for Virtual Schemas (Exasol V6) o Display Table Statistics in Details (Size, Compressed Size …) o Security Browser (Roles, Users) o “Explain” SQL Feature • Constantly enhanced • I’m open for feature requests, bug reports … • Details https://goo.gl/w2F3Wk

ARZ Allgemeines Rechenzentrum GmbH 23 Karl Grießer Data and Information Management DB2 LUW, Oracle, Exasol, Imperva [email protected] +43 50400 91166 KONTAKT