Oracle TimesTen In-Memory Database Architecture, Performance Tips, Use Cases

Chris Jenkins ([email protected]) Senior Director, In-Memory Technology TimesTen Product Management

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 3 Best In-Memory Databases: For Both OLTP and Analytics

In-Memory for OLTP Oracle TimesTen In-Memory Database Application Application Application • Lightweight, highly-available IMDB • Primary use case: Extreme OLTP Application Application Application • Microsecond response time • Millions of TPS on commodity hardware

In-Memory for Analytics In-Memory Option • Dual-Format In-Memory Database • Primary use case: Real Time Analytics • Billions of Rows/Sec scan rate • Faster mixed-workload enterprise OLTP  Fewer indexes needed to support analytics

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 4 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle TimesTen In-Memory Database One product, two deployment modes TimesTen Classic TimesTen Scaleout

Application Application Reads Application Reads read/writes from Standby from Subscribers

Subscriber Active Standby Subscriber Single System Image Subscriber In-Memory Database

• Replicated In-Memory Relational Database • Scale-Out In-Memory Relational Database • Highly Available • Highly Available • Extremely low latency reads and writes • Extremely high throughput reads and • Read scaling across multiple hosts writes • Cache Groups (cache for Oracle DB EE) • Scales both reads and writes

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 7 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | TimesTen Classic In-memory RDBMS Persistent and Recoverable – Pure in-memory – Database persisted on local storage – Entire database in RAM – Automatic recovery after failure – ACID compliant – Standard SQL and APIs

Extremely Fast Highly Available – Low, consistent response time – Active-standby and multi-master – Very high throughput replication – Excellent scalability – Highly parallel, high throughput – Async and Sync – HA and DR

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 9 Architecture: Classic Instance • Installation Client-Server Application TimesTen Oracle – An unzipped copy of the Applications Replica DB RDBMS TimesTen software package and Tools TT Client – Immutable Network

• Instance Server Replication daemon Replication agent(s)agent(s) ServerServer – Created using ttInstanceCreate proxies MillionthsCache Application proxies agent(s) CodeApplication of a – A runnable copy of the Code Admin/Utility TimesTen Data Second software TimesTen Data programs Manager Library In-Memory Manager Library – Linked to an installation Database(s) Data Store subdaemon(s) – Direct mode Includes configuration files Applications Checkpoint – Identified by TIMESTEN_HOME and Tools Millionths files – Set of processes of a Log files Second Log TimesTen Buffer – Supports one or more daemon databases Metadata, Tables, Data Tables, Indexes,Indexes, Views,

Instance SystemSequences, Tables …

Locks, Cursors, Cursors, Locks,

Command cache, cache, Command Temp Indexes, … Temp

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 10 Application Connectivity Two modes, functionally equivalent , supported for all APIs (ODBC, OCI, JDBC, ODP.NET, …)

Host 1 Host 2

App 1 TT Server 1 App 1 You can mix Database and match App 2 TT Server 2 App 2 these modes as Shared desired based … Memory … … on your requirements. TT Server X App N App N Network

Direct mode Client/server mode • Apps run on same host as database • TCP/IP connections between apps and TT server processes • Apps directly map database shared memory (via TT engine) • TT server process is a multi-threaded direct mode app • No context switches, no IPC for database access • Each interaction involves 1 or more n/w round trips • Ultra low latency (in process direct memory access) • More overhead, higher latency than direct mode

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 11 Memory Layout Database is a single shared memory segment (separate much smaller segment for PL/SQL) ~20 MB PermSize TempSize LogBufMB

Perm Region Temp Region Log Buffer Metadata Temp objects Strand 1 TimesTen is a row store • System catalog tables • Tables • At least for now… D • System views • Indexes Use of huge pages • Sequences • Recommended unless database is b Sort space Strand 2 H • System Databasedata Shared Memory Segment very small Tables Locks • Mandatory on if segment size d >= 256 GB r • Logical Tuple Pages Connections • Physical Tuple Pages … If not using huge pages • Page directories Commit • Lock segment in memory buffers Strand N • MemoryLock=4 Indexes … Consider NUMA effects… • Hash buckets … • B-Tree nodes

Persistent Not to scale

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 12 Persistence - Checkpointing Dual checkpoint files, automatic fuzzy checkpoints Two checkpoint files, dbname.ds0 and dbname.ds1 • Each is a full image of DbHdr + Perm • Written to alternately by checkpoint operations Perm Region Perm region is divided into pages • Variable size based on contents 10 01 01 010 • Two bits to track ‘dirty’ state wrt to checkpoint files Last checkpoint was to .ds1 at some earlier time T0 Pg0 Pg1 At time T1 D • Page 0 is dirty wrt .ds0 and .ds1 0 • Page 2 is dirty wrt .ds0 b 1 0 Pg2 Checkpoint occurs at time T2 H • .ds0 is oldest file (previous checkpoint was to .ds1) d • DbHdr + pages 0 & 2 are written to .ds0 (in place update) r • Pages 0 & 2 dirty bits for .ds0 are cleared … At time T3 .ds1 • Page 1 is modified (e.g. by a DML operation) .ds0 • Both dirty bits are set in the page 0 0 • .ds0 dirty: Pg1 • .ds1 dirty: Pg0, Pg1 PgN Checkpoint occurs at time T4 • .ds1 is oldest file • DbHdr + pages 0 and 1 are written to .ds1 • Dirty bit for .ds1 cleared in both pages

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 13 Persistence – Transaction Logging Parallel log manager High throughput and concurrency • Multi-strand buffer (multiple ring buffers) • Behaves as a single shared logical buffer Log files In-memory • Configurable maximum size Database • Sequence number (64-bit) • Old files purged by checkpointing Highly concurrent Log Buffer • Concurrent record post to each strand AND Strand 1 Strand N • Flush buffer to disk … Asynchronous and synchronous operation • Asynchronous (DurableCommits=0) is the default Async Sync • Synchronous (DurableCommits=1) is an option • Many records • Less records written Async/sync configurable at written at once at once • Database level • Outside transaction • Part of commit • Connection level path • Write through to • Transaction level (application API) • Decoupled from media Log usage commit • Commit blocks until • Undo and Redo I/O completion • Replication, XLA, AWT caching & incremental backups • Group commit Log file purge criteria Current optimisation • No records in file belong to an open transaction Log File • Changes for all records written to both checkpoint files • Not required by replication, XLA, AWT or backups

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 14 Persistence – Recovery Occurs on every ramLoad operation

Most recent checkpoint loaded into memory Database A • Physical operation Shared p • Segment p Can use multiple reader threads (configurable) s Log replay • From the corresponding checkpoint log record… Rollback any • …to end of log open txns • Mark any indexes that would be modified Rollback any open transactions Drop marked indexes • No commit marker seen Rebuild them Drop and re-build marked indexes Ckpt Log • Index modifications are not redo logged Files files • Done in parallel (configurable) Static checkpoint to oldest checkpoint file Allow application connections

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 15 Concurrency Single level versioning for READ COMMITTED isolation READ COMMITTED isolation (default) • Readers don’t block writers and vice versa MYTABLE READER • Readers don’t place any locks pKey columnX 0 This is row 0 INSERT/UPDATE/DELETE 1 This is row 1 • Creates a new (uncommitted) version of affected rows 2 This is row 2 • eXclusively locks the row(s) 3 This is row 3 Reader • UPDATE 44 ThisNew is row row 4 4 4 This is row 4 Reads old (committed) version(s) 5 This is row 5 Old version(s) deleted on COMMIT or ROLLBACK • In reclaim phase 6 This is row 6 • New version(s) becomes the only version

SERIALIZABLE isolation • Readers place Shared row locks • Writers place eXclusive row locks • Optimiser may choose table locks instead  If many rows would be locked • Can use a hint to prevent lock escalation

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 16 Indexing Hash Indexes, Lockless B-Trees Hash Indexes Memory optimized B-Tree Indexes • Best performance for • Default index type - Full key equality lookups • Good all-round performance - Full key equijoins • Self balancing • Can’t be used for • Lockless design for high concurrency - Range lookups – No locks or latches for reads - Key prefix lookups – Fine grained latching for writes • Must be sized accurately - Too small: poor performance and CREATE[UNIQUE]HASH INDEX MyHashIndex concurrency ON MyTable(somecol)PAGES = CURRENT; - Too large: wastes memory CREATE[UNIQUE]INDEX MyRangeIndex ON MyTable (somecol);

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 17 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Distributed, Shared Nothing, In-Memory Database Single-Image Database with High Availability and Elasticity • Appears to applications as a single database - Not as a sharded database • Online scale-out and scale-in C - Data automatically redistributed B A’ - Workload automatically uses new elements • Built-in HA via multiple fully-active copies A – Copies automatically kept in sync

– Automatic client failover D’ - Parallel SQL execution D

- Same features as TimesTen Classic B’ - Mostly… C’

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 19 Scaleout Table Data Distribution Options

• DISTRIBUTE large tables by consistent hash Distribute CUSTOMER rows across all elements by hash of Customer ID Distribute • COLOCATE child table rows with parent table row to maximize locality CUST CUST Place ORDERS rows in same element along with Colocate Colocate corresponding CUSTOMER row ORDERS ORDERS Duplicate • DUPLICATE small read-mostly tables on all PRODUCTS PRODUCTS elements for maximum locality Duplicate the PRODUCT list on all elements Element 1 … Element N Servers

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 20 SQL

External Network

SSH TimesTen Scaleout

Architecture Overview RS1_DSG1 RS1_DSG2 2PC MGMT1 SSH & SCP RS2_DSG1 RS2_DSG2 2PC Internal Network

RS3_DSG1 RS3_DSG2 MGMT2 2PC

REPO1 RS4_DSG1 RS4_DSG2 Management instances 2PC … MOUNT / SCP ZooKeeper Data instances Membership Management Repository Storage Hosts

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 21 Architecture: Scaleout Instances

• Management Instance Management Client-Server Application Other Data Instances(s) Applications Instances – Stores grid model and Tools TT Client Network – Processes management

functions Server Replication daemon Grid agent(s)Workers ServerServer proxies Millionths – Very similar to a ‘classic’ Application proxies CodeApplication of a instance Code TimesTen Data Second TimesTen Data – 1 or 2 per Scaleout grid Manager Library Database Manager Library • Data Instance Element(s) Data Store Direct mode subdaemon(s) – Hosts database elements Applications Checkpoint – Processes SQL and Txns and Tools Millionths files – As many as you want of a Log files TimesTen Second Log daemon Buffer Epoch Metadata, Tables, files Data Tables, Indexes,Indexes, Views,

Data Instance SystemSequences, Tables …

Locks, Cursors, Cursors, Locks,

Command cache, cache, Command Temp Indexes, … Temp

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 22 High Availability K-safety, Synchronous , All Replicas Active • Hosts assigned to Data Space Groups – DSG = rack, Cloud AD etc. Data Space Data Space Group 1 Group 2 • Replica sets created automatically C – One replica per DSG Replica Set B1 A • All replicas are active for reads and writes – All replicas are ‘equal’ A’Replica Set 2 • Always consistent – 2 phase commit (synchronous) Replica Set 3 D’ – Reduced 2PC durable writes when K > 1 D • Queries and transactions can span all replica sets Replica Set 4 B’ – Grid aware optimizer C’ – Grid aware SQL engine – Sophisticated transaction manager

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 23 Elastic Scalability Expand and shrink the database online, based on business needs • Adding replica sets Data Space Data Space Group 1 Group 2 - Prepare Replica Set 1 - Deploy new hosts/installations/instances A A’ - Typically one command per host Replica Set 2 - Redistribute data when ready B B’ - One command - Workload uses the new elements Replica Set 3 C C’ - Connections start to use new elements Replica Set 4 • Removing replica sets D D’ - Redistribute data Replica Set 5 - Tear down old hosts E E’

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 24 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Top Performance Tips for TimesTen General • Optimise data types – Use native integer types where appropriate (TT_TINYINT, TT_SMALLINT, …) – Inline versus out of line for variable length columns (VARCHAR2, NVARCHAR2, VARBINARY) • Indexes and optimiser statistics – Hash versus range indexes – use the index advisor! – Keep optimizer stats up to date • Optimise OS configuration – Shared memory, semaphores – Huge pages, locked memory • Optimise database configuration – Configure database parameters depending on workload, hardware etc. – If using HDD storage, separate checkpoint files and transaction logs – avoid I/O contention – Use huge pages (or lock database in memory if can’t use huge pages)

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 26 Top Performance Tips for TimesTen Scaleout • Use the fastest, lowest latency network you can get – 10 GbE is absolute minimum – Faster is better! • Choose hardware wisely – Fewer, larger hosts better than many small hosts • Less network hops • Leverages TimesTen’s excellent vertical scalability • Optimise table distributions – Distribution type, distribution keys (hash) • Global versus local indexes – Faster access to data – Slower DML – Tradeoffs!

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 27 Top Performance Tips for TimesTen Applications • Use parameterized SQL – Facilitates use of (shared) prepared statements • Prepare once, execute many times – Hard parse >> soft parse >>>> no parse • Bind once (ODBC and OCI) • Minimise type conversions and character set conversions • Leverage batch operations, especially for INSERT – Multiple of 256 rows => fast path insert • Prefer direct mode connectivity where you can • When using client/server connectivity – Use PL/SQL to reduce network round trips – Use OCI ‘commit on success’ option where appropriate • Keep transactions small/short

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 28 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | When to use TimesTen? Classic or Scaleout? Primarily for OLTP, Scaleout useful for Analytics too TimesTen Classic TimesTen Scaleout • Very low, consistent latency • Good latency – Microsecond response times – Low millisecond response times • High throughput • Very high throughput – Millions of TPS on a commodity server – 100s of millions of TPS – 10s of millions of queries per second – Billions of queries per second • Single server or replicated • Vertical and horizontal scalability – Optional read-only subscribers – For both reads and writes – Vertical scalability for writes • Easy HA (99.999% possible) – Vertical and horizontal for reads • Elastic scalability • Highly available (99.999% possible) – Add or remove database elements online • Cache functionality for Oracle DB EE – Read-only and read-write caching

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 30 Agenda

1 What is TimesTen?

2 TimesTen Classic Architecture

3 TimesTen Scaleout Architecture

4 Performance Tips & Tricks

5 When to use TimesTen?

6 Summary

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Summary

• TimesTen is a sophisticated relational In-Memory Database • Two deployment modes; Classic and Scaleout • Focus is primarily on OLTP type workloads –Ultra low latency – Classic –Massive throughput – Scaleout • Standard SQL, PL/SQL and APIs • Fully ACID • Highly available

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 32 More Info… • TimesTen OTN Portal (http://www.oracle.com/technetwork/database/database- technologies/timesten/overview/index.html) – Product Information • Presentations, use cases, whitepapers, FAQs, … – Software Downloads – Product Documentation – Scaleout Demo / Learning VM download • TimesTen GitHub Quickstart and Samples (https://github.com/oracle/oracle-timesten-samples) • Contact me! ([email protected])

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 33 &

Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 34 Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 35