Oracle Sharding Technical Deep Dive

Y V Ravi Kumar Oracle ACE Director Oracle Certified Master (OCM)

New York Oracle User Group (NYOUG) 04th October 2018

Patterns 1 Infolob Solutions Inc. Infolob Solutions Inc.

About Infolob Infolob is a leading technology consulting organization that caters to companies looking for specialists who can help with their digital transformation agenda. We are headquartered in Irving, TX with offices in Harrisburg, PA., Atlanta, GA., Merida, Mexico and Hyderabad, India. Our Innovation labs are in Irving, TX and Merida, MX.

Infolob Solutions Inc. was founded in 2009 as a certified small business specializing in Oracle technologies. Based on rapid growth and niche competency, Infolob acquired Oracle Platinum Partner status in 2012. We have grown substantially and have established ourselves as a premier provider of Oracle services in the United States.

Patterns 2 Y V Ravi Kumar

 Oracle Certified Master (OCM)  Oracle ACE Director  Oracle ACE Spotlight for the month – June 2016  Oracle Magazine – July 2017

Oracle Speaker . Oracle Open World 2017 (OOW 2017) . Independent Oracle User Group (IOUG)

. New York Oracle User Group (NYOUG) ME ABOUT ORACLE CERTIFICATIONS . Sun Coast Oracle User Group (SOUG) Oracle Certified Master (OCM) . Oracle Technology Network (OTN) Oracle 10g & 11g: RAC Certified Expert . Sangam (Largest Oracle Event in India) Oracle 11g: Performance Tuning Certified Expert . All India Oracle User Group (AIOUG) Oracle Exadata 11g Essentials Oracle Golden Gate 10 Essentials Author of 100+ articles Oracle 11g: SQL Tuning Certified Expert . Oracle Technology Network (OTN) Oracle 9i & 10g: Oracle On Linux Certified Expert . Toad World - Connected-Driven Innovation OCP – Oracle 12c, 11g, 10g, 9i and 8i . All things ORACLE from Redgate SUN Certified – Solaris System Administrator in SUN Solaris 9 . UKOUG Library

Patterns 3 ORACLE SHARDING Oracle Sharding Technical Deep Dive

Patterns 4 What is Oracle Sharding ?

Oracle Sharding is a scalability and availability feature for custom-designed OLTP applications.

Sharding enables distribution and replication of data across a pool of Oracle (shards)

Oracle Sharding will not share no hardware or software

Adding shards The pool of databases is presented to the application as a single logical database. to the pool

Applications elastically scale (data, transactions and users) to any level, on any platform

Sharding currently scaling up to 1,000 shards is supported (12.2)

Patterns 5 with Sharding Architecture

EMPLOYEE

Transformation to Sharding Architecture

EMPLOYEE

EMPLOYEE Shard Database – PROD-A

EMPLOYEE

Traditional Oracle Database Sharded Catalog Database

Shard Database – PROD-B

Patterns 6 Oracle Sharding – Elastic Database Architecture

Sharding Architecture

Horizontal partitioning of data across up to 1000 independent Oracle Databases (shards)

Horizontal partitioning splits a database across shards so that each shard contains the table with the same columns but a different subset of rows.

Shared-nothing hardware architecture

•Each shard runs on commodity server •No shared storage components •No clusterware components

Data is partitioned using a sharding key (i.e. account_id in Account Table).

NoSQL databases made it easy to deploy Sharding, Oracle Database Native Sharding makes it easy for full-featured RDBMS.

Patterns 7 Oracle Sharding – One Giant Database to many small DBs

Table that is partitioned into smaller and more manageable pieces among multiple databases

Server 1 Server 2 Server 3 Server 4

CONTDBA B CONTDBC D CONTDBE F CONTDBG H Partitions PartitionsA B PartitionsA B PartitionsA B

Shard01: Table A Shard02: Table A Shard03: Table A Shard0N: Table A

Patterns 8 Oracle Sharding – Benefits

One mission critical database partitioned into many small databases (shards)

o Extreme scalability by adding shards (independent databases) o Rolling upgrades with independent availability of shards. Customers Americas o Global-Scale OLTP applications prefer to shard massive databases into a farm of smaller databases o Native SQL for sharding tables across up to 1000 Shards Customers Europe o Routing of SQL based on shard key, and cross shard queries Customers o Online addition and reorganization of shards.

Customers Asia Patterns 9 Oracle Sharding – Benefits

Linear Scalability

 Add shards online to increase database … size and throughput. …  Online split and rebalance. Extreme Availability

 Shared-nothing hardware architecture. …  Fault of one shard has no impact on … others. Distribution of Data globally

 User defined data placement for performance, availability, DR or to meet regulatory requirements.

Patterns 10 Oracle Sharding – Benefits

Auto deployment & Replication used for shard-level HA

Multiple sharding methods

Centralized schema maintenance (sharded and duplicated data)

A single global service accesses any shard

Direct routing and proxy routing

Elastic scaling with automatic rebalance

Patterns 11 Oracle Sharding – Advantages

High Availability (HA) features

Compression and Advanced Security

Backup and Recovery at Enterprise-Scale

Database Partitioning applicable

Online schema changes

Guarantees ACID properties and read consistency

Rolling upgrades with independent availability of shards

Patterns 12 Oracle Sharding – Application Profile

Custom OLTP Applications

 Large billing systems  Airline ticketing systems  Online financial services  Media companies  Online information services  Social media companies

Characteristics / Implementation

 Application must specify a sharding key for optimal performance  Sharding is not application transparent method

Patterns 13 Oracle Sharding Methods

System Managed Sharding User Defined Sharding Composite Sharding

Based on partitioning by Based on partitioning by RANGE or Provides two-levels of data CONSISTENT HASH LIST organization

Range of hash values assigned to List or Range of sharding key Partitions are distributed among each chunk values assigned to each chunk by buckets by LIST or RANGE (specified the user by the user)

Data is uniformly sharded / re- Need to manually maintain Within a bucket, a range of hash values sharded automatically balanced data distribution is automatically assigned to each chunk

User has no control on location of Full control on location of data Requires two sharding keys data provides (super_sharding_key and sharding_key)

Patterns 14 Oracle Sharding – Creating sharded tables

System managed Sharding does not require the user to specify mapping of data to Shards.

SQL> CREATE SHARDED TABLE customers( Supported data types for the sharding Customer_No NUMBER NOT NULL, key Customer_Name VARCHAR2(50),  NUMBER Customer_Address VARCHAR2(250),  INTEGER, CONSTRAINT Cust_PK PRIMARY KEY(Customer_No)  SMALLINT, )  RAW, PARTITION BY CONSISTENT HASH (Customer_No)  (N)VARCHAR, PARTITIONS AUTO TABLESPACE SET tbs1;  (N)CHAR,  DATE & TIMESTAMP

Patterns 15 Oracle Sharding – User Defined Sharding

User has control over database means user specifies the mapping of data to individual Shards.

SQL> create tablespace tbs1; SQL> create tablespace tbs2; SQL> CREATE TABLE orders ( id NUMBER, country_code VARCHAR2(5), customer_id NUMBER, order_date DATE, order_total NUMBER(8,2), CONSTRAINT orders_pk PRIMARY KEY (id)) PARTITION BY LIST (country_code) ( PARTITION part_usa VALUES ('USA') tablespsce tbs1, PARTITION part_uk_and_ireland VALUES ('GBR', 'IRL') tablespace tbs2);

Patterns 16 Oracle Sharding – Composite Sharding

Composite Sharding is a combination of system managed and user defined Sharding.

SQL> create tablespace tsp_set_1; SQL> create tablespace tsp_set_; SQL> alter session enable shard ddl;

SQL> CREATE SHARDED TABLE Customers ( CustId VARCHAR2(60) NOT NULL, Set partitionsets and FirstName VARCHAR2(60), tablespace sets LastName VARCHAR2(60), for composite Class VARCHAR2(10), partitioning Geo VARCHAR2(8), CustProfile VARCHAR2(4000), Passwd RAW(60), CONSTRAINT pk_customers PRIMARY KEY (CustId), CONSTRAINT json_customers CHECK (CustProfile IS JSON)) partitionset by list(GEO) partition by consistent hash(CustId) partitions auto (partitionset america values ('AMERICA') tablespace set tsp_set_1, partitionset europe values ('EUROPE') tablespace set tsp_set_2);

Patterns 17 Oracle Sharding License Requirements

Oracle Enterprise Edition is a pre-requisite

 Each shard is a standalone Oracle Database

Each shard must be licensed for either Active Data Guard, GoldenGate or RAC

 Additional Active Data Guard or RAC licenses are required for catalog DB if used for catalog HA

No separate license for Oracle Partitioning

 No separate license for Oracle Partitioning is required for sharded or duplicated tables created using Oracle Sharding

Patterns 18 Oracle Sharding – Compatibility Requirements for Shards

Each shard is an independent Oracle Database.

Oracle Sharding requires a minimum release of Oracle DB 12.2.0.1 and Oracle Client 12.2.0.1.

For the first release of Oracle Sharding, Oracle recommends using the same OS for all the shards.

This recommendation will be lifted in future releases.

The initial release of Oracle Sharding (12.2.0.1) does not support Oracle Multitenant.

Patterns 19 Oracle Database 12c R2 - Choices

ORACLE RAC ORACLE DB ORACLE DB ORACLE DB 1. Single logical Single physical database database 2. Supports OLTP Supports any applications ORACLE GRID INFRASTRUCTURE application designed to shard OS OS OS OS OS 3. Each shard has its own CPU, memory & disk. 4. Presented to the application as a single logical ORACLE ASM database SHARED STORAGE

Patterns 20 Components of Sharding

Sharded Database (SDB)

Shards in Sharded Database (SDB) – (Independent physical Oracle databases that host a subset of the sharded database)

Global Service – (Database services that provide access to data in an SDB)

Shard Directors – (Network listeners that enable high performance connection routing based on a sharding key)

Management Tools – OEM 13C and Global Database Services (GDSCTL)

Connection Pools – (at runtime, act as shard directors by routing DB requests across pooled connections)

Shard Catalog – (Oracle DB that supports automated shard deployment, centralized mgmt. of a sharded database, and multi-shard queries)

Patterns 21 Oracle Sharding Architecture

Application Connection Sharding Shard Catalog Pool  Stores SDB metadata Servers key  Acts as a query coordinator for multi-shard queries  Contains the master copy of all duplicated tables

Routing Tier Shard Directors Shard Director  Global service manager for direct routing of connection requests to shards  Publishes run-time SDB topology map, load balancing advisory, FAN events via ONS  Acting as a regional listener for clients to connect Shard Catalog/ to an SDB. Coordinator DATA TIER Sharded Database  Set of all shards  Independent physical Oracle databases that host a subset of the sharded database

Global Service  Single service to access any shard in the SDB

Patterns 22 Oracle Sharding – Data Modeling Requirements

Hierarchical table family that consists of a root table and many child and grandchild tables

Every table in the table family must contain the sharding _key

Sharding _key must be the leading of the primary key of the root table

The sharding method and sharding_key are determined based upon application and business requirements

Identify tables that need to be duplicated on all shards

In 12.2.0.1, only one table family is supported per sharded database

Patterns 23 Application requirements in Oracle Sharding

Direct Routing (Based on Sharding_key)

 OLTP workloads must specify sharding_key (e.g. customer_id) during connect  JDBC/UCP, OCI, and ODP.NET recognize sharding keys

Proxy Routing via a Coordinator(shard catalog)

 For workloads that cannot specify sharding_key (as part of connection). . Reporting, Batch jobs workloads . Queries spanning one or more or all shards . Applications unable to specify a sharding_key

Database Connections

 Must use Oracle integrated connection pools (UCP, OCI, ODP.NET, JDBC)  Must be able to separate workloads that use Direct Routing from those that use Proxy Routing  Each uses separate connection pools

Patterns 24 Life Cycle Management In Oracle Sharding

Centralized Schema Management & Duplicated Table maintenance

EM supports monitoring and management of SDB

Online Addition and Rebalancing of Shards (When a new shard is added, chunks are auto rebalanced)

DBA can manually move or split a chunk from one shard to another shard

Online Chunk Split – Splits specified chunk into 2 chunks

All shards can be patched with one command via opatchauto.

Patterns 25 Deploying a System Managed SDB

. Clients pass sharding key (e.g. Customer ID) to Connection pool, connection is routed to the right shard . Fast: caching key ranges on client ensures that most accesses go directly to the shard

ORACLE DB ORACLE DB ORACLE DB Shard Directors (GSMs) Shard Catalog Sharddirector1 shardcat Client Terminal

OS OS OS UCP

 Create a database that hosts the shard catalog  Install Oracle Database software on shard nodes  Install shard director – Global Service Manager (GSM) Shardgroup - shgrp1

Patterns 26 Complete Architecture of a System-Managed SDB

Shard Directors (GSMs) Shard Catalog Shard director1,2 shardcat Shardgroup ORACLE DB ORACLE DB Shgrp1

Primary Region A OS OS Connection ConnectionPools Pools Client Terminal

Shard Directors (GSMs) Shard Catalog Shard director 3,4 Shardcat_stdby Shardgroup ORACLE DB ORACLE DB Shgrp2

OS OS HA Standby Region B Connection ConnectionPools Pools

Data Guard Fast-Start Failover

Patterns 27 Oracle Sharding Deployment

Primary Database Standby Database

Host: SHRCAT Shard-1 Shard-1

Database: Database: Sh1 Sh3 Shard Install Oracle Database 12c R2 in all Active Directors three nodes with the following options:  Install Database software only Data & Shard  Single Instance Database Installation Guard Catalog  Enterprise Edition Shard-2 Shard-2

Database: Database: Sh2 Sh4

Patterns 28 Setting up Oracle Sharding

Installing Global Service Manager (GSM/GDS) on SourceDB1 (host the shard director )

[oracle@sourcedb1 gsm]$ ./runInstaller

Accept the defaults and click Next

Patterns 29 Setting up Oracle Sharding

Accept the defaults and click Next

Patterns 30 Setting up Oracle Sharding

Finished Installation of GSM

Patterns 31 Creating the Shard Catalog Database on host sourcedb1

Prepare and run DBCA.

[oracle@sourcedb1 Desktop]$ export ORACLE_BASE=/u01/app/oracle [oracle@sourcedb1 Desktop]$ export ORACLE_HOME=/u01/app/oracle/product/12.2.0/dbhome_1 [oracle@sourcedb1 Desktop]$ mkdir /u01/app/oracle/oradata [oracle@sourcedb1 Desktop]$ mkdir /u01/app/oracle/fast_recovery_area [oracle@sourcedb1 Desktop]$ $ORACLE_HOME/bin/dbca

Patterns 32 Creating the Shard Catalog Database on host sourcedb1

Create a NON-CDB by using DBCA on the shard catalog with OMF (required) with single Instance.

Patterns 33 Creating the Shard Catalog Database on host sourcedb1

Select Fast Recovery and Enable archiving.

Patterns 34 Setting Up - Oracle Sharding Management and Routing Tier

Set catalog database environment and start listener.

[oracle@sourcedb1 ~]$ . oraenv ORACLE_SID = [oracle] ? sorcat The Oracle base remains unchanged with value /u01/app/oracle [oracle@sourcedb1 ~]$ lsnrctl start

open_links and open_links_per_instance are set to 16 (Optional)

[oracle@sourcedb1 ~]$ sqlplus / as sysdba SQL> alter system set open_links=16 scope=spfile; SQL> alter system set open_links_per_instance=16 scope=spfile; SQL> shutdown immediate SQL> startup

Grant roles and privileges on the sorcat database.

SQL> alter user gsmcatuser account unlock; SQL> alter user gsmcatuser identified by welcome1; SQL> create user mysdbadmin identified by welcome1; SQL> grant connect, create session, gsmadmin_role to mysdbadmin; SQL> grant inherit privileges on user SYS to GSMADMIN_INTERNAL;

Patterns 35 Setting Up - Oracle Sharding Management and Routing Tier

Connect to shard director host(sourcedb1) and start GDSCTL.

[oracle@sourcedb1 ~]$ export ORACLE_BASE=/u01/app/oracle [oracle@sourcedb1 ~]$ export ORACLE_HOME=/u01/app/oracle/product/12.2.0/gsmhome_1 [oracle@sourcedb1 ~]$ export PATH=$PATH:$HOME/bin:$ORACLE_HOME/bin [oracle@sourcedb1 ~]$ gdsctl Welcome to GDSCTL, type "help" for information. Current GSM is set to SHARDDIRECTOR1

Create the shard catalog and configure the remote scheduler agent on the shard catalog

GDSCTL>create shardcatalog -database shard_catalog_host:1521:sorcat -chunks 12 -user mysdbadmin/welcome1 -sdb orasdb -region region1, region2

Create and start the shard director, set the operating system credentials

GDSCTL> add gsm -gsm sharddirector1 -listener 1522 -pwd welcome1 -catalog shard_catalog_host:1521:sorcat -region region1 GDSCTL> start gsm -gsm sharddirector1 GDSCTL> add credential -credential region1_cred -osaccount oracle -ospassword welcome1 GDSCTL> exit

Patterns 36 Setting Up - Oracle Sharding Management and Routing Tier

Connect to catalog database sorcat, to set the scheduler port and password.

[oracle@sourcedb1 ~]$ . oraenv [oracle@sourcedb1 ~]$ sqlplus / as sysdba SQL> exec DBMS_XDB.sethttpport(8080); SQL> ; SQL> exec DBMS_SCHEDULER.SET_AGENT_REGISTRATION_PASS('welcome1'); SQL> alter system register;

Connect to each of the shard hosts, register remote scheduler agents on them, and create directories for oradata and fast_recovery_area [oracle@sourcedb1 ~]$ ssh oracle@sh1 [oracle@sourcedb2 ~]$ schagent -start [oracle@sourcedb2 ~]$ echo welcome | schagent -registerdatabase shard_catalog_host 8080 [oracle@sourcedb2 ~]$ mkdir /u01/app/oracle/oradata [oracle@sourcedb2 ~]$ mkdir /u01/app/oracle/fast_recovery_area [oracle@sourcedb2 ~]$ exit [oracle@sourcedb1 ~]$ ssh oracle@sh2 [oracle@sourcedb3 ~]$ schagent -start [oracle@sourcedb3 ~]$ echo welcome1 | schagent -registerdatabase shard_catalog_host 8080 [oracle@sourcedb3 ~]$ mkdir /u01/app/oracle/oradata [oracle@sourcedb3 ~]$ mkdir /u01/app/oracle/fast_recovery_area

Patterns 37 Deploying a System-Managed SDB

Prepare – set ORACLE_BASE, ORACLE_HOME and PATH for GSM

[oracle@sourcedb1 ~]$ gdsctl Current GSM is set to SHARDDIRECTOR1 GDSCTL>set gsm -gsm sharddirector1 GDSCTL>connect mysdbadmin/welcome1 Catalog connection is established GDSCTL>add shardgroup -shardgroup primary_shardgroup -deploy_as primary -region region1 The operation completed successfully GDSCTL>create shard -shardgroup primary_shardgroup -destination sourcedb2 -credential region1_cred -sys_password welcome1 Catalog connection is established The operation completed successfully DB Unique Name: sh1 GDSCTL>create shard -shardgroup primary_shardgroup -destination sourcedb3 -credential region1_cred -sys_password welcome1 The operation completed successfully DB Unique Name: sh2 GDSCTL>config shard Name Shard Group Status State Region Availability ------sh1 primary_shardgroup U none region1 - sh21 primary_shardgroup U none region1 - Patterns 38 Deploying a System-Managed SDB

Deploy. The DEPLOY command takes some time to run, approximately 15 to 30 minutes

GDSCTL>deploy deploy: examining configuration... deploy: deploying primary shard 'sh1' ... deploy: network listener configuration successful at destination 'sourcedb2' deploy: starting DBCA at destination 'sourcedb2' to create primary shard 'sh1' ... deploy: deploying primary shard 'sh21' ... deploy: network listener configuration successful at destination 'sourcedb3' deploy: starting DBCA at destination 'sourcedb3' to create primary shard 'sh2' ... deploy: waiting for 2 DBCA primary creation job(s) to complete... deploy: waiting for 2 DBCA primary creation job(s) to complete... deploy: DBCA primary creation job succeeded at destination 'sourcedb2' for shard 'sh1' deploy: waiting for 1 DBCA primary creation job(s) to complete... deploy: waiting for 1 DBCA primary creation job(s) to complete... deploy: DBCA primary creation job succeeded at destination 'sourcedb3' for shard 'sh2' deploy: requesting Data Guard configuration on shards via GSM deploy: shards configured; background operations in progress The operation completed successfully

Patterns 39 Behind the scenes of Deploy Command

 DBMS_SCHEDULER package (executed on Shard Catalog) communicates with Scheduler Agents on remote hosts 1. Agents run DBCA and NETCA to create shards and listeners

 Oracle Data Guard replication: 1. Primary databases are created first 2. DBCA uses RMAN duplicate to create corresponding standbys 3. Redo transport and Broker are configured

 Oracle GoldenGate replication (OGG 12.3): 1. Replication pipelines are configured and replication is started

Patterns 40 Deploying a System-Managed SDB

Monitor the logfile on the shard databases

[oracle@sourcedb2 dbca]$ ls /u01/app/oracle/cfgtoollogs/dbca/silent.log* [oracle@sourcedb2 dbca]$ tail -f /u01/app/oracle/cfgtoollogs/dbca/silent.log Copying database files DBCA_PROGRESS : 1% DBCA_PROGRESS : 2% DBCA_PROGRESS : 16% DBCA_PROGRESS : 25% DBCA_PROGRESS : 45% DBCA_PROGRESS : 78% DBCA_PROGRESS : 86%

Note: Query the dba_scheduler_running_jobs on catalog database for monitoring or diagnosing during the deploying.

The exact location of a given GSM's log and trace files can be obtained using the status gsm command. GDSCTL> status gsm

Patterns 41 Deploying a System-Managed SDB

Verify the configuration

GDSCTL>config shard

Name Shard Group Status State Region Availability ------sh1 primary_shardgroup Ok Deployed region1 ONLINE Sh2 primary_shardgroup Ok Deployed region1 ONLINE

GDSCTL>databases Conversion = '1'Database: "sh1" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1 Registered instances: orasdb%1 Database: "sh2" Registered: Y State: Ok ONS: N. Role: PRIMARY Instances: 1 Region: region1 Registered instances: orasdb%11

Patterns 42 Oracle Sharding – Backup and Recovery for SDB

Existing MAA best practices for backup apply to SDB

 Best practices for Disk, Tape, or Oracle Secure Backup (OSB)  Determine frequency and retention period  Use Recovery Manager (RMAN) catalog  Enable Block Change Tracking (BCT)  Enable auto backup for control file and server parameter file  Offload backups to physical standby

Oracle Sharding – MOS Notes

 Complete reference - Oracle Sharding 12.2 [2143551.1]  Oracle Sharding - Troubleshooting Tips and Techniques [2180259.1]  Backup and Recovery Consideration - Oracle Sharding Environment [2189659.1]  Master Note for Handling Oracle Sharding - Oracle Database 12.2 Technology [2226341.1]

Patterns 43 Oracle Sharding using Oracle RAC

ORACLE RAC ORACLE RAC Shard Catalog Shard Catalog DG shardcat shardcat

ORACLE GRID INFRASTRUCTURE ORACLE GRID INFRASTRUCTURE OS OS OS OS

DG

Primary Shard Shard Directors Shard Directors Standby Shard ASM SHARED STORAGE (GSMs) (GSMs) ASM SHARED STORAGE

Patterns 44 Thanks for your TIME

[email protected]

@yvrk1973

http://yvrk1973.blogspot.in

https://www.linkedin.com/in/yv-ravikumar-ace-director-oracle-certified-master-book-author-a5001314/

Patterns 45