Succinct Study of Oracle SQL BI API as Evolution Prepared by: ANTHONY D NORIEGA ADN R & D SQL API Evolution Urban Consulting Oracle9i through Oracle19g Oracle Corporation New York, NY December 4, 2019 BACKGROUND AND CONCEPTUAL FRAMEWORK Objectives

1. Understanding the nature of changes and improvements in Oracle SQL Business Intelligence API transformation as advancement within the Oracle12c family. 2. Introducing a visual perspective of the Oracle SQL API for code maintenance, migration, and upgrade purposes. 3. Tuning modern improved Oracle SQL API for application performance in relation to existing code and old-fashioned queries, e.g., equivalent correlated subqueries.

Objectives Introduction

• Oracle SQL API is also the globally recognized leading product in its class. • Oracle SQL standards comply with the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO). • The ISO SQL standard consists of nine parts (SQL/Framework, SQL/Foundation, SQL/CLI, SQL/PSM, SQL/MED, SQL/OLB, SQL/Schemata, SQL/JRT, and SQL/XML). The ANSI SQL standard consists of the same nine parts. • The mandatory portion of SQL is known as Core SQL and is found in SQL:2016 Part 2 (Foundation) and Part 11 (Schemata). The Foundation features are analyzed in Annex F of Part 2 in the table "Feature taxonomy and definition for mandatory features." The Schemata features are analyzed in Annex F of Part 11 in the table "Feature taxonomy and definition for mandatory features."

Introduction Justification

• Oracle SQL BI API is also the global leading product in its class. • Maintaining applications based on or utilizing the Oracle SQL BI API using new releases, although apparently seamless, may require important quality assurance (regression testing, verification and validation, and conventional upgrade tasks) when attempting to use relevant new features to ensure reliability. • Taking the most of new features is worth any effort to attain the utmost accuracy and tuning.

Justification TECHNICAL FRAMEWORK Special Considerations

The Oracle SQL BI API and the usage of analytical functions was introduced with the Oracle9i release, and expanded thereon.

Special Considerations

Oracle12c SQL BI API Characterization

• Finding repeating patterns in SQL BI API statements should be a specialized analytic task in both software and data engineering. • Tracing SQL BI API patterns is possible, just like with any software engineering code. • Identifying best patterns to attain optimal SQL performance tuning is equally possible. • Characterizing SQL BI API patterns occurs in the structure of DML statements, quite often on SELECT, GROUP BY and clauses.

Oracle12c SQL BI API Characterization

Basic Oracle SQL API Characterization

SELECT Clause Analytical Window Analytical aggregate function or expression OVER Clause WIDTH_BUCKET Function ( PARTITION BY subclauses Linear Algebra ORDER BY subclause Frequent Itemsets in SQL Analytics other frame clause keywords CASE Expressions ) START WITH … CONNECT BY GROUPING, GROUPING_ID, GROUP_ID() aggregate analytic function FROM Clause * (possible embedded table subclause) PIVOT Clause WHERE MODEL Clause] GROUP BY Clause [Aggregate analytical functions (ROLLUP, CUBE, GROUPING SETS, GROUPING IDs)] ORDER BY [ Clause ]

Basic Oracle12c SQL BI API Characterization Analytical Function

• Provides the analytical window in the SELECT clause. • Contains three standard (independent and interdependent) components and one dependent component:

– The analytic clause – The query partition clause – The order-by clause – The windowing clause, whose existence depends on the presence of the order-by clause (an optional component).

Key Patterns in SQL BI API

Oracle12c SQL API Characterization

Key Patterns in SQL BI API Analytical Functions by Type

The SQL BI API can involve several specific roles and types of calculation, namely: • Rankings and percentiles • Moving window calculations • Lag/lead analysis • First/last analysis • Linear regression statistics

Key Patterns in SQL BI API Analytical Functions Types and Usage

TYPE USAGE RANKING Calculating ranks, percentiles, and n-tiles of the values in a result set.

Calculating cumulative and moving aggregates. Works with these functions: SUM, AVG, MIN, MAX, WINDOWING COUNT, VARIANCE, STDDEV, FIRST_VALUE, LAST_VALUE, and new statistical functions. Note that the DISTINCT keyword is not supported in windowing functions except for MAX and MIN.

Calculating shares, for example, market share. Works with these functions: SUM, AVG, MIN, MAX, COUNT (with/without DISTINCT), VARIANCE, STDDEV, RATIO_TO_REPORT, and new statistical functions. Note REPORTING that the DISTINCT keyword may be used in those reporting functions that support DISTINCT in aggregate mode. LAG / LEAD ANALYSIS Finding a value in a row a specified number of rows from a current row. FIRST / LAST ANALYSIS First or last value in an ordered group. LINEAR REGRESSION ANALYSIS Calculating linear regression and other statistics (slope, intercept, and so on). INVERSE PERCENTILE The value in a data set that corresponds to a specified percentile. ANALYSIS HYPOTHETICAL RANK AND The rank or percentile that a row would have if inserted into a specified data set. DISTRIBUTION

Key Patterns in SQL BI API Oracle Analytic Function SQL BI API

• Analytic Functions, a decade ago and now

AVG * CLUSTER_DETAILS FEATURE_VALUE CLUSTER_DISTANCE PERCENT_RANK STDDEV * FIRST CLUSTER_ID PERCENTILE_CONT STDDEV_POP * FIRST_VALUE * REGR_ (Linear Regression) CLUSTER_PROBABILITY PERCENTILE_DISC STDDEV_SAMP LAG Functions * CLUSTER_SET PREDICTION LAST ROW_NUMBER * CORR * PREDICTION_COST LAST_VALUE * FEATURE_DETAILS SUM * COUNT * LEAD PREDICTION_DETAILS VAR_POP * COVAR_POP * FEATURE_ID LISTAGG PREDICTION_PROBABILITY VAR_SAMP * COVAR_SAMP * FEATURE_SET MAX * PREDICTION_SET CUME_DIST VARIANCE * MIN * RANK DENSE_RANKPREDICTION NTH_VALUE * RATIO_TO_REPORT NTILE

Key Patterns in SQL BI API Analytical Function Processing Workflow

• Processing Order of Analytical Functions (Oracle19c)

Key Patterns in SQL BI API Sample Evolution: LISTAGG Analytical Function

SELECT deptno, LISTAGG(DISTINCT ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; SELECT deptno, LISTAGG(ALL ename, ',') WITHIN GROUP DEPTNO EMPLOYEES (ORDER BY ename) AS employees ------10 CLARK,KING,MILLER FROM emp 20 ADAMS,FORD,JONES,SCOTT,SMITH GROUP BY deptno 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD ORDER BY deptno;

DEPTNO EMPLOYEES ------10 CLARK,KING,MILLER,MILLER,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

Key Patterns in SQL BI API Sample Evolution: LISTAGG Analytical Function

• Evolution of the LISTAGG Analytical Function

SELECT g.country_region, LISTAGG(c.cust_first_name||' '||c.cust_last_name, ',' ON OVERFLOW TRUNCATE '…' WITH COUNT) WITHIN GROUP (ORDER BY c.country_id) AS Customer FROM customers c, countries g WHERE g.country_id = c.country_id GROUP BY country_region ORDER BY country_region;

Key Patterns in SQL BI API Sample Evolution: LISTAGG Analytical Function

• Sample Evolution in LISTAGG Analytical Function

SELECT deptno, LISTAGG(DISTINCT ename, ',') WITHIN GROUP (ORDER BY ename) AS employees FROM emp GROUP BY deptno ORDER BY deptno; SELECT deptno, LISTAGG(ALL ename, ',') WITHIN GROUP (ORDER BY ename) AS employees DEPTNO EMPLOYEES FROM emp ------10 CLARK,KING,MILLER GROUP BY deptno 20 ADAMS,FORD,JONES,SCOTT,SMITH ORDER BY deptno; 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD DEPTNO EMPLOYEES ------10 CLARK,KING,MILLER,MILLER,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

Key Patterns in SQL BI API Advanced Aggregation in Analytical Functions

• Grouping Analytical Functions – Introduced in the Oracle10g release – Provides an extension to the analytical window with by generating grouping pseudo-columns in the the SELECT clause (upper portion above the FROM clause) – Provides the following clauses: • GROUPING (specifying tuples from selected attributes) • GROUPING_ID (differentiator, specifying selected attributes) • GROUP_ID() (differentiator, providing analytic group id)

Key Patterns in SQL BI API String-driven Analytical Functions

• SQL String Aggregation – LISTAGG – WM_CONCAT – COLLECT

• String Pattern Matching – MATCH_RECOGNIZE

Key Patterns in SQL BI API SQL String Aggregation using LISTAGG

• LISTAGG Function Primitive Format

Key Patterns in SQL BI API SQL String Aggregation using LISTAGG

• Using LISTAGG

SELECT e2.deptno, LISTAGG(e2.ename, ',') WITHIN GROUP (ORDER BY e2.ename) AS employees FROM (SELECT e.*, ROW_NUMBER() OVER (PARTITION BY e.deptno, e.ename ORDER BY e.empno) AS myrank FROM emp e) e2 WHERE e2.myrank = 1 GROUP BY e2.deptno ORDER BY e2.deptno;

DEPTNO EMPLOYEES ------10 CLARK,KING,MILLER 20 ADAMS,FORD,JONES,SCOTT,SMITH 30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

Key Patterns in SQL BI API Oracle 19c Featured LISTAGG Usage

• Oracle 19c Featured SQL String Aggregation with LISTAGG

SELECT g.country_region, LISTAGG(c.cust_first_name||' '||c.cust_last_name, ',' ON OVERFLOW TRUNCATE '…' WITH COUNT) WITHIN GROUP (ORDER BY c.country_id) AS Customer FROM customers c, countries g WHERE g.country_id = c.country_id GROUP BY country_region ORDER BY country_region;

• ERROR reporting on overflow is also featured in Oracle19c.

Key Patterns in SQL BI API SQL String Pattern Matching with MATCH_RECOGNIZE • Using MATCH_RECOGNIZE

SELECT * FROM Ticker MATCH_RECOGNIZE ( PARTITION BY symbol SYMBOL START_TST BOTTOM_TS END_TSTAM ORDER BY tstamp ------MEASURES STRT.tstamp AS start_tstamp, ACME 05-APR-11 06-APR-11 10-APR-11 LAST(DOWN.tstamp) AS bottom_tstamp, ACME 10-APR-11 12-APR-11 13-APR-11 LAST(UP.tstamp) AS end_tstamp ACME 14-APR-11 16-APR-11 18-APR-11 ONE ROW PER MATCH AFTER MATCH SKIP TO LAST UP PATTERN (STRT DOWN+ UP+) DEFINE DOWN AS DOWN.price < PREV(DOWN.price), UP AS UP.price > PREV(UP.price) ) MT

ORDER BY MT.symbol, MT.start_tstamp; Key Patterns in SQL BI API Pivot Table API

• Pivot Table Clause Format

Key Patterns in SQL BI API Using Basic Pivot Table API

• Pivot Table Basic Usage SELECT * FROM sales2 PIVOT ( SUM(amount_sold), COUNT(*) AS count_total FOR qtr IN ('Q1', 'Q2') ); PROD_ID "Q1" "Q1_COUNT_TOTAL" "Q2" "Q2_COUNT_TOTAL" ------100 20 2 NULL <1> 1 200 50 1 NULL <2> 0

From the result, you know that for prod_id 100, there are 2 sales rows for quarter Q1, and 1 sales row for quarter Q2; for prod_id 200, there is 1 sales row for quarter Q1, and no sales row for quarter Q2.So, in Q2_COUNT_TOTAL, you can identify that NULL<1> comes from a row in the original table whose measure is of value, while NULL<2> is due to no row being present in the original table for prod_id 200 in quarter Q2.

Key Patterns in SQL BI API Using Pivot Table Intermediate API

• Pivot Table Intermediate Usage SELECT * FROM ( SELECT product, channel, amount_sold FROM sales_view ) SV PIVOT (SUM(amount_sold) FOR CHANNEL IN ( 3 AS DIRECT_SALES, 4 AS INTERNET_SALES, 5 AS CATALOG_SALES, 9 AS TELESALES)) ORDER BY product; PRODUCT DIRECT_SALES INTERNET_SALES CATALOG_SALES TELESALES ------... Internal 6X CD-ROM 229512.97 26249.55 Internal 8X CD-ROM 286291.49 42809.44 Keyboard Wrist Rest 200959.84 38695.36 1522.73

Key Patterns in SQL BI API Using XML Pivot Table API

• Pivot Table XML Usage ( generating XML output)

SELECT * FROM (SELECT product, channel, quantity_sold FROM sales_view sv ) PIVOT XML(SUM(quantity_sold) FOR channel IN (SELECT DISTINCT channel_id FROM CHANNELS ) );

Key Patterns in SQL BI API SQL BI API for Modeling

• The MODEL Clause – This SQL clause works conveniently with an analytical workspace regardless of how the multi-dimensional model is implemented, i.e., Snowflake or Star Schema models. – Enhanced the Oracle10g release for the purpose of predictive analytics and time series analysis.

Key Patterns in SQL BI API MODEL Clause primitive API format

• The MODEL Clause The Oracle 19c SQL MODEL Clause API exhibited below.

Key Patterns in SQL BI API MODEL Clause in SELECT Statement

• The MODEL Clause Example

Key Patterns in SQL BI API MODEL Clause Processing Workflow

• MODEL Flow Processing

Key Patterns in SQL BI API Advanced Analytic Aggregation SQL BI API

• Aggregate Analytical Functions (in Datawarehousing) These functions are essentially used in the GROUP BY clause:

– ROLLUP – CUBE – GROUPING SETS

Key Patterns in SQL BI API ROLLUP vs. CUBE Processing Workflow

• ROLLUP vs. CUBE Analytical Aggregation

ROLLUP (x, y, z) ROLLUP ((x, y), z) (x, y, z) (x, y, z) (x, y) (x, y) (x) () ()

CUBE (x, y, z) CUBE ((x, y), z) (x, y, z) (x, y, z) (x, y) (x, y) (x, z) (z) (x) () (y, z) (y) (z)

() Key Patterns in SQL BI API WITH Clause SQL BI API • WITH Clause computed computed WITH V AS PROD_NAME CALENDAR UNITS SALES _units _sales (SELECT substr(p.prod_name,1,12) prod_name, ------64MB Memory 2000-01 112 4129.72 112 4129.72 calendar_month_desc, 64MB Memory 2000-02 190 7049 190 7049 SUM(quantity_sold) units, SUM(amount_sold) sales 64MB Memory 2000-03 47 1724.98 47 1724.98 64MB Memory 2000-04 20 739.4 20 739.4 FROM sales s, products p, times t 64MB Memory 2000-05 47 1738.24 47 1738.24 WHERE s.prod_id IN (122,136) AND calendar_year = 64MB Memory 2000-06 20 739.4 20 739.4 64MB Memory 2000-07 72.6666667 2686.79 2000 64MB Memory 2000-08 72.6666667 2686.79 AND t.time_id = s.time_id 64MB Memory 2000-09 72.6666667 2686.79 64MB Memory 2000-10 72.6666667 2686.79 AND p.prod_id = s.prod_id 64MB Memory 2000-11 72.6666667 2686.79 GROUP BY p.prod_name, calendar_month_desc) 64MB Memory 2000-12 72.6666667 2686.79 DVD-R Discs, 2000-01 167 3683.5 167 3683.5 SELECT v.prod_name, calendar_month_desc, units, sales, DVD-R Discs, 2000-02 152 3362.24 152 3362.24 NVL(units, AVG(units) OVER (PARTITION BY DVD-R Discs, 2000-03 188 4148.02 188 4148.02 DVD-R Discs, 2000-04 144 3170.09 144 3170.09 v.prod_name)) computed_units, DVD-R Discs, 2000-05 189 4164.87 189 4164.87 NVL(sales, AVG(sales) OVER (PARTITION BY DVD-R Discs, 2000-06 145 3192.21 145 3192.21 DVD-R Discs, 2000-07 124.25 2737.71 v.prod_name)) computed_sales DVD-R Discs, 2000-08 124.25 2737.71 FROM DVD-R Discs, 2000-09 1 18.91 1 18.91 DVD-R Discs, 2000-10 124.25 2737.71 (SELECT DISTINCT calendar_month_desc DVD-R Discs, 2000-11 124.25 2737.71 FROM times DVD-R Discs, 2000-12 8 161.84 8 161.84 WHERE calendar_year = 2000) t Key Patterns LEFT in SQL OUTER BI API JOIN V PARTITION BY (prod_name) USING (calendar_month_desc); Other Key Patterns in SQL BI API

Other Miscellaneous Analysis and Reporting Capabilities: • WIDTH_BUCKET Function • Linear Algebra • CASE Expressions • Frequent Itemsets in SQL Analytics • Hierarchical Queries

Key Patterns in SQL BI API WIDTH_BUCKET SQL Function API

CUST_ID CUST_CREDIT_LIMIT WIDTH_BUCKET_UP WIDTH_BUCKET_DOWN • ------WIDTH_BUCKET Function Format 10346 7000 2 3 35266 7000 2 3 WIDTH_BUCKET(expression, 41496 15000 4 2 35225 11000 3 2 minval expression, 3424 9000 2 3 maxval expression, 28344 1500 1 4 31112 7000 2 3 num buckets 8962 1500 1 4 ) 15192 3000 1 4 21380 5000 2 4 36651 1500 1 4 SELECT cust_id, cust_credit_limit, 30420 5000 2 4 WIDTH_BUCKET(cust_credit_limit,0,20000,4) AS 8270 3000 1 4 17268 11000 3 2 WIDTH_BUCKET_UP, 14459 11000 3 2 WIDTH_BUCKET(cust_credit_limit,20000, 0, 4) 13808 5000 2 4 AS WIDTH_BUCKET_DOWN 32497 1500 1 4 100977 9000 2 3 FROM customers WHERE cust_city = 'London'; 102077 3000 1 4 103066 10000 3 3 101784 5000 2 4 . . .

Key Patterns in SQL BI API Linear Algebra and Matrix SQL API

CREATE OR REPLACE VIEW sales_marketing_model (year, ols) • Linear Algebra AS SELECT year, OLS_Regression( /* mean_y => */ AVG(sales), /* variance_y => */ var_pop(sales), /* MV mean vector => */ UTL_NLA_ARRAY_DBL (AVG(media),AVG(promo), AVG(disct),AVG(dmail)), /* VCM variance covariance matrix => */ UTL_NLA_ARRAY_DBL (var_pop(media),covar_pop(media,promo), covar_pop(media,disct),covar_pop(media,dmail), var_pop(promo),covar_pop(promo,disct), covar_pop(promo,dmail),var_pop(disct), covar_pop(disct,dmail),var_pop(dmail)), /* CV covariance vector => */ UTL_NLA_ARRAY_DBL (covar_pop(sales,media),covar_pop(sales,promo), covar_pop(sales,disct),covar_pop(sales,dmail))) FROM sales_marketing_data GROUP BY year;

Key Patterns in SQL BI API CASE Expression Simple and Search SQL API

SELECT (CASE WHEN cust_credit_limit BETWEEN 0 AND 3999 • CASE Expression Primitives THEN ' 0 - 3999' • Simple Case Syntax WHEN cust_credit_limit BETWEEN 4000 AND 7999 THEN ' 4000 - 7999' CASE expr WHEN cust_credit_limit BETWEEN 8000 AND 11999 THEN ' 8000 WHEN comparison_expr THEN return_expr - 11999' [, WHEN comparison_expr THEN return_expr]... WHEN cust_credit_limit BETWEEN 12000 AND 16000 THEN [ELSE else_expr] END '12000 - 16000' END) • Search Case Syntax AS BUCKET, COUNT(*) AS Count_in_Group FROM customers CASE WHEN THEN return_expr WHERE cust_city = ’London' [, WHEN condition THEN return_expr] GROUP BY ... ELSE else_expr] END (CASE WHEN cust_credit_limit BETWEEN 0 AND 3999 THEN ' 0 - 3999' BUCKET COUNT_IN_GROUP WHEN cust_credit_limit BETWEEN 4000 AND 7999 THEN ------‘4000 - 7999' 0 - 3999 8 WHEN cust_credit_limit BETWEEN 8000 AND 11999 THEN ' 8000 4000 - 7999 7 - 11999' 8000 - 11999 7 WHEN cust_credit_limit BETWEEN 12000 AND 16000 THEN 12000 - 16000 1 '12000 - 16000' END);

Key Patterns in SQL BI API Frequent Itemsets Analytic SQL BI API

• Frequent Itemsets in SQL Analytics Counting often concurrent events rather than single events provides the following key benefits: • Applications that previously relied on frequent itemset operations now benefit from significantly improved performance as well as simpler implementation. • SQL-based applications that did not previously use frequent itemsets can now be easily extended to take advantage of this functionality. These benefits are attained using the DBMS_FREQUENT_ITEMSETS PL/SQL package. This package is quite useful in marketing, AI, and machine learning applications Key Patterns in SQL BI API DBMS_FREQUENT_ITEMSETS PL/SQL package

DBMS_FREQUENT_ITEMSET.FI_HORIZONTAL( tranx_cursor IN SYSREFCURSOR, support_threshold IN NUMBER, itemset_length_min IN NUMBER, DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL ( itemset_length_max IN NUMBER, tranx_cursor IN SYSREFCURSOR, including_items IN SYS_REFCURSOR DEFAULT NULL, support_threshold IN NUMBER, excluding_items IN SYS_REFCURSOR DEFAULT NULL) itemset_length_min IN NUMBER, RETURN TABLE OF ROW ( itemset_length_max IN NUMBER, itemset [Nested Table of Item Type DERIVED FROM including_items IN SYS_REFCURSOR DEFAULT NULL, tranx_cursor], excluding_items IN SYS_REFCURSOR DEFAULT NULL) support NUMBER, RETURN TABLE OF ROW ( length NUMBER, itemset [Nested Table of Item Type DERIVED FROM total_tranx NUMBER); tranx_cursor], support NUMBER, length NUMBER, total_tranx NUMBER);

Key Patterns in SQL BI API DBMS_FREQUENT_ITEMSETS Usage

CREATE TYPE fi_varchar_nt AS TABLE OF VARCHAR2(30); SELECT CAST(itemset as FI_VARCHAR_NT)itemset, support, length, total_tranx FROM table(DBMS_FREQUENT_ITEMSET.FI_HORIZONTAL( CURSOR(SELECT iid1, iid2, iid3, iid4, iid5 CREATE TYPE fi_varchar_nt AS TABLE OF VARCHAR2(30); FROM horiz_table_in), SELECT CAST(itemset as FI_VARCHAR_NT) itemset, support, 0.3, length, total_tranx 2, FROM 5, table(DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL( CURSOR(SELECT * FROM table(FI_VARCHAR_NT cursor(SELECT tid, iid FROM tranx_table_in), ('apple','banana','orange'))), 0.6, CURSOR(SELECT * FROM 2, table(FI_VARCHAR_NT('milk'))))); 5, NULL, NULL));

Key Patterns in SQL BI API DBMS_FREQUENT_ITEMSETS Usage

SELECT itemset, support, length, rnk FROM White paper titles # (SELECT itemset, support, length, ------RANK() OVER (PARTITION BY length ORDER BY Table Compression in Oracle 10g 696 support DESC) rnk Field Experiences with Large Data Warehouses 439 Key Data Warehouse Features: A Comparative Performance Analysis 181 FROM Materialized Views in Oracle Database 10g 167 (SELECT CAST(itemset AS fi_char) itemset, support, Parallel Execution in Oracle Database 10g 166 length, total_tranx FROM table(DBMS_FREQUENT_ITEMSET.FI_TRANSACTIONAL (CURSOR(SELECT session_id, command FROM web_log White paper titles # ------WHERE time_stamp BETWEEN '01-APR-2002' Table Compression in Oracle Database 10g 115 AND '01-JUN-2002'), Field Experiences with Large Data Warehouses (60/2600), 2, 2, CURSOR(SELECT 'a' FROM DUAL Data Warehouse Performance Enhancements with Oracle Database 10g 109 WHERE 1=0), Oracle Performance and Scalability in DSS Environments CURSOR(SELECT 'a' FROM DUAL Materialized Views in Oracle Database 10g 107 WHERE 1=0))))) Query Optimization in Oracle Database 10g WHERE rnk <= 10;

Key Patterns in SQL BI API Limiting Rows SQL API

[ OFFSET offset { ROW | ROWS } ] • Limiting SQL Rows [ FETCH { FIRST | NEXT } [ { rowcount | percent PERCENT } ] { ROW | ROWS } { ONLY | WITH TIES } ]

SELECT employee_id, SELECT employee_id, last_name SELECT employee_id, last_name last_name FROM employees FROM employees FROM employees ORDER BY employee_id ORDER BY employee_id ORDER BY employee_id OFFSET 5 ROWS OFFSET 10 ROWS FETCH FIRST 5 ROWS ONLY; FETCH NEXT 5 ROWS ONLY; FETCH NEXT 5 ROWS ONLY;

EMPLOYEE_ID LAST_NAME EMPLOYEE_ID LAST_NAME EMPLOYEE_ID LAST_NAME ------100 King 105 Austin 110 Chen 101 Kochhar 106 Pataballa 111 Sciarra 102 De Haan 107 Lorentz 112 Urman 103 Hunold 108 Greenberg 113 Popp 104 Ernst 109 Faviet 114 Raphaely

Key Patterns in SQL BI API Hierarchical Queries

LAST_NAME EMPLOYEE_ID MANAGER_ID LEVEL ------King 100 1 Cambrault 148 100 2 Bates 172 148 3 Bloom 169 148 3 Fox 170 148 3 Kumar 173 148 3 Ozer 168 148 3 Smith 171 148 3 SELECT last_name, employee_id, manager_id, LEVEL De Haan 102 100 2 Hunold 103 102 3 FROM employees Austin 105 103 4 START WITH employee_id = 100 Ernst 104 103 4 Lorentz 107 103 4 CONNECT BY PRIOR employee_id = manager_id Pataballa 106 103 4 ORDER SIBLINGS BY last_name; Errazuriz 147 100 2 Ande 166 147 3 Banda 167 147 3

Key Patterns in SQL BI API Retrospective Oracle SQL BI API

• Analytic Query vs. Correlated Subquery • In some cases, it is possible to represent correlated subqueries as a typical analytical query and vice versa. • Analytical queries provide exceedingly better performance than correlated subqueries, despite the fact that parsing aggregation occurs for each row in both scenarios, i.e., the parsing, execution, and fetching cycle is essentially less costly for analytical queries.

Key Patterns in SQL BI API Retrospective Oracle SQL BI API

• Analytic Query vs. Correlated Subquery

SELECT SELECT e1.empno, empno, e1.ename, ename, e1.deptno, deptno, e1.sal, sal, ( MAX (sal) OVER ( SELECT MAX(e2.sal) PARTITION BY deptno FROM scott.emp e2 ORDER BY deptno, sal DESC WHERE e1.deptno = e2.deptno ) AS max_sal GROUP BY e2.deptno FROM scott.emp; ) AS max_sal FROM scott.emp e1 ORDER BY e1.deptno, e1.sal DESC

Key Patterns in SQL BI API Analytic Query vs. Correlated Subquery Demo

Key Patterns in SQL BI API Analytic Query vs Correlated Subquery Demo Explained

Key Patterns in SQL BI API Retrospective Oracle SQL BI API

• Analytic Functions, a decade ago and now

AVG * PERCENT_RANK STDDEV * CLUSTER_DETAILS STDDEV_POP * FEATURE_VALUE PERCENTILE_CONT CLUSTER_DISTANCE REGR_ (Linear Regression) FIRST PERCENTILE_DISC STDDEV_SAMP CLUSTER_ID Functions * FIRST_VALUE * PREDICTION * CLUSTER_PROBABILITY ROW_NUMBER LAG PREDICTION_COST SUM * CLUSTER_SET LAST FEATURE_DETAILS VAR_POP * CORR * PREDICTION_DETAILS LAST_VALUE * FEATURE_ID COUNT * PREDICTION_PROBABILITY VAR_SAMP * LEAD FEATURE_SET COVAR_POP * PREDICTION_SET VARIANCE * LISTAGG COVAR_SAMP * RANK MAX * CUME_DIST RATIO_TO_REPORT MIN * DENSE_RANKPREDICTION NTH_VALUE * NTILE

Key Patterns in SQL BI API Cohesion and Coupling Concepts Revisited BEST BEST 1. Functional Cohesion: can exist if the different elements of a module, 1. No Direct Coupling: There is no direct coupling between cooperate to achieve a single function. modules. In this case, modules are subordinates to different modules. Therefore, no direct coupling. 2. Sequential Cohesion: exists if the element of a module form the components of the sequence, the output from one component 2. Data Coupling: When data of one module is passed to another of the sequence is input to the next. module, this is called data coupling. 3. Communicational Cohesion: occurs if all tasks of the module refer to 3. Stamp Coupling: when modules communicate using composite or the same data structure, e.g., the set of functions defined data items such as structure, objects, etc. When the module on an array or a stack. passes non-global data structure or entire structure to another module, they are said to be stamp coupled. For example, 4. Procedural Cohesion: takes place when the set of purpose of the passing structure variable in Python or an object in Java module are all parts of a procedure in which particular sequence of language to a module. steps has to be carried out for achieving a goal, e.g., the algorithm for decoding a message. 4. Control Coupling: When data from one module is used to direct the structure of instruction execution in another module. 5. Temporal Cohesion: When a module includes functions that are associated by the fact that all the methods must be executed in the 5. External Coupling: when modules share an externally imposed same time, the module is said to exhibit temporal cohesion. data format, communication protocols, or device interface. This is related to communication to external tools and devices. 6. Logical Cohesion: occurs when all the elements of the module perform a similar operation. For example Error handling, data input 6. Common Coupling: when modules share information through and data output, etc. some global data items. 7. Coincidental Cohesion: exists if it performs a set of tasks that are 7. Content Coupling: when modules share code, e.g., a branch associated with each other very loosely, if at all. from one module into another module. WORST WORST

Key Patterns in SQL BI API SQL BI API Cohesion and Coupling

• SQL BI API provides strong cohesion and flexible coupling capabilities, which can be balanced accordingly by software engineering methodologies. • Cohesion (intra-module binding) and coupling (inter-module binding) in traditional, object-oriented and aspect-oriented software engineering paradigm are pervasive in Cloud [software] engineering and applications. • Often Oracle SQL BI API provides flexible functions that can be overloaded supporting different types of actual parameters and so does PL/SQL extension functions in supplied packages. • Java, PL/SQL, and Python can also provide further overloading and overriding capabilities for an intended coupling problem-solving. • Coupling can integrate Oracle SQL BI API to other Oracle technologies such as machine learning, AI, using R Enterprise and the Oracle Message-Oriented Middleware (MOM) with Streams AQ.

Key Patterns in SQL BI API SQL BI API Support for XML and JSON

SELECT columns SELECT * FROM tables FROM PIVOT [XML] ( (SELECT product, pivot_clause, channel, pivot_for_clause, quantity_sold pivot_in_clause FROM sales_report ); ) PIVOT XML(SUM(quantity_sold) JSON native support is available FOR channel IN (ANY) and quite applicable and ); congruent with SQL BI API.

Key Patterns in SQL BI API Other Key Patterns in SQL BI API

Other Miscellaneous Analysis and Reporting Capabilities: • WIDTH_BUCKET Function • Linear Algebra • CASE Expressions • Frequent Itemsets in SQL Analytics

Key Patterns in SQL BI API

Example 1

SELECT * FROM stock_trades MATCH_RECOGNIZE ( PARTITION BY symbol ORDER BY tstamp MEASURES FIRST (X.tstamp) AS in_hour_of_trade, SUM (X.volume) AS sum_of_large_volumes ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (X Y* X Y* X) DEFINE X AS ((X.volume > 50000) AND ((X.tstamp - FIRST (X.tstamp)) < '001:00:00.00' )), Y AS ((Y.volume <= 50000) AND ((Y.tstamp - FIRST (X.tstamp)) < '01:00:00.00')))

Examples

Example 2

SELECT last_name, salary FROM (SELECT last_name, DENSE_RANK() OVER (ORDER BY salary DESC) rank_val, salary FROM employees) WHERE rank_val BETWEEN 10 AND 20;

Examples

Example 3

CREATE MATERIALIZED VIEW sales_hierarchical_mon_cube_mv PARTITION BY RANGE (mon) SUBPARTITION BY LIST (gid) REFRESH FAST ON DEMAND ENABLE QUERY REWRITE AS SELECT calendar_year yr, calendar_quarter_desc qtr, calendar_month_desc mon, country_id, cust_state_province, cust_city, prod_category, prod_subcategory, prod_name, GROUPING_ID( calendar_year, calendar_quarter_desc, calendar_month_desc, country_id, cust_state_province, cust_city, prod_category, prod_subcategory, prod_name ) gid, SUM(amount_sold) s_sales, COUNT(amount_sold) c_sales, COUNT(*) c_star FROM sales s, products p, customers c, times t WHERE s.cust_id = c.cust_id AND s.prod_id = p.prod_id AND s.time_id = t.time_id GROUP BY calendar_year, calendar_quarter_desc, calendar_month_desc, ROLLUP(country_id, cust_state_province, cust_city), ROLLUP(prod_category, prod_subcategory, prod_name), ...; Examples Example 4

SELECT empno, ename, deptno, sal, COUNT(*) OVER (PARTITION BY deptno) AS amount_by_dept FROM emp; EMPNO ENAME DEPTNO SAL AMOUNT_BY_DEPT ------7782 CLARK 10 2450 3 7839 KING 10 5000 3 7934 MILLER 10 1300 3 7566 JONES 20 2975 5 7902 FORD 20 3000 5 7876 ADAMS 20 1100 5 7369 SMITH 20 800 5 7788 SCOTT 20 3000 5 7521 WARD 30 1250 6 7844 TURNER 30 1500 6 7499 ALLEN 30 1600 6 7900 JAMES 30 950 6 7698 BLAKE 30 2850 6 7654 MARTIN 30 1250 6

Examples

Example 5

CREATE TABLE pivotedTable AS SELECT * FROM ( SELECT product, quarter, SELECT * quantity_sold, FROM pivotedTable amount_sold FROM sales_view ) ORDER BY product; PIVOT ( SUM(quantity_sold) AS sumq, SUM(amount_sold) AS suma FOR quarter IN ('01' AS Q1, '02' AS Q2, '03' AS Q3, '04' AS Q4) ); PRODUCT Q1_SUMQ Q1_SUMA Q2_SUMQ Q2_SUMA Q3_SUMQ Q3_SUMA Q4_SUMQ Q4_SUMA ------1.44MB External 6098 58301.33 5112 49001.56 6050 56974.3 5848 55341.28 128MB Memory 1963 110763.63 2361 132123.12 3069 170710.4 2832 157736.6 . . . Examples Example 5

SELECT product, DECODE(quarter, 'Q1_SUMQ', 'Q1', 'Q2_SUMQ', 'Q2', 'Q3_SUMQ', 'Q3', 'Q4_SUMQ', 'Q4') AS quarter, quantity_sold FROM pivotedTable UNPIVOT INCLUDE NULLS (quantity_sold FOR quarter IN (Q1_SUMQ, Q2_SUMQ, Q3_SUMQ, Q4_SUMQ)) ORDER BY product, quarter; PRODUCT QU QUANTITY_SOLD ------1.44MB External 3.5" Diskette Q1 6098 1.44MB External 3.5" Diskette Q2 5112 1.44MB External 3.5" Diskette Q3 6050 1.44MB External 3.5" Diskette Q4 5848 128MB Memory Card Q1 1963 128MB Memory Card Q2 2361 128MB Memory Card Q3 3069 128MB Memory Card Q4 2832 … Examples

Example 5

SELECT product, quarter, quantity_sold, amount_sold FROM pivotedTable UNPIVOT INCLUDE NULLS ( (quantity_sold, amount_sold) FOR quarter IN ( (Q1_SUMQ, Q1_SUMA) AS 'Q1', PRODUCT QU QUANTITY_SOLD AMOUNT_SOLD (Q2_SUMQ, Q2_SUMA) AS 'Q2', ------1.44MB External 3.5" Diskette Q1 6098 58301.33 (Q3_SUMQ, Q3_SUMA) AS 'Q3', 1.44MB External 3.5" Diskette Q2 5112 49001.56 (Q4_SUMQ, Q4_SUMA) AS 'Q4') 1.44MB External 3.5" Diskette Q3 6050 56974.3 ) 1.44MB External 3.5" Diskette Q4 5848 55341.28 ORDER BY product, quarter; 128MB Memory Card Q1 1963 110763.63 128MB Memory Card Q2 2361 132123.12 128MB Memory Card Q3 3069 170710.4 128MB Memory Card Q4 2832 157736.6

Examples

Example 6

SELECT deptno, job, APPROX_SUM(sal), APPROX_RANK(PARTITION BY deptno ORDER BY APPROX_SUM(sal) DEPTNO JOB APPROX_SUM(SAL) RK DESC) rk ------FROM emp 10 CLERK 1300 3 10 MANAGER 2450 2 GROUP BY deptno, job 10 PRESIDENT 5000 1 HAVING 20 CLERK 1900 3 20 MANAGER 2975 2 APPROX_RANK( 20 ANALYST 6000 1 PARTITION BY deptno ORDER BY 30 CLERK 950 3 APPROX_SUM(sal) DESC 30 MANAGER 2850 2 30 SALESMAN 5600 1 ) <= 9;

Examples

Example 7

SELECT cust_city, RANK(6000) WITHIN GROUP (ORDER BY CUST_CREDIT_LIMIT DESC) AS HRANK, TO_CHAR(PERCENT_RANK(6000) WITHIN GROUP (ORDER BY cust_credit_limit),'9.999') AS HPERC_RANK, TO_CHAR(CUME_DIST (6000) WITHIN GROUP (ORDER BY cust_credit_limit),'9.999') AS CUST_CITY HRANK HPERC_ HCUME_ HCUME_DIST ------Fondettes 13 .455 .478 FROM customers Fords Prairie 18 .320 .346 WHERE cust_city LIKE 'Fo%' Forest City 47 .370 .378 Forest Heights 38 .456 .464 GROUP BY cust_city; Forestville 58 .412 .418 Forrestcity 51 .438 .444 Fort Klamath 59 .356 .363 Fort William 30 .500 .508 Foxborough 52 .414 .420 Examples

Example 8

SELECT * FROM (SELECT product, channel, amount_sold, quantity_sold FROM sales_view ) PIVOT (SUM(amount_sold) AS sums, SUM(quantity_sold) AS sumq FOR channel IN (5, 4, 2, 9) ) ORDER BY product;

PRODUCT 5_SUMS 5_SUMQ 4_SUMS 4_SUMQ 2_SUMS 2_SUMQ 9_SUMS 9_SUMQ ------O/S Doc Set English 142780.36 3081 381397.99 8044 6028.66 134 O/S Doc Set French 55503.58 1192 132000.77 2782 ...

Examples

Example 9

SELECT * FROM Ticker MATCH_RECOGNIZE ( SYMBOL TSTAMP MATCH_NUM VAR_ UP_DAYS TOTAL_DAYS CNT_DAYS PRICE_DIF PRICE ------PARTITION BY symbol ACME 05-APR-11 1 STRT 4 6 1 0 25 ORDER BY tstamp ACME 06-APR-11 1 DOWN 4 6 2 -13 12 MEASURES ACME 07-APR-11 1 UP 4 6 3 -10 15 ACME 08-APR-11 1 UP 4 6 4 -5 20 MATCH_NUMBER() AS match_num, ACME 09-APR-11 1 UP 4 6 5 -1 24 CLASSIFIER() AS var_match, ACME 10-APR-11 1 UP 4 6 6 0 25 FINAL COUNT(UP.tstamp) AS up_days, ACME 10-APR-11 2 STRT 1 4 1 0 25 FINAL COUNT(tstamp) AS total_days, ACME 11-APR-11 2 DOWN 1 4 2 -6 19 ACME 12-APR-11 2 DOWN 1 4 3 -10 15 RUNNING COUNT(tstamp) AS cnt_days, ACME 13-APR-11 2 UP 1 4 4 0 25 price - STRT.price AS price_dif ACME 14-APR-11 3 STRT 2 5 1 0 25 ALL ROWS PER MATCH ACME 15-APR-11 3 DOWN 2 5 2 -11 14 ACME 16-APR-11 3 DOWN 2 5 3 -13 12 AFTER MATCH SKIP TO LAST UP ACME 17-APR-11 3 UP 2 5 4 -11 14 PATTERN (STRT DOWN+ UP+) ACME 18-APR-11 3 UP 2 5 5 -1 24 DEFINE DOWN AS DOWN.price < PREV(DOWN.price), UP AS UP.price > PREV(UP.price) ) MR ORDER BY MR.symbol, MR.match_num, MR.tstamp;

Examples

Example 10

SELECT empno, ename, deptno, sal, SUM(sal) OVER (PARTITION BY deptno ORDER BY sal) AS run_total_sal_by_dept, SUM(sal) OVER (PARTITION BY deptno ORDER BY sal ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW) AS row_run_total_sal_by_dept FROM emp; EMPNO ENAME DEPTNO SAL RUN_TOTAL_SAL_BY_DEPT ROW_RUN_TOTAL_SAL_BY_DEPT ------7934 MILLER 10 1300 1300 1300 7782 CLARK 10 2450 3750 3750 7839 KING 10 5000 8750 8750 7369 SMITH 20 800 800 800 7876 ADAMS 20 1100 1900 1900 7566 JONES 20 2975 4875 4875 7788 SCOTT 20 3000 10875 7875 7902 FORD 20 3000 10875 10875 7900 JAMES 30 950 950 950 7654 MARTIN 30 1250 3450 2200 7521 WARD 30 1250 3450 3450 7844 TURNER 30 1500 4950 4950 7499 ALLEN 30 1600 6550 6550 7698 BLAKE 30 2850 9400 9400

Examples Example 11

• Windowing Aggregate Functions With Physical Offsets

SELECT t.time_id, TO_CHAR(amount_sold, '9,999,999,999') AS INDIV_SALE, TO_CHAR(SUM(amount_sold) OVER ( PARTITION BY t.time_id ORDER BY t.time_id ROWS UNBOUNDED PRECEEDING), '9,999,999,999’ ) AS CUM_SALES FROM sales s, times t, customers c TIME_ID INDIV_SALE CUM_SALES ------WHERE s.time_id=t.time_id AND s.cust_id=c.cust_id 12-DEC-99 23 23 AND t.time_id IN 12-DEC-99 9 32 (TO_DATE('11-DEC-1999'), TO_DATE('12-DEC-1999')) 12-DEC-99 14 46 12-DEC-99 24 70 AND c.cust_id 12-DEC-99 19 89 BETWEEN 6500 AND 6600 ORDER BY t.time_id; Examples

Example 12

SELECT channel_desc, country_iso_code, SUM(amount_sold) SALES$, RANK() OVER (PARTITION BY GROUPING_ID(channel_desc, country_iso_code) ORDER BY SUM(amount_sold) DESC) AS RANK_PER_GROUP FROM sales, customers, times, channels, countries CHANNEL_DESC CO SALES$ RANK_PER_GROUP WHERE sales.time_id=times.time_id ------Direct Sales US 616539.04 1 AND sales.cust_id=customers.cust_id Direct Sales GB 83869.96 2 Internet US 82595.71 3 AND countries.country_id = customers.country_id Direct Sales JP 79047.78 4 Internet JP 7103.39 5 AND sales.channel_id = channels.channel_id Internet GB 6477.98 6 Direct Sales 779456.78 1 AND channels.channel_desc IN ('Direct Sales', 'Internet') Internet 96177.08 2 US 699134.75 1 AND times.calendar_month_desc='2000-07' GB 90347.94 2 JP 86151.17 3 AND country_iso_code IN ('GB', 'US', 'JP') 875633.86 1 GROUP BY CUBE (channel_desc, country_iso_code);

Examples CONCLUDING REMARKS Concluding Remarks

• The SQL BI API has a robust software structure providing both cohesion and coupling for database host languages, such as Java, PL/SQL, Python, and Cloud Platforms such as Oracle, Microsoft and Amazon. • The evolution of the SQL BI API relies on the solid architecture supported by both US standards, e.g., ANSI, and global standards such as ISO. • This robustness provides significant flexibility to attain optimal reliability and performance tuning improvements when upgrading, migrating applications and systems SQL BI code to use new features. • Quality assurance procedures and testing standards, such as regression testing, SIT, UAT, A/B testing, or simply smoke testing apply as either mandated by regulatory compliance or required by organizational practices. • Future SQL BI API opens doors to AI and machine learning functionality.

Concluding Remarks THANKS!

QUESTIONS AND ANSWERS

Q & A

?