Modern SQL: Evolution of a dinosaur

Markus Winand

Kraków, 9-11 May 2018 Still using Windows 3.1?

So why stick with SQL-92?

@ModernSQL - https://modern-sql.com/ @MarkusWinand SQL:1999 WITH (Common Expressions) WITH (non-recursive) The Problem

Nested queries are hard to read:

SELECT … FROM (SELECT … FROM t1 Understand JOIN (SELECT … FROM … this first ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) The Problem

Nested queries are hard to read:

SELECT … FROM (SELECT … FROM t1 Then this... JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) The Problem

Nested queries are hard to read:

SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … Then this... ) c ON (…) WITH (non-recursive) The Problem

Nested queries are hard to read:

SELECT … Finally the first line makes sense FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) Since SQL:1999

CTEs are statement-scoped views: Keyword WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999

CTEs are statement-scoped views: Name of CTE and (here WITH optional) column names a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999

CTEs are statement-scoped views:

WITH a (c1, c2, c3) Definition AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999

CTEs are statement-scoped views:

WITH Introduces a (c1, c2, c3) another CTE AS (SELECT c1, c2, c3 FROM …), b (c4, …) Don't repeat AS (SELECT c4, … WITH FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999

CTEs are statement-scoped views:

WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 May refer to JOIN a previous CTEs ON (…) ), c (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Third CTE c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) No comma! SELECT … FROM b JOIN c ON (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) Main query SELECT … FROM b JOIN c ON (…) WITH (non-recursive) CTEs are statement-scoped views: Since SQL:1999

WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a Read ON (…) top down ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) WITH (non-recursive) Use-Cases

‣ Literate SQL https://modern-sql.com/use-case/literate-sql

Organize SQL code to improve maintainability

‣ Assign column names https://modern-sql.com/use-case/naming-unnamed-columns

to tables produced by values or unnest. ‣ Overload tables (for testing)

with queries hide tables https://modern-sql.com/use-case/unit-tests-on-transient-data of the same name. WITH (non-recursive) In a Nutshell

WITH are the "private methods" of SQL WITH is a prefix to SELECT WITH queries are only visible in the SELECT they precede WITH in detail: https://modern-sql.com/feature/with WITH (non-recursive) Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL 3.8.3[0] SQLite

7.0 DB2 LUW 9iR2 Oracle 2005[1] SQL Server [0]Only for top-level SELECT statements [1]Only allowed at the very begin of a statement. E.g. WITH...INSERT...SELECT. WITH RECURSIVE (Common Table Expressions) WITH RECURSIVE The Problem

(This page is intentionally left blank) WITH RECURSIVE The Problem

Coping with hierarchies in the Adjacency List Model[0]

CREATE TABLE t ( id NUMERIC NOT NULL, parent_id NUMERIC, … PRIMARY KEY (id) )

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE The Problem

Coping with hierarchies in the Adjacency List Model[0]

SELECT * FROM t AS d0 LEFT JOIN t AS d1 WHERE d0.id = ? ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE The Problem

Coping with hierarchies in the Adjacency List Model[0]

SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) WHERE d0.id = ?

[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE Since SQL:1999

WITH RECURSIVE d (id, parent, …) AS (SELECT id, parent, … SELECT * FROM tbl FROM t AS d0 WHERE id = ? LEFT JOIN t AS d1 UNION ALL ON (d1.parent_id=d0.id) SELECT id, parent, … LEFT JOIN t AS d2 FROM d ON (d2.parent_id=d1.id) JOIN tbl WHERE d0.id = ? ON (tbl.parent=d.id) ) SELECT * FROM subtree WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: Keyword WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) Column list AS (SELECT 1 mandatory here UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 Executed first UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) Result sent there AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 Result FROM cte visible WHERE n < 3) twice SELECT * FROM cte WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL Once it becomes n SELECT n+1 part of --- FROM cte the final 1 WHERE n < 3) result 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1

UNION ALL Second n SELECT n+1 leg of --- FROM cte UNION 1 WHERE n < 3) is 2 SELECT * FROM cte executed 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL n SELECT n+1 n=3 --- FROM cte doesn't 1 WHERE n < 3) match 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999

Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:

WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL n SELECT n+1 n=3 --- FROM cte doesn't 1 WHERE n < 3) match 2 SELECT * FROM cte Loop 3 terminates (3 rows) WITH RECURSIVE Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL 3.8.3[0] SQLite

7.0 DB2 LUW 11gR2 Oracle 2005 SQL Server [0]Only for top-level SELECT statements SQL:2003 OVER and PARTITION BY OVER (PARTITION BY) The Problem

Two distinct concepts could not be used independently: ‣ Merge rows with the same key properties ‣ GROUP BY to specify key properties ‣ DISTINCT to use full row as key ‣ Aggregate data related rows ‣ Requires GROUP BY to segregate the rows ‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows OVER (PARTITION BY) The Problem

No ⇠ Aggregate ⇢ Yes

SELECT c1

No , c2 ⇢

FROM t

SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t

Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) The Problem

No ⇠ Aggregate ⇢ Yes

SELECT c1 SELECT c1

No , c2 , c2 , tot ⇢

FROM t FROM t SELECT c1 , SUM(c2) tot FROM t JOIN ( ) ta GROUP BY c1 ON (t.c1=ta.c1)

SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t

Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) The Problem

No ⇠ Aggregate ⇢ Yes

SELECT c1 SELECT c1

No , c2 , c2 , tot ⇢

FROM t FROM t SELECT c1 , SUM(c2) tot FROM t JOIN ( ) ta GROUP BY c1 ON (t.c1=ta.c1)

SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t

Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) Since SQL:2003

No ⇠ Aggregate ⇢ Yes

SELECT c1 SELECT c1

No , c2 , c2 ⇢

FROM t FROM t, SUM(c2) OVER (PARTITION BY c1) FROM t

SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t

Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works

dep salary ts SELECT dep, 1 1000 1000 salary, 22 1000 2000 SUM(salary) 22 1000 2000 OVER() PARTITION BY dep) FROM emp 333 1000 3000 333 1000 3000 333 1000 3000 OVER and (Framing & Ranking) OVER (ORDER BY) The Problem

acnt id value balance SELECT id, value, 1 1 +10 +10 FROM transactions t 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem

acnt id value balance SELECT id, value, 1 1 +10 +10 (SELECT SUM(value) 22 2 +20 +30 FROM transactions t2 22 3 -10 +20 WHERE t2.id <= t.id) FROM transactions t 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem

acnt id value balance SELECT id, value, 1 1 +10 +10 (SELECT SUM(value) 22 2 +20 +30 FROM transactions t2 22 3 -10 +20 WHERE t2.id <= t.id) FROM transactions t 333 4 +50 +70 333 5 -30 +40 Range segregation (<=) not possible with 333 6 -20 +20 GROUP BY or PARTITION BY OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 SUM(value) 22 2 +20 OVER ( 22 3 -10 ORDER BY id ROWS BETWEEN 333 4 +50 UNBOUNDED PRECEDING 333 5 -30 AND CURRENT ROW 333 6 -20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +20 OVER ( PARTITION BY acnt 22 3 -10 +10 ORDER BY id ROWS BETWEEN 333 4 +50 +50 UNBOUNDED PRECEDING 333 5 -30 +20 AND CURRENT ROW

333 6 -20 . 0 ) FROM transactions t OVER (ORDER BY) Since SQL:2003

With OVER (ORDER BY n) a new type of functions make sense:

n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST 1 1 1 1 0 0.25 2 2 2 2 0.33… 0.75 3 3 2 2 0.33… 0.75 4 4 4 3 1 1 OVER (SQL:2003) Use Cases

‣ Aggregates without GROUP BY

‣ Running totals, AVG(…) OVER(ORDER BY … ROWS BETWEEN 3 PRECEDING moving averages AND 3 FOLLOWING) moving_avg

‣ Ranking SELECT * FROM (SELECT ROW_NUMBER() ‣ Top-N per Group OVER(PARTITION BY … ORDER BY …) rn , t.* Avoiding self-joins FROM t) numbered_t ‣ WHERE rn <= 3 [… many more …] OVER (SQL:2003) In a Nutshell

OVER may follow any aggregate function

OVER defines which rows are visible at each row

OVER() makes all rows visible at every row

OVER(PARTITION BY …) segregates like GROUP BY

OVER(ORDER BY … BETWEEN) segregates using <, > Impala OVER (SQL:2003) Availability Hive Spark

NuoDB 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL SQLite

7.0 DB2 LUW 8i Oracle 2005 SQL Server SQL:2006 XMLTABLE XMLTABLE Since SQL:2006

SELECT id Stored in tbl.x: , c1 * , n XPath expression to identify rows FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r *Standard SQL allows XQuery XMLTABLE Since SQL:2006

SELECT id Stored in tbl.x: , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r *Standard SQL allows XQuery XMLTABLE Since SQL:2006

SELECT id Stored in tbl.x: , c1 , n FROM tbl , XMLTABLE( XPath* expressions '/d/e' to extract data PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r *Standard SQL allows XQuery XMLTABLE Since SQL:2006

SELECT id Stored in tbl.x: , c1 , n FROM tbl , XMLTABLE( '/d/e' PASSING x Row number COLUMNS id INT PATH '@id' (like for unnest) , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r *Standard SQL allows XQuery XMLTABLE Since SQL:2006

SELECT id Result Stored in tbl.x: , c1 id | c1 | n , n ----+----+--- FROM tbl 42 | … | 1 , XMLTABLE( '/d/e' PASSING x COLUMNS id INT PATH '@id' , c1 VARCHAR(255) PATH 'c1' , n FOR ORDINALITY ) r *Standard SQL allows XQuery XMLTABLE Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 MariaDB MySQL 10[0] PostgreSQL SQLite

9.7 DB2 LUW 11gR1 Oracle SQL Server [0]No XQuery (only XPath). No default namespace declaration. SQL:2008 FETCH FIRST FETCH FIRST The Problem

Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)

SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 SQL:2003 introduced ROW_NUMBER() to number rows. But this still requires wrapping to limit the result. And how about not supporting ROW_NUMBER()? FETCH FIRST The Problem

Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)

SELECT * Dammit! FROM (SELECT * Let's take , LIMITROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 SQL:2003 introduced ROW_NUMBER() to number rows. But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()? FETCH FIRST Since SQL:2008

SQL:2008 introduced the FETCH FIRST … ROWS ONLY clause:

SELECT * FROM data ORDER BY x FETCH FIRST 10 ROWS ONLY FETCH FIRST Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 MariaDB 3.19.3[0] MySQL 6.5[1] 8.4 PostgreSQL 2.1.0[1] SQLite

7.0 DB2 LUW 12cR1 Oracle 7.0[2] 2012 SQL Server [0]Earliest mention of LIMIT. Probably inherited from mSQL [1]Functionality available using LIMIT [2]SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters SQL:2011 OFFSET OFFSET The Problem

How to fetch the rows after a limit? (pagination anybody?)

SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn > 10 and rn <= 20 OFFSET Since SQL:2011

SQL:2011 introduced OFFSET, unfortunately!

SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY OFFSET Since SQL:2011

SQL:2011 introduced OFFSET, unfortunately!

SELECT * FROM data OFFSET ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY Grab coasters & stickers! https://use-the-index-luke.com/no-offset OFFSET Since SQL:2011

1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB 3.20.3[0] 4.0.6[1] MySQL 6.5 PostgreSQL 2.1.0 SQLite

9.7[2] 11.1 DB2 LUW 12c Oracle 2012 SQL Server [0]LIMIT [offset,] limit: "With this it's easy to do a poor man's next page/previous page WWW application." [1]The release notes say "Added PostgreSQL compatible LIMIT syntax" [2]Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS OVER OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * FROM t

) SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn

)FROM t SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn

)FROM t SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn

)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 50 … 1 FROM numbered_t curr 90 … 2 90 … 2 LEFT JOIN numbered_t prev 70 … 3 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 30 … 4 OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn

)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 50 … 1 LEFT JOIN numbered_t prev 70 … 3 90 … 2 ON (curr.rn = prev.rn+1) 30 … 4 70 … 3 30 … 4 OVER (SQL:2011) The Problem

Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn

)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 +50 FROM numbered_t curr 90 … 2 50 … 1 +40 LEFT JOIN numbered_t prev 70 … 3 90 … 2 -20 ON (curr.rn = prev.rn+1) 30 … 4 70 … 3 -40 30 … 4 OVER (SQL:2011) Since SQL:2011

SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that: SELECT *, balance - COALESCE( LAG(balance) OVER(ORDER BY x) , 0) FROM t Available functions: LEAD / LAG FIRST_VALUE / LAST_VALUE NTH_VALUE(col, n) FROM FIRST/LAST RESPECT/IGNORE NULLS OVER (LEAD, LAG, …) Since SQL:2011

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2[0] MariaDB 8.0[0] MySQL 8.4[0] PostgreSQL SQLite

9.5[1] 11.1 DB2 LUW 8i[1] 11gR2 Oracle 2012[1] SQL Server [0]No IGNORE NULLS and FROM LAST [1]No NTH_VALUE System Versioning (Time Traveling) System Versioning The Problem INSERT UPDATE DELETE are DESTRUCTIVE System Versioning Since SQL:2011

Table can be system versioned, application versioned or both.

CREATE TABLE t (..., start_ts TIMESTAMP(9) GENERATED ALWAYS AS ROW START, end_ts TIMESTAMP(9) GENERATED ALWAYS AS ROW END,

PERIOD FOR SYSTEM_TIME (start_ts, end_ts) ) WITH SYSTEM VERSIONING System Versioning Since SQL:2011

INSERT ... (ID, DATA) VALUES (1, 'X')

ID Data start_ts end_ts 1 X 10:00:00

UPDATE ... SET DATA = 'Y' ...

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00

DELETE ... WHERE ID = 1 INSERT ... (ID, DATA) VALUES (1, 'X')

ID Data start_ts end_ts System Versioning1 X 10:00:00 Since SQL:2011

UPDATE ... SET DATA = 'Y' ...

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00

DELETE ... WHERE ID = 1

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 System Versioning Since SQL:2011

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 Although multiple versions exist, only the “current” one is visible per default.

After 12:00:00, SELECT * FROM t doesn’t return anything anymore. System Versioning Since SQL:2011

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 With FOR … AS OF you can query anything you like: SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2015-04-02 10:30:00'

ID Data start_ts end_ts 1 X 10:00:00 11:00:00 System Versioning Since SQL:2011

1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 MariaDB[0] MySQL PostgreSQL SQLite

10.1[1] DB2 LUW 10gR1[2] Oracle 2016 SQL Server [0]Available in MariaDB 10.3 beta. [1]Third column required (tx id), history table required. [2]Functionality available using Flashback SQL:2016 (released: 2016-12-15) LISTAGG LISTAGG Since SQL:2016

grp val SELECT grp grp val 1 B , LISTAGG(val, ', ') 1 A WITHIN GROUP (ORDER BY val) 1 A, B, C 1 C FROM t 2 X GROUP BY grp 2 X Default

LISTAGG(val, ', ' ON OVERFLOW ERROR) Default LISTAGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITHOUT COUNT) ➔ 'A, B, ...' LISTAGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITH COUNT) ➔ 'A, B, ...(1)' LISTAGG Availability

1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB 4.1[0] MySQL 7.4[1] 8.4[2]9.0[3] PostgreSQL 3.5.4[4] SQLite

10.5[5] DB2 LUW 11gR1 12cR2 Oracle SQL Server[6] [0] [4] [0] group_concat group_concat w/o ORDER BY [1] group_concat [5] [1] array_to_string No ON OVERFLOW clause [2] array_to_string [6] [2] array_agg string_agg announced for vNext [3] array_agg [3] string_aggstring_agg [4]group_concat without ORDER BY [5]No ON OVERFLOW clause [6]string_agg announced for vNext New in SQL:2016 JSON LISTAGG https://modern-sql.com/feature/listagg ROW PATTERN MATCHING https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016 DATE FORMAT POLYMORPHIC TABLE FUNCTIONS

➔ https://modern-sql.com/blog/2017-06/whats-new-in-sql-2016 Modern SQL? @MarkusWinand

SQL has evolved beyond the relational idea. Modern SQL? @MarkusWinand

SQL has evolved beyond the relational idea.

If you are using SQL like 25 years ago, you are doing it wrong! Modern SQL? @MarkusWinand

SQL has evolved beyond the relational idea.

If you are using SQL like 25 years ago, you are doing it wrong!

A lot has happened since SQL-92. https://www.flickr.com/photos/mfoubister/25367243054/ I have shown you a few features today

https://www.flickr.com/photos/mfoubister/25367243054/ I have shown you a few features today

There are hundreds more to discover

https://www.flickr.com/photos/mfoubister/25367243054/ @ModernSQL modern-sql.com

My other website: https://use-the-index-luke.com

Training & co: https://winand.at/