Modern SQL: Evolution of a dinosaur
Markus Winand
Kraków, 9-11 May 2018 Still using Windows 3.1?
So why stick with SQL-92?
@ModernSQL - https://modern-sql.com/ @MarkusWinand SQL:1999 WITH (Common Table Expressions) WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT … FROM (SELECT … FROM t1 Understand JOIN (SELECT … FROM … this first ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT … FROM (SELECT … FROM t1 Then this... JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT … FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … Then this... ) c ON (…) WITH (non-recursive) The Problem
Nested queries are hard to read:
SELECT … Finally the first line makes sense FROM (SELECT … FROM t1 JOIN (SELECT … FROM … ) a ON (…) ) b JOIN (SELECT … FROM … ) c ON (…) WITH (non-recursive) Since SQL:1999
CTEs are statement-scoped views: Keyword WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999
CTEs are statement-scoped views: Name of CTE and (here WITH optional) column names a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999
CTEs are statement-scoped views:
WITH a (c1, c2, c3) Definition AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999
CTEs are statement-scoped views:
WITH Introduces a (c1, c2, c3) another CTE AS (SELECT c1, c2, c3 FROM …), b (c4, …) Don't repeat AS (SELECT c4, … WITH FROM t1 JOIN a ON (…) ), c (…) WITH (non-recursive) Since SQL:1999
CTEs are statement-scoped views:
WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 May refer to JOIN a previous CTEs ON (…) ), c (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), Third CTE c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) No comma! SELECT … FROM b JOIN c ON (…) WITH a (c1, c2, c3) WITH (non-recursive)AS (SELECT c1, c2, c3 FROM …), Since SQL:1999 b (c4, …) AS (SELECT c4, … FROM t1 JOIN a ON (…) ), c (…) AS (SELECT … FROM …) Main query SELECT … FROM b JOIN c ON (…) WITH (non-recursive) CTEs are statement-scoped views: Since SQL:1999
WITH a (c1, c2, c3) AS (SELECT c1, c2, c3 FROM …), b (c4, …) AS (SELECT c4, … FROM t1 JOIN a Read ON (…) top down ), c (…) AS (SELECT … FROM …) SELECT … FROM b JOIN c ON (…) WITH (non-recursive) Use-Cases
‣ Literate SQL https://modern-sql.com/use-case/literate-sql
Organize SQL code to improve maintainability
‣ Assign column names https://modern-sql.com/use-case/naming-unnamed-columns
to tables produced by values or unnest. ‣ Overload tables (for testing)
with queries hide tables https://modern-sql.com/use-case/unit-tests-on-transient-data of the same name. WITH (non-recursive) In a Nutshell
WITH are the "private methods" of SQL WITH is a prefix to SELECT WITH queries are only visible in the SELECT they precede WITH in detail: https://modern-sql.com/feature/with WITH (non-recursive) Availability
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL 3.8.3[0] SQLite
7.0 DB2 LUW 9iR2 Oracle 2005[1] SQL Server [0]Only for top-level SELECT statements [1]Only allowed at the very begin of a statement. E.g. WITH...INSERT...SELECT. WITH RECURSIVE (Common Table Expressions) WITH RECURSIVE The Problem
(This page is intentionally left blank) WITH RECURSIVE The Problem
Coping with hierarchies in the Adjacency List Model[0]
CREATE TABLE t ( id NUMERIC NOT NULL, parent_id NUMERIC, … PRIMARY KEY (id) )
[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE The Problem
Coping with hierarchies in the Adjacency List Model[0]
SELECT * FROM t AS d0 LEFT JOIN t AS d1 WHERE d0.id = ? ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id)
[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE The Problem
Coping with hierarchies in the Adjacency List Model[0]
SELECT * FROM t AS d0 LEFT JOIN t AS d1 ON (d1.parent_id=d0.id) LEFT JOIN t AS d2 ON (d2.parent_id=d1.id) WHERE d0.id = ?
[0] Hierarchies implemented using a “parent id” — see “Joe Celko’s Trees and Hierarchies in SQL for Smarties” WITH RECURSIVE Since SQL:1999
WITH RECURSIVE d (id, parent, …) AS (SELECT id, parent, … SELECT * FROM tbl FROM t AS d0 WHERE id = ? LEFT JOIN t AS d1 UNION ALL ON (d1.parent_id=d0.id) SELECT id, parent, … LEFT JOIN t AS d2 FROM d ON (d2.parent_id=d1.id) JOIN tbl WHERE d0.id = ? ON (tbl.parent=d.id) ) SELECT * FROM subtree WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]: Keyword WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) Column list AS (SELECT 1 mandatory here UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 Executed first UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) Result sent there AS (SELECT 1 UNION ALL SELECT n+1 FROM cte WHERE n < 3) SELECT * FROM cte WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL SELECT n+1 Result FROM cte visible WHERE n < 3) twice SELECT * FROM cte WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL Once it becomes n SELECT n+1 part of --- FROM cte the final 1 WHERE n < 3) result 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1
UNION ALL Second n SELECT n+1 leg of --- FROM cte UNION 1 WHERE n < 3) is 2 SELECT * FROM cte executed 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL It's a n SELECT n+1 loop! --- FROM cte 1 WHERE n < 3) 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL n SELECT n+1 n=3 --- FROM cte doesn't 1 WHERE n < 3) match 2 SELECT * FROM cte 3 (3 rows) WITH RECURSIVE Since SQL:1999
Recursive common table expressions may refer to themselves in a leg of a UNION [ALL]:
WITH RECURSIVE cte (n) AS (SELECT 1 UNION ALL n SELECT n+1 n=3 --- FROM cte doesn't 1 WHERE n < 3) match 2 SELECT * FROM cte Loop 3 terminates (3 rows) WITH RECURSIVE Availability
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL 3.8.3[0] SQLite
7.0 DB2 LUW 11gR2 Oracle 2005 SQL Server [0]Only for top-level SELECT statements SQL:2003 OVER and PARTITION BY OVER (PARTITION BY) The Problem
Two distinct concepts could not be used independently: ‣ Merge rows with the same key properties ‣ GROUP BY to specify key properties ‣ DISTINCT to use full row as key ‣ Aggregate data from related rows ‣ Requires GROUP BY to segregate the rows ‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows OVER (PARTITION BY) The Problem
No ⇠ Aggregate ⇢ Yes
SELECT c1
No , c2 ⇢
FROM t
SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t
Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) The Problem
No ⇠ Aggregate ⇢ Yes
SELECT c1 SELECT c1
No , c2 , c2 , tot ⇢
FROM t FROM t SELECT c1 , SUM(c2) tot FROM t JOIN ( ) ta GROUP BY c1 ON (t.c1=ta.c1)
SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t
Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) The Problem
No ⇠ Aggregate ⇢ Yes
SELECT c1 SELECT c1
No , c2 , c2 , tot ⇢
FROM t FROM t SELECT c1 , SUM(c2) tot FROM t JOIN ( ) ta GROUP BY c1 ON (t.c1=ta.c1)
SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t
Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) Since SQL:2003
No ⇠ Aggregate ⇢ Yes
SELECT c1 SELECT c1
No , c2 , c2 ⇢
FROM t FROM t, SUM(c2) OVER (PARTITION BY c1) FROM t
SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t
Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 6000 salary, 22 1000 6000 SUM(salary) 22 1000 6000 OVER() FROM emp 333 1000 6000 333 1000 6000 333 1000 6000 OVER (PARTITION BY) How it works
dep salary ts SELECT dep, 1 1000 1000 salary, 22 1000 2000 SUM(salary) 22 1000 2000 OVER() PARTITION BY dep) FROM emp 333 1000 3000 333 1000 3000 333 1000 3000 OVER and ORDER BY (Framing & Ranking) OVER (ORDER BY) The Problem
acnt id value balance SELECT id, value, 1 1 +10 +10 FROM transactions t 22 2 +20 +30 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem
acnt id value balance SELECT id, value, 1 1 +10 +10 (SELECT SUM(value) 22 2 +20 +30 FROM transactions t2 22 3 -10 +20 WHERE t2.id <= t.id) FROM transactions t 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 OVER (ORDER BY) The Problem
acnt id value balance SELECT id, value, 1 1 +10 +10 (SELECT SUM(value) 22 2 +20 +30 FROM transactions t2 22 3 -10 +20 WHERE t2.id <= t.id) FROM transactions t 333 4 +50 +70 333 5 -30 +40 Range segregation (<=) not possible with 333 6 -20 +20 GROUP BY or PARTITION BY OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id 333 4 +50 +70 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +30 OVER ( 22 3 -10 +20 ORDER BY id ROWS BETWEEN 333 4 +50 +70 UNBOUNDED PRECEDING 333 5 -30 +40 AND CURRENT ROW 333 6 -20 +20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 SUM(value) 22 2 +20 OVER ( 22 3 -10 ORDER BY id ROWS BETWEEN 333 4 +50 UNBOUNDED PRECEDING 333 5 -30 AND CURRENT ROW 333 6 -20 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
acnt id value balance SELECT id, value, 1 1 +10 +10 SUM(value) 22 2 +20 +20 OVER ( PARTITION BY acnt 22 3 -10 +10 ORDER BY id ROWS BETWEEN 333 4 +50 +50 UNBOUNDED PRECEDING 333 5 -30 +20 AND CURRENT ROW
333 6 -20 . 0 ) FROM transactions t OVER (ORDER BY) Since SQL:2003
With OVER (ORDER BY n) a new type of functions make sense:
n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST 1 1 1 1 0 0.25 2 2 2 2 0.33… 0.75 3 3 2 2 0.33… 0.75 4 4 4 3 1 1 OVER (SQL:2003) Use Cases
‣ Aggregates without GROUP BY
‣ Running totals, AVG(…) OVER(ORDER BY … ROWS BETWEEN 3 PRECEDING moving averages AND 3 FOLLOWING) moving_avg
‣ Ranking SELECT * FROM (SELECT ROW_NUMBER() ‣ Top-N per Group OVER(PARTITION BY … ORDER BY …) rn , t.* Avoiding self-joins FROM t) numbered_t ‣ WHERE rn <= 3 [… many more …] OVER (SQL:2003) In a Nutshell
OVER may follow any aggregate function
OVER defines which rows are visible at each row
OVER() makes all rows visible at every row
OVER(PARTITION BY …) segregates like GROUP BY
OVER(ORDER BY … BETWEEN) segregates using <, > Impala OVER (SQL:2003) Availability Hive Spark
NuoDB 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL SQLite
7.0 DB2 LUW 8i Oracle 2005 SQL Server SQL:2006 XMLTABLE XMLTABLE Since SQL:2006
SELECT id Stored in tbl.x: , c1 *
SELECT id Stored in tbl.x: , c1
SELECT id Stored in tbl.x: , c1
SELECT id Stored in tbl.x: , c1
SELECT id Result Stored in tbl.x: , c1 id | c1 | n
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 MariaDB MySQL 10[0] PostgreSQL SQLite
9.7 DB2 LUW 11gR1 Oracle SQL Server [0]No XQuery (only XPath). No default namespace declaration. SQL:2008 FETCH FIRST FETCH FIRST The Problem
Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 SQL:2003 introduced ROW_NUMBER() to number rows. But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()? FETCH FIRST The Problem
Limit the result to a number of rows. (LIMIT, TOP and ROWNUM are all proprietary)
SELECT * Dammit! FROM (SELECT * Let's take , LIMITROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn <=10 SQL:2003 introduced ROW_NUMBER() to number rows. But this still requires wrapping to limit the result. And how about databases not supporting ROW_NUMBER()? FETCH FIRST Since SQL:2008
SQL:2008 introduced the FETCH FIRST … ROWS ONLY clause:
SELECT * FROM data ORDER BY x FETCH FIRST 10 ROWS ONLY FETCH FIRST Availability
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 MariaDB 3.19.3[0] MySQL 6.5[1] 8.4 PostgreSQL 2.1.0[1] SQLite
7.0 DB2 LUW 12cR1 Oracle 7.0[2] 2012 SQL Server [0]Earliest mention of LIMIT. Probably inherited from mSQL [1]Functionality available using LIMIT [2]SELECT TOP n ... SQL Server 2000 also supports expressions and bind parameters SQL:2011 OFFSET OFFSET The Problem
How to fetch the rows after a limit? (pagination anybody?)
SELECT * FROM (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn FROM data) numbered_data WHERE rn > 10 and rn <= 20 OFFSET Since SQL:2011
SQL:2011 introduced OFFSET, unfortunately!
SELECT * FROM data ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY OFFSET Since SQL:2011
SQL:2011 introduced OFFSET, unfortunately!
SELECT * FROM data OFFSET ORDER BY x OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY Grab coasters & stickers! https://use-the-index-luke.com/no-offset OFFSET Since SQL:2011
1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1 MariaDB 3.20.3[0] 4.0.6[1] MySQL 6.5 PostgreSQL 2.1.0 SQLite
9.7[2] 11.1 DB2 LUW 12c Oracle 2012 SQL Server [0]LIMIT [offset,] limit: "With this it's easy to do a poor man's next page/previous page WWW application." [1]The release notes say "Added PostgreSQL compatible LIMIT syntax" [2]Requires enabling the MySQL compatibility vector: db2set DB2_COMPATIBILITY_VECTOR=MYS OVER OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * FROM t
) SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn
)FROM t SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn
)FROM t SELECT curr.* curr , curr.balance balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 LEFT JOIN numbered_t prev 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn
)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 50 … 1 FROM numbered_t curr 90 … 2 90 … 2 LEFT JOIN numbered_t prev 70 … 3 70 … 3 ON (curr.rn = prev.rn+1) 30 … 4 30 … 4 OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn
)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 FROM numbered_t curr 90 … 2 50 … 1 LEFT JOIN numbered_t prev 70 … 3 90 … 2 ON (curr.rn = prev.rn+1) 30 … 4 70 … 3 30 … 4 OVER (SQL:2011) The Problem
Direct access of other rows of the same window is not possible. (E.g., calculate the difference to the previous rows) WITH numbered_t AS (SELECT * , ROW_NUMBER() OVER(ORDER BY x) rn
)FROM t SELECT curr.* curr prev , curr.balance balance … rn balance … rn - COALESCE(prev.balance,0) 50 … 1 +50 FROM numbered_t curr 90 … 2 50 … 1 +40 LEFT JOIN numbered_t prev 70 … 3 90 … 2 -20 ON (curr.rn = prev.rn+1) 30 … 4 70 … 3 -40 30 … 4 OVER (SQL:2011) Since SQL:2011
SQL:2011 introduced LEAD, LAG, NTH_VALUE, … for that: SELECT *, balance - COALESCE( LAG(balance) OVER(ORDER BY x) , 0) FROM t Available functions: LEAD / LAG FIRST_VALUE / LAST_VALUE NTH_VALUE(col, n) FROM FIRST/LAST RESPECT/IGNORE NULLS OVER (LEAD, LAG, …) Since SQL:2011
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2[0] MariaDB 8.0[0] MySQL 8.4[0] PostgreSQL SQLite
9.5[1] 11.1 DB2 LUW 8i[1] 11gR2 Oracle 2012[1] SQL Server [0]No IGNORE NULLS and FROM LAST [1]No NTH_VALUE System Versioning (Time Traveling) System Versioning The Problem INSERT UPDATE DELETE are DESTRUCTIVE System Versioning Since SQL:2011
Table can be system versioned, application versioned or both.
CREATE TABLE t (..., start_ts TIMESTAMP(9) GENERATED ALWAYS AS ROW START, end_ts TIMESTAMP(9) GENERATED ALWAYS AS ROW END,
PERIOD FOR SYSTEM_TIME (start_ts, end_ts) ) WITH SYSTEM VERSIONING System Versioning Since SQL:2011
INSERT ... (ID, DATA) VALUES (1, 'X')
ID Data start_ts end_ts 1 X 10:00:00
UPDATE ... SET DATA = 'Y' ...
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00
DELETE ... WHERE ID = 1 INSERT ... (ID, DATA) VALUES (1, 'X')
ID Data start_ts end_ts System Versioning1 X 10:00:00 Since SQL:2011
UPDATE ... SET DATA = 'Y' ...
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00
DELETE ... WHERE ID = 1
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 System Versioning Since SQL:2011
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 Although multiple versions exist, only the “current” one is visible per default.
After 12:00:00, SELECT * FROM t doesn’t return anything anymore. System Versioning Since SQL:2011
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 1 Y 11:00:00 12:00:00 With FOR … AS OF you can query anything you like: SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2015-04-02 10:30:00'
ID Data start_ts end_ts 1 X 10:00:00 11:00:00 System Versioning Since SQL:2011
1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 MariaDB[0] MySQL PostgreSQL SQLite
10.1[1] DB2 LUW 10gR1[2] Oracle 2016 SQL Server [0]Available in MariaDB 10.3 beta. [1]Third column required (tx id), history table required. [2]Functionality available using Flashback SQL:2016 (released: 2016-12-15) LISTAGG LISTAGG Since SQL:2016
grp val SELECT grp grp val 1 B , LISTAGG(val, ', ') 1 A WITHIN GROUP (ORDER BY val) 1 A, B, C 1 C FROM t 2 X GROUP BY grp 2 X Default
LISTAGG(val, ', ' ON OVERFLOW ERROR) Default LISTAGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITHOUT COUNT) ➔ 'A, B, ...' LISTAGG(val, ', ' ON OVERFLOW TRUNCATE '...' WITH COUNT) ➔ 'A, B, ...(1)' LISTAGG Availability
1999 2001 2003 2005 2007 2009 2011 2013 2015 5.1[0] MariaDB 4.1[0] MySQL 7.4[1] 8.4[2]9.0[3] PostgreSQL 3.5.4[4] SQLite
10.5[5] DB2 LUW 11gR1 12cR2 Oracle SQL Server[6] [0] [4] [0] group_concat group_concat w/o ORDER BY [1] group_concat [5] [1] array_to_string No ON OVERFLOW clause [2] array_to_string [6] [2] array_agg string_agg announced for vNext [3] array_agg [3] string_aggstring_agg [4]group_concat without ORDER BY [5]No ON OVERFLOW clause [6]string_agg announced for vNext New in SQL:2016 JSON LISTAGG https://modern-sql.com/feature/listagg ROW PATTERN MATCHING https://www.slideshare.net/MarkusWinand/row-pattern-matching-in-sql2016 DATE FORMAT POLYMORPHIC TABLE FUNCTIONS
➔ https://modern-sql.com/blog/2017-06/whats-new-in-sql-2016 Modern SQL? @MarkusWinand
SQL has evolved beyond the relational idea. Modern SQL? @MarkusWinand
SQL has evolved beyond the relational idea.
If you are using SQL like 25 years ago, you are doing it wrong! Modern SQL? @MarkusWinand
SQL has evolved beyond the relational idea.
If you are using SQL like 25 years ago, you are doing it wrong!
A lot has happened since SQL-92. https://www.flickr.com/photos/mfoubister/25367243054/ I have shown you a few features today
https://www.flickr.com/photos/mfoubister/25367243054/ I have shown you a few features today
There are hundreds more to discover
https://www.flickr.com/photos/mfoubister/25367243054/ @ModernSQL modern-sql.com
My other website: https://use-the-index-luke.com
Training & co: https://winand.at/