LATERAL LATERAL Before SQL:1999
Total Page:16
File Type:pdf, Size:1020Kb
Still using Windows 3.1? So why stick with SQL-92? @ModernSQL - https://modern-sql.com/ @MarkusWinand SQL:1999 LATERAL LATERAL Before SQL:1999 Select-list sub-queries must be scalar[0]: (an atomic quantity that can hold only one value at a time[1]) SELECT … , (SELECT column_1 FROM t1 WHERE t1.x = t2.y ) AS c FROM t2 … [0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar LATERAL Before SQL:1999 Select-list sub-queries must be scalar[0]: (an atomic quantity that can hold only one value at a time[1]) SELECT … , (SELECT column_1 , column_2 FROM t1 ✗ WHERE t1.x = t2.y ) AS c More than FROM t2 one column? … ⇒Syntax error [0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar LATERAL Before SQL:1999 Select-list sub-queries must be scalar[0]: (an atomic quantity that can hold only one value at a time[1]) SELECT … More than , (SELECT column_1 , column_2 one row? ⇒Runtime error! FROM t1 ✗ WHERE t1.x = t2.y } ) AS c More than FROM t2 one column? … ⇒Syntax error [0] Neglecting row values and other workarounds here; [1] https://en.wikipedia.org/wiki/Scalar LATERAL Since SQL:1999 Lateral derived queries can see table names defined before: SELECT * FROM t1 CROSS JOIN LATERAL (SELECT * FROM t2 WHERE t2.x = t1.x ) derived_table ON (true) LATERAL Since SQL:1999 Lateral derived queries can see table names defined before: SELECT * FROM t1 Valid due to CROSS JOIN LATERAL (SELECT * LATERAL FROM t2 keyword WHERE t2.x = t1.x ) derived_table ON (true) LATERAL Since SQL:1999 Lateral derived queries can see table names defined before: SELECT * FROM t1 Valid due to CROSS JOIN LATERAL (SELECT * LATERAL FROM t2 keyword WHERE t2.x = t1.x Useless, but ) derived_table still required ON (true) LATERAL Since SQL:1999 Lateral derived queries can see table names defined before: SELECT * FROM t1 Valid due to CROSS JOIN LATERAL (SELECT * LATERAL keyword Use FROM t2 CROSS JOIN to WHERE t2.x = t1.x omit the ON clause ) derived_table ON (true) But WHY? LATERAL Use-Cases LATERAL Use-Cases ‣ Top-N per group inside a lateral derived table FETCH FIRST (or LIMIT, TOP) applies per row from left tables. LATERAL Use-Cases ‣ Top-N per group FROM t inside a lateral derived table CROSS JOIN LATERAL ( SELECT … (or , ) FETCH FIRST LIMIT TOP FROM … applies per row from left tables. WHERE t.c=… ORDER BY … LIMIT 10 ) derived_table LATERAL Use-Cases ‣ Top-N per group FROM t inside a lateral derived table CROSS JOIN LATERAL ( SELECT … (or , ) FETCH FIRST LIMIT TOP FROM … applies per row from left tables. WHERE t.c=… ORDER BY … Add proper index LIMIT 10 for Top-N query ) derived_table https://use-the-index-luke.com/sql/partial-results/top-n-queries LATERAL Use-Cases ‣ Top-N per group FROM t inside a lateral derived table CROSS JOIN LATERAL ( SELECT … (or , ) FETCH FIRST LIMIT TOP FROM … applies per row from left tables. WHERE t.c=… ORDER BY … ‣ Also useful to find most recent LIMIT 10 news from several subscribed ) derived_table topics (“multi-source top-N”). LATERAL Use-Cases ‣ Top-N per group FROM t inside a lateral derived table CROSS JOIN LATERAL ( SELECT … (or , ) FETCH FIRST LIMIT TOP FROM … applies per row from left tables. WHERE t.c=… ORDER BY … ‣ Also useful to find most recent LIMIT 10 news from several subscribed ) derived_table topics (“multi-source top-N”). ‣ Table function arguments FROM t (TABLE often implies LATERAL) JOIN TABLE (your_func(t.c)) LATERAL Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 MariaDB 8.0.14 MySQL 9.3 PostgreSQL SQLite 9.1 DB2 LUW 11gR1[0] 12cR1 Oracle 2005[1] SQL Server [0]Undocumented. Requires setting trace event 22829. [1]LATERAL is not supported as of SQL Server 2016 but [CROSS|OUTER] APPLY can be used for the same effect. WITH RECURSIVE (Common Table Expressions) WITH RECURSIVE CREATE TABLE t ( id INTEGER, parent INTEGER) WITH RECURSIVE CREATE TABLE t ( id INTEGER, parent INTEGER) WITH RECURSIVE CREATE TABLE t ( id INTEGER, parent INTEGER) WITH RECURSIVE Since SQL:1999 WITH RECURSIVE Since SQL:1999 SELECT t.id, t.parent FROM t WHERE t.id = ? WITH RECURSIVE Since SQL:1999 SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t WHERE t.parent = ? WITH RECURSIVE Since SQL:1999 SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t WHERE t.parent = ? WITH RECURSIVE Since SQL:1999 SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t WHERE t.parent = ? WITH RECURSIVE Since SQL:1999 WITH RECURSIVE prev (id, parent) AS ( SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t WHERE t.parent = ? ) SELECT * FROM prev WITH RECURSIVE Since SQL:1999 WITH RECURSIVE prev (id, parent) AS ( SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t JOIN prev ON t.parent = prev.id ) SELECT * FROM prev WITH RECURSIVE Since SQL:1999 WITH RECURSIVE prev (id, parent) AS ( SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t JOIN prev ON t.parent = prev.id ) SELECT * FROM prev WITH RECURSIVE Since SQL:1999 WITH RECURSIVE prev (id, parent) AS ( SELECT t.id, t.parent FROM t WHERE t.id = ? UNION ALL SELECT t.id, t.parent FROM t JOIN prev ON t.parent = prev.id ) SELECT * FROM prev WITH RECURSIVE Use Cases WITH RECURSIVE Use Cases ‣ Processing graphs “[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer Shortest route from person A performance superior to that of the dedicated graph to B in LinkedIn/Facebook/… databases.” event.cwi.nl/grades2013/07-welc.pdf WITH RECURSIVE Use Cases ‣ Processing graphs “[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer Shortest route from person A performance superior to that of the dedicated graph to B in LinkedIn/Facebook/… databases.” event.cwi.nl/grades2013/07-welc.pdf https://wiki.postgresql.org/wiki/Loose_indexscan ‣ Finding distinct values in n*log(N)† time complexity † n … # distinct values, N … # of table rows. Suitable index required WITH RECURSIVE Use Cases ‣ Processing graphs “[…] for certain classes of graphs, solutions utilizing relational database technology […] can offer Shortest route from person A performance superior to that of the dedicated graph to B in LinkedIn/Facebook/… databases.” event.cwi.nl/grades2013/07-welc.pdf https://wiki.postgresql.org/wiki/Loose_indexscan ‣ Finding distinct values † WITH RECURSIVE gen (n) AS ( in n*log(N) time complexity SELECT 1 UNION ALL SELECT n+1 ‣ Row generators FROM gen WHERE n < ? To fill gaps (e.g., in time ) series), generate test data SELECT * FROM gen † n … # distinct values, N … # of table rows. Suitable index required WITH RECURSIVE Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1 10.2 MariaDB 8.0 MySQL 8.4 PostgreSQL 3.8.3[0] SQLite 7.0 DB2 LUW 11gR2 Oracle 2005 SQL Server [0]Only for top-level SELECT statements GROUPING SETS GROUPING SETS Before SQL:1999 Only one GROUP BY operation at a time: GROUPING SETS Before SQL:1999 Only one GROUP BY operation at a time: Monthly revenue SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month GROUPING SETS Before SQL:1999 Only one GROUP BY operation at a time: Monthly revenue Yearly revenue SELECT year SELECT year , month , sum(revenue) , sum(revenue) FROM tbl FROM tbl GROUP BY year, month GROUP BY year GROUPING SETS Before SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month SELECT year , sum(revenue) FROM tbl GROUP BY year GROUPING SETS Before SQL:1999 SELECT year , month , sum(revenue) FROM tbl GROUP BY year, month UNION ALL SELECT year , null , sum(revenue) FROM tbl GROUP BY year GROUPING SETS Since SQL:1999 SELECT year SELECT year , month , month , sum(revenue) , sum(revenue) FROM tbl FROM tbl GROUP BY year, month GROUP BY UNION ALL GROUPING SETS ( SELECT year (year, month) , null , (year) , sum(revenue) ) FROM tbl GROUP BY year GROUPING SETS Availability 1999 2001 2003 2005 2007 2009 2011 2013 2015 2017 5.1[0] MariaDB 5.0[1] MySQL 9.5 PostgreSQL SQLite 5 DB2 LUW 9iR1 Oracle 2008 SQL Server [0]Only ROLLUP (properitery syntax). [1]Only ROLLUP (properitery syntax). GROUPING function since MySQL 8.0. SQL:2003 OVER and PARTITION BY OVER (PARTITION BY) The Problem Two distinct concepts could not be used independently: OVER (PARTITION BY) The Problem Two distinct concepts could not be used independently: ‣ Merge rows with the same key properties ‣ GROUP BY to specify key properties ‣ DISTINCT to use full row as key OVER (PARTITION BY) The Problem Two distinct concepts could not be used independently: ‣ Merge rows with the same key properties ‣ GROUP BY to specify key properties ‣ DISTINCT to use full row as key ‣ Aggregate data from related rows ‣ Requires GROUP BY to segregate the rows ‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows OVER (PARTITION BY) The Problem OVER (PARTITION BY) The Problem SELECT c1 , c2 FROM t OVER (PARTITION BY) The Problem SELECT c1 No , c2 ⇢ FROM t Merge rows Merge ⇠ Yes Yes OVER (PARTITION BY) The Problem SELECT c1 No , c2 ⇢ FROM t SELECT DISTINCT Merge rows Merge c1 ⇠ , c2 Yes Yes FROM t OVER (PARTITION BY) The Problem No ⇠ Aggregate ⇢ Yes SELECT c1 No , c2 ⇢ FROM t SELECT DISTINCT Merge rows Merge c1 ⇠ , c2 Yes Yes FROM t OVER (PARTITION BY) The Problem No ⇠ Aggregate ⇢ Yes SELECT c1 No , c2 ⇢ FROM t SELECT DISTINCT SELECT c1 Merge rows Merge c1 , SUM(c2) tot ⇠ , c2 FROM t Yes Yes FROM t GROUP BY c1 OVER (PARTITION BY) The Problem No ⇠ Aggregate ⇢ Yes SELECT c1 SELECT c1 No , c2 , c2 ⇢ FROM t