how postgresql’s sql dialect stays ahead of its competitors · grouping sets are multiple group...

97
How PostgreSQL’s SQL dialect stays ahead of its competitors @MarkusWinand

Upload: others

Post on 23-Jan-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

How PostgreSQL’sSQL dialect stays aheadof its competitors

@MarkusWinand

Page 2: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

I’m sorry!

Page 3: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

I couldn’t make it to PgConf.EU in 2014!

Page 4: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Don’t repeat my mistake.

Page 5: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Come toPgConf.EU 2018

in Lisbon!Oct 23-26.

Page 6: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

How PostgreSQL’sSQL dialect stays aheadof its competitors

@MarkusWinand

Page 7: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

PostgreSQL 9.5 2016-01-07

Page 8: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

GROUPINGSETS

Page 9: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Only one GROUPBY operation at a time:

GROUPINGSETS Before SQL:1999

SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

Monthly revenue Yearly revenue

SELECTyear,sum(revenue)FROMtblGROUPBYyear

Page 10: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

GROUPINGSETS Before SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

SELECTyear,sum(revenue)FROMtblGROUPBYyear

UNIONALL

,null

Page 11: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

GROUPINGSETS Since SQL:1999SELECTyear,month,sum(revenue)FROMtblGROUPBYyear,month

SELECTyear,sum(revenue)FROMtblGROUPBYyear

UNIONALL

,null

SELECTyear,month,sum(revenue)FROMtblGROUPBYGROUPINGSETS((year,month),(year))

Page 12: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

GROUPINGSETS are multiple GROUPBYs in one go

() (empty parenthesis) build a group over all rows

GROUPING (function) disambiguates the meaning of NULL(was the grouped data NULL or is this column not currently grouped?)

Permutations can be created using ROLLUP and CUBE(ROLLUP(a,b,c) = GROUPINGSETS((a,b,c),(a,b),(a),())

GROUPINGSETS In a Nutshell

Page 13: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

GROUPINGSETS Availability19

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1[0] MariaDB5.0[1] MySQL

9.5 PostgreSQLSQLite

5 DB2 LUW9iR1 Oracle

2008 SQL Server[0]Only ROLLUP (properitery syntax).[1]Only ROLLUP (properitery syntax). GROUPING function since MySQL 8.0.

Page 14: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

TABLESAMPLE

Page 15: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

TABLESAMPLE Since SQL:2003

SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)

Or:

bernoulli(10)better sampling,

way slower.

User provided seed

value

Page 16: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

TABLESAMPLE Since SQL:2003

SELECT*FROMtblTABLESAMPLEsystem(10)REPEATABLE(0)

Or:

bernoulli(10)better sampling,

way slower.

User provided seed

value

Page 17: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

TABLESAMPLE Availability19

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 MariaDBMySQL

9.5[0] PostgreSQLSQLite

8.2[0] DB2 LUW8i[0] Oracle

2005[0] SQL Server[0]Not for derived tables

Page 18: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

PostgreSQL 10 2017-10-05

Page 19: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

XMLTABLE

Page 20: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTid,c1,nFROMtbl,XMLTABLE('/d/e'PASSINGxCOLUMNSidINTPATH'@id',c1VARCHAR(255)PATH'c1',nFORORDINALITY)r

XMLTABLE Since SQL:2006Stored in tbl.x:

<d><eid="42"><c1>…</c1></e></d>

XPath* expression to identify rows

*Standard SQL allows XQuery

Page 21: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTid,c1,nFROMtbl,XMLTABLE('/d/e'PASSINGxCOLUMNSidINTPATH'@id',c1VARCHAR(255)PATH'c1',nFORORDINALITY)r

XMLTABLE Since SQL:2006Stored in tbl.x:

<d><eid="42"><c1>…</c1></e></d>

*Standard SQL allows XQuery

Page 22: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTid,c1,nFROMtbl,XMLTABLE('/d/e'PASSINGxCOLUMNSidINTPATH'@id',c1VARCHAR(255)PATH'c1',nFORORDINALITY)r

XMLTABLE Since SQL:2006Stored in tbl.x:

<d><eid="42"><c1>…</c1></e></d>

*Standard SQL allows XQuery

XPath* expressions to extract data

Page 23: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTid,c1,nFROMtbl,XMLTABLE('/d/e'PASSINGxCOLUMNSidINTPATH'@id',c1VARCHAR(255)PATH'c1',nFORORDINALITY)r

XMLTABLE Since SQL:2006Stored in tbl.x:

<d><eid="42"><c1>…</c1></e></d>

*Standard SQL allows XQuery

Row number (like for unnest)

Page 24: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTid,c1,nFROMtbl,XMLTABLE('/d/e'PASSINGxCOLUMNSidINTPATH'@id',c1VARCHAR(255)PATH'c1',nFORORDINALITY)r

XMLTABLE Since SQL:2006Stored in tbl.x:

<d><eid="42"><c1>…</c1></e></d>

*Standard SQL allows XQuery

Result id|c1|n----+----+---42|…|1

Page 25: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

XMLTABLE Availability19

99

2001

2003

2005

2007

2009

2011

2013

2015

2017

MariaDBMySQL

10[0] PostgreSQLSQLite

9.7 DB2 LUW11gR1 Oracle

SQL Server[0]No XQuery (only XPath). No default namespace declaration.

Page 26: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

IDENTITYCOLUMNS

Page 27: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

IDENTITY column Since SQL:2003Similar to serial columns, but standard and more powerful.

CREATETABLEtbl(idINTEGERGENERATEDBYDEFAULTASIDENTITY(STARTWITH10INCREMENTBY10)

,PRIMARYKEY(id))

Page 28: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

IDENTITY column Since SQL:2003Similar to serial columns, but standard and more powerful.

CREATETABLEtbl(idINTEGERGENERATEDBYDEFAULTASIDENTITY(STARTWITH10INCREMENTBY10)

,PRIMARYKEY(id))

More (esp. vs. serial):https://blog.2ndquadrant.com/postgresql-10-identity-columns/

Page 29: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

IDENTITY column Availability19

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 MariaDBMySQL

10 PostgreSQLSQLite

7.0 DB2 LUW12cR1 Oracle

SQL Server

Page 30: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

PostgreSQL 11 201?-??-??

Page 31: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER and

PARTITIONBY

Not new, butbear with me

Page 32: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (PARTITION BY) The ProblemTwo distinct concepts could not be used independently:

‣Merge rows with the same key properties

‣ GROUPBY to specify key properties

‣ DISTINCT to use full row as key

‣ Aggregate data from related rows ‣ Requires GROUPBY to segregate the rows

‣ COUNT, SUM, AVG, MIN, MAX to aggregate grouped rows

Page 33: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMt

SELECTc1,SUM(c2)totFROMtGROUPBYc1

Page 34: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

,tot

Page 35: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) The Problem

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMtJOIN()taON(t.c1=ta.c1)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

,tot

Page 36: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTc1,SUM(c2)totFROMtGROUPBYc1

OVER (PARTITION BY) Since SQL:2003

Yes ⇠

Mer

ge ro

ws ⇢

No

No ⇠ Aggregate ⇢ Yes

SELECTc1,c2FROMt

SELECTDISTINCTc1,c2FROMt

SELECTc1,c2FROMt

FROMt

,SUM(c2)OVER(PARTITIONBYc1)

Page 37: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 38: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 600022 1000 600022 1000 6000333 1000 6000333 1000 6000333 1000 6000

OVER (PARTITION BY) How it works

Page 39: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SELECTdep,salary,SUM(salary)OVER()FROMemp

dep salary ts1 1000 100022 1000 200022 1000 2000333 1000 3000333 1000 3000333 1000 3000

OVER (PARTITION BY) How it works

)PARTITIONBYdep

Page 40: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER and

ORDERBY(Framing & Ranking)

Still not new, butbear with me

Page 41: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,

(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)

FROMtransactionst

Page 42: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

OVER (ORDER BY) The Problem

SELECTid,value,

(SELECTSUM(value)FROMtransactionst2WHEREt2.id<=t.id)

FROMtransactionst

Range segregation (<=)not possible with

GROUP BY orPARTITION BY

Page 43: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYid

Page 44: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 45: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 46: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 47: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 48: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 49: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +30

22 3 -10 +20

333 4 +50 +70

333 5 -30 +40

333 6 -20 +20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 50: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10

22 2 +20

22 3 -10

333 4 +50

333 5 -30

333 6 -20

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

Page 51: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003

SELECTid,value,

FROMtransactionst

SUM(value)OVER(

)

acnt id value balance

1 1 +10 +10

22 2 +20 +20

22 3 -10 +10

333 4 +50 +50

333 5 -30 +20

333 6 -20 .0

ORDERBYidROWSBETWEENUNBOUNDEDPRECEDINGANDCURRENTROW

PARTITIONBYacnt

Page 52: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (ORDER BY) Since SQL:2003With OVER(ORDERBYn) a new type of functions make sense:

n ROW_NUMBER RANK DENSE_RANK PERCENT_RANK CUME_DIST1 1 1 1 0 0.252 2 2 2 0.33… 0.753 3 2 2 0.33… 0.754 4 4 3 1 1

Page 53: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

‣ Aggregates without GROUPBY

‣ Running totals, moving averages

‣ Ranking‣ Top-N per Group

‣ Avoiding self-joins

[… many more …]

Use Cases

SELECT*FROM(SELECTROW_NUMBER()OVER(PARTITIONBY…ORDERBY…)rn,t.*FROMt)numbered_tWHERErn<=3

AVG(…)OVER(ORDERBY…ROWSBETWEEN3PRECEDINGAND3FOLLOWING)moving_avg

OVER (SQL:2003)

Page 54: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER may follow any aggregate function

OVER defines which rows are visible at each row

OVER() makes all rows visible at every row

OVER(PARTITIONBY …) segregates like GROUPBY

OVER(ORDERBY…BETWEEN) segregates using <, >

In a NutshellOVER (SQL:2003)

Page 55: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (SQL:2003) Availability19

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 10.2 MariaDB8.0 MySQL

8.4 PostgreSQLSQLite

7.0 DB2 LUW8i Oracle

2005 SQL Server

Hive

ImpalaSpark

NuoDB

8 years

Page 56: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVERSQL:2011 functions

Page 57: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt

Page 58: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

Page 59: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

Page 60: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

Page 61: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

Page 62: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

Page 63: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

WITHnumbered_tAS(SELECT*)

SELECTcurr.*,curr.balance-COALESCE(prev.balance,0)FROMnumbered_tcurrLEFTJOINnumbered_tprevON(curr.rn=prev.rn+1)

OVER (SQL:2011) The ProblemDirect access of other rows of the same window is not possible.

(E.g., calculate the difference to the previous rows)

currbalance … rn

50 … 190 … 270 … 330 … 4

FROMt,ROW_NUMBER()OVER(ORDERBYx)rn

prevbalance … rn

50 … 190 … 270 … 330 … 4

+50+40-20-40

Page 64: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SQL:2011 can access other rows directly:SELECT*,balance-LAG(balance,1,0)OVER(ORDERBYx)FROMdata

OVER Since SQL:2011

Page 65: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SQL:2011 can access other rows directly:SELECT*,balance-LAG(balance,1,0)OVER(ORDERBYx)FROMdata

Available functions: LEAD/LAGFIRST_VALUE/LAST_VALUENTH_VALUE(col,n)FROMFIRST/LASTRESPECT/IGNORENULLS

OVER Since SQL:2011

Page 66: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SQL:2011 can access other rows directly:SELECT*,balance-LAG(balance,1,0)OVER(ORDERBYx)FROMdata

Available functions: LEAD/LAGFIRST_VALUE/LAST_VALUENTH_VALUE(col,n)FROMFIRST/LASTRESPECT/IGNORENULLS

OVER Since SQL:2011

Not supported by PostgreSQL

(as of 11)

Page 67: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (LEAD, LAG, …) Since SQL:201119

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 10.2[0] MariaDB8.0[0] MySQL

8.4[0] PostgreSQLSQLite

9.5[1] 11.1 DB2 LUW8i[1] 11gR2 Oracle

2012[1] SQL Server[0]No IGNORENULLS and FROMLAST[1]No NTH_VALUE

Page 68: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVERSQL:2011 groups option

Page 69: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (groups option) Since SQL:2011ORDERBYx<frameunit>between1precedingand1following

rows,range

Page 70: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (groups option) Since SQL:2011ORDERBYx<frameunit>between1precedingand1following

rows,range

rangexbetweencurrent_row-1

andcurrent_row+1

rowscount(*)

x13

3.53.54

CURRENT ROW

Page 71: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

groupscount(distinctx)

OVER (groups option) Since SQL:2011ORDERBYx<frameunit>between1precedingand1following

rows,rangeNew in

SQL:2011

andPostgreSQL 11

groups

rangexbetweencurrent_row-1

andcurrent_row+1

rowscount(*)

x13

3.53.54

CURRENT ROW

Page 72: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Since SQL:201119

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 MariaDBMySQL

11 PostgreSQLSQLite

DB2 LUWOracleSQL Server

OVER (groups option)

Page 73: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVERSQL:2003 frame exclusion

Page 74: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Since SQL:2003OVER(ORDERBY…BETWEEN…exclude[noothers|currentrow|group|ties])

default

New inPostgreSQL 11

OVER (frame exclusion)

noothers

Page 75: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

currentrow

Since SQL:2003

x12223

OVER(ORDERBY…BETWEEN…exclude[noothers|currentrow|group|ties])

default

New inPostgreSQL 11

OVER (frame exclusion)

currentrow

Page 76: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

groupx=current_row

currentrow

Since SQL:2003

x12223

OVER(ORDERBY…BETWEEN…exclude[noothers|currentrow|group|ties])

default

New inPostgreSQL 11

OVER (frame exclusion)

group

Page 77: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

tiesgroupx=current_row

currentrow

Since SQL:2003

x12223

OVER(ORDERBY…BETWEEN…exclude[noothers|currentrow|group|ties])

default

New inPostgreSQL 11

OVER (frame exclusion)

ties

Page 78: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

OVER (frame exclusion) Since SQL:201119

9920

0120

0320

0520

0720

0920

1120

1320

1520

17

5.1 MariaDBMySQL

11 PostgreSQLSQLite

DB2 LUWOracleSQL Server

Page 79: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

PostgreSQL 12+ 20??-??-??

Page 80: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

MERGE

Page 81: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Copying rows from another table is easy:

INSERTINTO<target>SELECT…FROM<source>WHERENOTEXISTS(SELECT*FROM<target>WHERE…)

MERGE The Problem

Both, <target> and <source>

are in scope here.

Page 82: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Deleting rows that exist in another table is also possible:

DELETEFROM<target>WHEREEXISTS(SELECT*FROM<source>WHERE…)

MERGE The Problem

Both, <target> and <source>

are in scope here.

Page 83: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Updating rows from another table is awkward:

UPDATE<target>SET…WHERE…

MERGE The Problem

Subqueries are a more common choice

Sometimes, updatable views can help.

Bringing another tables rows into scope of the SET clause is tricky

Page 84: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

Updating rows from another table is awkward:

UPDATE<target>SET…WHERE…SET(col1,col2)=(SELECTcol1,col2FROM<source>WHERE…)WHERE…

MERGE The Problem

Both, <target> and <source>

are in scope here.

Page 85: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

SQL:2008 introduced merge to improve this situation two-fold:

‣It has always two tables in scope,the source can be a derived table (subquery). ‣It can do insert, update, or delete in one go.

MERGEINTO<target>USING<source>ON<joincondition>WHENMATCHED[AND<cond>]THEN[UPDATE…|DELETE…]WHENNOTMATCHED[AND<cond>]THENINSERT…

MERGE Since SQL:2008

WHEN/THENcan appear many times

Page 86: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

MERGE Since SQL:200819

99

2001

2003

2005

2007

2009

2011

2013

2015

2017

5.1 MariaDBMySQLPostgreSQLSQLite

9.1 DB2 LUW9iR1[0] Oracle

2008 SQL Server[0]No AND condition.

Page 87: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

(standard) JSON

Page 88: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSONPostgreSQL got the JSON data type and

the first JSON related functions with 9.2 (2012)

ISO added SQL JSON functions (but no type)with SQL:2016

Both approaches follow a different mantra. The following slides only reflects the standards perspective.

Page 89: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON

Page 90: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON

Page 91: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON

Page 92: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON — Preliminary testing of patchesUsed a06e56b24 as basis, applied those patches on top:

0001-strict-do_to_timestamp-v16.patch 0002-pass-cstring-to-do_to_timestamp-v16.patch 0003-add-to_datetime-v16.patch 0004-jsonpath-v16.patch 0005-jsonpath-gin-v16.patch 0006-jsonpath-json-v16.patch 0007-remove-PG_TRY-in-jsonpath-arithmetics-v16.patch 0008-jsonpath-extensions-v16.patch 0009-jsonpath-extensions-tests-for-json-v16.patch 0010-add-invisible-coercion-form-v16.patch 0011-add-function-formats-v16.patch 0012-sqljson-v16.patch 0013-sqljson-json-v16.patch 0014-optimize-sqljson-subtransactions-v16.patch 0015-json_table-v16.patch 0016-json_table-json-v16.patch }SQL/JSON: JSON_TABLE

}}SQL/JSON: jsonpath

SQL/JSON: functions

Page 93: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON — Preliminary testing of patches

Page 94: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON — Preliminary testing of patches

Page 95: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

JSON — Preliminary testing of patches

Page 96: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

@ModernSQL modern-sql.comMy other website:

https://use-the-index-luke.com

Training & co: https://winand.at/

Page 97: How PostgreSQL’s SQL dialect stays ahead of its competitors · GROUPING SETS are multiple GROUP BYs in one go () (empty parenthesis) build a group over all rows GROUPING (function)

5-day intensive training — Sept 17-21Visit Vienna!SQL Performance Kick-Start • Indexing, Indexing, Indexing… • Joining • Ordering, grouping, • …

Analysis and Aggregation • OVER, GROUPING SETS • Avoiding self-joins • Grouping consecutive events • …

SQL Reloaded • Type safety, NULL an 3VL • Modern interpretation of

the relational model • Keeping historic data • Design guidelines

Recursive Queries • WITH • WITH RECURSIVE • Use cases and examples

https://winand.at/sql-training/5-day-training/aug-sept-2018