teaching postgresql to new people

84
PostgreSQL – Tomasz Borek Teaching PostgreSQL to new people @LAFK_pl Consultant @

Upload: tomek-borek

Post on 15-Jan-2017

235 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Teaching PostgreSQL to new people

PostgreSQL – Tomasz Borek

Teaching PostgreSQL to new people

@LAFK_plConsultant @

Page 2: Teaching PostgreSQL to new people

About me

@LAFK_plConsultant @

Tomasz Borek

Page 3: Teaching PostgreSQL to new people
Page 4: Teaching PostgreSQL to new people

What will I tell you?

● About me (done)● Show of hands● Who „new people” might be

– And usually – in my case – are

● About teaching– Comfort zone, learners, stepping back

● Chosen approaches, features, gotchas and the like● Why, why, why● And yes, this’ll be about Postgres, but in an unusual way

Page 5: Teaching PostgreSQL to new people

Show of hands

● Developers (not PL/SQL ones)

Page 6: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)

Page 7: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)● DBA (Admin, Architect)

Page 8: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)● DBA (Admin, Architect)● DevOps

Page 9: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)● DBA (Admin, Architect)● DevOps● SysAdmin

Page 10: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)● DBA (Admin, Architect)● DevOps● SysAdmin● Trainers / consultants

Page 11: Teaching PostgreSQL to new people

Show of hands

● Developers● Developers (PL/SQL ones)● DBA (Admin, Architect)● DevOps● SysAdmin● Trainers / consultants● Other?

Page 12: Teaching PostgreSQL to new people

„New” people

Page 13: Teaching PostgreSQL to new people

Surprisingly

● Often your colleagues● Sometimes older● Sometimes more senior● Experienced● With success under their belts

Page 14: Teaching PostgreSQL to new people

Surprisingly

● Often your colleagues● Sometimes older● Sometimes more senior● Experienced● With success under their belts● Basically: FORMED already

– Or MADE, if you will

Page 15: Teaching PostgreSQL to new people

Developers are problem solvers

● Your colleagues have certain problems● Is Postgres the solution?

– Or „a solution” at least?

● And how is the learning curve– Time including

Page 16: Teaching PostgreSQL to new people

Developers are not SQL people!

● Not many know JOINs very well● Not many know how indexes work● Not many know indexes weaknesses● CTEs, window functions, procedures, cursors…● They „omit” this● Comfort zone is nice

Page 17: Teaching PostgreSQL to new people

Do not abandon them

Or they’ll abandon you

Page 18: Teaching PostgreSQL to new people

Do not abandon them

● Docs● Materials● Tools● Links to good content● Pictures, pictures, pictures● They can edit / comment (Wiki)● Your (colleagues) time

Page 19: Teaching PostgreSQL to new people

Teaching

Page 20: Teaching PostgreSQL to new people

What is YOUR problem?

● DBA wanting respite for your DB?● Malpractice in SQL queries?● Why don’t they use XYZ feature?● From tomorrow on, teach them some SQL● Migration from X to Postgres● Guidelines creation

Page 21: Teaching PostgreSQL to new people

Xun Kuang once said

不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知之不若行之

Xunzi book 8: Ruxiao, chapter 11

Page 22: Teaching PostgreSQL to new people
Page 23: Teaching PostgreSQL to new people

Xun Kuang once said

不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知之不若行之

“Not having heard something is not as good as having heard it; having heard it is not as good as having seen it; having seen it is not as good as

knowing it; knowing it is not as good as putting it into practice.”

Xunzi book 8: Ruxiao, chapter 11

Page 24: Teaching PostgreSQL to new people

Xun Kuang paraphrase would be

不闻不若闻之 , 闻之不若见之 , 见之不若知之 , 知之不若行之

“Not having heard something < having heard it; having heard it < having seen it;

having seen it < knowing it; knowing it < putting it into practice.”

Xunzi book 8: Ruxiao, chapter 11

Page 25: Teaching PostgreSQL to new people

How do they learn?

● „Practice makes master”– Except it doesn’t

● Learning styles● Docs still relevant

– If well-placed, accessible and easy to get in

Page 26: Teaching PostgreSQL to new people

Repetitio est mater studiorum

● Crash course● Workshop● Problem solving on their own● Docs to help● Code reviews

Page 27: Teaching PostgreSQL to new people

Comfort zone

Page 28: Teaching PostgreSQL to new people

Comfort zone

● Setup / install● Moving around● Logs, timing queries● EXPLAIN + ANALYZE● Indexes● PgSQL and variants● NoSQL + XML

Page 29: Teaching PostgreSQL to new people

Chosen features, gotchas etc.

so

How to teach Postgres?

Page 30: Teaching PostgreSQL to new people

In short

● History – battle-tested, feature-rich, used● Basics – moving around, commands, etc.● Prepare your bait accordingly

– My faves

– Advanced features

– NoSQL angle

– …

● Don’t just drink the KoolAid!

Page 31: Teaching PostgreSQL to new people

Battle-tested

● Matures since 1987● Comes in many flavours (forks)● Largest cluster – 2PBs in Yahoo ● Skype, NASA, Instagram● Stable:

– Many years on one version

– Good version support

– Every year something new

– Follows ANSI SQL standards

https://www.postgresql.org/about/users/

Page 32: Teaching PostgreSQL to new people

In-/Postgres forks

Page 33: Teaching PostgreSQL to new people
Page 34: Teaching PostgreSQL to new people

Support?

Page 35: Teaching PostgreSQL to new people

Great angles

● Procedures: Java, Perl, Python, CTEs...● Enterprise / NoSQL - handles XMLs and JSONs● Index power – spatial or geo or your own● CTEs and FDWs => great ETL or µservice● Pure dev: error reporting / logging, MVCC (dirty

read gone), own index, plenty of data types, Java/Perl/… inside

● Solid internals: processes, sec built-in,

Page 36: Teaching PostgreSQL to new people

Basics

● Setup● Psql

– Moving around

– What’s in

● Indexes● Joins● Query path● Explain, Explain Analyze

Page 37: Teaching PostgreSQL to new people

Query Path

http://www.slideshare.net/SFScon/sfscon15-peter-moser-the-path-of-a-query-postgresql-internals

Page 38: Teaching PostgreSQL to new people

Parser

● Syntax checks, like FRIM is not a keyword– SELECT * FRIM myTable;

● Catalog lookup– MyTable may not exist

● In the end query tree is built– Query tokenization: SELECT (keyword)

employeeName (field id) count (function call)...

Page 39: Teaching PostgreSQL to new people

Grammar and a query tree

Page 40: Teaching PostgreSQL to new people

Planner

● Where Planner Tree is built● Where best execution is decided upon

– Seq or index scan? Index or bitmap index?

– Which join order?

– Which join strategy (nested, hashed, merge)?

– Inner or outer?

– Aggregation: plain, hashed, sorted…

● Heuristic, if finding all plans too costly

Page 41: Teaching PostgreSQL to new people

Full query path

Page 42: Teaching PostgreSQL to new people

Example to explain EXPLAIN

EXPLAIN SELECT * FROM tenk1;

QUERY PLAN

------------------------------------------------------------

Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)

Page 43: Teaching PostgreSQL to new people

Explaining EXPLAIN - what

EXPLAIN SELECT * FROM tenk1;

QUERY PLAN

------------------------------------------------------------

Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)

● Startup cost – time before output phase begins● Total cost – in page fetches, may change, assumed to

run node to completion● Rows – estimated number to scan (but LIMIT etc.)● Estimated average width of output from that node (in

bytes)

Page 44: Teaching PostgreSQL to new people

Explaining EXPLAIN - how

EXPLAIN SELECT * FROM tenk1;

QUERY PLAN

------------------------------------------------------------

Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)

SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; //358|10k

● No WHERE, no index● Cost = disk pages read * seq page cost + rows scanned

* cpu tuple cost● 358 * 1.0 + 10000 * 0.01 = 458 // default values

Page 45: Teaching PostgreSQL to new people

Analyzing EXPLAIN ANALYZEEXPLAIN ANALYZE SELECT *

FROM tenk1 t1, tenk2 t2

WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)

-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)

Recheck Cond: (unique1 < 10)

-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)

Index Cond: (unique1 < 10)

-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)

Index Cond: (unique2 = t1.unique2)

Planning time: 0.181 ms

Execution time: 0.501 ms

● Actually runs the query● More info: actual times, rows removed by filter,

sort method used, disk/memory used...

Page 46: Teaching PostgreSQL to new people

Analyzing EXPLAIN ANALYZE

EXPLAIN ANALYZE SELECT *

FROM tenk1 t1, tenk2 t2

WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)

-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)

Recheck Cond: (unique1 < 10)

-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)

Index Cond: (unique1 < 10)

-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)

Index Cond: (unique2 = t1.unique2)

Planning time: 0.181 ms

Execution time: 0.501 ms

Page 47: Teaching PostgreSQL to new people

Analyzing EXPLAIN ANALYZE

EXPLAIN ANALYZE SELECT *

FROM tenk1 t1, tenk2 t2

WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)

-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)

Recheck Cond: (unique1 < 10)

-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)

Index Cond: (unique1 < 10)

-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)

Index Cond: (unique2 = t1.unique2)

Planning time: 0.181 ms

Execution time: 0.501 ms

Page 48: Teaching PostgreSQL to new people

Analyzing EXPLAIN ANALYZE

EXPLAIN ANALYZE SELECT *

FROM tenk1 t1, tenk2 t2

WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------

Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)

-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)

Recheck Cond: (unique1 < 10)

-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)

Index Cond: (unique1 < 10)

-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)

Index Cond: (unique2 = t1.unique2)

Planning time: 0.181 ms

Execution time: 0.501 ms

Page 49: Teaching PostgreSQL to new people

My Faves

● Error reporting● PL/xSQL – feel free to use Perl, Python, Ruby, Java,

LISP...● Data types

– XML and JSON handling

● Foreign Data Wrappers (FDW)● Windowing functions● Common table expressions (CTE) and recursive queries● Power of Indexes

Page 50: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Page 51: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Page 52: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Page 53: Teaching PostgreSQL to new people

The cake is a lie!

Page 54: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Page 55: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Page 56: Teaching PostgreSQL to new people

Will DB eat your cake?

● Thanks @anandology

Consider password VARCHAR(8)

Page 57: Teaching PostgreSQL to new people

Logging, ‘gotchas’

● Default is to stderr only● Set on CLI or in config, not through sets● Where is it?

● How to log queries… or turning log_collector on

Page 58: Teaching PostgreSQL to new people

Where is it?

● Default– data/pg_log

● Launchers can set it (Mac Homebrew/plist)● Version and config dependent

Page 59: Teaching PostgreSQL to new people

Ask DB

Page 60: Teaching PostgreSQL to new people

Logging, turn it on

● Default is to stderr only● In PG:

logging_collector = on

log_filename = strftime-patterned filename

[log_destination = [stderr|syslog|csvlog] ]

log_statement = [none|ddl|mod|all] // all

log_min_error_statement = ERROR

log_line_prefix = '%t %c %u ' # time sessionid user

Page 61: Teaching PostgreSQL to new people

Log line prefix

Page 62: Teaching PostgreSQL to new people

PL/pgSQL

● Stored procedure dilemma– Where to keep your logic?

– How your logic is NOT in your SCM

Page 63: Teaching PostgreSQL to new people

PL/pgSQL

● Stored procedure dilemma– Where to keep your logic?

– How your logic is NOT in your SCM

● Over dozen of options: – Perl, Python, Ruby,

– pgSQL, Java,

– TCL, LISP…

Page 64: Teaching PostgreSQL to new people

PL/pgSQL

● Stored procedure dilemma– Where to keep your logic?

– How your logic is NOT in your SCM

● Over dozen of options: – Perl, Python, Ruby,

– pgSQL, Java,

– TCL, LISP…

● DevOps, SysAdmins, DBAs… ETLs etc.

Page 65: Teaching PostgreSQL to new people

PL/pgSQL

● Stored procedure dilemma– Where to keep your logic?

– How your logic is NOT in your SCM

● Over dozen of options: – Perl, Python, Ruby,

– pgSQL, Java,

– TCL, LISP…

● DevOps, SysAdmins, DBAs… ETLs etc.

Page 66: Teaching PostgreSQL to new people

Perl function example

CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$

my ($x, $y) = @_;

if (not defined $x) {

return undef if not defined $y;

return $y;

}

return $x if not defined $y;

return $x if $x > $y;

return $y;

$$ LANGUAGE plperl;

Page 67: Teaching PostgreSQL to new people

XML or JSON support

● Parsing and retrieving XML (functions)● Valid JSON checks (type)● Careful with encoding!

– PG allows only one server encoding per database

– Specify it to UTF-8 or weep

● Document database instead of OO or rel– JSON, JSONB, HSTORE – noSQL fun welcome!

Page 68: Teaching PostgreSQL to new people

HSTORE?

CREATE TABLE example (

id serial PRIMARY KEY,

data hstore);

Page 69: Teaching PostgreSQL to new people

HSTORE?

CREATE TABLE example (

id serial PRIMARY KEY,

data hstore);

INSERT INTO example (data) VALUES

('name => "John Smith", age => 28, gender => "M"'),

('name => "Jane Smith", age => 24');

Page 70: Teaching PostgreSQL to new people

HSTORE?

CREATE TABLE example (

id serial PRIMARY KEY,

data hstore);

INSERT INTO example (data) VALUES

('name => "John Smith", age => 28, gender => "M"'),

('name => "Jane Smith", age => 24');

SELECT id,

data->'name'

FROM example;

SELECT id, data->'age' FROM example

WHERE data->'age' >= '25';

Page 71: Teaching PostgreSQL to new people

XML and JSON datatype

CREATE TABLE test (

...,

xml_file xml,

json_file json,

...

);

Page 72: Teaching PostgreSQL to new people

XML functions example

XMLROOT (

XMLELEMENT (

NAME gazonk,

XMLATTRIBUTES (

’val’ AS name,

1 + 1 AS num

),

XMLELEMENT (

NAME qux,

’foo’

)

),

VERSION ’1.0’,

STANDALONE YES

)

<?xml version=’1.0’ standalone=’yes’ ?>

<gazonk name=’val’ num=’2’>

<qux>foo</qux>

</gazonk>

xml '<foo>bar</foo>''<foo>bar</foo>'::xml

Page 73: Teaching PostgreSQL to new people

Architecture and internals

Page 74: Teaching PostgreSQL to new people
Page 75: Teaching PostgreSQL to new people
Page 76: Teaching PostgreSQL to new people

Check out processes

● pgrep -l postgres● htop > filter: postgres● Whatever you like / use usually● Careful with kill -9 on connections

– kill -15 better

Page 77: Teaching PostgreSQL to new people
Page 78: Teaching PostgreSQL to new people

Summary

Page 79: Teaching PostgreSQL to new people

Before

● Who are they?● What is your problem?● How large comfort zone, how to push them out?● Materials, docs, workshop preparation● How much time for training?● How much time after?● How many people will it be?● What indicates that problem is solved?

Page 80: Teaching PostgreSQL to new people

During

● Establish the goal – And – if possible – learning styles

● Promise support (and tell how!)– Push out from comfort zone!

● Ask for hard work and stupid questions● Show documentation, do live tour● Do the workshop● Involve, find best ones

– You will have them help you later

● Expect questions, make them ask– Again, push out from comfort zone!

Page 81: Teaching PostgreSQL to new people

After

● Where are the docs?– Are they using them?

● Answer the questions– Again, and again

● Code reviews– Deliver on support promise!

– Involve promising students

● Is the problem gone / better?

Page 82: Teaching PostgreSQL to new people

Don’t omit the basics

● Joins● Indexes – how they work● Query path (EXPLAIN, EXPLAIN ANALYZE)● Moving around (psql)● Setup and getting to DB

Page 83: Teaching PostgreSQL to new people

Postgres is cool

● Goodies like error reporting or log line prefix● Processes thought out● Good for µservices and enterprise● Not only SQL (XML, JSON, Perl, Python...)● Ask DB● Indexes● Powerful: CTEs, recursive queries, FDWs...● Battle tested and always high

Page 84: Teaching PostgreSQL to new people

Teaching Postgres – Tomasz Borek

Teaching Postgres to new people

@LAFK_plConsultant @