sql, scaling, and what’s unique about postgresql · workloads – from projects such as...
TRANSCRIPT
![Page 1: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/1.jpg)
SQL, Scaling, and What’s Unique About PostgreSQL
Ozgun Erdogan Citus Data XLDB | May 2018
![Page 2: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/2.jpg)
Punch Line 1. What is unique about PostgreSQL?
• The extension APIs
2. PostgreSQL extensions are a game changer for relational databases
![Page 3: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/3.jpg)
Talk Outline 1. What is an extension? 2. Why are extensions a game changer for databases? 3. Postgres can’t do “this”
• Semi-structured or unstructured data • Geospatial database • S3 or columnar storage for storage • Scale out
4. Conclusion
![Page 4: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/4.jpg)
What is an Extension • An extension is a piece of software that adds
functionality to Postgres. Each extension bundles related objects together.
• Postgres 9.1 started providing official APIs to override or extend any database module’s behavior.
• “CREATE EXTENSION citus;” dynamically loads these objects into Postgres’ address space.
![Page 5: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/5.jpg)
What can you Extend in Postgres? • You can override, cooperate with, or extend any
combination of the following database modules:
• Type system and operators • User defined functions and aggregates • Storage system and indexes • Write ahead logging and replication • Transaction engine • Background worker processes • Query planner and query executor • Configuration and database metadata
![Page 6: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/6.jpg)
Why are Extensions a game changer • Every decade brings new workloads for databases. • The last decade was about capturing more data, in
more shapes and form. • Postgres has been forked by dozens of commercial
databases for new workloads. When you fork, your database diverges from the community.
• What if you could leverage the database ecosystem and grow with it?
![Page 7: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/7.jpg)
Extending a relational database: Really? Extending a relational database is a relatively new idea. Over the years, we received questions on this new idea.
1. Forking vs extensions: Can you really extend any database module?
2. Building from scratch vs extensions: Postgres is a relational database from an old era. It can’t do “this”.
![Page 8: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/8.jpg)
Relational databases can’t do “this” Postgres isn’t designed for “this”:
1. Process semi-structured
2. Run geospatial workloads
3. Non-relational data storage
4. Scale out for large datasets
![Page 9: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/9.jpg)
Postgres can’t do semi-structured data • NoSQL popularized the use of semi-structured data as
an alternative to data models used in relational databases. In practice, each model has benefits.
• Postgres has an extensible type system. It already supports semi-structured data types:
1. XML 2. Full-text search 3. Hstore: precursor to JSONB 4. JSON / JSONB
![Page 10: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/10.jpg)
JSONB data type – store and query
fromcompose.com
![Page 11: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/11.jpg)
JSONB data type – aggregate and index
![Page 12: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/12.jpg)
Postgres can do semi-structured data
• PostgreSQL stores and processes semi-structured data just as efficiently as NoSQL databases. You also get rich features that come with a relational database.
• http://goo.gl/NuoLgP (Mongo vs Postgres jsonb benchmarks)
• If your semi-structured or unstructured data can’t be served by existing data types, you can always create your own type. You can even add operators, aggregate functions, or indexes.
![Page 13: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/13.jpg)
Postgres can’t be a spatial database • A spatial database stores and
queries data that represents objects defined in a geometric space.
• Spatial databases represent geometric objects such as lines and polygons. Some databases handle complex structures such as 3D objects and topological coverages.
fromboundlessgeo.com
![Page 14: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/14.jpg)
PostGIS – Geographic objects
![Page 15: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/15.jpg)
PostGIS – Geospatial joins
![Page 16: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/16.jpg)
Postgres can become a spatial database
• The PostGIS extension turns PostgreSQL into one of most popular geospatial databases in the world.
• Thousands of companies use PostGIS for spatial workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight.
• If you need more from your spatial database, you can easily extend Postgres. In fact, PostGIS comes with six other extensions for specific use cases.
![Page 17: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/17.jpg)
Postgres can only do row storage • Postgres 9.1+ comes with foreign data wrapper APIs.
With these APIs, you can add read from or write to any data source.
• Postgres already has 106 wrappers. With these, you can run SQL commands on diverse data sources:
1. S3 (read-only) 2. MongoDB 3. Oracle 4. Cstore_fdw
![Page 18: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/18.jpg)
CStore – Columnar storage • CStore is under
development. For example, cstore doesn’t yet support Update / Delete commands.
• Cstore’s primary benefit today is compression. People use it to reduce in-memory and storage footprint.
![Page 19: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/19.jpg)
Block 1 Block 2 Block 3 Block 4 Block 5 Block 6 Block 7
150Krows(configurable)
150Krows(configurable) 10Kcolumnvalues
(configurable)perblock
ORCfileformat
![Page 20: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/20.jpg)
CStore – Data Load and Query
![Page 21: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/21.jpg)
Postgres can do more than row stores • Default storage engine for relational databases is row-
oriented. But, Postgres can do way more than row stores. • You can extend Postgres to store data in a columnar
format or interact with other databases – such as DynamoDB or Oracle.
• Postgres provides extension apis to (1) scan foreign tables, (2) scan foreign joins, (3) update foreign tables, (4) lock rows, (5) sample data, (6) override planner and executor, and more.
![Page 22: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/22.jpg)
Postgres doesn’t scale • “SQL doesn’t scale” answers a complex problem by
making a simple statement. • SQL means different things to different people.
Depending on the context, it could mean multi-tenant (B2B) databases, short read/writes, real-time analytics, or data warehousing.
• Scaling each one of these workloads require extending the relational database in a different way.
![Page 23: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/23.jpg)
Citus – Distributed database 1. Citus scales out PostgreSQL
• Uses sharding and replication • Query engine parallelizes SQL queries across machines
2. Citus extends PostgreSQL • Uses Postgres extension APIs to cooperate with or extend all
database modules
3. Available in 3 ways • Open source, enterprise software, and managed database as a
service on AWS
![Page 24: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/24.jpg)
Citus – Scaling out PostgreSQL
![Page 25: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/25.jpg)
Citus – Architecture diagram (simplified)
Coordinator
SELECT sum(…), count(…) FROM teams_1001
SELECT sum … FROM teams_1003
Worker node 1
Table metadata
Table_1001 Table_1003
SELECT sum … FROM teams_1002
SELECT sum … FROM teams_1004
Worker node 2
Table_1002 Table_1004
Worker node N
.
.
.
.
.
. Each node Postgres with Citus installed
1 shard = 1 Postgres table
SELECT avg(..) FROM teams;
![Page 26: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/26.jpg)
Postgres can scale • “SQL doesn’t scale” is a
simple statement to a complex problem. It’s easy to dismiss a complex problem by making a statement - that trivializes the problem.
• SQL is hard, not impossible, to scale.
![Page 27: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/27.jpg)
Summary
• Postgres Extension APIs provide a unique way to build new databases.
• Postgres can be extended to many different workloads
1. jsonb: Semi-structured data 2. PostGIS: Geospatial database 3. cstore_fdw: columnar storage (in works) 4. Citus: Scale out your database
![Page 28: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/28.jpg)
Conclusion
• Postgres 10 enables you to extend any database module’s behavior. This way, you can use functionality built into Postgres over decades. You can also grow with the rich ecosystem of tools and libraries.
• Extensions are a game changer for databases. • The monolithic relational database could be dying. If
so, long live Postgres!
![Page 29: SQL, Scaling, and What’s Unique About PostgreSQL · workloads – from projects such as OpenStreetMap to start-ups like Hotel Tonight. • If you need more from your spatial database,](https://reader030.vdocuments.net/reader030/viewer/2022041215/5e03fa032a2d104d9655036b/html5/thumbnails/29.jpg)
© 2017 Citus Data. All right reserved.
Questions?
@citusdata
Ozgun Erdogan
www.citusdata.com