architecting your postgresql application for the … • what is the cloud? • amazon ec2 •...

30
Architecting Your PostgreSQL Application for the Cloud Jim Mlodgenski Chief Architect EnterpriseDB

Upload: vongoc

Post on 21-May-2018

225 views

Category:

Documents


3 download

TRANSCRIPT

Architecting Your PostgreSQL Application for the Cloud

Jim MlodgenskiChief ArchitectEnterpriseDB

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

What is the Cloud?

• Dynamically scalable cluster of resources– It is really hosted virtualized instances

• Frequently referred to as Utility or Elastic Computing

• There is no standard across providers

• It will be a key part of IT in the future

Who are the players?

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

Amazon EC2

• Elastic Compute Cloud (EC2)– Uses Xen virtualization to run images

• Amazon Machine Images (AMI)– Hundreds of pre-configured images available to

the public– Tools available to bundle customized images

Amazon EC2

• Instances– A range of virtual environments

• Small – 1 core, 1.7GB RAM, 160GB Storage

• Xlarge– 4 cores, 15GB RAM, 1.6TB Storage

• Xlarge High CPU– 8 cores, 7GB RAM, 1.6TB Storage

Amazon EC2 - Storage

• Instance Storage

• Tied to the active instance

• When the instance shuts down it all goes away

• Elastic Block Store

• Persistent storage

• Acts like a raw block device

• Tied to a physical zone

• Software RAID across multiple volumes is possible

• Simple Storage Service (S3)

• Reliable and Redundant

• Not a block device

• Only a web service interface

Amazon EC2

• Storage Performance● The EBS Storage is about

the same speed as the internal instance storage

● The IO speed of the Xlarge instances is extremely good

MacBook Small Small EBS Xlarge Xlarge High CPU

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

Bonnie++ Results

Seq Writes (K/sec)

Seq Reads (K/sec)

Amazon EC2

• Memory Performance● The speed of the memory

of even the largest instances is slow compared to physical machines

● This affects the performance of queries that are entirely returned out of cache

MacBook Small Xlarge Xlarge High CPU

0

500

1000

1500

2000

2500

3000

3500

4000

Bandwidth Memory Results

Read (MB/sec) Write (MB/sec)

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

PostgreSQL Applications in the Cloud

• Elastics Databases● They do not exist for transactional systems● Possible to set up an AMI with PostgreSQL

● The “data” directory is ever changing● Need to scale the database via traditional methods

● Horizontally● Vertically

PostgreSQL Applications in the Cloud (cont.)

• Prepare for Elastic Application Tier– The Application Tier is easily scaled – Connection Pooling

• A connection pooling layer can be made elastic

– PgBouncer– pg-pool II

– Minimize Server Side Code• Put business logic in the Application Tier

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

Maintenance

• Backups

– Online backups with PITR is always best

– Nothing wrong with pg_dump

• Storing the backups

– S3

• Extremely reliable

• Difficult to automate with scripts

– Web service interface

– EBS Snapshots

• Backup to S3

• Command line tools available

Maintenance (cont.)

• Performance Tuning

– Understand the ratio of the speed of IO vs memory

– Remember to adjust the configuration file when scaling vertically

MacBook Small Small EBS Xlarge Xlarge High CPU

0

20

40

60

80

100

120

140

Ratio of Memory vs. IO Speed

Memory/IO (Writes)

Memory/IO (Reads)

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

Security

• The Cloud is on the public internet– Use good security practices

• Firewall the server– Lock down PostgreSQL

• Change the default port• Use pg_hba.conf• Authentication• SSL communication

Security (cont.)

• Storage is semi-public– Maintain good security practices

• Encryption on the data directory– S3 for backups

• Use SSL when communicating with S3

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

High Availability

• Replication– Use the traditional

replication methods• Slony• Log Shipping• Bucardo• PG Pool II

High Availability (cont.)

• Clustering– Not as simple to use

traditional methods• No shared disk

– Active/Passive Clustering• ISCSI for the

shared disk

Agenda

• What is the Cloud?• Amazon EC2• PostgreSQL Applications in the Cloud• Maintenance• Security• High Availability• Scaling the Data Tier

Scaling the Data Tier

• Replication• Massively Parallel Processing

– GridSQL– pg-pool II

Scaling the Data Tier (cont.)

• Federation and Sharding

– Horizontally partition the data across many smaller databases

• PL/Proxy

• Hibernate Shards

Scaling the Data Tier (cont.)

• Pgpool-II and Slony can be a very scalable architecture

– But be sure the data is read consistently across nodes

Summary

• PostgreSQL can excel on the Cloud– But it is not always nirvana

• Utility model is interesting– But watch your costs closely!!!

• Manage PostgreSQL much like it is done in a traditional environment

Contact Information

Jim MlodgenskiChief Architect, EnterpriseDB

[email protected]

Thank you. Questions?