the breakup - logically sharding a growing postgresql database

70
Logically Sharding a Growing PostgreSQL Database The Breakup

Upload: fred-moyer

Post on 13-Apr-2017

107 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: The Breakup - Logically Sharding a Growing PostgreSQL Database

Logically Sharding a Growing PostgreSQL Database

The Breakup

Page 2: The Breakup - Logically Sharding a Growing PostgreSQL Database

Introductions

Students hate us.

Page 3: The Breakup - Logically Sharding a Growing PostgreSQL Database

Introductions

Turnitin.com

Samantha: database@mzsamantha

Fred: code@phredmoyer

Page 4: The Breakup - Logically Sharding a Growing PostgreSQL Database

The Seven Stages Of Grief Scaling

1. Shock and Denial2. Pain and Guilt3. Anger and

Bargaining4. Depression &

Reflection5. The Upward Turn6. Reconstruction7. Acceptance and

Hope

Page 5: The Breakup - Logically Sharding a Growing PostgreSQL Database

The Seven Stages Of Grief Scaling

1. Shock and Denial2. Pain and Guilt3. Anger and

Bargaining4. Depression &

Reflection5. The Upward Turn6. Reconstruction7. Acceptance and

Hope

1. Monolithic Scaling 2. Hardware is

Expensive3. If We Do It This

Way...4. We Are So *%@#!&5. Down To 150 Bugs!6. Release Day7. Beer & Therapy

(beerapy?)

Page 6: The Breakup - Logically Sharding a Growing PostgreSQL Database

The Problem

● The ability to efficiently backup and restore● The amount of ram required to keep indexes

in memory ● Resource contention causing query planner to

make sub-optimal choices. ● Aged data extending query resources and

execution time● Overlap in existing ID spaces● No account crossover between shards. I.E. Tii-

UK and Tii require separate accounts.

Page 7: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

● Account based shardingo Difficult to split account usage evenly across shards.

● Geographical based shardingo Currently have one geographical shard (UK).o Added deployment, poor resource utilization.

● Oracle RAC ($$$)o Oracle OpenWorld is Sunday in SF. No bacon there.

● Horizontal shardingo Move fast growing tables to separate physical hosts.o Break relational constraints.o Good path to a service oriented architecture node.

Page 8: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Why Did We Discuss All That Before Phase 1?

Page 9: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Objective Expertise.Please Step Away From the Application.

Page 10: The Breakup - Logically Sharding a Growing PostgreSQL Database

Triage

What is going to kill us first?

Page 11: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 1: DiagnoseTa

ble

size

in G

igs

Page 12: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 1: Diagnose

Database size: 507 GB m_object_paper: 94 GB

gm3_mark: 71 GBm_object: 53 GB

m_report_stats: 35 GBFour tables account for half the

bulk of the entire database.

Page 13: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 1: Diagnose

What About Table Sharding?

Page 14: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Three Part Two Year Proposal: Short, Mid, and Long term Goals.

Short: 3 MonthsQuery Partition and Refactor

Removal of ‘Leaf Service’: Marks

Page 15: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Three Part Two Year Proposal: Short, Mid, and Long term Goals.

Mid: 9 MonthsID Reconciliation Between Shards

Table Partitioning

Page 16: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Three Part Two Year Proposal: Short, Mid, and Long term Goals.

Long: 12 MonthsCreate DAL

Removal of Large TablesGlobal Statistics and Reporting

Page 17: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 2: Options

Short Term: 12 Months LaterI do not think it means what you think it means.

Page 18: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Database

Main

Marks

Page 19: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Database

Data Up Approach:

Start with the schemaIsolate direct links

Slow, Tedious, and Painful

Page 20: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - DatabaseForeign-key constraints: "$1" FOREIGN KEY (source) REFERENCES m_object(id) "$2" FOREIGN KEY (reader) REFERENCES m_user(id) "m_dg_read_pm_review_set_fkey" FOREIGN KEY (pm_review_set) REFERENCES pm_review_set(id)Referenced by: TABLE "gm_mark" CONSTRAINT "$1" FOREIGN KEY (read) REFERENCES m_dg_read(id) TABLE "erater_read_filter" CONSTRAINT "erater_read_filter_read_fkey" FOREIGN KEY (read) REFERENCES m_dg_read(id) ON DELETE CASCADE TABLE "gm3_mark" CONSTRAINT "gm3_mark_read_fkey" FOREIGN KEY (read) REFERENCES m_dg_read(id) ON DELETE CASCADE TABLE "gm3_rubric_scoring" CONSTRAINT "gm3_rubric_scoring_read_fkey" FOREIGN KEY (read) REFERENCES m_dg_read(id) TABLE "r_mark_criterion" CONSTRAINT "mark_criterion_read_fkey" FOREIGN KEY (read) REFERENCES m_dg_read(id) ON DELETE CASCADE TABLE "pm_review" CONSTRAINT "pm_review_id_fkey" FOREIGN KEY (id) REFERENCES m_dg_read(id) TABLE "r_read_audio" CONSTRAINT "r_read_audio_read_id_fkey" FOREIGN KEY (read_id) REFERENCES m_dg_read(id)

Page 21: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Database

Original: 236 tablesNew main database (192 tables)New marks database (40 tables)

Page 22: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Code

Option 1 - Data Access Layer (DAL)o Separate codebase encapsulating new set of tableso Written in Golang, an HTTP based REST serviceo Avoids carrying forward existing technical debto Requires detailed knowledge of existing product featureso Unit tests are very helpful, but coverage is never 100%o 14 years of business logic (dark matter)o In long lived web apps, tribal knowledge is authoritative

Page 23: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Code

Option 2 - Add additional database handles to new dbo Perceived as a safer approach (deciding factor,

known risks).o Requires paying interest on existing technical debt.o Refactoring is less risky than rewriting.o Take advantage of existing business logic and tribal

knowledge.o Preserve sacred cows.

Page 24: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 3: Scoping The Solution - Hardware

"We can use smaller hardware because we are splitting off part of the database"

➢This is somewhat of a fallacy➢You might need smaller storage➢You might need slightly less CPU➢Stick with close to the same amount of RAM

Page 25: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Rollback

S: “What if this fails?”F: “We Rollback the code, restore the database,

and look for new jobs.”

Page 26: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Rollback

Q: How do you bifurcate a database and rollback without data loss?

A: Slony.

Page 27: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Rollback

Timelines matter. Prepare in advance.Split Replication Well In Advance.Test Process, Then Test It Again.

Page 28: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

● What is this table? That service doesn’t exist anymore?

○Let’s Drop it!●What’s that table? It’s an old version still in

use?○Let’s Drop it!

●What’s that one over there?○Let’s Drop it!

Page 29: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

Wait… old version still in use?

Page 30: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

● Fourteen years of application development.● Five major codebases, dozens of support utilities.● Hundreds of codepoints for database connections.● A dozen different ORMs.● Dynamically generated SQL joining tables.● Technical debt (code with high maintenance costs).● Best practices of 10 years ago are now liabilities.

Page 31: The Breakup - Logically Sharding a Growing PostgreSQL Database

How do you change all of the electrical sockets in an

(old) office building?

Stage 4: Implementation - Archaeology

Page 32: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

EMPATHY

Page 33: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

EMPATHYput yourself in the mind of the

author

Page 34: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

James left 8 years ago. The elevator is in old building.They tore down the old building to build a Target.

# this code is critical to our workflow, don’t remove it!!# for details talk to jamesb <> who sits near the elevator# $foo = $object->flocculate( key => $cfg->secret_key );# return $foo;return;

Page 35: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Archaeology

Bob is still here though. Bob is a little particular about his code though (we are all to some degree).

Now you’re in there meddling with Bob’s code. How would you feel if you were Bob?

A little empathy goes a long way towards getting Bob to help you get his code ported to the new dual database schema.

Page 36: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Queries

main database - marks databaseSELECT count(m.*) FROM gm3_mark m, gm3_qm_template qmtWHERE m.read IN

(SELECT dgr.id FROM m_dg_read dgrJOIN m_object_paper mop ON (mop.id =

dgr.source AND mop.owner = ?)JOIN m_assignment ma ON (ma.id =

mop.assignment AND ma.class = ?) WHERE reader = ?)

AND m.qm_template = qmt.id AND qmt.id = ?

Page 37: The Breakup - Logically Sharding a Growing PostgreSQL Database

Main Database - grab ids to pass to marks database.

SELECT p.id FROM m_object_paper pJOIN m_assignment a ON a.id = p.assignmentWHERE a.class = ? AND p.owner = ?

Stage 4: Implementation - Queries

Page 38: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Queries

Marks database - pass former FK ids to an IN clause.

SELECT count(m.*) FROM gm3_mark m JOIN gm3_qm_template qmt ON qmt.id = m.qm_template JOIN m_dg_read dgr ON dgr.id = m.read WHERE dgr.source IN (?, ?, ?) AND qmt.id = ? AND dgr.reader = ?

Page 39: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

Single database transactions are easy.eval { $db->do(“INSERT INTO foo (name) VALUES (‘bar’)”); $id = $db->do(“SELECT CURRVAL(‘foo’)”); $db->do(“INSERT INTO fee (foo_id) VALUES ($id)”);};if ($@) { # catch exception $db->rollback; # roll transaction back} else { $db->commit; # commit transaction}

Page 40: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

Dual database transactions are harder.

eval { # insert into foo in main db, grab last value $main_db->do(“INSERT INTO foo VALUES (‘bar’)”); $foo_id = $main_db->do(“SELECT CURRVAL(‘foo’)”);

# insert foo id into marks db, grab last value $marks_db->do(“INSERT INTO fee VALUES ($id)”); $fee_id = $main_db->do(“SELECT CURRVAL(‘fee’)”);};

Page 41: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

Roll back both handles on exception, commit both on success.

if ($@) { # catch exception $main_db->rollback; # roll main_db back $marks_db->rollback; # roll marks_db back} else { $main_db->commit; # commit main_db $marks_db->commit; # commit marks_db}

Page 42: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

What if the commit fails?if ($@) { # catch exception $main_db->rollback; # roll main_db back $marks_db->rollback; # roll marks_db back} else { eval { $main_db->commit }; if ($@) { $main_db->rollback; $marks_db->rollback; } eval { $marks_db->commit }; ...

Page 43: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

CAP (Brewer’s Law)

Page 44: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

Consistency or Availability?

Page 45: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Transactions

9 out of 10 users prefer availability

So does customer support.You can fix consistency.

Page 46: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - ORMs

ORMs are full of pain● They hide away db connection details.

● They make it hard to break models apart.

● They make writing code easy…

● But debugging is much more difficult.

Page 47: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - ORMs

ORMs are full of painBack in my day we used SQL, and we liked it.

$classes = $c->classes->search( $select_hash, { '+select' => 'source.id', '+as' => 'src_id', 'join' => [ { 'user_rights_class' => { 'user_role' => 'owner' } }, 'source' ], 'rows' => 200, 'page' => 1 } );

Page 48: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Juggling

Talking to two databases is easy, right?

Page 49: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Juggling

Talking to two databases is easy, right?

Not as easy as it seems.

Page 50: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Juggling

Main database - Marks database

Are you talking to me?

Page 51: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Juggling

Main database - Marks database

I think he was talking to me.

Page 52: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Config

● Main Database: One master, two slaves (2)

● Marks Database: One master, two slaves (2)

● ASP application: write user, read only user (2)

● Catalyst Application: write user, read only user (2)

● REST Application: write user, read only user (2)

● dev, qa, staging, production, sandbox, uk (6)

Page 53: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Config

● Database hosts and users: 2*5 = 10

● Stages: 10 * 6 = 60

● Config managed in version control, no discovery.

● Config deployed via RPM with application.

● Get one wrong? Start all over again.

● Configuration is full of pain and suffering.

Page 54: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Config

Yes, we are moving to Chef.

Page 55: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

How much tech debtdo you have?

Page 56: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

How much tech debtdo you have?

More than you think.

Page 57: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

How much of it will you have to deal with?

Page 58: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

How much of it will you have to deal with?

More than you think.

Page 59: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

Our legacy app:● 5 ORMs

● No unit tests (many integration tests)

● Two template frameworks

● 9 different log files

● Code is generally pretty readable!

Page 60: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 4: Implementation - Tech Debt

Page 61: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 5: Release

Planned 8 hour Maintenance Window15 People + support

2.5 Hours Main Service1.5 Hours UK

2 Hours Sandbox + Cat Videos

Page 62: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 5: Release

Page 63: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

Patch Flavors:How Did That Get there?

That’s a bug.It worked fine in dev.

Page 64: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

“Sometimes the query planner does dumb things”

o People forget why you embarked on this effort.

o People forget the successes and risk mitigation.

o People won’t forget the visceral reactions to

service degradations.

Page 65: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

How to bring your site to a halt:

1.Start transaction to database 12.Start transaction to database 23.Wait for database 1 to finish

Page 66: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

PANIC

Page 67: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

Gone in 60 seconds

Page 68: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

Page 69: The Breakup - Logically Sharding a Growing PostgreSQL Database

Stage 6: Cleanup

Where Do We Golang From here?

Back To Plan A.

Most of the heavy lifting is done.

“The first split is the hardest” - Some Guy Here

Page 70: The Breakup - Logically Sharding a Growing PostgreSQL Database

The End

So long SurgeCon!

And thanks for the bacon.