web scale mysql at facebook (domas mituzas)

28

Upload: ontico

Post on 15-Jan-2015

3.508 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Web scale MySQL at Facebook (Domas Mituzas)
Page 2: Web scale MySQL at Facebook (Domas Mituzas)

Web scale MySQL@ facebook

Domas Mituzas2011-10-03

Page 3: Web scale MySQL at Facebook (Domas Mituzas)

1 Intro

2 Current

3 Future

Agenda

Page 4: Web scale MySQL at Facebook (Domas Mituzas)

Facebook• 800M active monthly users

• 500M active daily users

• 350M mobile users

• 7M apps and websites integrated via platform

Page 5: Web scale MySQL at Facebook (Domas Mituzas)

1 Setup

2 Performance Overview

3 Stalls

4 Efficiency

5 Projects

Current

Page 6: Web scale MySQL at Facebook (Domas Mituzas)

Setup▪ Software

▪ MySQL 5.1

▪ Custom facebook patch

▪ Launchpad - mysqlatfacebook

▪ Extra resiliency

▪ Reduced operations effort

▪ Hardware

▪ Variety of generations

▪ Many core

▪ Local storage

▪ Some flash storage

Page 7: Web scale MySQL at Facebook (Domas Mituzas)

UDB Performance numbers(From Sep. 2011)

▪ Query response time

▪ 4ms reads, 5ms writes

▪ Network bytes sent per second

▪ 90GB peak

▪ Queries per second

▪ 60M peak

▪ Rows read per second

▪ 1450M peak

▪ Rows changed per second

▪ 3.5M peak

▪ InnoDB page IO per second

▪ 8.1M peak

Page 8: Web scale MySQL at Facebook (Domas Mituzas)

Performance focus▪ Focus on reliable throughput in production

▪ Avoid performance stalls

▪ Make sure hardware is used

▪ 99th percentile rather than average or median

▪ Worst offender analysis – topN & histograms instead of tier averages

Page 9: Web scale MySQL at Facebook (Domas Mituzas)

Stalls▪ “Dogpiles”

▪ Temporary slow down – even 0.1s is huge

Page 10: Web scale MySQL at Facebook (Domas Mituzas)

Stall tools▪ Dogpiled (in-house)

▪ Snapshot aggregation of server state at distress

▪ “time machine” view into logs before the event too

▪ Aspersa (stalk, collect)

▪ Poor man’s profiler (.org)

▪ Later iterations – apmp, hpmp, tpmp

▪ GDB

Page 11: Web scale MySQL at Facebook (Domas Mituzas)

Stalls found▪ Tables extending – global I/O mutex held

▪ Drop table – both SQL layer and InnoDB global mutexes held

▪ Purge contention – unnecessary dictionary lock held

▪ Binlog reads – no commits can happen if old events read

▪ Kernel mutex – O(N) and O(N^2) operations

▪ Transaction creation

▪ Lock creation/removal, deadlock detection

▪ Background page flushing not really background

▪ Many more

Page 12: Web scale MySQL at Facebook (Domas Mituzas)

Efficiency▪ Increasing utilization of hardware

▪ Memory to Disk ratio

▪ Finding bottlenecks

▪ Disk bound normally

▪ Sometimes network

▪ Application or server software chokepoints

▪ Rarely CPU/memory bandwidth

▪ Application design

▪ Biggest wins are in optimizing the workload

Page 13: Web scale MySQL at Facebook (Domas Mituzas)

Disk efficiency▪ Normally disk IOPS bound

▪ Allowing higher queue lengths

▪ Can operate at more than 8 pending operations per disk

▪ InnoDB page size

▪ Need adjustable per table or index for real gain

▪ XFS/deadline

▪ Parallelism at MySQL layer

▪ >300 iops on 166 rps disks

Page 14: Web scale MySQL at Facebook (Domas Mituzas)

Memory efficiency▪ Compact records – Thrift compaction for objects, etc

▪ Clustered and covering index planning

▪ FORCE INDEX – avoid unnecessary I/O and cached pages

▪ Historical data access costly

▪ Full table scans

▪ ETL-type queries, mysqldump, …

▪ Tune midpoint insertion LRU for InnoDB

▪ Incremental updating, incremental binary backups

▪ O_DIRECT data and logs access

Page 15: Web scale MySQL at Facebook (Domas Mituzas)

Pure flash(Cheating)

▪ Data stored directly on flash

▪ Limited data size

▪ Not utilizing flash card fully

▪ Still used in some cases

Page 16: Web scale MySQL at Facebook (Domas Mituzas)

Flashcache▪ Flash in front of disks

▪ Can use slower disks

▪ Write-back cache

▪ Much more data storage

▪ Able to utilize much more of flash card

▪ Very long warmup time

▪ Open source (github/facebook/flashcache)

Page 17: Web scale MySQL at Facebook (Domas Mituzas)

MySQL 2x▪ Flash allows for large loads

▪ Large performance difference from pure disk servers

▪ Many older servers still being used

▪ Solution?

▪ Run multiple MySQL instances per server

▪ Use ports 3307, 3308, 3309, etc…

▪ Replication prevents direct consolidation

▪ Redo a lot of port assumptions in code

Page 18: Web scale MySQL at Facebook (Domas Mituzas)

Application caching▪ Old: memcached

▪ Cache invalidation stampedes, refetching full dataset on refresh, many copies

▪ New: write-through caching

▪ Incremental cache updates

▪ Cache hierarchies for datacenter local copies

▪ Efficient operations for association set

▪ Common API for all use cases

Page 19: Web scale MySQL at Facebook (Domas Mituzas)

Group commit▪ Some OLTP workloads too busy even for modern RAID cards

▪ High I/O pressure increases response times

▪ Durability compromises increase operational overhead

▪ Dead batteries are extremely painful otherwise

▪ Now in 5.1.52-fb

Page 20: Web scale MySQL at Facebook (Domas Mituzas)

Admission control▪ Server resources are limited

▪ Per account thread concurrency

▪ Reduces O(N^2) blowup chance

▪ max_connections are no longer impacting server load

▪ Per-application resource throttling

▪ Now in 5.1.52-fb

Page 21: Web scale MySQL at Facebook (Domas Mituzas)

Online Schema Change▪ External PHP script, open source

▪ Utilizes triggers for change tracking

▪ Used on 100G+ sized tables

▪ Dump/reload + fast index creation

▪ Extendable class, may allow:

▪ PK composition changes with conflict resolution

▪ Indexing previously unindexed datasets

Page 22: Web scale MySQL at Facebook (Domas Mituzas)

Tools▪ Table and user statistics

▪ Shadows

▪ Slocket

▪ pmysql

▪ Replication sampling

▪ Client log aggregation

▪ Query comments

▪ Indigo (Query monitor)

Page 23: Web scale MySQL at Facebook (Domas Mituzas)

1 Visibility

2 Replication

3 Compression

Future

Page 24: Web scale MySQL at Facebook (Domas Mituzas)

Future▪ MySQL is never a solved problem

▪ Always investigating better/new solutions

▪ New hardware types

▪ New datacenters and topologies

▪ New use cases and clients

▪ New neighbors to share data with

Page 25: Web scale MySQL at Facebook (Domas Mituzas)

Visibility▪ Never assume

▪ Use metrics to measure

▪ When metrics aren’t available, add them

▪ Full stack

▪ More InnoDB info

▪ More application info

Page 26: Web scale MySQL at Facebook (Domas Mituzas)

Replication▪ Lag used to be a big problem, still is a bottleneck

▪ Possible solutions:

▪ “Better” slave prefetch

▪ Maatkit version has problems

▪ Our own version being used on some tiers successfully

▪ May be possible with InnoDB cooperation

▪ Continuent parallel slave

▪ Oracle parallel slave in 5.6

Page 27: Web scale MySQL at Facebook (Domas Mituzas)

InnoDB Compression▪ Originally was planned during 5.1 upgrade

▪ Problems

▪ Replication stream cost

▪ Increased log writes

▪ Performance in some cases

▪ Stability, monitoring, etc

Page 28: Web scale MySQL at Facebook (Domas Mituzas)

(c) 2009 Facebook, Inc. or its licensors.  "Facebook" is a registered trademark of Facebook, Inc.. All rights reserved. 1.0