getting innodb compression_ready_for_facebook_scale

Post on 27-Jun-2015

1.068 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

InnoDB CompressionGetting it ready for Facebook scale

Nizam Ordulu nizam.ordulu@fb.comSoftware Engineer, database engineering @Facebook4/11/12

Why use compression

Why use compression

▪ Save disk space.

▪ Buy fewer servers.

▪ Buy better disks (SSD) without too much increase in cost.

▪ Reduce IOPS.

Database Size

IOPS

Sysbench Benchmarks

SysbenchDefault table schema for sysbench

CREATE TABLE `sbtest` (

`id` int(10) unsigned NOT NULL auto_increment,

`k` int(10) unsigned NOT NULL default '0',

`c` char(120) NOT NULL default '',

`pad` char(60) NOT NULL default '',

PRIMARY KEY (`id`),

KEY `k` (`k`)

);

In-memory benchmarkConfiguration

▪ Buffer pool size =1G.

▪ 16 tables.

▪ 250K rows on each table.

▪ Uncompressed db size = 1.1G.

▪ Compressed db size = 600M.

▪ In-memory benchmark.

▪ 16 threads.

In-memory benchmarkLoad Time

mysql-un-compressed

mysql-compressed fb-mysql-un-compressed

fb-mysql-compressed

0

10

20

30

40

50

60

70

80Time(s)

Time(s)

In-memory benchmarkDatabase size after load

mysql-uncompressed mysql-compressed fb-mysql-uncompressedfb-mysql-compressed0

200

400

600

800

1000

1200Size (M)

Size (M)

In-memory benchmarkTransactions per second for reads (oltp.lua, read-only)

mysql-uncompressed mysql-compressed fb-mysql-un-compressed

fb-mysql-compressed0

1000

2000

3000

4000

5000

6000

7000

8000Transactions Per Second (Read-Only)

TPS

In-memory benchmarkInserts per second (insert.lua)

mysql-uncompressed mysql-compressed fb-mysql-un-compressed

fb-mysql-compressed (4X)

0

10000

20000

30000

40000

50000

60000Inserts Per Second

IPS

IO-bound benchmark for insertsInserts per second (insert.lua)

mysql-uncompressed mysql-compressed fb-mysql-un-compressed

fb-mysql-com-pressed(3.8X)

0

10000

20000

30000

40000

50000

60000Inserts Per Second

IPS

InnoDB Compression

InnoDB CompressionBasics

▪ 16K Pages are compressed to 1K, 2K, 4K, 8K blocks.

▪ Block size is specified during table creation.

▪ 8K is safest if data is not too compressible.

▪ blobs and varchars increase compressibility.

▪ In-memory workloads may require larger buffer pool.

InnoDB CompressionExample

CREATE TABLE `sbtest1` (

`id` int(10) unsigned NOT NULL AUTO_INCREMENT,

`k` int(10) unsigned NOT NULL DEFAULT '0',

`c` char(120) NOT NULL DEFAULT '’,

`pad` char(60) NOT NULL DEFAULT '',

PRIMARY KEY (`id`),

KEY `k_1` (`k`)

) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

InnoDB CompressionPage Modification Log (mlog)

▪ InnoDB does not recompress a page on every update.

▪ Updates are appended to the modification log.

▪ mlog is located in the bottom of the compressed page.

▪ When mlog is full, page is recompressed.

InnoDB CompressionPage Modification Log Example

InnoDB CompressionPage Modification Log Example

InnoDB CompressionPage Modification Log Example

InnoDB CompressionPage Modification Log Example

InnoDB CompressionCompression failures are bad

▪ Compression failures:

▪ waste CPU cycles,

▪ cause mutex contention.

InnoDB CompressionUnzip LRU

▪ A compressed block is decompressed when it is read.

▪ Compressed and uncompressed copy are both in memory.

▪ Any update on the page is applied to both of the copies.

▪ When it is time to evict a page:

▪ Evict an uncompressed copy if the system is IO-bound.

▪ Evict a page from the normal LRU if the system is CPU-bound.

InnoDB CompressionCompressed pages written to redo log

▪ Compressed pages are written to redo log.

▪ Reasons for doing this:

▪ Reuse redo logs even if the zlib version changes.

▪ Prevent against indeterminism in compression.

▪ Increase in redo log writes.

▪ Increase in checkpoint frequency.

InnoDB CompressionOfficial advice on tuning compression

If the number of “successful” compression operations (COMPRESS_OPS_OK) is a high percentage of the total number of compression operations (COMPRESS_OPS), then the system is likely performing well. If the ratio is low, then InnoDB is reorganizing, recompressing, and splitting B-tree nodes more often than is desirable. In this case, avoid compressing some tables, or increase KEY_BLOCK_SIZE for some of the compressed tables. You might turn off compression for tables that cause the number of “compression failures” in your application to be more than 1% or 2% of the total. (Such a failure ratio might be acceptable during a temporary operation such as a data load).

Facebook Improvements

Facebook ImprovementsFinding bugs and testing new features

▪ Expanded mtr test suite with crash-recovery and stress tests.

▪ Simulate compression failures.

▪ Fixed the bugs revealed by the tests and production servers.

Facebook ImprovementsTable level compression statistics

▪ Added the following columns to table_statistics:

▪ COMPRESS_OPS,

▪ COMPRESS_OPS_OK,

▪ COMPRESS_USECS,

▪ UNCOMPRESS_OPS,

▪ UNCOMPRESS_USECS.

Facebook ImprovementsRemoval of compressed pages from redo log

▪ Removed compressed page images from redo log.

▪ Introduced a new log record for compression.

Facebook ImprovementsAdaptive padding

▪ Put less data on each page to prevent compression failures.

▪ pad = 16K – (maximum data size allowed on the uncompressed copy)

Facebook ImprovementsAdaptive padding

Facebook ImprovementsAdaptive padding

Facebook ImprovementsAdaptive padding▪ Algorithm to determine pad per table:

▪ Increase the pad until the compression failure rate reaches the specified level.

▪ Decrease padding if the failure rate is too low.

▪ Adapts to the compressibility of data over time.

Facebook ImprovementsAdaptive padding on insert benchmark

▪ Padding value for sbtable is 2432.

▪ Compression failure rate:

▪ mysql: 41%.

▪ fb-mysql: 5%.

mysql-compressed fb-mysql-compressed

0

5000

10000

15000

20000

25000

30000

35000Inserts Per Second

Facebook ImprovementsCompression ops in insert benchmark

mys

ql-c

ompr

esse

d

fb-m

ysql

-com

pres

sed

0

200000

400000

600000

800000

1000000

1200000

1400000

compress_ops_okcompress_ops_fail

Facebook ImprovementsTime spent for compression ops in insert benchmark

mysql-compressed fb-mysql-compressed0

200

400

600

800

1000

1200

compress_time(s)decompress_time(s)

Facebook ImprovementsOther improvements

▪ Amount of empty allocated pages: 10-15% to 2-5%.

▪ Cache memory allocations for:

▪ compression buffers,

▪ decompression buffers,

▪ buffer page descriptors.

▪ Hardware accelerated checksum for compressed pages.

▪ Remove adler32 calls from zlib functions.

Facebook ImprovementsFuture work

▪ Make page_zip_compress() more efficient.

▪ Test larger page sizes:32K, 64K.

▪ Prefix compression.

▪ Other compression algorithms: snappy, quicklz etc.

▪ 3X compression in production.

Questions

nizam.ordulu@fb.com

top related