the hive think tank: rocking the database world with rocksdb

12

Upload: pashu-dewailly-christensen

Post on 16-Apr-2017

403 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: The Hive Think Tank: Rocking the Database World with RocksDB
Page 2: The Hive Think Tank: Rocking the Database World with RocksDB

Caching a Billion Items: A Case Study of Rakuten Programmatic DSP

Qian ZhuPrinciple EngineerRakuten Marketing

Page 3: The Hive Think Tank: Rocking the Database World with RocksDB

About Me• Principle engineer, Rakuten• Worked in Accenture Technology Labs for 4 years

– Infrastructure and System software• Research focus on big data infrastructure and analytics• Ph.D. in Computer Science and Engineering,

specializing on distributed systems and data mining

Page 4: The Hive Think Tank: Rocking the Database World with RocksDB

About Rakuten• 3rd largest e-commerce retailer in the world• Largest consumer services company in Japan• Rakuten Affiliate Network (formerly Rakuten LinkShare)

is a leading provider of full-service online marketing solutions – voted #1 for 4 years in a row

Page 5: The Hive Think Tank: Rocking the Database World with RocksDB
Page 6: The Hive Think Tank: Rocking the Database World with RocksDB

Retargeting Platform

• The platform delivers real-time segment loading, scalability, frequency capping, AdX inventory and AdX automatic creative tagging

• the architecture lays the foundation for extensive data flow validations, business alerts and performance optimization

Page 7: The Hive Think Tank: Rocking the Database World with RocksDB

Challenges• Low latency: respond to bid requests in 10s of ms

– 100 ms on bidding– generating retargeting data within a few seconds

• High I/O: handle intensive (random) query workload– Up to 100,000 QPS per bidder– Tens of bidders across the world

• Parallelism: write, read and data lifetime management in parallel

• Easy migration, eventual consistency and etc.

Page 8: The Hive Think Tank: Rocking the Database World with RocksDB
Page 9: The Hive Think Tank: Rocking the Database World with RocksDB

How RocksDB is Used• Write thread

– Multiple column families for segment data, segment index*, Kafka offsets and expire index

• Query thread– Range scan and random single record retrieval

• Expiry thread– Triggered at a demanded frequency– Delete data from both RocksDB and bidder cache

Page 10: The Hive Think Tank: Rocking the Database World with RocksDB

Moving Forward• Apply for other Rakuten services

• We are hiring!!!– System programmers– Data engineers– Machine learning experts

Page 12: The Hive Think Tank: Rocking the Database World with RocksDB