the hive think tank: rocking the database world with rocksdb
TRANSCRIPT
Caching a Billion Items: A Case Study of Rakuten Programmatic DSP
Qian ZhuPrinciple EngineerRakuten Marketing
About Me• Principle engineer, Rakuten• Worked in Accenture Technology Labs for 4 years
– Infrastructure and System software• Research focus on big data infrastructure and analytics• Ph.D. in Computer Science and Engineering,
specializing on distributed systems and data mining
About Rakuten• 3rd largest e-commerce retailer in the world• Largest consumer services company in Japan• Rakuten Affiliate Network (formerly Rakuten LinkShare)
is a leading provider of full-service online marketing solutions – voted #1 for 4 years in a row
Retargeting Platform
• The platform delivers real-time segment loading, scalability, frequency capping, AdX inventory and AdX automatic creative tagging
• the architecture lays the foundation for extensive data flow validations, business alerts and performance optimization
Challenges• Low latency: respond to bid requests in 10s of ms
– 100 ms on bidding– generating retargeting data within a few seconds
• High I/O: handle intensive (random) query workload– Up to 100,000 QPS per bidder– Tens of bidders across the world
• Parallelism: write, read and data lifetime management in parallel
• Easy migration, eventual consistency and etc.
How RocksDB is Used• Write thread
– Multiple column families for segment data, segment index*, Kafka offsets and expire index
• Query thread– Range scan and random single record retrieval
• Expiry thread– Triggered at a demanded frequency– Delete data from both RocksDB and bidder cache
Moving Forward• Apply for other Rakuten services
• We are hiring!!!– System programmers– Data engineers– Machine learning experts
Questions?• https://www.linkedin.com/in/qian-zhu-40a3917• mailto: [email protected]