building a scalable and modern infrastructure at carfax
DESCRIPTION
The CARFAX vehicle history database contains over twelve billion documents in a twelve shard cluster that replicates to multiple data centers. This will be a step by step walk through of how we deploy our servers, manage high volume reads and writes, and our configuration for high availability. By automating everything from the operating system install up we are able deploy complete replica clusters quickly and efficiently. Using distributed processing and message queuing we load millions of new documents each day with a projected growth over a billion records per year. Through the use of tagging, server configuration, and read settings we deliver content with high consistency and availability.TRANSCRIPT
![Page 1: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/1.jpg)
A Scalable and Modern Infrastructure at CARFAX
![Page 2: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/2.jpg)
About Me• Jai Hirsch – Senior Systems Architect, Data
Technologies at CARFAX• Long-time Java and Database Developer• Data and Distributed Processing Enthusiast
• Github: https://github.com/JaiHirsch• Twitter: @JaiHirsch • Blog: http://jaihirsch.github.io/straw-in-a-haystack/
![Page 3: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/3.jpg)
“CARFAX helps millions of people buy and sell used cars with more confidence”
![Page 4: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/4.jpg)
CARFAX Vehicle History Report
![Page 5: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/5.jpg)
Documents on the Report
![Page 6: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/6.jpg)
NoSQL Before it Was Cool
Proprietary Key Value Store on OpenVMS Developed by CARFAX in 1984
![Page 7: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/7.jpg)
Never mind that sh*t! Here comes Mongo!
![Page 8: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/8.jpg)
Why MongoDB?Legacy structures mapped to
documentsHigh availability using replica setsPlatform IndependenceSupport
![Page 9: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/9.jpg)
MongoDB at CARFAXOur Production EnvironmentThe Legacy Database and High
Volume LoadsHigh Availability Reads
![Page 10: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/10.jpg)
Our Production Environment
![Page 11: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/11.jpg)
Server Deployment
AUTOMATEAUTOMATE
AUTOMATEAUTOMATE
![Page 12: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/12.jpg)
Server Configuration12 Shards with two spare servers racked for failover• OS: Linux• MongoDB 2.4.9• 128 GIGs of RAM• 1.8 TB of Drive Space • 10K RPM SAS Drives
![Page 13: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/13.jpg)
![Page 14: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/14.jpg)
The Future
![Page 15: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/15.jpg)
![Page 16: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/16.jpg)
Extract, Transform, Load
![Page 17: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/17.jpg)
Loading Millions to Billions of Records per Day
AUTOMATEAUTOMATE
AUTOMATEAUTOMATE
![Page 18: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/18.jpg)
First Attempt To Load Was Completely CPU Bound
![Page 19: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/19.jpg)
Not Acceptable!45 Days to
Backload the Legacy Database
![Page 20: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/20.jpg)
DistributedProcessing
![Page 21: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/21.jpg)
Acceptable! Billion+ inserts per
Day! 9 Days to Backload
![Page 22: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/22.jpg)
The MongoDB Implementation
13.6 billion+ documents 1.5 billion+ new documents per
year Document size: ~ 800 Bytes
![Page 23: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/23.jpg)
VHR Uses 200+ DocumentsWith Embedded Keys
![Page 24: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/24.jpg)
High Availability
Reads
![Page 25: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/25.jpg)
Millions of Reports per Day
AUTOMATEAUTOMATE
AUTOMATE
![Page 26: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/26.jpg)
Read Scalability With Tagging
![Page 27: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/27.jpg)
Each Data center is Tagged
Each Replica Set is Tagged
![Page 28: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/28.jpg)
5X More Reports per
Second
![Page 29: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/29.jpg)
But we can do More!
![Page 30: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/30.jpg)
Lets Wrap It UpDon’t buy a used car without a
CARFAX reportGrok your data and working setArchitect for your load volumeScale your reads to meet demand
30
![Page 31: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/31.jpg)
Keys To SuccessAUTOMATE EVERYTHINGTest Many ConfigurationsGrid Computing is AwesomeShard Early, Shard Often
![Page 32: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/32.jpg)
And Remember
![Page 33: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/33.jpg)
Friends Don’t Let Friends Use Default Ulimits!
![Page 34: Building a Scalable and Modern Infrastructure at CARFAX](https://reader033.vdocuments.net/reader033/viewer/2022061206/5482a7afb4af9f54508b46c3/html5/thumbnails/34.jpg)
Thank You! The migration was a
success due to the incredible teams at CARFAX and MongoDB
We are always looking for great people to join us.
www.carfax.com/careers