scylla summit 2017: learn how to build a time series database
TRANSCRIPT
![Page 1: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/1.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Learn How to Build a Time Series Database
Staff Engineer, Proofpoint
Brian Hawkins
![Page 2: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/2.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Brian Hawkins
2
Brian is a Staff Engineer at Proofpoint. He created
KairosDB, a time series solution that runs on top of
Cassandra or ScyllaDB.
![Page 3: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/3.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
KairosDB
▪ Fast open source time series database
▪ Cassandra is primary backend (GridDB and H2)
▪ Extremely pluggable
▪ Scales really well (10,000,000,000/day)
▪ Saves you $$$$
3
![Page 4: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/4.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Build Your Own TS Application
▪ Why Build Your Own?
o Specialized Query Patterns
o Indexing Requirements
▪ Use KairosDB
o Plugable Storage Model for Custom Data Point Types
o Easy to Extend
o Easy to Embed
4
![Page 5: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/5.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Storage
HashMap<partition_key, TreeMap<cluster_key, value>>
5
![Page 6: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/6.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Storage
HashMap<partition_key, TreeMap<cluster_key, value>>
6
![Page 7: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/7.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Storage
HashMap<partition_key, TreeMap<cluster_key, value>>
7
![Page 8: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/8.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Schema
public static final String DATA_POINTS = "" + "CREATE TABLE IF NOT EXISTS data_points (" + " metric text, " + " row_time timestamp, " + " data_type text, " + " tags frozen<map<text, text>>, " + " offset int, "+ " value blob, " + " PRIMARY KEY ((metric, row_time, data_type, tags), offset)" + ") WITH COMPACT STORAGE";
8
![Page 9: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/9.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
General Tips for Ingest
▪ Batch (10k is sweet spot)o Need to adjust batch_size_fail_threshold on C* server
▪ Add LZ4 Compression
▪ Set native_transport_max_threads: 2000
▪ Don’t cannon ball your batch
9
![Page 10: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/10.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
Is Cassandra/Scylla the Right Platform for TS?
▪ The Good
o Incredibly fast ingest
o Excellent scaling
o Moderately compact storage
o Write once data
▪ The Bad
o No multi dimensional lookup
o No in place processing of data
10
![Page 11: Scylla Summit 2017: Learn How to Build a Time Series Database](https://reader038.vdocuments.net/reader038/viewer/2022100803/5a6512817f8b9aa2548b6a4b/html5/thumbnails/11.jpg)
PRESENTATION TITLE ON ONE LINE AND ON TWO LINES
First and last namePosition, company
THANK YOU
Please stay in touch
Any questions?