scylla summit 2017: learn how to build a time series database

11

Click here to load reader

Upload: scylladb

Post on 22-Jan-2018

398 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Learn How to Build a Time Series Database

Staff Engineer, Proofpoint

Brian Hawkins

Page 2: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Brian Hawkins

2

Brian is a Staff Engineer at Proofpoint. He created

KairosDB, a time series solution that runs on top of

Cassandra or ScyllaDB.

Page 3: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

KairosDB

▪ Fast open source time series database

▪ Cassandra is primary backend (GridDB and H2)

▪ Extremely pluggable

▪ Scales really well (10,000,000,000/day)

▪ Saves you $$$$

3

Page 4: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Build Your Own TS Application

▪ Why Build Your Own?

o Specialized Query Patterns

o Indexing Requirements

▪ Use KairosDB

o Plugable Storage Model for Custom Data Point Types

o Easy to Extend

o Easy to Embed

4

Page 5: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Storage

HashMap<partition_key, TreeMap<cluster_key, value>>

5

Page 6: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Storage

HashMap<partition_key, TreeMap<cluster_key, value>>

6

Page 7: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Storage

HashMap<partition_key, TreeMap<cluster_key, value>>

7

Page 8: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Schema

public static final String DATA_POINTS = "" + "CREATE TABLE IF NOT EXISTS data_points (" + " metric text, " + " row_time timestamp, " + " data_type text, " + " tags frozen<map<text, text>>, " + " offset int, "+ " value blob, " + " PRIMARY KEY ((metric, row_time, data_type, tags), offset)" + ") WITH COMPACT STORAGE";

8

Page 9: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

General Tips for Ingest

▪ Batch (10k is sweet spot)o Need to adjust batch_size_fail_threshold on C* server

▪ Add LZ4 Compression

▪ Set native_transport_max_threads: 2000

▪ Don’t cannon ball your batch

9

Page 10: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

Is Cassandra/Scylla the Right Platform for TS?

▪ The Good

o Incredibly fast ingest

o Excellent scaling

o Moderately compact storage

o Write once data

▪ The Bad

o No multi dimensional lookup

o No in place processing of data

10

Page 11: Scylla Summit 2017: Learn How to Build a Time Series Database

PRESENTATION TITLE ON ONE LINE AND ON TWO LINES

First and last namePosition, company

THANK YOU

[email protected]

Please stay in touch

Any questions?