prometheus - usenix · pdf file$ ./prometheus (not like opentsdb) soundcloud scalable data...

25
Prometheus: A Next-Generation Monitoring System Björn Rabenstein, Julius Volz SoundCloud SREcon Dublin May 14, 2015

Upload: phunghanh

Post on 07-Mar-2018

248 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Prometheus:A Next-Generation Monitoring System

Björn Rabenstein, Julius VolzSoundCloud

SREcon DublinMay 14, 2015

Page 2: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

History

Page 3: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

From a monolith...

Page 4: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

...to microservices

Page 5: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Existing monitoring setup

in 2012...

Page 6: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Service monitoring

Page 7: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Service monitoring

Page 8: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Host monitoring

Page 9: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Alerting

Page 10: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Not fun

Page 11: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Prometheus

Page 12: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Multi-dimensional data model

api_http_requests_total{method="GET", endpoint="/api/tracks", status="200"} 2034834

(like OpenTSDB)

Page 13: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Operational simplicity

$ go build$ ./prometheus

(not like OpenTSDB)

Page 14: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Scalable data collection

Thousands of targets. Hundreds of thousands of samples per second.

Millions of time series.On a single monitoring server.

Running many servers is easy, too…Pull, not push.

Page 15: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Powerful query language

topk(3, sum(rate(bazooka_instance_cpu_time_ns[5m])) by (app, proc))

sort_desc(sum(bazooka_instance_memory_limit_bytes - bazooka_instance_memory_usage_bytes) by (app, proc))

Page 16: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Architecture

Page 17: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Expression browser

Page 18: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Built-in graphing

Page 19: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

PromDash

Page 20: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Alertmanager

Page 21: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Exporters

vs.

direct instrumentation

Page 22: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Client libraries

Official

GoJava (JVM)RubyPython

Unofficial

.NET / C#Node.jsBash

Page 23: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

Integrations

Official exporters

Node/system metrics exporterGraphite exporterCollectd exporterJMX exporterHAProxy exporterStatsD bridgeAWS CloudWatch exporterHystrix metrics publisherMesos task exporterConsul exporter

Unofficial exporters

RethinkDB exporterRedis exporterscollector exporterMongoDB exporterDjango exporterGoogle's mtail log data extractorMinecraft exporter moduleMeteor JS web framework exporter

Direct instrumentation

cAdvisorKubernetesKubernetes-MesosEtcdgokitgo-metrics instrumentation library...

Page 24: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

Further reading

prometheus.io

developers.soundcloud.com/blog

boxever.com/tech-blog

Page 25: Prometheus - USENIX · PDF file$ ./prometheus (not like OpenTSDB) SOUNDCLOUD Scalable data collection Thousands of targets. Hundreds of thousands of samples per

SOUNDCLOUD

AcknowledgementsMany contributors at SoundCloud…

...and by now so many more from everywhere.

Special thanks to:

Matt T. Proud (founding father)Brian Brazil (Boxever)

Johannes “fish” Ziemke (Docker)