Post on 11-Jan-2017
Embed Size (px)
Prometheus OverviewThe Promethean Ideal of Monitoring
Know when things go wrong To call in a human to prevent a business-level issue, or prevent an issue in advance
Be able to debug and gain insight Trending to see changes over time, and drive technical/business decisions To feed into other systems/processes (e.g. QA, security, automation)
Common Monitoring ChallengesThemes common among companies Ive talk to:
Monitoring tools are limited, both technically and conceptually Tools dont scale well and are unwieldy to manage Operational practices dont align with the business
Your customers care about increased latency and its in your SLAs. You can only alert on individual machine CPU usage.
Result: Engineers continuously woken up for non-issues, get fatigued
Fundamental Challenge is Limited Visibility
PrometheusInspired by Googles Borgmon monitoring system.
Started in 2012 by ex-Googlers working in Soundcloud as an open source project, mainly written in Go. Publically launched in early 2015, and continues to be independent of any one company.
Over 100 companies have started relying on it since then.
What does Prometheus offer? Inclusive Monitoring Powerful data model Powerful query language Manageable and Reliable Efficient Scalable Easy to integrate with Dashboards
Services have Internals
Monitor the Internals
Monitor as a Service, not as Machines
Inclusive MonitoringDont monitor just at the edges:
Instrument client libraries Instrument server libraries (e.g. HTTP/RPC) Instrument business logic
Library authors get information about usage.
Application developers get monitoring of common components for free.
Dashboards and alerting can be provided out of the box, customised for your organisation!
Powerful Data ModelAll metrics have arbitrary multi-dimensional labels.
No need to force your model into dotted string.
Can aggregate, cut, and slice along them.
Supports any double value, labels support full unicode.
Powerful Query LanguageCan multiply, add, aggregate, join, predict, take quantiles across many metrics in the same query. Can evaluate right now, and graph back in time.
Answer questions like:
Whats the 95th percentile latency in the European datacenter? How full will the disks be in 4 hours? Which services are the top 5 users of CPU?
Can alert based on any query.
Manageable and ReliableCore Prometheus server is a single binary.
Doesnt depend on Zookeeper, Consul, Cassandra, Hadoop or the Internet.
Only requires local disk (SSD recommended). No potential for cascading failure.
Pull based, so easy to on run a workstation for testing and rogue servers cant push bad metrics.
Advanced service discovery finds what to monitor.
EfficientInstrumenting everything means a lot of data.
Prometheus is best in class for lossless storage efficiency, 3.5 bytes per datapoint.
A single server can handle:
millions of metrics hundreds of thousands of datapoints per second
ScalablePrometheus is easy to run, can give one to each team in each datacenter.
Federation allows pulling key metrics from other Prometheus servers.
When one job is too big for a single Prometheus server, can use sharding+federation to scale out. Needed with thousands of machines.
Easy to integrate withMany existing integrations: Java, JMX, Python, Go, Ruby, .Net, Machine, Cloudwatch, EC2, MySQL, PostgreSQL, Haskell, Bash, Node.js, SNMP, Consul, HAProxy, Mesos, Bind, CouchDB, Django, Mtail, Heka, Memcached, RabbitMQ, Redis, RethinkDB, Rsyslog, Meteor.js, Minecraft...
Graphite, Statsd, Collectd, Scollector, Munin, Nagios integrations aid transition.
Its so easy, most of the above were written without the core team even knowing about them!
What do we do?Robust Perception is an independent provider of Prometheus-related services.
We can help you:
Decide if Prometheus is for you Manage your transition to Prometheus Resolve issues that arise Use Prometheus to run and scale your production systems efficiently
We are proud to be among the core contributors to the Prometheus project.
ResourcesOfficial Project Website: prometheus.io
Official Mailing List: email@example.com
Robust Perception Website: www.robustperception.io