dc/os metrics - schedschd.ws/hosted_files/mesosconeu2016/e7/metrics on... · dc/os metrics...

23
© 2016 Mesosphere, Inc. All Rights Reserved. 1 DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise Nick Parker at..

Upload: truongtuong

Post on 11-Mar-2018

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 1

DC/OS Metrics(formerly known as Project Ambrose)

Application and Resource Metrics in DC/OS Enterprise

Nick Parker at..

Page 2: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 2

Introduction

Nick Parker● DC/OS Slack: chat.dcos.io● DC/OS Mailing List: [email protected]● GitHub: nickbp@Data Agility Team● Frameworks for Cassandra/DSE, HDFS,

Kafka/Confluent, Spark, ...● Service SDK (in progress...)

Page 3: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 3

The Importance of Metrics

How do you know if...● Things are running fine, or falling over● Containers have plenty of quota, or are on the edge of OOM● You’re optimizing for what people use, or what nobody sees● The new release is good, or should be rolled back

Page 4: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 4

Sources of Metrics in DC/OSContainer MetricsMeasure things like:● RAM, Disk, IOPS, CPU, Network, …To determine:● Resource utilization/basic health

Application MetricsMeasure things like:● QPS, query latency, number/types of hit exceptions, number of active users, …To determine:● Changes in performance/behavior across rollouts● Debugging active issues (eg oncall pages)● Tracing historical behavior● ...

Page 5: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 5

Solving Metrics on DC/OSEasy integration by applications● Little effort/thought to emit metrics from any application● Support custom metric metadata

Inject container metadata● Container, Framework, Agent, ...

Flexible, configurable output● Widely accessible format/schema● Send metrics to any storage● Easy filtering and routing

Installed as a containerized application● Easy reconfiguration/upgrades/fixes

Page 6: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 6

What DC/OS Metrics ProvidesEasy input● Container resource metrics: retrieved automatically● Custom application metrics: StatsD endpoint, advertised with env vars

Automatic source tagging● Application, Framework, Host Agent, Container, ...

Flexible outputs● Kafka cluster: scale as needed, attach arbitrary consumers● Others?

Page 7: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 7

Application Input: StatsD (with tag support)StatsD Format● Text records: either one-per-packet or newline separated.● Optional tagging (Datadog extension) - Consumed by DC/OS Metrics!

memory.usage_mb:5|gfrontend.query.latency_ms:46|g|#shard_id:6,section:frontpage

Pseudocodeif (env[“STATSD_UDP_HOST”] and env[“STATSD_UDP_PORT”]) {

// 1. Open UDP socket to the endpoint// 2. Send StatsD-formatted metrics

}

Page 8: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 8

Output Format: Apache Avro

repeated MetricList { repeated Tag { string key, string value, } repeated Datapoint { string name, double value, int64 epoch_time_ms, }}

Page 9: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 9

Architecture: Per-Node

Per-host components:1. Mesos Metrics Module2. Metrics Collector

Page 10: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 10

Architecture: Per-Cluster

Per-cluster components:1. Kafka2. Consumer(s)

Page 11: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 11

Architecture: Overall

Per-host components:1. Mesos Metrics Module2. Metrics Collector

Per-cluster components:3. Kafka4. Consumer(s)

Page 12: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 12

Demo!!!

● Service config examples● Consumer examples● Show and tell

Page 14: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

Kafka

Cassandra

Executor logs from...

Page 15: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

15

InfluxDB

KairosDB

(Dog)StatsD

Consumers for...

Page 17: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 17

Contact/Q&A

Nick Parker● DC/OS Slack: chat.dcos.io● DC/OS Mailing List: [email protected]● GitHub: nickbp@

Any Questions?

Page 18: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

© 2016 Mesosphere, Inc. All Rights Reserved. 18

Appendix: Metrics on DC/OS Enterprise

DC/OS Enterprise 1.7● Application metrics only● Tagged with some container IDs● Sent to metrics.marathon.mesos:8125● Tied to DC/OS release cycle

DC/OS Enterprise 1.8● Adds resource usage metrics● Adds more tags● Sent to local Collector process● Collector is detached from DC/OS release cycle

Page 19: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

Appendix: Mesos Agent

Page 20: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

Appendix: Print Consumer

Page 21: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

21

Page 22: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

22

Page 23: DC/OS Metrics - Schedschd.ws/hosted_files/mesosconeu2016/e7/Metrics on... · DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise

23