anatomy of an action

40
Mining the event storm Vladik Romanovsky Engineer The Anatomy of an Action Engineer Gordon Chung

Upload: gordon-chung

Post on 16-Aug-2015

46 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Anatomy of an action

Mining the event storm

Vladik RomanovskyEngineer

The Anatomy of an Action

EngineerGordon Chung

Page 2: Anatomy of an action

OpenStack is a wonderful place

Page 3: Anatomy of an action

when you use OpenStack you might see this

Page 4: Anatomy of an action
Page 5: Anatomy of an action
Page 6: Anatomy of an action
Page 7: Anatomy of an action
Page 8: Anatomy of an action

WTF???

Page 9: Anatomy of an action

if you’re lucky, you might find the real error![instance: e7933ceb-d1e7-42fe-9f37-d275ebd375bd] Instance failed to spawn

Traceback (most recent call last):......ProcessExecutionError: Unexpected error while running command.Command: qemu-img convert -O raw /opt/stack/data/nova/instances/_base/7434c85f2968d2cfb05b07d8c769d7d938cec5e8.part /opt/stack/data/nova/instances/_base/7434c85f2968d2cfb05b07d8c769d7d938cec5e8.convertedExit code: 1Stdout: u''Stderr: u'qemu-img: error while reading sector 0: Input/output error\n'

Page 10: Anatomy of an action

Debugging be Hard

• actions consists of multiple steps• asynchronous calls that can cause

timing issues• distributed nature of OpenStack

can make it difficult to debug• parsing log files are easy -- if you’re

a robot

Page 11: Anatomy of an action

Use Case: Creating an Instance

Page 12: Anatomy of an action

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

Page 13: Anatomy of an action

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

FAIL HERE

Page 14: Anatomy of an action

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

FAIL HERE

Page 15: Anatomy of an action

Creating an Instance

conductor scheduler computemanager

buildnetwork

buildstorage

startguestapi

FAIL HERE

Page 16: Anatomy of an action

Creating an Instance

api conductor scheduler computemanager

buildnetwork

buildstorage

startguest

notification bus

Page 17: Anatomy of an action

Creating an Instance

conductor scheduler computemanager

buildnetwork

buildstorage

startguestapi

FAIL HERE

notification bus

Page 18: Anatomy of an action

OpenStack Events

• most services emit notifications for some discrete events• the content of notification represent that state of the

environment, resource, etc… at the point in time• notifications are defined by a type to describe content• nova: compute.instance.create.*, scheduler.create_volume• neutron: port.create.*, network.create.*• cinder: volume.detach.*, volume.create.*• keystone: identity.user.*, identity.project.*• and a lot more...

Page 19: Anatomy of an action

Creating an Instance

api conductor scheduler hostmanager

buildnetwork

buildstorage

startguest

notification bus

consumer?

Page 20: Anatomy of an action

Ceilometer

• telemetry project in OpenStack• notification agent which consumes messages• listens to the queues of each OpenStack service• picks specific measurement values from notifications and

builds meters

Page 21: Anatomy of an action

but wait, there’s more!

Page 22: Anatomy of an action

every notification is also captured as an Event

Page 23: Anatomy of an action

Creating an Instance

api conductor scheduler hostmanager

buildnetwork

buildstorage

startguest

notification bus

ceilometer notification agentMeters Events

Page 24: Anatomy of an action

Ceilometer Events

• initially implemented in Icehouse (part of StackTach integration)

• an Event represents the state of an object in an OpenStack service at a point in time.

• built from INFO and ERROR level notifications emitted by all services

• ability to normalise messages by mapping key attributes from notification messages to a common name

Page 25: Anatomy of an action

Ceilometer Event Model

• message id• event type• timestamp• traits

• queryable, indexed attributes

• ie. payload.x.y.z => attr1• raw

• full notification

Page 26: Anatomy of an action

Ceilometer Event Processing

• all events are forced through pipelines

• events can be published to multiple targets• database• file• queue• http

Page 27: Anatomy of an action

Benefits of Centralised Events

• potential lost of data if logging locally• normalisation of data• event flow across services gives context

• individual events means nothing• end to end flow means something

Page 28: Anatomy of an action

connecting the dots…

Page 29: Anatomy of an action

Debugging be Easier

• we wanted a view to show all the events of a given action by a resource

• be able to see any errors• temporally aware -- order of events• show the flow and context of events

Page 30: Anatomy of an action

postmortem analysis using Elasticsearch

Page 31: Anatomy of an action

ElasticSearch

• document-oriented, schema free database• built on top of Apache Lucene

• focused on providing full-text search capabilities• distributed, highly available, real time db• kibana - gui interface to database

Page 32: Anatomy of an action
Page 33: Anatomy of an action

KIBANA!!!

Page 34: Anatomy of an action

KIBANA!!!

Page 35: Anatomy of an action

HORIZON!!!

Page 36: Anatomy of an action

HORIZON!!!

Page 37: Anatomy of an action

Extending Events

• there is a lot of data that isn’t published• the data that is published is disorganised• extending support in horizon

• drilling down into event to view full raw data• filter options - time range, events for a specific request

• ceilometer• alarm on events• build metrics from events

Page 38: Anatomy of an action

thank you

Page 39: Anatomy of an action

BACKUP

Page 40: Anatomy of an action

Horizon Events Prototype, by George Peristerakis

https://github.com/enovance/horizon/tree/event-prototype