monitoring your aws cloud infrastructure

18
AWS Usergroup Greece AWS Usergroup Greece Monitoring your cloud-hosted app Monitoring your cloud-hosted app 18/07/2012 18/07/2012 Andreas Chatzakis Andreas Chatzakis @achatzakis on twitter @achatzakis on twitter

Upload: newvewm

Post on 30-Oct-2014

1.554 views

Category:

Technology


2 download

DESCRIPTION

The presentation includes great overview on why and how to track and monitor your cloud infrastructure. It list the different types of cloud monitoring include the underlying infrastructure all the way up the application stack. Here you can find names of relevant tools that can support monitoring cloud online applications.

TRANSCRIPT

Page 1: Monitoring Your AWS Cloud Infrastructure

AWS Usergroup GreeceAWS Usergroup Greece

Monitoring your cloud-hosted appMonitoring your cloud-hosted app

18/07/201218/07/2012

Andreas ChatzakisAndreas Chatzakis@achatzakis on twitter@achatzakis on twitter

Page 2: Monitoring Your AWS Cloud Infrastructure

2

whoami

Andreas Chatzakis CTO & co-founder /

High traffic Greek Real Estate portal Software delivery team management IT Operations

co-founder of AWS Usergroup Greece

@achatzakis

Page 3: Monitoring Your AWS Cloud Infrastructure

3

Why monitoring

Detect problems before (many) users are aware Alerts and notifications at 3 AM Be informed of issues you wouldn't be able to recreate Collect data to discover root cause of an incident

...and automate response for next time Statistics and KPIs to track service quality trends Visibility to prioritize optimization efforts Make sense out of large quantity of logs and data

You need monitoring to proact or react to availability & performance risks and issues:

Page 4: Monitoring Your AWS Cloud Infrastructure

4

Monitoring in the cloud

Cloud allows us to build highly dynamic setups More data Our tools need to adapt Ephemeral resources require centralized approach

Need aggregation based on server role Cloud promises agility

Only possible when cost of failure is low Being able to spot issues in a more automated manner is key

The rise of the devops Developers need visibility to understand how their code affects costs and impacts availability

Principles are not that diverse from traditional infrastructure but...

Page 5: Monitoring Your AWS Cloud Infrastructure

5

Types of monitoring

External checks (is my app still up?) Server monitoring (CPU, RAM, IO...) Systems monitoring (mySQL, Apache etc metrics) Process monitoring (restart crashed services) Application monitoring (bottlenecks in the code) End user monitoring (client side performance) Log aggregation & analysis (centralize storage) Cloud Analytics (do I make the most out of AWS?)

There is a variety of monitoring tools that complement each other

Page 6: Monitoring Your AWS Cloud Infrastructure

6

Deployment models

Agent vs Agent-less SaaS vs DIY on own computing instances

Consider different AZ or provider Least privilege principle (e.g. read-only access to agent)

Consider the deployment model of each monitoring solution

Page 7: Monitoring Your AWS Cloud Infrastructure

7

Pricing models

Freeware Per host Per host-hour Per user Per alert Per stored Gbyte

Different pricing models offered by the various solutions

Page 8: Monitoring Your AWS Cloud Infrastructure

8

External testsExternal tests detect failure & alert you so that you react

Treats your app as a black box Periodic check from a bot Define expected response (specific string) Tests from different geographies Report on average response time, latency etc Alert via email, sms, phone

Page 9: Monitoring Your AWS Cloud Infrastructure

9

Server & Systems monitoringServer monitoring collects data from OS and Systems

Server metrics (CPU, Load Average, RAM, IO activity) System metrics (Apache status, MySQL connections...) Typically works via an agent or remote access Can point towards root cause

But can't trace issues to specific parts of your code Helps with capacity planning and scaling decisions

Page 10: Monitoring Your AWS Cloud Infrastructure

10

Process monitoringProcesses die or misbehave... Monitor their health and automate response

Tools that check critical processes Restart if crashed process

...or those using too many resources Can configure complex scenarios Beware of false positives Beware of recurring restarts

Page 11: Monitoring Your AWS Cloud Infrastructure

11

Application monitoringA 'Flight recorder' for your code helps you fix real issues.

It is often hard to recreate a production issue. Plugs into your app servers & tracks execution Code tracing

Captures errors, input variables and debugging info

Records performance metrics Time spent on DB, Cache, external services Overhead of specific classes or methods Slow queries

Page 12: Monitoring Your AWS Cloud Infrastructure

12

End user monitoringGet real data about the experience of your app's users

It works for you. Does it work for them? Servers running ok. What about that 3rd party widget? Typically collects actual end user data via js Capture performance issues faced by user segments

OS / browser / addons Network connection speed Geographical location First time visit VS warm browser cache

Page 13: Monitoring Your AWS Cloud Infrastructure

13

Log aggregatorsCentralized storage of logs for cloud setups with ephemeral instances

Logs are sent over to centralized repository Persists after server has been decomissioned Logs are captured, stored, archived & recycled Logs are indexed and analyzed Preconfigured analyzers for known apps Free text analyzers for less known apps Alerts based on specific patterns, frequencies

Page 14: Monitoring Your AWS Cloud Infrastructure

14

Swiss knivesThe future might belong to holistic monitoring solutions

Monitoring at multiple levels Correlating data can be a godsend for devops

Cloud management tools might move to integrate or provide such functionality

Page 15: Monitoring Your AWS Cloud Infrastructure

15

A common pitfallWhile it does have its uses, you should not rely on custom application logging

Typically inconsistent logging that is added reactively

Developer bias and lack of operational issues understanding

logging what you anticipate to go wrong Increased code maintenance costs and risks Can hurt performance if you are not careful Instead use a proper monitoring toolset

let developers focus on building new functionality

Page 16: Monitoring Your AWS Cloud Infrastructure

16

Cloud AnalyticsCombine traditinal monitoring with Newvem's Analytics and make the most of the cloud

Powerful analytics of cloud usage data Reveal security & availability issues in your cloud infra

Get actionable insights Identify opportunities for cost reductions Spot overloaded resources requiring vertical or horizontal scaling

Visibility and confidence you making the most of the cloud

Page 17: Monitoring Your AWS Cloud Infrastructure

17

Page 18: Monitoring Your AWS Cloud Infrastructure

18

Questions

?