appsphere 15 - smoke jumping with appdynamics

36
Smoke Jumping With AppDynamics Jim Waldron, IHS Principal Global Application Support Engineer

Upload: appdynamics

Post on 22-Mar-2017

408 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AppSphere 15 - Smoke Jumping with AppDynamics

Smoke Jumping With AppDynamics Jim Waldron, IHS Principal Global Application Support Engineer

Page 2: AppSphere 15 - Smoke Jumping with AppDynamics

INTRODUCTION

Page 3: AppSphere 15 - Smoke Jumping with AppDynamics

My Career -  25 years in the industry working with .com’s and Fortune 500 Companies.

-  I have done everything from C, C++, Java programming to Systems Administration work.

-  My Career found me I don’t believe I found it.

-  Degree is in History not IT.

-  Always have found myself in a Smoke Jumper role. -  Drop in - fix it - get the heck out!

-  No better way to learn than Baptism by Fire!

-  There is nothing in life I would be rather be doing that what I am doing now.

Page 4: AppSphere 15 - Smoke Jumping with AppDynamics

ABOUT IHS

.

Page 5: AppSphere 15 - Smoke Jumping with AppDynamics

About IHS

IHS offers a unique combination of information, analytics, and expertise. Our solutions and capabilities are augmented by a professional team of subject matter experts, analysts, and consultants. These thought leaders provide you with actionable intelligence for expedited and improved decision making. Customers, prospects and the media outlets alike rely upon IHS thought leaders for analysis, forecasts, and perspectives on topics, events, and issues that impact the global business landscape.

Page 6: AppSphere 15 - Smoke Jumping with AppDynamics

Industries IHS Provides Services Too

-  Academic and Education -  Aerospace and Defense -  Agriculture -  Automotive -  Chemicals -  Electronics -  Energy Oil and Gas -  Power and Utility -  As well as many others -  Maritime

Page 7: AppSphere 15 - Smoke Jumping with AppDynamics

IT TAKES A BRIGADE NOT AN INDIVIDUAL TO PUT OUT A FIRE

Page 8: AppSphere 15 - Smoke Jumping with AppDynamics

Initial Jump Into the Fire

-  First day I realized I had dropped into an OH S*&T moment

-  I guess I missed in the interview the gravity of sheer amount of applications that I was going to support

-  Function of the AppSupport team was pretty primitive

-  History had dictated trust issues between the teams

-  Proponent for change had to build trust between the Development and AppSupport teams

-  Root Cause Analysis took hours sometimes days to determine

Page 9: AppSphere 15 - Smoke Jumping with AppDynamics

The main function of the AppSupport Team was to jump in a fix the fire nothing more nothing less! Calls would come from the International Operations Center and routed to our team. Time to resolve could take hours to days to figure out Not only was it time consuming it was not always received well. Multi-Environment issues made it difficult to determine where the underlying application was causing the issue Used to take Days sometime weeks to dig through log files in order to hopefully determine if the incident was Code, Application or Infrastructure related – at our customers expense

Copyright © 2015 AppDynamics. All rights reserved. 9

Page 10: AppSphere 15 - Smoke Jumping with AppDynamics

FROM SMOKE JUMPER TO FIRE BRIGADE

Page 11: AppSphere 15 - Smoke Jumping with AppDynamics

Pulling together a Brigade -  Battle to gain trust. - Dev Team = Deployments - Become a proponent for change.

-  Went from the fire being 10% contained to 75% in a relatively short amount of time. -  Can Tools Help.

Page 12: AppSphere 15 - Smoke Jumping with AppDynamics

Core Requirements to Choosing an APM

-  Time to Value – Deep visibility out of the box.

-  Ease of use.

-  App to App correlation for multi-application interaction.

-  We needed a partner, not just a vendor.

-  We need a tool that delivers data not emotion.

Page 13: AppSphere 15 - Smoke Jumping with AppDynamics

What AppDynamics Provided that Dynatrace Didn’t

-  Immediate results right out of the box.

-  Detailed Information for All.

-  Finally the good fight was starting to pay off -  DevOps was starting to become a reality not just a word being thrown around -  DevOps only works if all teams buy into it not just individuals

-  Functionality to see the application as a whole. -  Collaboration was set in motion.

-  Resolution was finally on track.

Page 14: AppSphere 15 - Smoke Jumping with AppDynamics

From Lone Jumper to a Brigade After a few months of gaining trust we decided we would take a look at a few APM’s to help us to gain a better insight into our outages. We pitted Dynatrace against AppDynamics over the period of a few months. We were looking for a partner in this battle not just a solution. We achieved this with AppDynamics and not so much with Dynatrace. Determined root cause whether Dev or Ops and have visible proof that there were definitive - not just speculative. With visible proof and data, we could go to the RIGHT team and it was easy to pull together the brigade and put out the fire.

Copyright © 2015 AppDynamics. All rights reserved. 14

Page 15: AppSphere 15 - Smoke Jumping with AppDynamics

FINALLY A BRIGADE TO FIGHT THE FIRES

Page 16: AppSphere 15 - Smoke Jumping with AppDynamics

What we Learned In Order to Become a Brigade

-  Better insight and communication were key.

-  Finger pointing stopped and collaboration began.

-  Solid Bridges were built.

-  Time to resolve was reduced by 72%. What took days now only takes minutes or hours

-  With valid results and effective communication application, infrastructure or database issues are resolved with releases.

-  Complete turn around from 2 years ago.

-  The word DevOps now has meaning.

Page 17: AppSphere 15 - Smoke Jumping with AppDynamics

“AppDynamics is showing issues unfold before synthetics!” “Is helping us understand our peak times and how it impacts our customers” “Could be the difference between an outage or not” “Much more detail on issues, less confusion” “The ability to understand where calls are coming from, and which applications are being affected” “We are actually working as a brigade and not just smoke jumpers!” Copyright © 2015 AppDynamics. All rights reserved. 17

Page 18: AppSphere 15 - Smoke Jumping with AppDynamics

AppDynamics Application Impact

Copyright © 2014 AppDynamics. All rights reserved. 18

Oil and Gas Discovery Tools

Baseline (Pre-AppDynamics) BVA Projected Benefit BVR Realized Benefit

•  2-3 complex performance issues per App per month

•  Avg MTTR = 5 days •  Root cause discovered

10% of time •  Avg 5 FTEs to

troubleshoot

•  Reduce Root Cause Analysis by 65%

•  Reduce MTTR by 65%

•  75% MTTR reduction •  Avg 1 FTEs rather

than 5 •  AND, additional

benefit of improving Root Cause to ensure ‘less repeat’ issues

Page 19: AppSphere 15 - Smoke Jumping with AppDynamics

AFTER THE FIRE WHAT WAS LEARNED

Page 20: AppSphere 15 - Smoke Jumping with AppDynamics

“We now have a view of all integration points. Much easier to pinpoint issues and understand complexity. We can also see opportunities for consolidation.” – Dan Hauser Principal Software Engineer US Land Team.

Copyright © 2015 AppDynamics. All rights reserved. 20

Page 21: AppSphere 15 - Smoke Jumping with AppDynamics

Copyright © 2014 AppDynamics. All rights reserved. 21

Cross Application View (Impossible before AppDynamics)

Page 22: AppSphere 15 - Smoke Jumping with AppDynamics

Multi-Environment Testing We have several Applications that use our Access Control Environment. All Applications have potential to impact our front end. Determining which Application is causing the issue was difficult and multiple teams were engaged prior to implementing AppDynamics. With AppDynamics we are able to determine which Application is having the issues to engage the proper team to resolve the issue quickly. Our Development Teams are now using AppDynamics to test their code in their Stage Environments to research and determine coding issues before releasing to production.

Copyright © 2015 AppDynamics. All rights reserved. 22

Page 23: AppSphere 15 - Smoke Jumping with AppDynamics

Looking Forward

-  By Mid January 2016 we are looking to mature our usage of AppDynamics.

-  We will be implementing Deeper and more thorough use of AppDynamics by the PDD Team for upstream, pre production tuning and optimization for code and applications.

-  This will allow us to flip the 80/20 balance of reactive/proactive post production Operational troubleshooting to be 80/20 proactive vs. reactive.

-  This will ensure tighter collaboration between PDD and Application Support during load/stress testing and pre-production testing to ensure applications are optimal and consistent between pre and post-production releases.

-  Proactive analysis and trending against baselines to detect and alert on deviations with sufficient time to collaboratively engage and analyze the cause prior to any customer impact.

Page 24: AppSphere 15 - Smoke Jumping with AppDynamics

Thank You Jim Waldron Principal Global Applications Engineer [email protected] www.ihs.com

Page 25: AppSphere 15 - Smoke Jumping with AppDynamics

COLLABORATION IS THE KEY

Page 26: AppSphere 15 - Smoke Jumping with AppDynamics

Pulling together a Brigade

-  At first it seemed like an uphill battle to gain trust. -  Started with a simple task in regards to the Dev Team – Deployments. -  Never gave up and kept fighting the good fight – Became a proponent for change. -  Went from the fire being 10% contained to 75% in a relatively short amount of time. -  Started to gain trust and build a team of jumpers to fight the fire. -  After awhile we started to look into tools to help fight the good fight.

Page 27: AppSphere 15 - Smoke Jumping with AppDynamics

Collaborate, Collaborate, Collaborate! Prior to AppDynamics multiple excuses were given for the root cause of the issue

We heard that is an known issue and has never really caused any impact before so why now?

We will fix that issue in the next release and the issue would arise again

My favorite was how do you know that is an issue after searching logs for hours and sometimes days to determine root cause.

Now, the data provided to the Dev Teams is undisputable and has allowed us to collaborate with them to detect, investigate and reveal known and potential issues with their Applications to work towards better coding which leads to less down time as well as time to resolve.

Copyright © 2015 AppDynamics. All rights reserved. 27

Page 28: AppSphere 15 - Smoke Jumping with AppDynamics

Production – Stage – Test – Dev All In One Place

Page 29: AppSphere 15 - Smoke Jumping with AppDynamics

Initial Dashboard View Into the Application

Page 30: AppSphere 15 - Smoke Jumping with AppDynamics

Drilling Down to Root Cause

Page 31: AppSphere 15 - Smoke Jumping with AppDynamics

Determining Root Cause

Page 32: AppSphere 15 - Smoke Jumping with AppDynamics

Root Cause

Page 33: AppSphere 15 - Smoke Jumping with AppDynamics

Custom Dashboards Key Dashboards

•  Utilizing the Application Support Dashboard

•  Providing a dashboard for Development

•  Merging the two for full coverage •  SysOps dashboard

Key Metrics

•  Node CPU and Memory usage •  Call load and slow calls per node •  Node response time •  Disk activity per node •  Slow, very slow, stall CPM EPM

Copyright © 2015 AppDynamics. All rights reserved. 33

Page 34: AppSphere 15 - Smoke Jumping with AppDynamics

Why These Metrics?

Application versus Infrastructure Insight Determine responsibility Visible proof removes any doubt on responsibility

Copyright © 2015 AppDynamics. All rights reserved. 34

Page 35: AppSphere 15 - Smoke Jumping with AppDynamics

Give Dev Their Own Dashboard!

Development metrics != Ops metrics Business Transactions are most important to Developers Lightning fast resolution to root cause in code

Copyright © 2015 AppDynamics. All rights reserved. 35

Page 36: AppSphere 15 - Smoke Jumping with AppDynamics

Development Dashboard