managing quality of service for containerized microservice applications
TRANSCRIPT
Managing Quality of Service for Containerized Microservice Applications
• Michael Krumm, Product Manager • 1.5 yrs, Sales Engineer, AppDynamics • 1 yr, Software Consultant, BMC Software • 2.5 yrs, Department Manager and Head of IT,
Hospital
• Pete Abrams, Founder & COO • 2 yrs, VP Innovation, AppDynamics • 4 yrs, VP Channel Sales, AppDynamics • 10 yrs, sales and marketing at Sun Microsystems • 5 yrs, VP Marketing, Netcontinuum
Speaker Bios
From: How Microservices Have a Macro Affect on APM, June, 2016
The Challenge of MicroServices for APM
Instana is a Gartner Cool Vendor 2016: Availability and Performance
“Microservice architectures bring new complexity, in terms of scale and dynamism, to assessing the status of the
application environment.” Cameron Haight, Chief of Research, Infrastructure and Operations at Gartner, Inc.
• written in different languages
• maintained independently
• deployed automatically
• terminated after use
• invoked on demand
• scaled dynamically
No longer rigid, hard wired blocks of functionality but rather Business Processes made from the interactions of the multitude of (micro)services
What are MicroService Applications?
USER
Applications are:
Modern systems are built with resilience.
The new QoS challenge is the dynamism and interactions,
not so much the piece parts.
Cluster
?
The Microservice Technology Stack
?
ServicesHost Container Middleware
Host Container Middleware Cluster
CPU high Load to high GC Overhead (JVM) Re-Balancing
Alert
?
Traditional Monitoring Creates Too Many Alarms
Code Exceptions/Errors
?
Services
An issue with a component probably does not affect the Quality of (micro)Service
Alert Alert Alert ???
The Challenge of monitoring MicroServices based Applications
Cluster
Host
Container
Middleware
Service USER
??
• deep, diverse technology stacks • complex, unpredictable service interactions • constantly changing everything • scale, even small systems have 100s of parts
GOAL:
Quality of (micro)Service Management: In Production,
With Minimal Impact, and Zero Configuration
A modern application is the usage patterns of microservices
Monitoring those services is required to manage the application
USER
Monitoring With Instana
Management by Incident
Incidents report all correlated changes and issues
Quality of the (micro) Services
‣ Incidents are raised when quality is impacted
‣ Quality is defined by KPI’s: ‣ Throughput ‣ Latency ‣ Error Rate ‣ Saturation
‣ KPI health is determined by machine learning
Curated Expert Knowledge = component health understanding
Component Health Reported within Incidents
The Dynamic Graph
Search Product Trace
Index A
ES Cluster
Spring Boot
JVM
Process
Container
Host
ES Node
JVM
Process
Container
Host
ES Node
JVM
Process
Container
Host
ES Node
JVM
Process
Container
Host
ES Node
JVM
Process
Container
Host
Zone
Zone
App A
A model to correlate relationships and interaction
One Agent per Host
One Sensor per active component
Trace messages between microservices
Sensor Repository
Agent
Knowledge Engine
Elasticsearch sensor
Tomcat sensor
JVM sensor
Linux sensor
Auto Discovery / Auto Update
Communication
LocalSensor Memory
& Contextual
Compression
Immediate, Automatic and Continuous Discovery of Components and Dependencies
1 SECOND RESOLUTION
Others
Instana collects 1 second resolution data. Data viewed as 1 minute running average.
Aggregation = loss of information | Dynamic applications demand high resolution data
Demo Application „The Shop“
• Online Shop with simulated traffic • total of 22 Services • Languages
• Java, PHP, Node.js • Components
• Docker • Marathon • Springboot • Cassandra, Elasticsearch, MySQL, MongoDB • RabbitMQ, Kafka, Redis, Memcached • nginx, HAProxy • and more…
A day with Instana
• Ops is notified about an Incident • Identify and understand the issue • Work on remediation
Demo
• Runtime behavior and architecture in production • Identify code improvement opportunities • Troubleshoot performance and errors • Understand deployment impact in seconds
Value to Developers
• Full Stack visibility and navigation - infrastructure to application to trace and back
• Automatic and intelligent Incident management • Real time insights and comparison
Value to Operations
• Understand service usage • Manage service performance • Identify improvements • Prioritize based on impact
Value to Product Owner
Q & A
Data Ingestion &Health Calculation
Sensor Data
Realtime Stream Processing
Incident Detection
Alerting
Quality of Service
Dependency
Health
Metrics
3D Map
Dynamic Knowledge Graph
API & CLI
Configuration
Instana Processing Pipeline
3 seconds from sensing to alerting
Sensor Availability
Sensors:
Supported TechnologiesTracing:
Data Retention
• Metrics Data Retention ‣ 1 second data granularity is stored for 10 minutes
- 5 seconds for 24 hours
- 60 seconds for 1 month
- 300 seconds/5 minutes for 3 months
- 3600 seconds/1 hours forever
• Graph/Configuration Data Retention ‣ each change of the Graph is kept forever
• Events Data Retention ‣ each event is kept forever
Instana, Inc. Proprietary and Confidential 27
Instana Knowledge
EngineSensor Data
On-Prem Instana Service
3D Map
On-Prem Deployment
Usage Billing DataInstana Monitoring
User Management
Updates
Customer’s Data Center
Instana Cloud
Authentication + HTTPS