using machine learning to optimize devops practices

29
Using Machine Learning to Optimize DevOps Practices Building Learning into Monitoring and Feedback Peter Varhol

Upload: peter-varhol

Post on 22-Jan-2018

74 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Using Machine Learning to Optimize DevOps Practices

Using Machine Learning to Optimize DevOps Practices

Building Learning into Monitoring and Feedback

Peter Varhol

Page 2: Using Machine Learning to Optimize DevOps Practices

About me

• International speaker and writer• Degrees in Math, CS, Psychology• Technology communicator• Former university professor, tech journalist• Cat owner and distance runner• [email protected]

Page 3: Using Machine Learning to Optimize DevOps Practices

Agenda

• What is machine learning?

• How is machine learning applied to DevOps?

• Challenges in training these systems

• What constitutes an issue?

• Summary and conclusions

Page 4: Using Machine Learning to Optimize DevOps Practices

What is Machine Learning?

• Layered algorithms that change parameters based on feedback from know data• Can be linear or nonlinear

• Algorithms can be fixed in production or adaptive• Fixed – algorithms do not adjust once deployed

• Adaptive – algorithms continually adjust to new data

• Usually part of a larger system

Page 5: Using Machine Learning to Optimize DevOps Practices

Adaptive Systems

• Airline pricing• Ticket prices change three times a day based on demand

• It can cost less to go farther

• It can cost less later

• Ecommerce systems• Recommendations try to discern what else you might want

• Can I incentivize you to fill up the plane?

Page 6: Using Machine Learning to Optimize DevOps Practices

Why Use Adaptive?

• The “right” result will vary over time

• Trying to optimize a particular result• Revenue

• The problem domain is not static

Confidential, Dynatrace LLC

Page 7: Using Machine Learning to Optimize DevOps Practices

How Are Fixed Systems Used?

• Transportation• Self-driving cars

• Aircraft/Drones

• Ecommerce• Recommendation engines

• Medical• Diagnosis systems

Page 8: Using Machine Learning to Optimize DevOps Practices

Why Use Fixed Machine Learning Systems

• The problem domain is static

• The expectations remain constant

• The right answer is known under most conditions

• The original algorithms remain valid over a long period of time

Page 9: Using Machine Learning to Optimize DevOps Practices

DevOps Practices Generate Data

• During development• Agile metrics, JIRA issues, test case metrics

• During continuous integration• System test metrics

• During continuous deployment• Quality metrics for deployments

• After deployment and into production• Application availability and performance

• Usage log files

Page 10: Using Machine Learning to Optimize DevOps Practices

Focus on Monitoring

• Ongoing data on availability and performance• RUM

• Synthetic tests

• Application monitoring

• Monitoring tackles the back end of DevOps• Identifying unhealthy trends

• Diagnoses failures and poor performance

• Recommends action

• Fixed or adaptive depends on your goals

Page 11: Using Machine Learning to Optimize DevOps Practices

Where Do Predictive Analytics Come In?

• Big data makes possible predictions of future events• Are we going to fail?

• How will we perform with traffic surges?

• As well as past events• What went wrong and how do we fix it

• We can rely on past data• Adaptive systems may not perform as well

• Clear goals needed

Page 12: Using Machine Learning to Optimize DevOps Practices

What Technologies Are Involved?

• Neural networks

• Genetic algorithms

• Rules engines

Page 13: Using Machine Learning to Optimize DevOps Practices

Neural Networks

• Set of layered algorithms whose variables can be adjusted via a learning process

• The learning process involves training with known inputs and outputs

• The algorithms adjust coefficients to converge on the correct answer (or not)

• You freeze the algorithms and coefficients, and deploy• Or you optimize on a particular set of characteristics

Page 14: Using Machine Learning to Optimize DevOps Practices

A Sample Neural Network

Page 15: Using Machine Learning to Optimize DevOps Practices

Genetic Algorithms

• Use the principle of natural selection

• Create a range of possible solutions

• Try out each of them

• Choose and combine two of the better alternatives

• Rinse and repeat as necessary

Page 16: Using Machine Learning to Optimize DevOps Practices

Bringing in DevOps

• DevOps has data that can be used to train neural networks• Health of the application

• Trends in application traffic and responsiveness

• Application failure

Page 17: Using Machine Learning to Optimize DevOps Practices

Machine Learning Helps DevOps

• Decisions are complex• Why is the CPU maxed?

• What is causing disk thrashing?

• Why did the network slow?

• Why did the application fail?

• Data is massive• Potentially thousands of data points a day

Page 18: Using Machine Learning to Optimize DevOps Practices

How Good Are Decisions?

• Expert versus machine

• Given the same data• In many domains they tie

• With additional data, the human can be better

• But machine learning will get better

• But only as good as the data

Page 19: Using Machine Learning to Optimize DevOps Practices

We Want to Do Two Things

• Identify trends that may indicate future problems• Increasing response times

• More page errors

• Diagnose faults once they have happened• Why did the application fail?

• How can we fix it as quickly as possible?

Page 20: Using Machine Learning to Optimize DevOps Practices

Fixed Algorithms Work for Some Problems

• Immediate performance and failure identification

• Diagnosis of failures and performance issues

• These are readily identifiable from known data

Page 21: Using Machine Learning to Optimize DevOps Practices

Adaptive Systems Supplement These Tools

• Predictions of future events• Performance

• Availability

• The target is moving• So we need current data to adjust the algorithms

Page 22: Using Machine Learning to Optimize DevOps Practices

The Machine Helps the DevOps Expert

• The machine learning app provides:• Early warning on possible performance issues and failures

• Immediate notification of failure or impending failure

• Trend analysis of data to predict unhealthy outcomes

• The machine learning is an assistant• It can’t fix anything

• It can’t necessarily identify the root cause

Page 23: Using Machine Learning to Optimize DevOps Practices

What is the Goal?

• We have many ways of monitoring• Many of them are represented at this conference

• Each measures something a little different• Latency, response time, availability, network, DNS . . .

• Too much data can be no better than no data at all

• Machine learning can correlate across measurements• Focus to eliminate false positives

Page 24: Using Machine Learning to Optimize DevOps Practices

Intelligent Systems Are Sometimes Wrong

• The problem domain is ambiguous

• There is no single “right” answer• “Close enough” is good

• We don’t know quite why the software responds as it does• We can’t easily trace code paths

Page 25: Using Machine Learning to Optimize DevOps Practices

Testing Machine Learning Systems

• Have objective acceptance criteria

• Test with new data

• Don’t count on all results being accurate

• Understand the architecture of the network as a part of the testing process

• Communicate the level of confidence you have in the results to management and users

Page 26: Using Machine Learning to Optimize DevOps Practices

A Cautionary Tale

• All events are not created equal

• AI systems treat events equally• A failure of a system during busy season is the same as any other

• DevOps pros know otherwise• And can exert additional effort in response

• And actually fix the problem

• We can’t automate what we don’t understand

• You need the human in the loop

Confidential, Dynatrace LLC

Page 27: Using Machine Learning to Optimize DevOps Practices

Conclusions

• DevOps is a natural environment for machine learning systems• Any activity that generates data and requires a decision is fair game

• Monitoring is low-hanging fruit

• Fixed systems for failure and diagnosis, adaptive for trend analysis

Confidential, Dynatrace LLC

Page 28: Using Machine Learning to Optimize DevOps Practices

References

• https://qz.com/989137/when-a-robot-ai-doctor-misdiagnoses-you-whos-to-blame/

• https://pvarhol.wordpress.com/2017/07/22/what-brought-about-our-ai-revolution/

• https://pvarhol.wordpress.com/2017/06/21/analytics-dont-apply-in-the-clutch/

Confidential, Dynatrace LLC