introduction to dev ops

Introduction to DevOps

Len Bass

copyright 2015 Len Bass 1

Overview

• DevOps: What and why

• Architectural impact of different categories of DevOps practices


DevOps: What and why


Over the wall development


Board has idea

Developers implement

Operators place in production

Time

Where Does the Time Go?

• As Software Engineers our view is that there are the following activities in software development– Requirements– Design– Implementation– Test

• Code Complete• Different methodologies will organize these activities in

different ways.• Agile focuses on getting to Code Complete faster than

with other methods.

5

Developers implement

copyright 2015 Len Bass

What is wrong?

• Code Complete Code in Production

• Between the completion of the code and the placing of the code into production is a step called: Deployment

• Deploying completed code can be very time consuming because of concern about errors that could occur.

6 copyright 2015 Len Bass

What is the work flow for code from a multiteam development effort

• You develop and test your code in isolation

• Your code is integrated with code developed by other teams to see if an executable can be constructed.

• The built system is tested for correctness

• The built system is tested for performance and other qualities (staging)

• The built system is placed into production


What can go wrong – Integration

• Not all portions of the system are available– Portions developed by other teams– Portions developed by 3rd party– Names and signatures of methods from other

software are inconsistent

• Sequencing errors– Other teams do not follow the contract with your

code in terms of sequence of method calls

• Version incompatability– Your team assumed version A of 3rd party software but

the build downloads version B


What can go wrong – integration 2

• Data problems– Database data is not refreshed for each test

– Data does not flow correctly to your code

• Configuration problems– Configuration parameter settings for code developed

by different teams is incompatible

– Configuration parameters are not specified

– External services are not reachable for security or configuration reasons

• Etc


What can go wrong – staging

• Configuration problems

– External services not available because of lack of permissions

– Inconsistent configuration settings

– Leaking into production environment

• Data problems

– Database is not representative of production database

– Stale database after tests

• Etc


What can go wrong – production

• Configuration problems– Requires authentication and authorization– Keys must be kept securely– Inconsistent configurations

• Performance problems– Under actual load, system may not have adequate

performance

• Logical problems– May require new version to be rolled back– Database may have been corrupted

• Etc


Time is passing

• Every error must either be corrected or prevented.

• Preventing errors can be done through some combination of– Process– Architecture– Tooling– Coordination among teams.

• Coordination takes time. • Correcting errors takes time


How much time?

• Historically, releases are scheduled for once a quarter or once a year to give time to coordinate and adequately test.

• This means there may be months delay before a new concept or feature is added to a system.

• This delay has become more and more unacceptable.

• Weekly or daily releases are becoming the norm.


Goal of DevOps

• The goal of DevOps is to reduce the time to market without compromising quality by

– Reducing the number of errors that occur during the workflow of placing your code into production

– Reducing the time for correcting errors that occur

– Minimizing the necessity for coordination among teams


DevOps is a set of practices intended to reduce the time between committing a change to a system and the change being placed into normal production, while ensuring high quality.*

• DevOps practices involve developers and operators’ processes, architectures, and tools.

• DevOps is also a movement – like agile.

*DevOps: A Software Architect’s Perspective

What is DevOps?

15

TEAR DOWN THAT WALL!!

5 Categories of DevOps Practices

1. Treat operators as first class citizens

2. Make Dev more responsible for incident handling

3. Enforce deployment practices uniformly across both dev and ops

4. Use continuous deployment

5. Develop infrastructure code using same processes as application code

16 copyright 2015 Len Bass

Overview

• DevOps: What and why

• Architectural impact of different categories of DevOps practices


Treat Operators as First Class Citizens


Operators will add requirements

• Type and characteristics of error messages

• Type and characteristics of logs

• Expose performance information


Incident handling


Goal of incident handling

• An incident is something out of the ordinary.– Failure

– Performance problem

– Abnormal activity

– Erroneous output of a system

• Goal is – Get system back on track as soon as possible

(mitigate)

– Understand root cause to prevent repetition.


Normal Incident handling process

• Incident is reported to operations by– Developer (client of one of the software elements)– Customer– Internal user of software– Monitoring software

• Operator may be paged if high priority• Operations personnel have analysis tools that help

them determine probable cause and diagnosis tests• If a problem is related to developers code, then it is

escalated to the developer• Process is managed by a ticketing system.


DevOps incident handling

• Incident is reported to operations or developers by

– Developer (client of one of the software elements) –reported to development team

– Customer – reported to operations

– Internal user of software – reported to operations

– Monitoring software – reported to developer or operations depending on type of alert

• Developers wear pagers to ensure fast response.


Architectural Implications

• Make application level data available to amonitoring system

• Collect performance information and make it available to a monitoring system

• Have test or diagnostic mode in code so that reliability engineer can run tests specific to the incident

• Ensure error and logs contain context information.


Developing infrastructure code


Goal

• Reduce error rate in infrastructure code.

• Large number of errors are created during operational activities

– Upgrade

– Reconfiguration

– Race conditions


Techniques

• Use software engineering principles in the development of infrastructure code– Modularization– Test driven development– Version control/configuration management

• Difficult to test infrastructure code– Hard to create environment that mimics real

environment– Many errors are caused by cloud and not by code.

• No impact from this set of practices on application architecture


Continuous Deployment


Goal

• Allow developers to deploy to production without the necessity for coordination


Technique

• Base your system on “microservicearchitecture” style.

• Organization of material

– What is a microservice architecture?

– How does it cut down on coordination?

– What are its properties?


Definition

• A microservice architecture is

– A collection of independently deployable processes

– Packaged as services

– Communicating only via messages

• It is a stripped down version of Service Oriented Architecture (SOA)


~2002 Amazon instituted the following design rules - 1

• All teams will henceforth expose their data and functionality through service interfaces.

• Teams must communicate with each other through these interfaces.

• There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.

32

Amazon design rules - 2

• It doesn’t matter what technology they[services] use.

• All service interfaces, without exception, must be designed from the ground up to be externalizable.

• Amazon is providing the specifications for the “Microservice Architecture”.

33

In Addition

• Amazon has a “two pizza” rule.

• No team should be larger than can be fed with two pizzas (~7 members).

• Each (micro) service is the responsibility

of one team

• This means that microservices are

small and intra team bandwidth

is high

• Large systems are made up of many microservices.

• There may be as many as 140 in a typical Amazon page.

34

Services can have multiple instances

• The elasticity of the cloud will adjust the number of instances of each service to reflect the workload.

• Requests are routed through a load balancer for each service

• This leads to

– Lots of load balancers

– Overhead for each request.


Micro service architecture

36

Service• Each user request is satisfied

by some sequence of services.

• Most services are not externally available.

• Each service communicates with other services through service interfaces.

• Service depth may

– Shallow (large fan out)

– Deep (small fan out, more dependent services)

How does microservice architecture reduce requirements for coordination?

• Coordination decisions can be made

– incrementally as system evolves or

– be built into the architecture.

• Microservice architecture builds most coordination decisions into architecture

• Consequently they only need to be made once for a system, not once per release.


Seven Decision Categories

• Architectures can be categorized by means of seven categories1. Allocation of functionality

2. Coordination model

3. Data model

4. Management of resources

5. Mapping among architectural elements

6. Binding time decisions

7. Technology choices


Design decisions made or delegated by choice of microservice architecture

• Microservice architecture either specifies or delegates to the development team five out of the seven categories of design decisions.1. Allocation of responsibilities. 2. Coordination model. 3. Data model. 4. Management of resources. 5. Mapping among architectural elements. 6. Binding time decisions. 7. Choice of technology

39

Roadmap for next several slides

• Micro service oriented architectural style will either specify or allow delegation of five different categories of design decisions.

• Each decision category will be discussed separately.

40

Decision 1 – allocation of responsibilities

• This decision is not delegated to the team or specified.

• Development teams must coordinate to divide responsibilities for features that are to be added.

• Typically this happens at the beginning of each iteration cycle.

41

Decision 2 - coordination model

• Elements of service interaction

– Services communicate asynchronously through message passing

– Each service could (in principle) be deployed anywhere on the net.

• Latency requirements will probably force particular deployment location choices.

• Services must discover location of dependent services.

– State must be managed

42

State management

• Services can be stateless or stateful

– Stateless services

• Allow arbitrary creation of new instances for performance and availability

• Allow messages to be routed to any instance

• State must be provided to stateless services

– Stateful services

• Require clients to communicate with same instance

• Reduces overhead necessary to acquire state

43

Where to keep the state?

• Persistent state is kept in a database– Modern database management systems (relational)

provide replication functionality– Some NoSQL systems may be replicated. Others will

require manual replication.

• Transient small amounts of state can be kept consistent across instances by using tools such as Memcached or Zookeeper. This is a mechanism for making a statefulservice stateless.

• Instances may cache state for performance reasons. It may be necessary to purge the cache before bringing down an instance.

44

Decision 3 – Data model

• Schema based database system (relational). Requires coordination.– Development teams must coordinate when schema is

defined or modified.– Schema definition happens once when the

architecture is defined. Schema modification should be rare occurrence. Schema extensions (new fields or tables) do not cause problems.

• NoSQL systems. Will still require coordination over semantics of data.– Data written by one service is typically read by others,

they must agree on semantics.

45

Decision 4 – Resource Management

• Each instance of a service can process a certain workload.– Could be expressed in terms of requests– Could be expressed in terms of resource requirements

– e.g. CPU

• Each client instance will require resources from the service to process its requests.

• Service Level Agreements (SLAs) are a means for automating the resource assumptions of the clients and the resource requirements of the service.

46

Decision 5 – Mapping among architectural elements

• Decisions about packaging modules into processes and processes into a service are delegated to the service development team.

• Decisions about deployment of a service will be discussed later.

47

Decision 6 – Binding time

• Configuration information binding time is decided during the development of architecture and the deployment pipeline.

• Other binding time decisions are delegated to the service development team.

48

Decisions 7 – Technology choices

• All technology choices are delegated to the service development team.

49

Quality Analysis of MicroserviceArchitecture

• Deployability

• Availability

• Reusability

• Security

• Modifiability

• Performance


Deployability

• The microservice architecture style is designed to make it easy to deploy by reducing the requirement for coordination.

• There may be dependencies among the services or their versions.


Availability

• If an instance fails, another instance will be created through elasticity.– Stateless instances need no additional

mechanisms

– Stateful instances can keep copy of state in Memcached or Zookeeper. Need to ensure that failure of a single instance does not delete state maintained in Memcached or Zookeeper

• Clients must have rapid timeout and reissue requests that fail.


Reusability

• Small grained reuse

– Teams are independent and do not coordinate or share code. Small grained reuse does not happen using a microservice architecture.

• Large grained reuse

– Large grained reuse is embodied in the architecture and is treated as a service.


Security

• Security tokens (a la Kerberos) can be passedfrom client to service.

• Tokens contain information about access privileges.


Modifiability

• Microservice architecture is modular and coordination mechanisms prevent side effects from a change to one service from affecting another

• Special provisions affect evolution of services.

• Managing all of the services and understanding what each service does is complicated because of the proliferation of services.


Performance

• The main performance issue is message traffic.

• Each user request may involve many messages.

• Monitoring of services will also add to message traffic.

• Microservice architecture is not designed for high transaction volume because of the amount of message traffic.


Enforce Deployment Process


What problem is being attacked?

• When application code is deployed, it goes through several verification steps with gates at each step

• This is not necessarily so with code deployed by operators.

• Security patches, for example, may be deployed directly.

• Opertors may SSH into a VM to perform some action.

• The goal of enforcing the deployment process on operators is to reduce errors caused by incorrect operator actions on VMs.


Traceability

• For every portion of an executing system, it should be possible to know– What version of what components are included in the

executing code– What version of the configuration parameters was

used in invoking the system– What version of what script was used to create the

system– What version of what script was used to create the

environment in which the system is executing

• Without this knowledge it becomes difficult to determine root causes of errors


Architectural implications

• As a portion of initialization, a system should

– verify that it was deployed using a deployment tool

– Record in a log file its pedigree and the pedigree of the configuration parameters, creation script, and environment.


Summary

• DevOps is a movement driven by the need to reduce time to market

• It involves a variety of different practices each of which has its own architectural implications

• Continuous deployment can be done using a microservice architecture but the movement to a microservice architecture has multiple dimensions


More InformationContact [email protected]

DevOps: A Software Architect’s Perspective is available from your favorite bookseller

62

introduction to dev ops

Software

code complete code

devops len bass copyright

production copyright

time copyright

completed code

production time

code complete faster

wall development copyright