architecting for the cloud map reduce creating

104
© Matthew Bass 2013 Architecting for the Cloud Len and Matt Bass Map Reduce

Upload: len-bass

Post on 10-May-2015

411 views

Category:

Software


3 download

DESCRIPTION

Day 5 of Architecting for the Cloud. Topics are Map Reduce and Creating an architecture

TRANSCRIPT

Page 1: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Architecting for the Cloud

Len and Matt Bass Map Reduce

Page 2: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Recall …

Data should be modeled to support primary use

Orders A - F

Orders G - M

Orders N - Z

Page 3: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Queries Across Nodes

• Sometimes you’ll need information from more than one node

– For example: “what was the biggest selling item in 2011”?

• You need a mechanism for efficiently aggregating data across nodes

– Recall the issues with relational databases

• The issue is that activities across physical nodes can be expensive (if they are dependent)

Page 4: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Example

• If the result is dependent on information across nodes this is expensive

• Imagine looking for the biggest selling product of 2011 for example

Product

Information

Order

Information

Order

Information

Order

Information

Customer

Information

Page 5: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Parallelizing the Work

• If it’s possible to split the work into independent process it’s much more efficient

• In the case below it wouldn’t take any longer to count an arbitrarily large number of nodes than it would to count one

Purchase

Orders

Purchase

Orders

Purchase

Orders

Results Results Results + + = Total

Page 6: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

What is Map Reduce • Map Reduce is an infrastructure for parallelizing the

processing of large amounts of data (Terabytes).

• It assumes that it is being run on a cluster of hundreds or thousands of computers

• It manages the division of data and recovering from the failure of any individual computer in the cluster.

• A Map Reduce application computes a “natural join”

Page 7: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Serial vs. Parallel Programming

• In the old days programs were designed to execute instructions sequentially

• This limited the amount of data that can be processed

• In parallel programming the idea is that you break the data set down into units that can be processed in parallel

– What does this imply?

Page 8: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Data Units

Units of data can be independently processed

1

2

3

4

Page 9: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Implementation Technique

• A common implementation technique is to use a master/worker pattern

• The Master – Initializes an array and splits it according to the number of

workers

– Sends each Worker its sub-array

– Gets the results from each Worker

• The Worker – Receives the sub-array from the Master

– Performs processing on the sub-array

– Returns results to the Master

Page 10: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Example Application map(String key, String value):

// key: document name

// value: document contents

for each word w in value:

EmitIntermediate(w, "1");

reduce(String key, Iterator values):

// key: a word

// values: a list of counts

int result = 0;

for each v in values:

result += ParseInt(v);

Emit(AsString(result));

The assumption is that the input file is on the order of Gigabytes. Executes on a cluster of hundreds or thousands of computers. Scheduling, failure recovery, and synchronization are all managed by the map reduce infrastructure.

Page 11: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

General Map Reduce Statement

Map instance:

• Input consists of a collection of <key1, value1> pairs.

• Output consists of a collection of <key2, value2> pairs

Reduce instance:

• Input consists of <key2, list(value2)>

• Output consists of a list(value2)

Infrastructure sorts the output of the map functions based on key2 and provides each reduce function with all of the outputs of the map instances with the same key2

Page 12: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Distributed Grep Distributed Grep: Find the occurrences of a particular string in a

data set

Map: output a line if it contains the supplied pattern. It does not output anything if there is no match

Reduce: copy its input to the output

Page 13: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Count URL Access Frequency Count of URL Access Frequency: Count the number of times a

URL occurs in a log

Map: the map function processes logs of web page requests and outputs (URL,1)

Reduce: add together all values for each URL and output the

total count.

(this is the same as the word counter from before)

Page 14: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

ReverseWeb-Link Graph For a list of <source URL, target URL>, output the list of source

URLs that contain a link to each target

Map: the input is a pair <source, target>, the output is <target, source>

Reduce: concatenate the list of source URLs associated with a particular target URL. Emit (target, list(source))

Page 15: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Term-Vector per Host Output a list that contains the most important words that occur

in a document as a list of (word, frequency) pairs per document.

Map: input <URL, document>, output <URL, term vector>

Reduce: merge the term vectors for each URL and output final <URL, term vector>

Page 16: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Application areas for Map-Reduce* Ads & E-commerce

Astronomy

Social Networks

Bioinformatics/Medical Informatics

Machine Translation

Spatial Data Processing

Information Extraction and Text Processing

Artificial Intelligence/Machine Learning/Data Mining

*http://atbrox.com/2011/05/16/mapreduce-hadoop-algorithms-in-academic-papers-4th-update-may-2011/?utm_source=NoSQL+Weekly+List&utm_campaign=de57072736-NoSQL_Weekly_Issue_25_May_19_2011&utm_medium=email

Page 17: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

How Does This Work?

• A Master will assign jobs to a Slave node

– These jobs consist of two process: Map and Reduce

• The Slave node typically contains the data to be processed (when possible)

– The cost of transferring the data is too high

Page 18: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Job Execution

• The Slave node will execute the Map Job producing intermediate output

• The Map job will transfer this intermediate result to the Reduce process

• This is a synchronization phase

– The mapper nodes transfers the intermediate results to the reducers

– They then schedule the reduce activity

Page 19: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Reduce Activity

• The reduce phase sorts the intermediate results

• This is called the shuffle phase – This can sometimes be a labor intensive activity

• It then merges the results – Producing the final results

Page 20: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Issues with Map Reduce

• Map Reduce can be very fast and scalable

• There are issues, however

• The performance can be adversely impacted by

– Stragglers that occur during the map phase

– Labor intensive shuffle phase

Page 21: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Straggler Problem

• The Reduce job won’t execute until all of the mapper jobs are complete

• This means that you can have one slow mapper that can slow down the entire job

• This is known as the straggler problem

• There are many reasons that can create a straggler

Page 22: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Synchronization Issues

• There are a number of reasons for stragglers

– Heterogeneity amongst nodes executing mapping functions

– Network issues

– Node failures

– Data distribution issues

Page 23: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Data Distribution Issues

• It’s possible for the data to be distributed unevenly across nodes

• This doesn’t have to mean that the volume of data differs

• It could also mean that the density of data differs – With respect to the Map function

• This would cause the Map function to require increased execution times on the densely populated node

Page 24: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Node Heterogeneity

• Differences in the capability of the nodes executing the map function can cause stragglers

• It could be that the nodes are different in terms of CPU or memory capacity

• It could also be due to the loading of the nodes

– Given that we are in a multitenant environment it’s possible that others are consuming significant resources

– Other jobs could be running at the same time

Page 25: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Network Issues

• Significant network load can slow down the job as well

• This again can be due to overall network traffic

• It will frequently occur if the data and job are not collocated

• If it’s not possible to collocate on the same node, collocation at least on the same rack is wise

Page 26: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Node Failure

• Node failure can also slow down the overall map reduce job

• Map Reduce does have fault tolerant mechanisms built in to deal with this

• We’ll look at these in a minute

Page 27: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Shuffle Phase

• In some cases the shuffle phase can cause delay due to

– Network bandwidth consumption

– I/O overhead

• Some shuffle activities are iterative (e.g. pagerank) and the I/O costs can be higher than the computational costs

Page 28: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Architecture of Map Reduce

• Let’s look at the architecture of a common Map Reduce framework

– Hadoop

• There are several entities in this architecture

– Client

– Job Tracker

– Task Tracker

– Task

Page 29: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Entities in Map Reduce

• Client: is the client application that requests the map reduce job

• Job Tracker: schedules jobs, monitors execution of tasks, works to complete job

• Task Tracker: a node that accepts tasks (map, reduce, shuffle) from the job tracker. Monitors the execution of the task

Page 30: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

View of Map-Reduce

Page 31: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Client Job Tracker

Client bundles information necessary to execute the Map-Reduce Job

– Map code

– Reduce code

– Input files

– Output files

– Other information such as splitting function, hash function.

Client also reserves a number of computers in the cluster for this job. The reservations do not preclude the sharing of these computers.

– One computer is the Job Tracker

– The others are task trackers.

Client submits job to Job Tracker

Page 32: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Job Tracker Task Tracker (map phase)

Job Tracker divides input file into fixed size segments – typically 16-64MB

Job Tracker instantiates a Task Tracker instance on the allocated computers.

Each instance has

• Segment of the input to process

• Code to implement the Map function

• Text Formatter to turn input into records with key1 and value1

• R which is the number of reduce instances

• Partitioning function – e.g. hash

• Code to Implement the Reduce function

Page 33: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Task Tracker (map phase)

Instantiates map function in a separate jvm (to enable tracing of activity)

Processes one logical record at a time as defined by the

Text Formatter

Opens one output file on its local computer partitioned into R portions.

Writes output from processing into partition [hash(key2) modulo R]. The individual records are buffered in memory until a significantly large block has been collected.

Reports completion back to Job Tracker

Page 34: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Picture so far

Page 35: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Job Tracker (reduce phase)

Wait until all Map instances complete (I will talk about failure and optimizations later).

Invoke the Reduce functions passing them their particular partitions. I.e. Reduce function 3 gets all of the partition 3s from the various mapping functions.

Because all of the Map instances have completed, there is a complete data set for the reduce instances to process.

Page 36: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Task Tracker (reduce phase)

A task tracker instance is provided a set of partitions.

The task tracker sorts its input data. This may involve an external sort, it may involve a pre-process of the input to combine entries, or both.

All of the entries with the same key2 are provided to the reduce function at once. This plus the fact that the Job Tracker waited for all map functions to complete allows the reduce function to be sure that all of the data with that key2 value are being processed at the same time by that single reduce instance.

The reduce function writes its output to an output file.

When it is complete, it informs the Job Tracker.

Page 37: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Picture w/ Reduce Function

Page 38: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Completing

If there are R reduce functions, then R output files are produced. These files

• Can be returned as R files to the client

• Can be passed to another reduce function

• Can be combined into a single file by Job Tracker (name provided by client as a portion of invocation)

Job Tracker waits until all of the reduce functions have completed and then informs client of completion. It also informs Task Trackers to clean up their files.

Page 39: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Reliability

• There are 3 basic failure scenarios

– Task tracker failure

– Job tracker failure

– Client failure

• We’ll look at these in turn

Page 40: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Task Tracker Failure

Job tracker keeps track of state for each map and reduce task. The state may be idle, in-progress, completed.

For each in-progress task, the Job Tracker pings the computer on which it is executing periodically.

If the computer fails, all map tasks on that worker are set back to idle. Furthermore, all in-progress reduce tasks are set back to idle

• In-progress map and reduce tasks must be restarted for obvious reasons

• Completed map tasks must be restarted because their intermediate output is on the computer on which the map task was executing.

Any output created by a failed reduce task is discarded.

Page 41: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Job Tracker Failure

Recall one Job Tracker instance per job (no central Job Tracker).

Since execution time for the job is relatively small compared to mean time to failure for the host (even commodity host), nothing special is done for Job Tracker failure.

Client must check on Job Tracker. If Job Tracker fails, client restarts another Job Tracker.

Existing Task Trackers must clean up their files. They know the Job Tracker has failed when they do not get communications from the Job Tracker.

Page 42: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Client Failure

If the client fails, the Job Tracker and Task Trackers continue to execute.

The only connection between the Job Tracker and the client is in the output file.

If output file is on client machine, the Job Tracker will detect that through failed writes and will terminate itself.

If output file is not on client machine, then Job Tracker will create output file. It is the responsibility of an application higher in the stack to clean up the output file.

Page 43: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Optimizations

• Several optimizations exist for the issues discussed

– Restart slow task trackers

– Asynchronous map and reduce phases

– Placement of task trackers

– Various scheduling algorithms

Page 44: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Task Tracker Restarts

• If the system detects slow task trackers it can restart them

– Hadoop is set up to restart task trackers that are 1.5 times slower than the average

• This works in some cases

• But doesn’t help if the data density or capacity of the node is the issue

– Hadoop assumes homogeneity amongst nodes

Page 45: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Asynchronous Phases

• Typically the reduce phase waits until the map phase is complete

• An alternative is to begin execution of the reduce phase once intermediate results are available

• This can be done in two ways

– Hierarchical reduction

– Incremental reduction

Page 46: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Scheduling Options

• By default Hadoop implements a FIFO scheduling algorithm

Page 47: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Fair Scheduling

• Fair scheduling on the other hand allocates resources to each job (developed at Facebook)

Page 48: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Capacity Scheduling

• Developed by Yahoo!

• Jobs are separated into queues

• Each queue is guaranteed some percentage of the total capacity

• If there are additional resources available they will be divided equally across the queues

Page 49: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Summary

• Relational databases are difficult to distribute efficiently

– Scalability can be problematic

• NoSQL databases offer an alternative

– Data is typically schema-less

• Aggregates of data that mirror primary use cases are considered a unit of data

• Queries across nodes requires an efficient mechanism for aggregation

Page 50: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Questions??

Page 51: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Architecting for the Cloud

Creating an architecture

Page 52: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Outline

• What is different about architecting for the cloud?

• Team Coordination Requirements

– Service Oriented Architecture

– Micro Service Oriented Architecture

Page 53: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

General Design Guidance

• The general design approach is the same as non cloud based systems although there are special considerations

• The decisions you make are not going to impact functionality

• They are going to impact the systemic properties supported or inhibited by your system

• You thus want to use these properties as the evaluation criteria for your decisions

• This means they need to be well articulated

• We are going to focus on special considerations caused by the cloud

Page 54: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Special considerations for the cloud

• Scalability

• Distribution

• Failure likelihood

• Data (in)consistency

• Team coordination requirements (discussed in its own section)

Page 55: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Scalability

• Making a system scalable is a matter of managing state. • Components that are stateless are easier to instantiate • When designing a system to be scalable

– Identify different types of state • Client • Session • Persistent

– Persistent state should be managed in a database and that should be in a separate tier

– When identifying components in your design, consider how they will scale demand grows.

– Make the ones that need to scale stateless – This may involve storing state in a database or in Memcached

type system

Page 56: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Migrating legacy system

• Identify state within existing components

• For those components that will scall when demand grows, factor state management out

• Make state management separate components and decide whether state is to be

– Persistent – store state in the database

– Exist for the run time of the system – use Memcached type of system

Page 57: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Distribution

• Assume each component is deployed on a different virtual machine

• Determine – Communication needs between components

• This affects performance

• Two components with high communication needs should be deployed “close together” in the network.

– Coordination needs among components • This affects performance and availability

• Use Zookeeper or other coordination system to manage coordination.

Page 58: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Failure

• Assume any component can fail at any time

• Two perspectives

– Component that fails

– Clients of component that fails

Page 59: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Failing component

• When a new instance of a failed component is instantiated it must be prepared to begin receiving requests – If the component is stateless, then nothing special

needs to be done

– If the component is stateful, then it must regain state of failed component • Logs

• Memcached

• Coordination with other components

Page 60: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Client of failed component

• It must recognize that a component has failed • Could be done through

– Time out – Error return from failed component (failure may be due to

a dependent component,, not the immediately invoked one)

• Client then – May inform other components of the failed component – Must find alternative method of service

• If failed component is replicated and stateless then a resent request will be routed by the load balancer to another instance

• Client may have fallback set of actions if request cannot be satisfied.

Page 61: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Consistency and Data Model

• Which data items need to be consistent?

• Which data items can be eventually consistent?

• What data model is most appropriate? – Use expected operations to evaluate the data model

– Think about the performance and scalability requirements when doing so

– Do the scalability needs imply there will need to be a partitioning of data?

– Does the model allow for a partitioning that will meet the desired properties?

Page 62: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Outline

• What is different about architecting for the cloud?

• Team Coordination Requirements

– Service Oriented Architecture

• What problem does it solve?

• What is it?

• How does it solve the problem?

– Micro Service Oriented Architecture

Page 63: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Recall Release Plan

1. Define and agree release and deployment plans with customers/stakeholders.

2. Ensure that each release package consists of a set of related assets and service components that are compatible with each other.

3. Ensure that integrity of a release package and its constituent components is maintained throughout the transition activities and recorded accurately in the configuration management system.

4. „„Ensure that all release and deployment packages can be tracked, installed, tested, verified, and/or uninstalled or backed out, if appropriate.

5. „„Ensure that change is managed during the release and deployment activities.

6. „„Record and manage deviations, risks, issues related to the new or changed service, and take necessary corrective action.

7. „„Ensure that there is knowledge transfer to enable the customers and users to optimise their use of the service to support their business activities.

8. „„Ensure that skills and knowledge are transferred to operations and support staff to enable them to effectively and efficiently deliver, support and maintain the service, according to required warranties and service levels

*http://en.wikipedia.org/wiki/Deployment_Plan

63

Page 64: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Why are we discussing SOA ?

• To make sure that everyone is on the same page

• SOA is still widely used

• SOA introduces some concepts used in Micro SOA.

Page 65: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Example

• Let’s look at an online retailer

– Something like Amazon that sells a variety of products available from a variety of suppliers

• Requirements for overall system are:

– Take orders: currently customers can call, fax orders, or order online

– Process orders: check inventory, ship goods, invoice customers

– Check status: check order status

– CRUD account information: customers have accounts

– Ad campaigns: subscribe/unsubscribe

Page 66: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Interactions with suppliers

• Amazon must check with their suppliers to

– Ensure it is in stock

– Notify the supplier to ship the item

– Determine the status of the order in case customer checks

– Deal with billing and pay supplier.

• This is the kind of problem that service orientation was designed to solve

Page 67: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

SOA context

• Customer is inside or outside of the cloud

• Service is inside of the cloud

• Customer and service are managed by different organizations

• Accessed through normal internet http(s)

• Internal structure of the service can be anything.

• Release planning coordination is not addressed

Service on servers

Customer

Page 68: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

SOA focus

• The focus of the SOA discussion is

– How do customers find the service

– How do customers interact with the service

• The discussion revolves around

– Discovery

– SOAP vs REST (standards vs flexibility)

Page 69: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Discovery

• Known URL – Applicable when customer has a business

arrangement with the service provider,

– e.g. the Amazon example

• UDDI (Universal Description Discovery and Integration) – Registry where businesses can register the services

they provide

– Applicable when customer is looking for any provider, e.g. travel services, weather services

Page 70: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Simple Object Access Protocol

• SOAP is an XML based message protocol

• A SOAP message consists of: – Envelope with

• Header

• Body with – Message data

– Fault (optional)

• Can be used with multiple transport protocols (typically HTTP(S))

• Intended to be self defining – header contains format of body.

Page 71: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

SOAP Messages

Http Request

Http Body

XML Syntax

Soap Envelope

Soap Body

Soap Body Block

Textual Integer

0x0b66

Page 72: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Issues

• Significant overhead

– XML processing takes time

– Messages are heavy weight

• Semantic dependencies continue to exist

• Runtime infrastructure required

– Technologies introduce potential for incompatibilities

Page 73: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

REST

• REpresentational State Transfer

• In the REST world you have clients and servers

• The state of the client is changed as the result of a resource request

– Think about what happens to your browser when you request a web page

• REST is not a standard but a set of principles

Page 74: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

REST + XML

• REST uses typical HTTP requests

– GET, PUT, POST, DELETE

• Typically no XML request sent

• The result could be an XML document

– This could be for example an HTML page

– But it could also be a XML file that is not HTML

Page 75: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

REST + JSON

• JavaScript Object Notation is a data exchange format based on JavaScript

• REST + JSON is the same as REST + XML except the data is transferred using JSON

• As JSON is a subset of JavaScript it is able to be parsed directly by the browser

– Used in AJAX

Page 76: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

REST vs SOAP - SOAP

• SOAP optimizes on flexibility without much concern about scalability, performance, and so forth

• SOAP has a collection of standards to specify properties of interaction – WS-Addressing, – WS-Discovery, – WS-Reliable Messaging – WS-Transaction – WS-Federation, – WS-Policy, – WS-Security, – WS-Trust – WS-Routing – WS-Referral – WS-Inspections

• You can see why it is consider heavy weight and high overhead

Page 77: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

REST vs SOAP - REST

• REST is designed for higher performance than SOAP but is not in and of itself a standard

• A REST interface has http requests but not additional semantics

– Semantics must be defined externally to use

– Interoperability can thus be a problem

–REST does not require a specific runtime environment

Page 78: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Outline

• What is different about architecting for the cloud?

• Team Coordination Requirements

– Service Oriented Architecture

– Micro Service Oriented Architecture

• What problem does it solve?

• What is it?

• How does it solve the problem?

Page 79: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Time Line to Production

Development Integration and testing

Deployment

Goal is to reduce release planning coordination required in these phases

Page 80: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Architecting to shorten release planning

• Micro SOA is designed to shorten the release phase.

• It does this by allowing development teams to operate without inter team coordination.

• Secondary assumptions are

– High workload

– Failure recovery

Page 81: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Amazon design rules - 1

• All teams will henceforth expose their data and functionality through service interfaces.

• Teams must communicate with each other through these interfaces.

• There will be no other form of inter-process communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.

81

Page 82: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Amazon design rules - 2

• It doesn’t matter what technology they[services] use.

• All service interfaces, without exception, must be designed from the ground up to be externalizable.

• Amazon is optimizing for its workload with these requirements – Mainly searching and browsing and web page

delivery

– Some transactions but not the dominant portion of the workload

82

Page 83: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Micro SOA context

• Customer is inside or outside of the cloud

• Service is inside of the cloud

• Micro SOA describes the internal structure of the service.

Service on servers

Customer

Page 84: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Micro service oriented architecture

84

Service • Each user request is satisfied

by some sequence of services.

• Most services are not externally available.

• Each service communicates with other services through service interfaces.

• Service depth may be 70, e.g. LinkedIn

Page 85: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Relation of teams and services

• Each service is the responsibility of a single development team

• Individual developers can deploy new version without coordination with other developers.

• It is possible that a single development team is responsible for multiple services

• Team size • Coordination among team members must be high bandwidth and low overhead. • Typically is done with small teams – as in agile.

85

Page 86: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Design decisions

• Seven categories of design decisions*. 1. Allocation of responsibilities. 2. Coordination model. 3. Data model. 4. Management of resources. 5. Mapping among architectural elements. 6. Binding time decisions. 7. Choice of technology

*Software Architecture in Practice 3rd edition, Chap 4

86

Page 87: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Design decisions made or delegated by choice of Micro SOA • Micro service oriented architecture either

specifies or delegates to the development team five out of the seven categories of design decisions. 1. Allocation of responsibilities. 2. Coordination model. 3. Data model. 4. Management of resources. 5. Mapping among architectural elements. 6. Binding time decisions. 7. Choice of technology

87

Page 88: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Roadmap for next several slides

• Micro service oriented architectural style will either specify or allow delegation of five different categories of design decisions.

• Each decision category will be discussed separately.

88

Page 89: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 1 – allocation of responsibilities

• This decision is not delegated to the team or specified.

• Development teams must coordinate to divide responsibilities for features that are to be added.

• Typically this happens at the beginning of each iteration cycle.

89

Page 90: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 2 - coordination model

• Elements of service interaction

– Services communicate asynchronously through message passing

– Each service could (in principle) be deployed anywhere on the net.

• Latency requirements will probably force particular deployment location choices.

• Services must discover location of dependent services.

– State must be managed

90

Page 91: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Service discovery

91

• When an instance of a service is launched, it registers with a registry/load balancer

• When a client wishes to utilize a service, it gets the location of an instance from the registry/load balancer.

• Eureka is an open source registry/load balancer

Instance of a service

Client

Register

Invoke

Registry/ load balancer

Query registry

Page 92: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Subtleties of registry/load balancer

• When multiple instances of the same service have registered, the load balancer can rotate through them to equalize number of requests to each instance.

• Each instance must renew its registration periodically (~90 seconds) so that load balancer does not schedule message to failed instance.

• Registry can keep other information as well as address of instance. For example, version number of service instance.

92

Page 93: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

State management

• Services can be stateless or stateful

– Stateless services

• Allow arbitrary creation of new instances for performance and availability

• Allow messages to be routed to any instance

• State must be provided to stateless services

– Stateful services

• Require clients to communicate with same instance

• Reduces overhead necessary to acquire state

93

Page 94: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Where to keep the state?

• Persistent state is kept in a database – Modern database management systems (relational)

provide replication functionality – Some NoSQL systems may be replicated. Others will

require manual replication.

• Transient small amounts of state can be kept consistent across instances by using tools such as Memcached or Zookeeper.

• Instances may cache state for performance reasons. It may be necessary to purge the cache before bringing down an instance.

94

Page 95: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 3 – Data model

• Schema based database system (relational). Requires coordination.

– Development teams must coordinate when schema is defined or modified.

– Schema definition happens once when the architecture is defined. Schema modification should be rare occurrence. Schema extensions (new fields or tables) do not cause problems.

• NoSQL systems. Will still require coordination over semantics of data.

– Data written by one service is typically read by others, they must agree on semantics.

95

Page 96: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 4 – Resource Management

• Each instance of a service can process a certain workload. – Could be expressed in terms of requests – Could be expressed in terms of resource requirements

– e.g. CPU

• Each client instance will require resources from the service to process its requests.

• Service Level Agreements (SLAs) are a means for automating the resource assumptions of the clients and the resource requirements of the service.

96

Page 97: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Managing SLAs

• A requirement for each service is to provide an SLA for its response time in terms of the workload asked of it.

– E.g. For a workload of Y requests per second, I will provide a response within X seconds.

• A requirement for each client is to provide an estimate of the requests it will make of each dependent service.

– E.g. for each request I receive, I will make Z requests for your service per second.

• This combination will enable a run time determination of the number of instances required for each service to meet its SLA.

97

Page 98: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Provisioning new instances

• When the desired workload of a service is greater than can be provided by the existing number of instances of that service, new instances can be instantiated (at runtime).

• Four possibilities for initiating new instance of a service: 1. Client. Client determines whether service is adequately provisioned

for its needs based on service SLA and services current workload.

2. Service. Service determines whether it is adequately provisioned based on number of requests it expects from clients.

3. Registry/load balancer determines appropriate number of instances of a service based on SLA and client instance requests.

4. External entity can initiate creation of new instances

98

Page 99: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Responsibilities of development teams.

• SLA determination of a service is done by the service development team prior to deployment augmented by run time discovery.

• Determination of a client's requirements for a service are is done by the client’s development team.

• Choice of which component has responsibility for instantiating/deinstantiating instances of a service is done as a portion of the architecture definition.

99

Page 100: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 5 – Mapping among architectural elements

• Decisions about packaging modules into processes and processes into a service are delegated to the service development team.

• Decisions about deployment of a service will be discussed later.

100

Page 101: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decision 6 – Binding time

• Configuration information binding time is decided during the development of architecture and the deployment pipeline.

• Other binding time decisions are delegated to the service development team.

101

Page 102: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Decisions 7 – Technology choices

• All technology choices are delegated to the service development team.

102

Page 103: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Questions about Micro SOA

• /Q/ Isn’t it possible that different teams will implement the same functionality, likely differently?

• /A/ Yes, but so what? Major duplications are avoided through assignment of responsibilities to services. Minor duplications are the price to be paid to avoid necessity for synchronous coordination.

• /Q/ what about transactions?

• /A/ Micro SOA privileges flexibility above reliability and performance. Transactions are recoverable through logging of service interactions. This may introduce some delays if failures occur.

103

Page 104: Architecting for the cloud   map reduce creating

© Matthew Bass 2013

Summary

• Special considerations when architecting for the cloud are – Scalability – Distribution – Failure likelihood – Data (in)consistency – Team coordination requirements

• SOA provides a means to access services from outside of the cloud

• Micro SOA provides a structure that minimizes need for team coordination within a single externally visible service