on demand and autonomic computing

24
IBM Research © 2003 IBM Corporation On Demand and Autonomic Computing Steve R. White Senior Manager, Autonomic Computing Thomas J. Watson Research Laboratory

Upload: enye

Post on 19-Jan-2016

81 views

Category:

Documents


0 download

DESCRIPTION

On Demand and Autonomic Computing. Steve R. White Senior Manager, Autonomic Computing Thomas J. Watson Research Laboratory. Outline. Background and motivation Research in autonomic components and systems Autonomic computing architecture Research in structured autonomic systems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: On Demand and Autonomic Computing

IBM Research

© 2003 IBM Corporation

On Demand andAutonomic Computing

Steve R. White

Senior Manager, Autonomic ComputingThomas J. Watson Research Laboratory

Page 2: On Demand and Autonomic Computing

2

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Outline

Background and motivation

Research in autonomic components and systems

Autonomic computing architecture

Research in structured autonomic systems

Page 3: On Demand and Autonomic Computing

3

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

On Demand Era

Responsive in real-time

Variable cost structures

Focused on what’s core and differentiating

Resilient around the world, around the clock

IntegratedOpenVirtuala

Page 4: On Demand and Autonomic Computing

4

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Complex heterogeneous infrastructures are a reality!

Directory Directory and Security and Security

ServicesServicesExistingExisting

ApplicationsApplicationsand Dataand Data

BusinessBusinessDataData

DataDataServerServer

WebWebApplicationApplication

ServerServer

Storage AreaStorage AreaNetworkNetwork

BPs andBPs andExternalExternalServicesServices

WebWebServerServer

DNSDNSServerServer

DataData

Dozens of systems and applications

Hundreds of components

Thousands of tuning

parameters

Page 5: On Demand and Autonomic Computing

5

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Motivation

Administration of individual systems is increasingly difficult 100s of configuration, tuning parameters for databases, Web application servers,

storage, …

Heterogeneous systems are becoming increasingly connected Integration becoming ever more difficult

Architects can't intricately plan interactions among components Increasingly dynamic; more frequently with unanticipated components

More of the burden must be assumed at run time But human system administrators can't assume the burden

6:1 cost ratio between storage administration and storage40% outages due to operator error

We need self-managing computing systems Behavior specified by system administrators via high-level policies

System and its components figure out how to carry out policies

Page 6: On Demand and Autonomic Computing

6

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Increase Responsiveness

Adapt to dynamically changing environments

Business Resiliency

Discover, diagnose, act to prevent disruptions

Operational Efficiency

Tune resources, balance workloads to best use IT resources

Secure Information & Resources

Anticipate, detect, identify, deter attacks

Autonomic Self-Management

Page 7: On Demand and Autonomic Computing

7

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Manual Autonomic

Ben

efi

tsS

kill

sC

har

acte

rist

ics

BasicLevel 1

ManagedLevel 2

PredictiveLevel 3

Evolving to Autonomic Computing

Multiple sources of

system generated data

Extensive, highly skilled

IT staff

Basic Requirements

Met

Data & actionsconsolidated through mgt

tools

IT staffanalyzes &

takes actions

Greater system awareness

Improved productivity

Sys monitors correlates & recommends

actions

IT staffapproves &

initiates actions

Less need for deep skills

Faster/better decision making

Sys monitors correlates &

takesaction

IT staff manages performance against SLAs

Human/system interaction

IT agility & resiliency

AutonomicLevel 5

Componentsdynamically

respond to bus policies

IT staff focuseson enabling

business needs

Business policy drives IT mgt

Business agility and resiliency

AdaptiveLevel 4

Page 8: On Demand and Autonomic Computing

8

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Human Interaction with Autonomic SystemsP. Maglio, Almaden

Basic questions What do middleware administrators do?

How can we better support the problems and practices they have?

Learn answers to these questions via ethnographic studies

Use insights to design new ways to interact with complex computing systems

… but we thought that was the return

port!

We had it wrong. Our assumption of how it worked was incorrect.

We start with looking at the proxy server log files, then the web server log files, then the application server admin log files then the application log files.

Page 9: On Demand and Autonomic Computing

9

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Few minutes later…

Dynamic Surge ProtectionJ. Hellerstein, Watson

Systems can go from steady Systems can go from steady state … state …

Internet

to overloaded without to overloaded without warning warning

Page 10: On Demand and Autonomic Computing

10

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Response Time

#Active Servers#Requested Servers

Surge Protection DemoMonitor & remove servers

Actual BOPS

Predicted BOPS

Page 11: On Demand and Autonomic Computing

11

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Enterprise Workload ManagementD. Dillenberger, Watson

InternetInternet

Appliance Appliance ServersServers

Web Web Application Application

ServersServersData and Data and

Transaction Transaction ServersServers

Internet/Internet/ExtranetExtranet

Business Business PartnersPartners

Large, distributed,heterogeneous system

Achieves end-to-end performance via adaptive algorithms Administrator defines policy

– Desired response times for various classes of users, apps eWLM managers on each resource cooperate to adaptively tune parameters

– OS, network, storage, virtual server knobs– JVM heap size, # garbage collection threads– Workload balancing, routing parameters

Page 12: On Demand and Autonomic Computing

12

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Policies and Autonomic ComputingD. Verma and D. Kandlur, Watson

Policy: Set of guidelines or directives provided to autonomic element to influence its behavior.

Key Challenge: Move away from low level controls

Move towards high level directives (policies) over autonomic decisions

Developing scenarios, standards and technologies to support policies for autonomic computing

Element

M

A

S

EP

E

K

S E

Element

MM

AA

S

EEPP

E

KK

S E

1. External policies are delivered through effectors.

3. AnalyzeAnalyze system operation w.r.t. policiesCreates reports as dicatated by policy

4. PlanAssigns tasks based on policesAssigns resources based on policies Enables sensorsAdd/modify/delete policies

2. Policies are stored as knowledge

5. Enabled/disabled based on policies

6. Enabled/disabled based on external policies

Page 13: On Demand and Autonomic Computing

13

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Utility Functions and Autonomic ComputingW. Walsh, Watson

Utility functions can guide autonomic decision making Self-optimization: natural way to express

optimization criteria

–Declarative: preferable to implicitly hard-coded in special purpose algorithms

Derivable from business objectives (e.g. optimize total profits)

–Can translate to computing metrics at different levels

Exploring applications in eWLM, eUtility, SLEDS

Response time RT

V(R

T)

Utility function

Page 14: On Demand and Autonomic Computing

14

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing ArchitectureThe Autonomic Element

AE is the fundamental abstraction Defines an important boundary

An AE contains Exactly one autonomic manager

Zero or more managed element(s)

– Could be basic resource like database, storage system, server, software app

– Higher level elements may have no managed element; they manage other autonomic elements via messages

AE is responsible for Providing/consuming computational

services

Interacting with other autonomic elements

Managing own behavior in accordance with policies

An Autonomic Element

Managed Element

ES

Monitor

Analyze

Execute

Plan

Knowledge

Autonomic Manager

An Autonomic Element

E.g. Database, storage, server, software app, workload mgr, sentinel, arbiter, OGSA infrastructure elements

Page 15: On Demand and Autonomic Computing

15

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing Architecture Element interactions

Based on OGSA; extensions as necessary Service-oriented architecture

Messages defined by WSDL: portTypes, operations

Services defined by constellations of portTypes AC architecture defines:

Required messages

Optional but standard messages For advanced interactions: conversation support

“Choreography” defines structure of multi-step interactions

Runtime enforces conversational protocols for app logic.

Underlies robust interactions

Page 16: On Demand and Autonomic Computing

16

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Manager ToolsetW. Arnold et al., Watson

Facilitates autonomic manager construction In accordance with AC architecture

Catcher for generic AM technologies OGSA messaging Policy tools Monitoring technologies AI tools for knowledge representation,

reasoning Math libraries for modeling, analysis,

planning Feedback control

V1.0 now available on alphaWorks Part of the Exploratory Technology Toolkit www.alphaworks.ibm.com An Autonomic Element

Managed Element

ES

Monitor

Analyze

Execute

Plan

Knowledge

Autonomic Manager

An Autonomic Element

ES

Page 17: On Demand and Autonomic Computing

17

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing SystemsA small-scale system prototype

PolicyRepository

Database Storage

Register Register

OGSARegistr

y

UserInterfac

e

Page 18: On Demand and Autonomic Computing

18

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing SystemsA small-scale system prototype

PolicyRepository

Database Storage

FindServiceData(PolicyRepository)

OGSARegistr

y

UserInterfac

eFetchPolicy,Subscribe(Policy)

ReportPolicy

Page 19: On Demand and Autonomic Computing

19

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing SystemsA small-scale system prototype

PolicyRepository

Database Storage

OGSARegistr

y

Service Class Definition

Alert Policy

ReportPolicySetPolicy

Publish(Policy)

UserInterfac

e

Page 20: On Demand and Autonomic Computing

20

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing SystemsA small-scale system prototype

PolicyRepository

Database

Storage

OGSARegistr

y

CreateTableSpace

AddResource(LV, Parms)

Alert PoliciesSvc Class Defs

FindServiceData(Storage)

QueryResponse(List(Storage))

UserInterfac

e

DeliverResource(LV Name)

Page 21: On Demand and Autonomic Computing

21

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Autonomic Computing SystemsFlexibly composed from autonomic elements

LargeAutonomicSystem

ResourceArbiter

WorkloadManager

NetworkServer

Application Environment 2

Application

Manager

StorageDatabaseDatabaseNetwork

Application Environment 1

Application

Manager

Predictor

Server Storage

WorkloadManager

RegistryResourceManagers

(e.g. Storage, DB,

Servers)

PolicyRepository

Sentinel

eUtilityManager

Page 22: On Demand and Autonomic Computing

22

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Workshops

First Workshop on Algorithms and Architectures for Self-Managing Systems (at FCRC ’03)

June 11, 2003 in San Diego, CA 5th Annual International Conference on Active Middleware Services:

Autonomic Computing Workshop June 25, 2003 in Seattle, WA

IJCAI-03 AI and Autonomic Computing: Developing a Research Agenda for Self Managing Computer Systems

August 10, 2003 in Acapulco, Mexico First International Workshop Autonomic Computing Systems at 14th

International Conference on Database and Expert Systems Applications (DEXA'2003)

1-5 September, 2003 in Prague, Czech Republic 14th IFIP/IEEE International Workshop on Distributed Systems:

Operations & Management (DSOM-03) October 20-22, 2003 in Heidelberg, Germany

Page 23: On Demand and Autonomic Computing

23

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

References

The Vision of Autonomic Computing IEEE Computer, January 2003

http://computer.org/computer/homepage/0103/Kephart/

IBM Systems Journal special issue on Autonomic Computing http://www.research.ibm.com/journal/sj42-1.html

Page 24: On Demand and Autonomic Computing

24

IBM Research

© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003

Interesting Research Problems

Architecture What is the right architecture?

Should we be working on architecture at all? Policies

Can we really run large IT systems by specifying high-level policies? Centralized vs. Decentralized Control

Will decentralized control play an important role? Human Interaction

How will humans interact with large autonomic systems?

How can we express the behavior of a large, dynamic system to humans?

Systems With a Billion Components Are they even possible?