grids and high performance distributed computing

17
Grids and High Performance Distributed Computing Andrew A.Chien March 29, 2005 CSE225, Spring 2005 CSE225 – Lecture #1 Course Information Course Instructor: Andrew Chien, [email protected] Course Meetings: TuThu 12:30-1:50pm, HSS 2152 Course web site: http://www- csag.ucsd.edu/teaching/cse225s05/ Handouts (see the web) » Course Information » Reading List » Schedule » Topical Report » Course Project

Upload: others

Post on 02-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

1

Grids and High Performance Distributed Computing

Andrew A.ChienMarch 29, 2005

CSE225, Spring 2005

CSE225 – Lecture #1

Course Information

• Course Instructor: Andrew Chien, [email protected]

• Course Meetings: TuThu 12:30-1:50pm, HSS 2152• Course web site: http://www-

csag.ucsd.edu/teaching/cse225s05/• Handouts (see the web)

» Course Information» Reading List » Schedule» Topical Report» Course Project

2

CSE225 – Lecture #1

CSE225 Course Work

• Read and Discuss Assigned Papers» Will be on the course web site (limited release)» Attend lectures and contribute to the discussion

• Topical Report (~20 pages)• Course Project

» Plan a great course project» Propose it» Do a great course project» Present it to the class» Write it up well

CSE225 – Lecture #1

Reading List

• I. Grid Computing: Vision and Realities» Vision, Real Grids, Applications

• II. Dynamic Applications are Resource Aware» Resource Description and Selection, Dynamic Monitoring,

Adaptive, Rescheduling• III. Open Resource Sharing

» Resource Sharing Models: Asymmetric, Batch, Slice-based» Allocation Techniques

• IV. Configurable Networks: Lambda Grids» Technology Drivers, Usage Models, Management, and Signalling

• V. Advanced Applications

3

CSE225 – Lecture #1

Topical Reports

• Four major Course Sections: Grid Vision, Dynamic Applications, Resource Sharing, Configurable Networks

• Two teams of three students each will be identified to write a summary of the area. Each student in the class will participate in one of these teams.

• Topical Report should include» Concise definition of the important problems in the area » Survey of the major approaches to the problem » Assessment of the state of the art in the area

• Identification of at least four major research problems or challenges to be explored

• ~20 pages in length They should not only reiterate the material in the papers, but also provide significant analysis and understanding translated into clear exposition.

• Due one week after coverage of the topic has been completed in the course lecture.

CSE225 – Lecture #1

Course Projects

• Project Teams are 2-4 students. • Course projects will be evaluated on two documents and a

presentation. • Project Proposal (due end of 3rd week of class)

» clear objectives - questions to be answered, research apparatus to be built, code to be understood)

» the team (who’s on it and contact information – email, phones, etc.) » clear responsibilities (who’s going to take lead for what); which will evolve

over time » Technical Definition of the project

– Project focus: What are the questions? – Infrastructure and Strategy: How are the questions going to be addressed? (i.e.

the software, systems, resources and experiments) – Expected Outcomes: What will we know that is new at the end of the project?

» Related Work Section (summarize background material) » Detailed Project Plan

4

CSE225 – Lecture #1

Course Projects (continued)

• Project Presentation (last week of class) » A 15-minute, Succinct Presentation of the Material in Project

Final Report

• Project Final Report (due at end of classes) » Elements from the Project Proposal » Description of What was Accomplished » Experimental Results » Analysis and Summary of Results

CSE225 – Lecture #1

Project Planning

• Week 1 and 2: Formation of Groups and Initial Project Definition

• End of Week 3: Project Proposals • End of Week 4: Feedback on Proposals, 1 week to

revise• Week 7: Checkup on Project Progress• Week 10: Project Presentations and Final Reports

• => Begin discussions today!

5

CSE225 – Lecture #1

Topics I

• Resource Selection, and Binding #1: Using realistic resource data and the three basic models for resource sharing, perform experiments to evaluate how well different selection and bindingstrategies work? How well can we do if system utilization is the primary goal? Application performance? Turnaround (completion latency)? How well do they work as the level of competition for resources increases in the system?

• Resource Selection, and Binding #2: Using realistic resource data and one of the three models for resource sharing (cycle stealing, batch scheduling, and slicing), perform experiments to evaluate how well different selection and binding strategies work? Use several synthetic distributed application models (which involve a set ofresources, and a set of computations, data movement, etc.) How well can we do if system utilization is the primary goal? Application performance? Turnaround (completion latency)? How well do they work as the level of competition for resources increases in the system?

CSE225 – Lecture #1

Topics II

• Dynamic Applications: Using the Virtual Grid infrastructure (vgES), construct an application which has a specific set of resource requirements (uses a specification to select), and then uses monitoring and adaptation to improve some aspect of its performance. Quantify the dependence of behavior on the dynamic information (i.e. network weather service) and its quality. Explore ranges of resource environments and properties for which stable (and unstable) behavior is realized. Are there any general principles that can be derived from the adaptation methods or resource behaviors explored?

6

CSE225 – Lecture #1

Topics III

• Open Resource Sharing #1: Using the statistical characterization of resources we have studied, compare the efficacy of the three models for resource sharing (cycle stealing, batch scheduling, and slicing) for three classes of applications – compute intensive, memory intensive, and data intensive. Where these terms are defined by the resource which limits the performance of a process in the overall computation. Explore quantitatively how the properties of the applications affect the utility of each resource sharing model.

• Open Resource Sharing #2: Using the statistical characterization of resources we have studied, compare the efficacy of the three models for resource sharing (cycle stealing, batch scheduling, and slicing) to study applications with a range of coupling between processes, ranging from embarrassingly parallel, master-worker, workflow, both master-worker and workflow with tightly-coupled subjobs, and tightly-coupled parallel computations. Explore quantitatively how the coupling properties of the applications affect the utility of each resource sharing model.

CSE225 – Lecture #1

Topics IV

• Configurable Networks #1: Compare the three models of use (intelligent network, asynchronous file transfer, and distributed virtual computer) for an application example where we know all of the computation times and communication quantities. For example, you could use the NAS Parallel Benchmarks and Grid Parallel Benchmarks. Other example applications are also of interest. Vary the parameters of time to detect a connection, connection setup delay, the cost to set up and tear down a connection, and the cost of “having a connection” per unit time. For what range of application properties does each model make sense? For what range of costs does each model make sense?

• Configurable Networks #2: Based on public estimates of the topology of major ISP’s such as Qwest, AT&T, MCI, Sprint, and others based on their publicly available maps, explore the following questions. Using a backdrop of the top 50 population centers in the United States, explore the number of topologies and competitive providers available to 5-city and 10-city subsets of these centers. Using the top 50 population centers in the world, explore the number of topologies and competitive providers. What is the spectrum of providers available for each subset? If one wanted to expand the subset of possible topologies, what is the impact on connection latency? (increase due to speed of light flight time) Explore how these realities might affect application performance and competition in future dynamic configurable network environments.

7

CSE225 – Lecture #1

Topics V

• Application-driven Evaluation of Grid Infrastructures: From the perspective of an important computational science application (e.g. Climate modeling, Protein Folding, Toxic Chemical diffusion, etc.), analyze the capabilities of current and future grid hardware infrastructures and technologies. Working with application experts who are well versed in the computational issues (we have several such volunteers), develop a performance model and simulation which includes a distributed application architecture, a resource description used to acquire resources, performance models for each element, and scaling characteristics. Use this simulation infrastructure to evaluate achievable performance of Grid deployments of these applications.

• Other Topics Many other topics are possible, and should be discussed with Professor Chien as early as possible.

Beyond the Technology:On Demand Computing

Irving Wladawsky-BergerVice President, Technology & StrategyIBM Server Group

8

Integration of Technology Into Society

Mass Adoption

Lab

Public Recognition

Early Adopters

Electricity

Evolution of TechnologyMass

Adoption

Lab

Public Recognition

Early Adopters

9

Integration of Technology Into Society

Mass Adoption

Lab

Public Recognition

Early Adopters

“Post Technology”Phase

Technology Development Phase

Evolution of Technology

Network ComputingThe Internet

Mainframes"The Glass House"

Client/ServerPC's/LAN's

. . .

.

InformationTechnology

10

Key IT Requirements

Technology AdvancesLow Costs; High Performance

Standards & Integration

Hiding Complexity

Organizational Productivity

Quality of Service

Flexibility of Deployment

Technology Issues Still Dominate

Technology Continues to Advance

11

Technology Continues to Advance#

Tran

sist

ors

per C

hip

Logic Density

1GHz

1MHz

10MHz

100MHz

1980 1990 2000 2010

10GHz1010

109

108

107

106

105

104

103

1011

Integrated Circuit Performance Trends

e-business Infrastructure

Middleware

Storage

Directoryand Security

Servers

Web Presentation

Servers

Web Application

ServersData

Servers

TransactionServersCustomers

BusinessPartners

Suppliers

Employees

Qua

lity

of S

ervi

ce

Net

wor

k

12

Timely, Reliable, Sophisticated, Technologies

Huge Talent Pool

Developing Standards

Driving Innovation

WSDL

XML

LinuxSOAP

Culture of Standards

Globus

InterfaceWSDL

DirectoryUDDI

TransportSOAP

Defines how to use the service

"Yellow pages" forservice location

Connecting with applications and data

Web ServicesStandards and Integration

13

Hiding Complexity: Grid ComputingAccessing and Sharing Resources over the Internet, or Private Intranets, based on Open Protocols

Productivity on an Internet ScaleVirtual Organizations: Accessing and Sharing Information, Applications and Expertise

14

IT Delivered as a Utility On Demand Grids

Network-Delivered Applications

Business Process Outsourcing

"Intelligent" Services

Middleware

e-commerce and B2B Transaction

Services

Storage Utility Services

Hosting/ Bandwidth

Evolution of Technology

Network ComputingThe Internet

Mainframes"The Glass House"

Client/ServerPC's/LAN's

IT in a “Post-Technology”World

On Demand Computing

Mass Adoption

15

Beyond the Technology:On Demand Computing

CSE225 – Lecture #1

What are Grids?

• Flexible shared infrastructures that can be automatically configured and adapted to use» “Utility”, “Shared”, “Plug in and Use”, “Dependable”» Efficient, flexible, low-cost use of resources

• Open infrastructures that enable federation at high levels of access and functionality» Computation Sharing» Data Sharing» Standards, Self-describing presentations, Security» Enable composition of resources, services, semantics, all the way

up!– Things that weren’t designed to work together– An evolving, emergent organic infrastructure

16

CSE225 – Lecture #1

Discussion

• Sounds good, why haven’t we always done this?• What are the costs and pitfalls of this approach?

• Do all things wind up being “on the grid”? And what does that mean?

• Are grids just for high-end resources? (e.g. IBM and HP)

• What are the key research challenges?

CSE225 – Lecture #1

Summary

• Course Organization» Two key elements of work» Topical Report: Signup today!» Course Project (get started today!)

• Grid Vision» Resources become invisible and seamlessly accessible» Sharing of data, computation, and other capabilities thru a

service-oriented model» Seamless federation and sharing (beyond the Internet)» On-demand, self managing, dynamic configuring, etc.

• This vision is very hard to achieve!

17

CSE225 – Lecture #1