distributed systems - carnegie mellon university 95-702 distributed systems 25 why distributed? •...

57
Distributed Systems Introduction Overview of Distributed Systems MISM 95-702 Distributed Systems 1

Upload: voduong

Post on 03-May-2018

219 views

Category:

Documents


2 download

TRANSCRIPT

Distributed Systems

Introduction

Overview of Distributed Systems

MISM 95-702 Distributed Systems 1

Any Guess What This Is?

2 Source: http://www.cheswick.com/ches/map/movie.mpeg

MISM 95-702 Distributed Systems 3

Why Distributed Systems?

•  Long ago…

•  Anyone want to guess who the two people are?

•  What are they most known for?

Long ago, there were stand-alone systems

•  E.g. Email •  Initially, email existed on a single machine •  All users in a community had to dial into that

machine •  Once computers starting talking to each other across

dial-up lines, you could get mail from one person to another –  E.g. uucp: bigsystem!myschool!mycomputer!joemertz

–  You would have to tell people how to hop to your computer •  Similar to not having addresses, but telling people how to find

you by a set of intersections. •  Imagine doing that from the US to India!

MISM 95-702 OCT 4

MISM 95-702 OCT 5

Why Distributed Systems?

Today…

And a single email address: [email protected]

Evolution of applications

• This is a rapidly growing and changing field.

• Building solutions has gotten quite powerful, but also quite complex.

• E.g. History of what has become Customer Relationship Management (CRM) systems.

6

Contact manager application

Stand alone applications on a single computer.

• Single programming language • Single computing platform • Ad-hoc information storage • No networking protocols

7

Shared database

Shared databases available via a LAN to all salespeople

• Single programming language • Single computing platform • Single DBMS • LAN based networking

8

Web-based system

•  CRM uses common web protocols for users to access customer information via a browser

•  Multiple programming languages –  Middleware (software on the server)

•  Java and J2EE, C# and .NET, PHP, Ruby, or others •  HTML (and perhaps Javascript and CSS)

•  Multiple computing platforms –  Server –  User web browser

•  Potentially multiple web browsers with varying capabilities •  Server-side DBMS •  Web-based networking protocols: http

9

My.company.com

Customer portal into your business •  Multiple languages

–  Middleware (J2EE, .NET, etc) –  Languages of legacy systems –  HTML, CSS, Javascript –  Frameworks: AJAX, JQuery, Struts, Springs, Hibernate,

JPA, Flash, Silverlight •  Multiple computing platforms

–  Server, legacy systems, multiple web browsers –  Cloud-based platforms (IaaS, PaaS, SaaS)

•  Multiple DBMS, some legacy •  Networking

–  HTTP, web services protocols (e.g. SOAP, REST), enterprise-internal protocols to tie to legacy systems.

10

Mobile.my.company.com •  Customer can access all their account information for interacting with

the business via a mobile phone •  E.g. Amazon.com on my iPhone. •  Multiple languages

–  Middleware –  Languages of legacy systems –  HTML, CSS, Javascript –  Objective-C for iPhone, Java for Android and Blackberry, C# for Windows Phone 7. –  Frameworks: AJAX, JQuery, Struts, Springs, Hibernate, JPA, Flash, Silverlight

•  Multiple computing platforms –  Server, legacy systems, multiple web browsers –  Cloud-based platforms (IaaS, PaaS, SaaS) –  iPhone, Android, etc mobile phone platforms.

•  Multiple DBMS, some legacy •  Networking

–  HTTP, web services protocols (e.g. SOAP, REST), enterprise-internal protocols to tie to legacy systems,

–  XML or JSON to pass objects to/from mobile. 11

MISM 95-702 OCT 12

Course Objectives

1.  Understand the principles underlying distributed computing and the design of distributed systems.

2.  Instantiate these principles in the context of real applications, using technologies such as XML, SOAP, Web services, and JEE.

3.  Analyze, design, evaluate and recommend distributed computing solutions in response to business problems.

MISM 95-702 Distributed Systems 13

OCT Relationship to Other Courses

95-702 Distributed Systems

95-843 Service Oriented Architecture

95-774 Business Process Modeling

95-831 EA

95-712 Java

MISM 95-702 OCT 14

Structure of the Course •  Lectures / Discussion

– Your class participation is critical!

•  Demonstrations – With your active involvement! – Sometimes with labs you do in class

•  Project assignments – Programming & Writing – The secret is to start early!

•  Two Midterms •  Final examination

MISM 95-702 OCT 15

Class Preparation

• Readings are assigned for most classes – You must complete them before

class, and be prepared to discuss them.

– You will be called upon

MISM 95-702 OCT 16

Course Technologies

•  IDE (Netbeans, Eclipse) •  Java Web Applications (Glassfish) • Android Platform • Message Oriented Middleware

(Sun’s Message Queue ) • Web Services (JDK 6) • Distributed Objects (Java RMI,

EJB’s and CORBA)

MISM 95-702 OCT 17

Course Web Site

•  http://andrew.cmu.edu/course/95-702

• Review – Syllabus – Schedule

MISM 95-702 Distributed Systems 18

Overview of Projects

• A Projects Summary is on the schedule

MISM 95-702 OCT 19

Getting Started Guide •  See the schedule for:

–  Glassfish Getting Started –  Download and Install Glassfish (choose the "All" column) –  Android Getting Started –  Download and Install Android

•  Contains instructions to get started with some development tools.

•  The installation includes: –  Netbeans and Glassfish. –  Eclipse and Android SDK

•  Do the exercise in these guides before next class. –  Not to be turned in, but necessary for activities in class.

MISM 95-702 Distributed Systems 20

What is a Distributed System?

Coulouris et al: System in which hardware and

software components located on networked computers communicate and coordinate their actions only by passing messages.

Is your dual core laptop a distributed system?

MISM 95-702 Distributed Systems 21

Consequences of Being Distributed

Concurrency - Because components are located on independent networked computers, they execute concurrently. – Advantage: can handle my queries as well

as others – Disadvantage: “Hey, who took the airplane

seat I wanted!”

MISM 95-702 Distributed Systems 22

Consequences of Being Distributed

No Global Clock - Coordination needs to be done without the benefit of a single clock. – Try synchronizing two watches, within a

second, on opposite sides of the room, only passing notes through potentially unreliable couriers.

MISM 95-702 Distributed Systems 23

Consequences of Being Distributed

Independent Failure - Each component, independently, can fail. Therefore components must account for any other component failing. – Give me an example…

•  Something failing • How it must be accounted for

MISM 95-702 Distributed Systems 24

Why Choose to Construct a Distributed System?

• Name 3 applications you would not implement as a distributed system

• Name 3 applications you would

• Therefore generalize: – What motivates constructing

distributed systems?

MISM 95-702 Distributed Systems 25

Why distributed? •  Resource sharing

– BitTorrent

•  Communication – Skype

•  Coordination – Google apps

•  The world is distributed – Sensor nets

•  Use existing resources – Mashup Google Maps and spatial data

•  Other?

MISM 95-702 Distributed Systems 26

Building vs. Arising

• Distributed systems can be specifically built – PNC Bank’s ATM Network – Amazon’s server farms

• Or arise out of a set of open standards – WWW – BitTorrent

MISM 95-702 Distributed Systems 27

The World Wide Web

•  Created by Sir Tim Berners-Lee at European centre for nuclear research (CERN) in Switzerland in 1989 (Knighted 2003)

•  Provides a hypertext structure allowing documents to contain links to other documents

•  Is an open system (can be extended and implemented in new ways, standards are public and widely implemented)

MISM 95-702 Distributed Systems 28

The World Wide Web Based on three standards:

1. HTML - presentation of content & links 2. URL’s - point to a resource and specify a

protocol 3. HTTP - the request and reply protocol

Crafting distributed systems sometimes means only defining appropriate standards and protocols.

MISM 95-702 Distributed Systems 29

Components

• What hardware components can be part of a distributed system?

MISM 95-702 Distributed Systems 30

Components

•  What hardware components can be part of a distributed system?

•  Do they all have the – Same processor? – Same instruction set? – Same speed? – Same memory? – Same operating system? – Support the same languages?

MISM 95-702 Distributed Systems 31

Challenges Constructing Distributed Systems

1.  Heterogeneity of components

MISM 95-702 Distributed Systems 32

Challenges Constructing Distributed Systems

1.  Heterogeneity of components •  Middleware provides a level of

abstraction to mask heterogeneity. •  Much of this class is about writing

middleware.

MISM 95-702 Distributed Systems 33

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness

•  What is openness? •  Why is it a challenge?

MISM 95-702 Distributed Systems 34

What is Openness?

•  The degree to which: – Key interfaces are published – System can be built-upon or re-implemented – Licensing and support encourage using the

system –  Interface and communication standards are

negotiated by stakeholders vs. being controlled by a single vendor • Who is included in the stakeholders? •  Fairness vs. Secret APIs •  Stability across updates

MISM 95-702 Distributed Systems 35

Security

• From a Distributed Systems standpoint, how could you harm Salesforce.com?

MISM 95-702 Distributed Systems 36

Security Attacks

•  Get access to their users’ information

•  Compromise their users’ information

•  Make Salesforce unavailable to their users

MISM 95-702 Distributed Systems 37

Security Challenges

• Confidentiality – Protection against disclosure to

unauthorized individuals •  Integrity

– Protection against alteration or corruption

• Availability – Protection against interference in

accessing the resources

MISM 95-702 Distributed Systems 38

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness 3.  Security

MISM 95-702 Distributed Systems 39

Twitter Interviewer: How has Ruby on Rails been holding up to the increased

load? Twitter Developer Alex Payne: By various metrics Twitter is the biggest

Rails site on the net right now. Running on Rails has forced us to deal with scaling issues - issues that any growing site eventually contends with – far sooner than I think we would on another framework.The common wisdom in the Rails community at this time is that scaling Rails is a matter of cost: just throw more CPUs at it. The problem is that more instances of Rails (running as part of a Mongrel cluster, in our case) means more requests to your database. At this point in time there’s no facility in Rails to talk to more than one database at a time. … All the convenience methods and syntactical sugar that makes Rails such a pleasure for coders ends up being absolutely punishing, performance-wise. Once you hit a certain threshold of traffic, either you need to strip out all the costly neat stuff that Rails does for you (RJS, ActiveRecord, ActiveSupport, etc.) or move the slow parts of your application out of Rails, or both.

March 29th, 2007 http://www.radicalbehavior.com/5-question-interview-with-twitter-developer-alex-payne/

MISM 95-702 Distributed Systems 40

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness 3.  Security 4.  Scalability

•  Will the system remain effective when there is a significant increase in users or resources.

MISM 95-702 Distributed Systems 41

Scalability

• Can you: • Control the cost of physical resources? • Control performance loss? • Prevent software resources from running

out • E.g. Y2K - date field size • E.g. IPv4 addresses

• Avoid performance bottlenecks

MISM 95-702 Distributed Systems 42

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness 3.  Security 4.  Scalability 5.  Failure Handling

MISM 95-702 Distributed Systems 43

Failure Handling •  Detect failures

–  E.g. Did you really get that plane reservation?

•  Masking failures –  E.g. If no response, retransmit

•  Tolerating failures –  E.g. When browsing, server (temporarily) down; you just

move on.

•  Recovery from failure –  E.g. A bad message from a bank clearing a credit card

transaction should not crash Amazon.com, nor have the merchandise sent without payment.

•  Redundancy –  E.g. Replicate data, replicate routes to data

MISM 95-702 Distributed Systems 44

Registering for class

When you request a class:

If (#students < roomSize) {

#students = #students + 1;

return (enrolled);

} else {

return (class_full);

}

MISM 95-702 Distributed Systems 45

Registering for class

When you request a class:

If (#students < roomSize) {

#students = #students + 1;

return (enrolled);

} else {

return (class_full);

}

When Bob requests a class:

If (#students < roomSize) {

#students = #students + 1;

return (enrolled);

} else {

return (class_full);

}

If these two execute concurrently, what will the outcome be?

MISM 95-702 Distributed Systems 46

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness 3.  Security 4.  Scalability 5.  Failure Handling 6.  Concurrency

•  System must be designed to allow for concurrent access, but in a way that maintains integrity of data.

•  Java “synchronized” •  Semaphores

MISM 95-702 Distributed Systems 47

Final Challenge: Transparency

• Hiding, or concealing from a user or developer the details below which they need to know.

• Maintaining a consistent abstraction without worrying about the underlying details.

•  (Who thinks this should be called opaqueness, not transparency?)

MISM 95-702 Distributed Systems 48

Types of Transparency

Access transparency: enables local and remote resources to be accessed using identical operations.

Location transparency: enables resources to be accessed without knowledge of their location.

Concurrency transparency: enables several processes to operate concurrently using shared resources without interference between them.

Replication transparency: enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers.

MISM 95-702 Distributed Systems 49

Types of Transparency

Failure transparency: enables the concealment of faults,allowing users and application programs to complete their tasks despite the failure of hardware or software components.

Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs.

Performance transparency: allows the system to be reconfigured to improve performance as loads vary.

Scaling transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms.

MISM 95-702 Distributed Systems 50

Challenges Constructing Distributed Systems

1.  Heterogeneity of components 2.  Openness 3.  Security 4.  Scalability 5.  Failure Handling 6.  Concurrency 7.  Transparency

MISM 95-702 Distributed Systems 51

Pitfalls when Developing Distributed Systems

•  So watch for false assumptions that may be made by designers: – The network is reliable. – The network is secure. – The network environment is homogeneous. – Latency is zero. – Bandwidth is infinite. – There is one administrator. – And many more…

MISM 95-702 Distributed Systems 52

Distributed System Models

In this class, we will be considering both:

• Architectural Models • Fundamental Models

MISM 95-702 Distributed Systems 53

E.g. In the Physical World

• Architecture Models – House – Office Building – Factory

• Fundamental Models – How to heat it – How to light it – How to provide security

MISM 95-702 Distributed Systems 54

Distributed System Models

•  Architectural Models –  Describe the placement of components and the

relationships between them •  Software architecture •  Hardware architecture •  Network architecture

•  Fundamental Models –  Describe fundamental properties that cut across any

choice of architecture •  Interaction model •  Failure model •  Security model

MISM 95-702 Distributed Systems 55

Architectures and Models

Interaction Model

Failure Model

Security Model

Client / Server

E.g. Client always initiates communication

E.g. If server is down, the client must retry later.

E.g. Server maintains set of user identities

Peer to Peer

E.g. Any peer can initiate

communication

E.g. If one peer is down, another

can be used.

E.g. Replicated set of user

identities are maintained by 4

master peers

MISM 95-702 Distributed Systems 56

Hardware & Software Architecture

Applications, services

Computer and network hardware

Platform

Operating system

Middleware

• Masks heterogeneity

• Convenient programming model to application developers

• Services such as - communications, - data sharing, - naming, - security, - transactions, - persistent storage - event notification

}

MISM 95-702 Distributed Systems 57

Models Impact Decision Making

•  Building distributed systems requires making architectural and functional decision

•  Architectural Models – Placement of components and relationship among

them

•  Fundamental Models – Design choices that cut across choice of

architectures •  Interaction •  Failure •  Security