cmpt 401 2008 dr. alexandra fedorova distributed systems

27
CMPT 401 2008 Dr. Alexandra Fedorova Distributed Systems

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

CMPT 401 2008

Dr. Alexandra Fedorova

Distributed Systems

2CMPT 401 © A. Fedorova

What is a Distributed System?

• Coulouris, et al:– communicate and coordinate their actions only by

passing messages

3CMPT 401 © A. Fedorova

What is a Distributed System?

• Andrew Tanenbaum:– A collection of independent computers that appear to

the users as a single coherent system– autonomous computers connected by a network– software specifically designed to provide an integrated

computing facility

4CMPT 401 © A. Fedorova

What is a Distributed System?

• Leslie Lamport:– “You know you have a distributed system when the

crash of a computer you’ve never heard of stops you from getting any work done.”

5CMPT 401 © A. Fedorova

What is a Distributed System?

• A broader definition:– A collection of processors executing independent

instruction streams that communicate and synchronize their actions

– Communication may be done via messages or shared memory

– Includes multi-process and multithreaded programs running on a monolithic multiprocessor hardware

6CMPT 401 © A. Fedorova

Distributed System: the Internet

©Pearson Education 2001

7CMPT 401 © A. Fedorova

Distributed System: an Intranet

©Pearson Education 2001

8CMPT 401 © A. Fedorova

Distributed System: Mobile Devices

©Pearson Education 2001

9CMPT 401 © A. Fedorova

Other Examples of Distributed Systems

• Distributed Multimedia Systems – Teleconferencing– Distance learning

• Cellular phone systems• IP Telephony• Flight management system in an aircraft• Automotive control systems (50+ embedded processors in a

Mercedes S-class)• Distributed file systems (NFS, Samba)• P2P file sharing• The World Wide Web

10CMPT 401 © A. Fedorova

Reasons for Distributing Systems I

• The need to share data across remote geographies– Online Encyclopedia Britannica is accessed by users all over the world– Computer users in different geographies send messages to each other

• Replication of processing power– Independent processors working on the same task– Distributed systems consisting of collections of microcomputers may have

processing power of a large supercomputer

• Use of heterogeneous components– Compute-intensive sub-tasks of a problem are run on powerful computers– Less resource-demanding sub-tasks run on less powerful computers– More efficient use of resources

11CMPT 401 © A. Fedorova

Reasons for Distributing Systems II

• Cost of hardware and management– A collection of cheap computers may be less expensive than one large

supercomputer– Small simple computers may be easier to manage than one large one

• Administrative/functional issues– Payroll database is separate from registrar’s database– Each is managed according to the needs of the organization– Each is equipped with hardware that answers the needs of an organization

• Resilience to failures– If one component fails, others can proceed with work on the task

• Scalability– The system can be extended by adding more components (i.e., WWW)

12CMPT 401 © A. Fedorova

Properties of Distributed Systems

• Heterogeneity– Systems consist of heterogeneous hardware and software

components• Concurrency

– Multiple programs run together• Shared data

– Data is accessed simultaneously by multiple entities• No global clock

– Each component has a local notion of time• Interdependencies

– Independent components depend on each other

13CMPT 401 © A. Fedorova

Challenges of DS: Heterogeneity

• Different network infrastructures (Ethernet, 802.11 – wireless)• Hardware and software (e.g., operating systems, processors):

how can an Intel/Windows system understand messages sent by an Macintosh OS X system?

• Programming languages – how can a Java program and a C program communicate?

14CMPT 401 © A. Fedorova

Challenges of DS: Security

• Shared data must be protected– Privacy – avoid unintentional disclosure of private data– Security – data is not revealed to unauthorized parties– Integrity – protect data and system state from

corruption• Denial of service attacks – put significant load on

the system, prevent users from accessing it

15CMPT 401 © A. Fedorova

Challenges of DS: Synchronization

• Concurrent cooperating tasks need to synchronize– When accessing shared data– When performing a common task

• Synchronization must be done correctly to prevent data corruption:– Example: two account owners; one deposits the money, the other one

withdraws; they act concurrently– How to ensure that the bank account is in “correct” state after these

actions? • Synchronization implies communication• Communication can take a long time• Excessive synchronization can limit effectiveness and scalability of a

distributed system

16CMPT 401 © A. Fedorova

Challenges of DS: Absence of Global Clock

• Cooperating task need to agree on the order of events• Each task has its own notion of time• Clocks cannot be perfectly synchronized• How to determine which even occurred first? • Example: Bank account, starting balance = $100

– Client at bank machine A makes a deposit of $100– Client at bank machine B makes a withdrawal of $150– Which event happened first? – Should the bank charge the overdraft fee?

17CMPT 401 © A. Fedorova

Challenges of DS: Partial Failures

• Detection of failures – may be impossible– Has a component crashed? Or is it just slow? – Is the network down? Or is it just slow? – If it’s slow – how long should we wait?

• Handling of failures– Retransmission– Tolerance for failures– Roll back partially completed task

• Redundancy against failures– Duplicate network routes– Replicated databases

18CMPT 401 © A. Fedorova

Challenges of DS: Scalability

• Does the system remain effective as if grows? • As you add more components:

– More synchronization– More communication –> the system runs slowly.

• Avoiding performance bottlenecks: – Everyone is waiting for a single shared resource– In a centrally coordinated system, everyone waits for

the coordinator

19CMPT 401 © A. Fedorova

Challenges of DS: Transparency I

• Concealing the heterogeneous and distributed nature of the system so that it appears to the user like one system

• Transparency categories– Access: access local and remote resources using identical

operations (NFS or Samba-mounted file systems)– Location: access without knowledge of location of a resource

(URL’s, e-mail)– Concurrency: allow several processes to operate concurrently

using shared resources in a consistent fashion (two users simultaneously accessing the bank account)

20CMPT 401 © A. Fedorova

Challenges of DS: Transparency II

• Transparency categories (continued)– Replication: use replicated resource as if there were just one

instance– Failure: allow programs to complete their task despite failures– Mobility: allow resources to move around– Performance: adaption of the system to varying load situations

without the user noticing it– Scaling: allow system and applications to expand without need to

change structure of applications or algorithms

21CMPT 401 © A. Fedorova

Course Objective

• Comprehensive introduction to distributed systems• Theoretical aspects: models and architectures of

distributed systems• Understand challenges in building distributed systems• Practical aspect: implement advanced distributed systems• Read latest research papers addressing distributed

systems

22CMPT 401 © A. Fedorova

Topics Studied I

• Architecture models of distributed systems (client-server, P2P, etc.)

• Operating system support: processes and threads, synchronization and mutual exclusion

• Inter-process communication• Distributed objects and remote invocation• Distributed file systems• Time and Global Clocks• Coordination and agreement

23CMPT 401 © A. Fedorova

Topics Studied II

• Transactions and concurrency control• Replication• Distributed multimedia systems• Peer-to-Peer systems• Mobile and ubiquitous computing• Biologically inspired distributed systems

24CMPT 401 © A. Fedorova

Course Structure

• Reading a textbook– Reading is assigned for every class– To be done before the lecture– Midterm and final will test reading

• Programming assignments and a project– Challenging assignments– Require strong programming skills– First assignment is in C; the rest are either C or Java

• Reading research papers– Takes time– Submit summaries of assigned articles

25CMPT 401 © A. Fedorova

Programming Assignments

• Assignment #1 (already posted) – due January 28– Solve a synchronization problem– Multithreaded programming in C, using pthreads library and

mutexes• Assignment #2 – due February 25

– Implement a distributed file system with a transactional interface– Use C, C++ or Java

26CMPT 401 © A. Fedorova

Grading

• One midterm exam: 20%• Two homework assignments: 30%• Project assignment: 20%• Paper summaries: 10%• Final exam: 20%

27CMPT 401 © A. Fedorova

Course Web Site

• Linked athttp://www.cs.sfu.ca/CC/index-by-course.html

• Visit often• Contains:

– Syllabus– Assignments– Deadlines– Instructor office hours and office location