erkay savas distributed systems cs403/534...

Introduction

CS403/534Distributed Systems

Erkay SavasSabanci University

Issues

1. Communication2. Processes3. Naming4. Synchronization5. Consistency & Replication6. Fault Tolerance7. Security

7 important principals

Distributed Systems: Paradigms

1. Distributed Object-Based Systems• CORBA, DCOM, Globe

2. Distributed File Systems• Sun NFS, Coda• Plan 9, xFS, SFS

3. Distributed Document-Based Systems• WWW, Lotus Notes

4. Distributed Coordination-Based Systems• TIB/Rendezvous, Sun JINI

Distributed System: DefinitionA distributed system is a piece of software ensuring that:

A collection of independent computers that appears to its users as a single coherent system.

Two aspects:1. Independent computers and2. Software that makes everything look like a

single system (middleware)

Machine A Machine B Machine C

Distributed System: Organization

Local OS Local OS Local OS

Middleware Service

Distributed Applications

Network

Goals of Distributed Systems

• Connecting resources and users• Distribution transparency• Openness• Scalability

Transparency in a Distributed System

Different forms of transparency in a distributed system.

Hide whether a (software) resource is in memory or on diskPersistence

Hide the failure and recovery of a resource (masking failures)Failure

Hide that a resource may be shared by several competitive users while keeping it at consistent state.Concurrency

Hide that a resource is replicatedReplication

Hide that a resource may be moved to another location while in useRelocation

Hide that a resource may move to another locationMigration Hide where a resource is locatedLocation

Hide differences in data representation and how a resource is accessedAccess

DescriptionTransparency

Degree of TransparencyObservation: aiming at full distribution transparency may be too much:

– Users may be located in different continents; distribution is apparent and not something you want to hide

– Completely hiding failures of networks and nodes is(theoretically and practically) impossible.

– You cannot distinguish a slow computer from a failing one.– You can never be sure that a server actually performed an

operation before a crash– Full transparency will cost performance, exposing the distributed

nature of the system• Keeping Web caches exactly up-to-date with the master copy• Immediately flushing write operations to disk for fault tolerance

Openness of Distributed Systems Open distributed system: Extending Services

– services given according to standard rules that describes the syntax and semantics of those services,

– standardization of interfaces– interoperability– Systems should support portability of applications.

Achieving openness: At least make the distributed system independent from the heterogeneity of the underlying environment

– Hardware– Platforms– Languages

ScalabilityAt least three types of scalability:

– Size scalability: Number of users and resources – Geographical scalability: Distance between users and

resources– Administrative scalability: Number of administrative

domains

Most systems emphasize only, to a certain extent, size scalability

Today, the challenge lies in geographical and administrative scalability

Obstacles for Scalability

Examples of scalability limitations.

routing algorithms based on complete information (load on all machines and lines, run a graph theory algorithm to compute all the optimal routes on a single machine)

Centralized algorithms

A single on-line telephone book; DNS implemented as a single table.Centralized data

A single server for all usersCentralized services

ExampleConcept

Characteristics of Distributed Algorithms

1. No machine has complete information about the system state

2. Machine make decisions based only on local information

3. Failure of one machine does not ruin the algorithm

4. There is no assumption that a global clockexists

Techniques for ScalingThree techniques:1. Hiding communication latencies

• Hide communication latency in the case of geographical scalability

• use, for example, asynchronous communication2. Distribution

• Partition components into smaller parts and spread them across multiple machines

• Move computations to clients (Java applets)• Decentralized naming services (DNS, tree of domains)• Decentralized information services (WWW)

3. Replication

Hiding Communication Latencies

Reducing the communication by checking the form for syntactic errors at client side through the use of a Java applet.

Distribution of DNS

Resolving, for example, flits.cs.vu.nl

An example of dividing the DNS name space into non-overlapping zones.

Techniques for Scaling (2)

Replication: Make copies of the same data available at different machines

– Replicated file servers (mainly for fault tolerance)– Replicated databases– Mirrored Web sites– Large-scale distributed shared memory systems

Caching: allow client processes to access local copies

– Web caches (browser/Web proxy)– File caching (at server and client)

Scaling – The ProblemsObservation: Applying replication as a scaling technique is easy, however

– Having multiple copies leads to inconsistencies;– Always keeping copies consistent requires global

synchronization on each modification– Global synchronization precludes large-scale solutions

Observation: If we can tolerate inconsistencies, we may reduce the need for global synchronization.Observation: Tolerating inconsistencies is application dependent

Hardware Concepts

• Multiprocessors• Multicomputers

– homogeneous– heterogeneous (network of computers)

Multiprocessors and Multicomputers

Different basic organizations and memories in distributed computer systems

Homogeneous Multicomputer Systems

a) Gridb) Hypercube

Networks of ComputersHigh degree of node heterogeneity

– High-performance parallel systems – High-end PCs and workstations– Simple network computers – Mobile computers (palmtops, laptops)

High degree of network heterogeneity– Local-area gigabit networks– Wireless connections– Long-haul, high latency connections

Ideally, a distributed system hides these differences

Software Concepts

An overview between • DOS (Distributed Operating Systems)• NOS (Network Operating Systems)• Middleware

Provide distribution transparency

Additional layer atop of NOS implementing general-purpose servicesMiddleware

Offer local services to remote clients

Loosely-coupled operating system for heterogeneous multicomputers (LAN and WAN)

Hide and manage hardware resources

Tightly-coupled operating system for multi-processors and homogeneous multicomputers

Main GoalDescriptionSystem

Multiprocessor Operating Systems

• Distinguishing feature of multiprocessors operating systems is the support for multiple processors having access to a shared memory.

• The communication is achieved through shared memory

• Data in shared memory must be protected against concurrent access

• Protection is done through synchronization primitives– Semaphores– Monitors

Semaphores• An integer with two operations: down and up• If the value of semaphore is greater than 0, the down operation decrements the value of semaphore and continues

• If the value is 0, the calling process is blocked• The up operation checks if there are any blocked

processes that were unable to complete an earlier down operation. If so, it unblocks one of them

• Otherwise, it simply increments the semaphore value

• An unblocked process can simply continue by returning from the down operation.

Monitors (1)• A monitor is a programming language construct similar to

an object– data and methods

• However, only one process can call the monitor at a timemonitor Counter {

private:

int count = 0;

public:

int value() { return count;}

void incr () { count = count + 1; }

void decr() { count = count – 1; }

Monitors (2)A monitor that conditionally blocks a processExample: a process blocks when count is 0

monitor Counter {

private:

int count = 0;

int blocked_procs = 0;

condition unblocked;

public:

int value () { return count;}

void incr () {

if (blocked_procs == 0)

count = count + 1;

signal (unblocked);

void decr() {

if (count == 0) {

blocked_procs = blocked_procs + 1;

wait (unblocked);

blocked_procs = blocked_procs – 1;

count = count – 1;

Condition variables

Multicomputer Operating Systems (1)

General structure of a multicomputer operating system

kernel kernel kernel

Distributed operating system services

Network

Multicomputer Operating Systems (2)

Some characteristics:– kernel on different computers is generally the same– Services are generally (transparently) distributed

across computersHarder than multiprocessor OS

– Virtual (distributed) shared memory requires OS to maintain global memory map in software.

– When there is no shared (virtual) memory, processor communication through message passing

– No simple systemwide synchronization mechanisms

Network Operating System (1)

General structure of a network operating system

Network

Network OSservices

Network Operating System (2)Some characteristics:

– Each computer has its own OS with networking facilities– Computers work independently (they may even have different OS)– Services are tied to individual nodes (ftp, telnet, WWW)– Highly file oriented (basically, processors share only files)

A service that is commonly provided by network OS is to allow a user to log into another machine remotely by using a command such as

rlogin machine_name

Two clients and a server in a network operating system

Different clients may mount the servers in different places

Positioning Middleware

General structure of a distributed system as middleware

Network

Network OSservices

Middleware services

Middleware

Some characteristics:– OS on different computers need not be generally the

same– Services are generally (transparently) distributed

across computersNeed for middleware: Best of both worlds

– Scalability and openness of NOS– Transparency and related ease of use of DOS.

Middleware Services (1)Communication Services: Abandon primitive socket-based messages in favor of:

– Procedure calls across networks– Remote-object method invocation– Message-queuing systems– Advanced communication streams– Event notification service

Information system services:– Large-scale, system-wide naming services– Advanced directory services (search engines)– Location services for tracking mobile objects

Middleware Services (2)Control services: Services giving applications control over when, where, and how they access data

– Distributed transaction processing– Code migration

Security services: Services for secure processing and communication:

– Authentication and authorization services– Simple encryption services– Auditing services– Access Control

Others:– Persistent storage facilities– Data caching & replication

Middleware and Openness

In an open middleware-based distributed system, the protocols used by each middleware layer should be the same, as well as the interfaces they offer to applications.

Network OS

Middleware

Application

Network OS

Middleware

Application

Commonprotocol

sameprogramming

interface

Comparison between Systems

A comparison between multiprocessor operating systems, multicomputer operating systems, network operating systems, and middleware based distributed systems

OpenOpenClosedClosedOpenness

VariesYesModeratelyNoScalability

Per nodePer nodeGlobal, distributed

Global, centralResource management

Model specificFilesMessagesShared

memoryBasis for communication

NNN1Number of copies of OS

NoNoYesYesSame OS on all nodes

HighLowHighVery HighDegree of transparency

Multicomp.Multiproc.Middleware-based DS

Network OS

Distributed OSItem

The Client-Server Model

Characteristics: – There are processes offering services (servers)– There are processes using services (clients)– Clients and servers can be distributed across different

machines– Clients follow request/reply model with respect to

using services

Request/Reply model

General interaction between a client and a server.

Client

Server

Request

Provideservice

wait for result

Clients and ServersServers: Generally provide services related to shared resources

– Servers for file systems, databases, implementation repositories

– Servers for shared, linked documents (WWW, Lotus Notes)

– Servers for shared applications– Servers for shared distributed objects

Clients: Allow remote service access– Interface transforming client’s local service calls to

request/reply messages

Application Layering

Traditional three-layered view:– User-interface layer contains units for an

application’s user interface– Processing layer contains the functions of an

application, i.e. without specific data– Data layer contains the data that a client wants to

manipulate through processing layer(important property is that the data is persistent)

Example: Layering

The general organization of an Internet search engine into three different layers

Client-Server Architectures

• Two alternatives for distributing layers of an application:

1. Two-tiered: client/single server configuration2. Three-tiered: each layer on separate machine

Traditional Two-Tiered Configurations

Alternative client-server organizations (a) – (e).

Three-Tiered Architectures

An example of a server acting as a client.

Alternative C/S Architectures (1)Cooperating servers: Service is physically distributed across a a collection of servers:

– Traditional multi-tiered architectures– Replicated file systems– Network news services– Large-scale naming systems (DNS, X.500)– Workflow systems

Cooperating clients: Distributed applications exists by virtue of client collaboration (peer-to-peer distribution)

– Teleconferencing where each client owns a workstation– Publish/subscribe architectures in which role of client

and server is blurred

Modern Architectures

An example of horizontal distribution of a Web service.

erkay savas distributed systems cs403/534...

Documents

distributed systems fall 2011 introduction. 2 distributed...

cs403- database management systems midterm...

ntroductionto distributed systems distributed coordination

1 distributed systems distributed file systems chapter 11

1 distributed systems. 2 overview definitions advantages...

cs431 distributed systems chapter 1 characterization of...

distributed systems course distributed transactions

lecture wise file -...

database management systems cs403 power point slides...

database management systems cs403 power point slides...

cs403 - database management systems solved subjective from...

introduction to distributed systems - distributed systems...

cs403- database management systems solved subjective...

database management systems cs403 power point slides...

database management systems cs403 power point slides...

cs403- database management systems solved mcqs …€¦ ·...

debugging distributed systems - usenix · debugging...

cs-550: distributed file systems [sis]1 resource management...

november 2005distributed systems: distributed algorithms 1...

cs403- database management systems - wordpress.com ·...