distributed computing distributed computing introduction1 chapter - 1

47
Distributed computing Distributed Computing Introduction 1 Chapter - 1

Upload: muriel-gaines

Post on 18-Jan-2018

264 views

Category:

Documents


4 download

DESCRIPTION

Distributed Systems Distributed Computing Introduction3

TRANSCRIPT

Page 1: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed computing

Distributed Computing Introduction1

Chapter - 1

Page 2: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed system, distributed computing

Distributed Computing Introduction2

Early computing was performed on a single processor. Uni-processor or monolithical computing can be called centralized computing.

A distributed system is a collection of independent computers, interconnected via a network, capable of collaborating on a task.

Distributed computing is computing performed in a distributed system.

Page 3: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed Systems

Distributed Computing Introduction3

T h e I n ter n et

a n etw o r k h o s t

w o r ks ta tio n s a lo c a l n etw o r k

Page 4: Distributed computing Distributed Computing Introduction1 Chapter - 1

Examples of Distributed systems

Distributed Computing Introduction4

Network of workstations (NOW): a group of networked personal workstations connected to one or more server machines.

The Internet An intranet: a network of computers and

workstations within an organization, segregated from the Internet via a protective device (a firewall).

Page 5: Distributed computing Distributed Computing Introduction1 Chapter - 1

Computers in a Distributed System

Distributed Computing Introduction5

Workstations: computers used by end-users to perform computing

Server machines: computers which provide resources and services

Personal Assistance Devices: handheld computers connected to the system via a wireless communication link.

Page 6: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Weaknesses and Strengths of Distributed Computing

Distributed Computing Introduction6

In any form of computing, there is always a tradeoff in advantages and disadvantages

Some of the reasons for the popularity of distributed computing :

The affordability of computers and availability of network access

Resource sharing Scalability Fault Tolerance

Page 7: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Weaknesses and Strengths of Distributed Computing

Distributed Computing Introduction7

The disadvantages of distributed computing: Multiple Points of Failures: the failure of

one or more participating computers, or one or more network links, can spell trouble.

Security Concerns: In a distributed system, there are more opportunities for unauthorized attack.

Page 8: Distributed computing Distributed Computing Introduction1 Chapter - 1

Introductory Basics

M. L. Liu

Distributed Computing Introduction8

Page 9: Distributed computing Distributed Computing Introduction1 Chapter - 1

Basics in three areas

Distributed Computing Introduction9

Some of the notations and concepts from these areas will be employed from time to time in the presentations for this course: Software engineering Operating systems Networks.

Page 10: Distributed computing Distributed Computing Introduction1 Chapter - 1

Operating Systems Basics

Distributed Computing Introduction10

Page 11: Distributed computing Distributed Computing Introduction1 Chapter - 1

Operating systems basics

Distributed Computing Introduction11

A process consists of an executing program, its current values, state information, and the resources used by the operating system to manage its execution.

A program is an artifact constructed by a software developer; a process is a dynamic entity which exists only when a program is run.

Page 12: Distributed computing Distributed Computing Introduction1 Chapter - 1

Process State Transition Diagram

Distributed Computing Introduction12

S im plif e d f in ite s ta te dia g ra m fo r a pro ce s s 's l if e t im e

s ta rt

re a dyru n n in g

blo ck e d

te rm in a te d

d is p atc h

q u eu ed

ev en t c o m p letio n w aitin gfo r ev en t

ex it

Page 13: Distributed computing Distributed Computing Introduction1 Chapter - 1

Java processes

Distributed Computing Introduction13

There are three types of Java program: applications, applets, and servlets, all are written as a class. A Java application program has a main method, and

is run as an independent(standalone) process. An applet does not have a main method, and is run

using a browser or the appletviewer. A servlet does not have a main method, and is run

in the context of a web server. A Java program is compiled into bytecode, a

universal object code. When run, the bytecode is interpreted by the Java Virtual Machine (JVM).

Page 14: Distributed computing Distributed Computing Introduction1 Chapter - 1

Three Types of Java programs

Distributed Computing Introduction14

Applicationsa program whose byte code can be run on any system which has a Java Virtual Machine. An application may be standalone (monolithic) or distributed (if it interacts with another process).

AppletsA program whose byte code is downloaded from a remote machine and is run in the browser’s Java Virtual Machine.

ServletsA program whose byte code resides on a remote machine and is run at the request of an HTTP client (a browser).

Page 15: Distributed computing Distributed Computing Introduction1 Chapter - 1

Concurrent Processing

Distributed Computing Introduction15

On modern day operating systems, multiple processes appear to be executing concurrently on a machine by timesharing resources.

Pro ce s s e s

t im e

P 1P 2

P 3P 4

T im es h ar in g o f a r es o u rc e

Page 16: Distributed computing Distributed Computing Introduction1 Chapter - 1

Concurrent processing within a process

Distributed Computing Introduction16

It is often useful for a process to have parallel threads of execution, each of which timeshare the system resources in much the same way as concurrent processes.

p a r en t p r o c es s

c h ild p r o c es s es

A pa re n t pro ce s s m a y s pa wn ch ild pro ce s s e s .

a p ro c es s

m ain th r ead

c h ild th r ead 1

c h ild th r ead 2

A pro ce s s m a y s pa wn ch ild th re a ds

C o n cu rre n t pro ce s s in g with in a pro ce s s

Page 17: Distributed computing Distributed Computing Introduction1 Chapter - 1

Thread-safe Programming

Distributed Computing Introduction17

When two threads independently access and update the same data object, such as a counter, as part of their code, the updating needs to be synchronized. (See next slide.)

Because the threads are executed concurrently, it is possible for one of the updates to be overwritten by the other due to the sequencing of the two sets of machine instructions executed in behalf of the two threads.

To protect against the possibility, a synchronized method can be used to provide mutual exclusion.

Page 18: Distributed computing Distributed Computing Introduction1 Chapter - 1

Race Condition

Distributed Computing Introduction18

fe tch va lue in cou n te r a nd lo ad in to a reg iste r

in cre me n t va lu e in re g is te r

s to re va lue in reg is te r to cou n te r

t im e

fe tch va lue in cou n te r a nd lo ad in to a reg iste r

in cre m e n t va lu e in re g is te r

s to re va lu e in reg is te r to cou n te r

in s t r u c t io n e x ec u ted in c o n c u r r en t p r o c es s o r th r ead 1

in s tr u c tio n ex ec u ted in c o n c u r ren t p r o c es s o r th r ead 2

This e xe c utio n re s ul ts in the value 2 in the c o unte r

fe tch va lu e in cou n te r an d lo ad in to a re g is te r

fe tch va lu e in cou n te r an d lo ad in to a re g is te r

in cre me n t va lue in reg iste r

in cre m e n t va lu e in re g is te r

sto re va lue in reg is te r to cou n te r

s to re va lu e in reg is te r to co un te r

This e xe c utio n re s ults in the value 1 in the c o unte r

Page 19: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed Computing Introduction19

Network Basics

Page 20: Distributed computing Distributed Computing Introduction1 Chapter - 1

Network standards and protocols

Distributed Computing Introduction20

On public networks such as the Internet, it is necessary for a common set of rules to be specified for the exchange of data.

Such rules, called protocols, specify such matters as the formatting and semantics of data, flow control, error correction.

Software can share data over the network using network software which supports a common set of protocols.

Page 21: Distributed computing Distributed Computing Introduction1 Chapter - 1

Protocols

Distributed Computing Introduction21

In the context of communications, a protocol is a set of rules that must be observed by the participants.

In communications involving computers, protocols must be formally defined and precisely implemented. For each protocol, there must be rules that specify the followings: How is the data exchanged encoded? How are events (sending , receiving)

synchronized so that the participants can send and receive in a coordinated order?

The specification of a protocol does not dictate how the rules are to be implemented.

Page 22: Distributed computing Distributed Computing Introduction1 Chapter - 1

The network architecture

Distributed Computing Introduction22

Network hardware transfers electronic signals,which represent a bit stream, between two devices.

Modern day network applications require an application programming interface (API) which masks the underlying complexities of data transmission.

A layered network architecture allows the functionalities needed to mask the complexities to be provided incrementally, layer by layer.

Actual implementation of the functionalities may not be clearly divided by layer.

Page 23: Distributed computing Distributed Computing Introduction1 Chapter - 1

The OSI seven-layer network architecture

Distributed Computing Introduction23

ap p lic a t io n lay er

p r es en ta tio n lay er

s es s io n lay e r

tr an s p o r t lay er

n e tw o rk lay er

d a ta lin k lay er

p h y s ic a l lay e r

ap p lic a t io n lay er

p r es en ta t io n lay er

s es s io n lay er

tr an s p o r t lay er

n e tw o r k lay er

d a ta lin k lay er

p h y s ic a l lay er

Page 24: Distributed computing Distributed Computing Introduction1 Chapter - 1

Network Architecture

Distributed Computing Introduction24

The division of the layers is conceptual: the implementation of the functionalities need not be clearly divided as such in the hardware and software that implements the architecture. The conceptual division serves at least two useful purposes :1. Systematic specification of protocols

it allows protocols to be specified systematically 2. Conceptual Data Flow: it allows programs to

be written in terms of logical data flow.

Page 25: Distributed computing Distributed Computing Introduction1 Chapter - 1

The TCP/IP Protocol Suite

Distributed Computing Introduction25

The Transmission Control Protocol/Internet Protocol suite is a set of network protocols which supports a four-layer network architecture.

It is currently the protocol suite employed on the Internet.

Ap p lic a tio n lay er

T r an s p o r t lay er

I n ter n et lay er

P h y s ic a l lay er

Ap p lic a tio n lay er

T r an s p o r t lay er

In ter n e t lay er

P h y s ic a l lay er

Th e I n te rn e t n e t wo rk a rch ite ctu re

Page 26: Distributed computing Distributed Computing Introduction1 Chapter - 1

Network Resources

Distributed Computing Introduction26

Network resources are resources available to the participants of a distributed computing community.

Network resources include hardware such as computers and equipment, and software such as processes, email mailboxes, files, web documents.

An important class of network resources is network services such as the World Wide Web and file transfer (FTP), which are provided by specific processes running on computers.

Page 27: Distributed computing Distributed Computing Introduction1 Chapter - 1

Identification of Network Resources

Distributed Computing Introduction27

One of the key challenges in distributed computing is the unique identification of resources available on the network, such as e-mail mailboxes, and web documents. Addressing an Internet Host Addressing a process running on a host Email Addresses Addressing web contents: URL

Page 28: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed Computing Introduction28

Addressing an Internet Host

Page 29: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Internet Topology

Distributed Computing Introduction29

s u b n e ts

T h e I n te r n e t b ac k b o n e

an I n te r n e t h o s t

Th e I n te rn e t Topo log y M o de l

Page 30: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Internet Topology

Distributed Computing Introduction30

The internet consists of an hierarchy of networks, interconnected via a network backbone.

Each network has a unique network address. Computers, or hosts, are connected to a

network. Each host has a unique ID within its network.

Each process running on a host is associated with zero or more ports. A port is a logical entity for data transmission.

Page 31: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Internet addressing scheme

Distributed Computing Introduction31

In IP version 4, each address is 32 bit long. The address space accommodates 232 (4.3 billion) addresses in

total. Addresses are divided into 5 classes (A through E)

0

1 0

1 1

11

0

01 1 1

111 1 1 01

n etw o r k ad d r es s

ho s t p o r tionm ul t i c as t g r o up

r e s e r ve d

b y te 0 b y te 1 b y te 2 by te 3

c l as s A addr e ss

c l as s B addr e s s

c l as s C addr e s s

m ul tc as t addr e s s

r e s e r ve d addr e ss reserved

Page 32: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Internet addressing scheme - 2

Distributed Computing Introduction32

1 0 n e tw o r k ad d res s h o s t p o r t io n

b y te 0 b y te 1 b y te 2 b y te 3

c l as s B addr e s s

s u b n e t ad d r es s lo c a l h o s t ad d r es s

S u b d iv id in g th e h o s t p o rtio n o f a n I n te rn e t a d d re s s :

A c las s A/C addre s s s pac e c anals o be s im ilar ly s ubdivide d..

W hic h po r tio n o f the ho s t addre s sis us e d fo r the s ubne t ide nti f ic at io nis de te rm ine d by a s ubne t m as k.

Page 33: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Domain Name System (DNS)

Distributed Computing Introduction33

For user friendliness, each Internet address is mapped to a symbolic name, using the DNS, in the format of:

<computer-name>.<subdomain hierarchy>.<organization>.<sector name>{.<country code>} e.g., www.csc.calpoly.edu.us

ro o t

co me du go v n e t org m il

o rg a n iza t io n

...

...

h o s t n a m e

to p- le v e l do m a in

s u bdo m a in

in th e U.S .

To p- le v e l do m ain n am e h a s to be a pplie d fo r.S u bdom a in h ie ra ch y a n d n a m e s a re a s s ig n e d by th e o rg a n iza t io n .

cou n try co de

Page 34: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Domain Name System

Distributed Computing Introduction34

For network applications, a domain name must be mapped to its corresponding Internet address.

Processes known as domain name system servers provide the mapping service, based on a distributed database of the mapping scheme.

The mapping service is offered by thousands of DNS servers on the Internet, each responsible for a portion of the name space, called a zone. The servers that have access to the DNS information (zone file) for a zone is said to have authority for that zone.

Page 35: Distributed computing Distributed Computing Introduction1 Chapter - 1

Top-level Domain Names

Distributed Computing Introduction35

.com: For commercial entities, which anyone, anywhere in the world, can register.

.net : Originally designated for organizations directly involved in Internet operations. It is increasingly being used by businesses when the desired name under "com" is already registered by another organization. Today anyone can register a name in the Net domain.

.org: For miscellaneous organizations, including non-profits.

.edu: For four-year accredited institutions of higher learning.

.gov: For US Federal Government entities .mil: For US military Country Codes :For individual countries based on the

International Standards Organization. For example, ca for Canada, and jp for Japan.

Page 36: Distributed computing Distributed Computing Introduction1 Chapter - 1

Domain Name Hierarchy

Distributed Computing Introduction36

. au . c a .u s . zw .c o m .g o v . ed u .m il . n e t. . .. . . . . . . o r g

c o u n try c o d e

u c s b .ed u c a lp o ly .ed u

c s c ee. . . en g lis h w ir ele s s. . .c s ec e. . . . . .

. . . . . .

. ( r o o t d o m ain )

Page 37: Distributed computing Distributed Computing Introduction1 Chapter - 1

Name lookup and resolution

Distributed Computing Introduction37

If a domain name is used to address a host, its corresponding IP address must be obtained for the lower-layer network software.

The mapping, or name resolution, must be maintained in some registry.

For runtime name resolution, a network service is needed; a protocol must be defined for the naming scheme and for the service. Example: The DNS service supports the DNS; the Java RMI registry supports RMI object lookup; JNDI is a network service lookup protocol.

Page 38: Distributed computing Distributed Computing Introduction1 Chapter - 1

Addressing a process running on a host

Distributed Computing Introduction38

Page 39: Distributed computing Distributed Computing Introduction1 Chapter - 1

Logical Ports

Distributed Computing Introduction39

...

p r o c es s

p o r t

...h o s t A

h o s t B

Th e I n te rn e t

Ea ch h o s t h a s 6 55 3 6 po rt s .

Page 40: Distributed computing Distributed Computing Introduction1 Chapter - 1

Well Known Ports

Distributed Computing Introduction40

Each Internet host has 216 (65,535) logical ports. Each port is identified by a number between 1 and 65535, and can be allocated to a particular process.

Port numbers beween 1 and 1023 are reserved for processes which provide well-known services such as finger, FTP, HTTP, and email.

Page 41: Distributed computing Distributed Computing Introduction1 Chapter - 1

Well-known ports

Distributed Computing Introduction41

Pro t o co l Po rt S e rv ice

echo 7 IPC te sting

da ytime 13 pro vid es th e cu rre n t d a te an d time

ftp 21 file tra ns fe r p ro toco l

te lne t 23 remote, c ommand- line termina l s es s ion

smtp 25 simp le ma il tra ns fe r p ro to co l

time 37 pro vid es a sta nd a rd time

fin ge r 79 pro v id es in fo rma tio n a b ou t a use r

h ttp 80 we b se rve r

R MI R e g is try 1 09 9 re g is try fo r R e m o te Me tho d In voca tion

sp ec ia l we b se rve r 8 08 0 web se rve r wh ich sup p o rtsse rvle ts , JSP, o r ASP

A ssign m en t o f so m e w ell-kn ow n p o rts

Page 42: Distributed computing Distributed Computing Introduction1 Chapter - 1

Choosing a port to run your program

Distributed Computing Introduction42

For our programming exercises: when a port is needed, choose a random number above the well known ports: 1,024- 65,535.

If you are providing a network service for the community, then arrange to have a port assigned to and reserved for your service.

Page 43: Distributed computing Distributed Computing Introduction1 Chapter - 1

Distributed Computing Introduction43

Addressing a Web Document

Page 44: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Uniform Resource Identifier (URI)

Distributed Computing Introduction44

Resources to be shared on a network need to be uniquely identifiable.

On the Internet, a URI is a character string which allows a resource to be located.

There are two types of URIs: URL (Uniform Resource Locator) points to a

specific resource at a specific location URN (Uniform Resource Name) points to a

specific resource at a nonspecific location.

Page 45: Distributed computing Distributed Computing Introduction1 Chapter - 1

URL

Distributed Computing Introduction45

A URL has the format of: protocol://host address[:port]/directory path/file

name#section

A s am pl e U R L :http ://w w w .c s c .c alp o ly.ed u :8080/~ m liu /C S C 369/hw .htm l # hw 1

p r o to c o l o f s e rv er

h o s t n am e

p o r t n u m b e r o f s e rv er p r o c es s

d ir ec to r y p a th

f ile n am e

s ec tio n n am e

O th e r p ro to co ls th a t ca n a p p e a r in a U R L a re : fi le

ftp g o p h e r n e w s te ln e t W AIS

Page 46: Distributed computing Distributed Computing Introduction1 Chapter - 1

Software Engineering Basics

Distributed Computing Introduction46

Page 47: Distributed computing Distributed Computing Introduction1 Chapter - 1

The Architecture of Distributed Applications

Distributed Computing Introduction47

Pre s e n ta t io n

A pplica t io n (B u s in e s s ) lo g ic

S e rv ice s