communication in distributed systems. communication in distributed systems based on low level...

73
Communication in Distributed Systems

Upload: tamsyn-fowler

Post on 26-Dec-2015

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Communication in Distributed Systems

Page 2: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Communication in Distributed Systems based on low level message passing offered by underlying network

Three popular models of communication:Remote Procedure Calls (RPC)Message-oriented Middleware (MOM)Data Streaming, discussion of SCTP, 3GPP

Sending data to multiple receivers or Multicasting

Page 3: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Interprocess Communication is part of the Client/Server Computing Model

A client is an application or a process that requests a service from some other application or process.

A server is an application or a process that responds to a client request.

Both client/server applications perform specialized functions in a distributed computing environment.

Intraprocess communication used between homogeneous systems. Example : using named pipes, named queues, shared memory etc

Focus of this presentation is Interprocess Communication

Page 4: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

In between the end users and large pool of computing resources, many applications act as both a client and a server, depending on the situation.

Interprocess communication is necessary to communicate between heterogeneous systems

Protocols govern the format, contents and meaning of messages

Page 5: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Two main types of Protocols:

Connection-oriented: Sender and receiver establish a connection and negotiate the protocol to use before exchanging data. At the end of commiunication, terminate or release the connection. An example: TCP

Connectionless: No connection is set up in advance. Sender transmits message when ready. An example: UDP

Page 6: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Main differences between connection-oriented and connectionless protocols:

1.Connection-oriented is reliable, while the other is not.

2.Connection-oriented is fully duplex 3.Connection-oriented is byte-stream service

with no structure4.Error checking an essential component of

unreliable connectionless protocol5.Connectionless protocol transfers packets

of data and uses a protocol port address to specify receiver process

Page 7: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular
Page 8: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Lower-Level ProtocolsImplemented in physical layer and data link layer of the

stack. Groups data bits into frames and adds a pattern called checksum at either end of frame

Network layer chooses best path from sender to receiver by routing

Transport Protocols• TCP• UDP

Higher Level Protocols• FTP• HTTP

Page 9: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Middleware Protocols:Examples are Authentication and

Authorization protocols, commit protocols in transaction databases

Middleware protocols support high level communication services• Protocols that allow a process to call a procedure or

invoke an object on a remote machine transparently. An example RPC/RMI

• Protocols that support the setting up and synchronizing of streams for transferring real-time data such as multimedia applications. An example: SCTP

• Protocols that support reliable multicast services to a WAN

Page 10: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Application

Middleware

Transport

Network

Data Link

Physical

Network

Application Protocol

Middleware Protocol

Transport Protocol

Network Protocol

Data Link Protocol

Physical Protocol

6

5

4

3

2

1

Protocol Stack

Page 11: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Types of Communication

• Persistent communication: Message submitted for transmission stored by communication middleware as long as it takes to deliver it to the receiver. Neither sending application nor receiving application need to be executing.

• Transient communication: Message stored by communication system only as long as sending and receiving application are executing.

• Asynchronous communication: Sender continues immediately after submitting message for transmission. Message temporarily stored by middleware on submission.

• Synchronous communication: Sender blocks until message received and processed and receiver returns acknowledgement.

Page 12: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Remote Procedure CallsRemote procedure calls can simplify the way IPCs

are conducted through the network. Client applications use the client side network API

to call RPCs.Server side of the network API turns the RPCs into

local function calls. Return value being transmitted back to the client

application again through the network.The OSI layers are transparent to the client

application, as if it is making a local function call.The RPC provided by Windows enable applications

that use RPC to communicate with applications running with other operating systems that support DCE (Distributed Computing Environment). RPC automatically supports data conversion to account for different hardware architectures and for byte-ordering between dissimilar environments.

Page 13: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Synchronous RPC Operation:

Process on machine A calls procedure on machine B, the calling process on A is suspended and execution of the called procedure takes place on B.

Information is transported from caller to callee in the parameters and comes back in the procedure result

No message passing visible to programmerClient and Server stubs are usedClient stub takes its parameters and packs them

into a message (parameter marshalling) and sends them to the server stub

Page 14: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Parameter Passing:Passing Value Parameters

• Client stub takes parameters and puts them in the message. It also puts the name or number of the procedure to be called in the message

• When message arrives at server, server stub examines the message to see which procedure is needed and then makes appropriate call. Stub takes the result and packs it into a message. Message is sent back to client stub.

• Client stub unpacks the message to extract the result and returns it to waiting client procedure

Page 15: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Passing Reference Parameters

• Pointers and references passed by copying the data structure such as array into message and sent to server

• Server stub calls server with a pointer to this array

• Server makes changes using this pointer that also affects the message buffer inside server stub

• Server finishes its work, original message sent back to client stub which copies it back to the client. This is similar to copy/restore.

Page 16: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Asynchronous RPC:

Client continues immediately after issuing RPC request and receiving acknowledgement from server, it is not blocked.

Server sends immediately a reply or acknowledgement to the client the moment RPC request is received

Server then calls requested procedureOne-way RPC: Client does not wait for even an

acknowledgement from server. Reliability not guaranteed as client has no acknowledgement from server.

Deferred synchronous RPC is a combination of two asynchronous RPCs, where client polls the server periodically to see whether results are available yet rather than server calling back the client.

Page 17: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message-Oriented Communication

RPC assumes that receiver process is executing at the time a request is issued

Message-oriented communication such as message-queuing systems is therefore needed to allow processes to exchange information even if one party is not executing at the time the communication is initiated.

Message-oriented-model (MOM) offered by transport layer, as part of middleware solution

Page 18: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Berkeley SocketsSockets interface introduced in 1970s in Berkeley UnixStandardizes the interface of the transport layer to allow

programmers the use of messaging protocols through simple set of primitives

Another interface XTI stands for X/Open Transport Interface, also formerly called Transport Layer Interface (TLI) developed by AT&T

Sockets and XTI similar in their model of network programming but differ in their set of primitives

Socket forms an abstraction over the communication end point to which an application can write data that are sent over the underlying network and from which incoming data can be read.

Servers execute the first four primitives in tableWhen calling the socket primitive, the caller creates a new

communication end point for a specific transport protocol. Internally, the OS reserves resources to accommodate sending and receiving messages for the specific protocol.

Page 19: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

• Bind primitive associates a local address with a newly-created socket. For example, server should bind the IP address of its machine together with a known port # to a socket

• Binding tells the OS that the server wants to receive messages only on the specified address and port

• Listen primitive is called only in the case of connection-oriented communication. It is a nonblocking call that allows the local OS to reserve enough buffers for a specified # of connections that the caller is willing to accept

• A call to accept primitive blocks the caller until a connection request arrives. When the request arrives, the local OS creates a new socket with same properties as the original one and returns it to caller. Server can wait for another connection request on the original socket

• Connect primitive on client requires that the caller specifies the transport level address to which a connection request is to be sent

• Client is blocked until a connection is set up, after which info can be exchanged using send/receive

• Closing the connection is symmetric as both server and client call the close primitive

Page 20: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Primitive Meaning

Socket Create a new communication end point

Bind Attach a local address to a socket

Listen Announce willingness to accept connections

Accept Block caller until connection request arrives

Connect Actively attempt to establish a connection

Send Send some data over the connection

Receive Receive some data over the connection

Close Release the connection

Page 21: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message-Passing Interface

A standard defined for message passing

Hardware and platform independent

Designed for parallel applications and transient communication.

Makes use of underlying network

Page 22: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

• Assumes that serious failures such as process crashes or network partitions are fatal and do not require automatic recovery

• MPI assumes communication takes place within a known group of processes

• Each group is assigned an identifier• Each process within a group is also assigned a local

identifier• A (groupID, processID) pair uniquely identifies the

source or destination of a message, and is used instead of a transport-level address

• Several possibly overlapping groups of processes involved in a computation, executing at the same time

• MPI has messaging primitives to support transient communication shown in next table

Page 23: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Primitive Meaning

MPI_bsend Append outgoing message to a local send buffer

MPI_send Send a message and wait until copied to local or remote buffer

MPI_ssend Send a message and wait until receipt starts

MPI_sendrecv Send a message and wait for reply

MPI_isend Pass reference to outgoing message, and continue

MPI_issend Pass reference to outgoing message, and wait until receipt starts

MPI_recv Receive a message; block if there is none

MPI_irecv Check if there is an incoming message, but do not block

Page 24: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

• Transient asynchronous communication is supported by MPI_bsend primitive

• Sender submits a message for transmission which is copied to a local buffer in MPI runtime system. Sender continues after message is copied. Local MPI runtime system will remove the message from its local buffer and transmit as soon as a receiver has called a receive primitive

• MPI_send is a blocking send operation with implementation dependent semantics

• It may either block the caller until the specified message has been copied to the MPI runtime system at the sender’s side, or until the receiver has initiated a receive operation

• MPI_ssend implements synchronous communication by which the sender blocks until its request is accepted for further processing

• MPI_sendrecv when called by sender, sends a request to the receiver and blocks until the latter returns a reply. This corresponds to normal RPC.

Page 25: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

• MPI_isend allows the sender to pass a pointer to the message and the MPI runtime system takes care of the communication. Sender continues.

• MPI_issend allows the sender to pass a pointer to MPI runtime system. When runtime system indicates it has processed the message, sender knows that receiver has accepted the message.

• Caller is blocked until message arrives when MPI_recv is called to receive a message

• Asynchronous variant MPI_irecv called by receiver indicates it is prepared to accept message

Page 26: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message-Oriented Persistent Communication

Message Oriented Middleware(MOM) or message queuing systems provide support for persistent asynchronous communication

Intermediate-term storage capacity for messages. Does not require sender/receiver to be active during message transmission

Slower than Berkeley sockets and MPI

Page 27: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message-Queuing Model

Applications communicate by inserting messages in specific queues

Messages are forwarded over a series of communication servers and delivered to the destination, even if it was down when message was sent

Each application has its own private queue to which other applications can send messages

Page 28: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Sender is informed that the message will be eventually inserted in the receiver’s queue. No time is specified.

Neither sender nor receiver need to be executing when message is placed in the queue

Loosely coupled communication with sender and receiver executing independent of each other.

Page 29: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Primitive Meaning

Put Append a message to a specified queue. A nonblocking call by sender.

Get Block until specified queue is non-empty and remove the first message

Poll Check a specified queue for messages and remove the first. Never block.

Notify Install a handler as a callback function to be called when a message is put into a specified queue

Page 30: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Architecture of Message Queuing System

Source queue to put message is local to sender

Message can be read only from local queue

Message put into a queue will contain the specification of a destination queue to which it should be transferred

Message queuing system responsible to provide the queues to sender/receiver and transfer messages from source to destination queue.

Page 31: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message queuing system maintains mapping of queues distributed across multiple machines to network locations – a distributed database of queue names to network locations, similar to DNS.

Queues managed by Queue Managers, who interact with the application that is sending or receiving a message. Special queue managers operate as routers/relays that forward incoming messages to other queue managers

Page 32: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message-queuing system may grow into a complete application level overlay network on top of existing computer network

Relays/routers help build scalable messaging systems

Relays allow secondary processing of messages, for example, a message may need to be logged for reasons of fault tolerance or security

Relays can be used for multicasting purposes. Incoming message put into each send queue.

Page 33: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

E-mail Systems• E-mail systems use underlying transport

services. Example, mail protocol for the Internet-SMTP, a message is transferred by setting up a direct TCP connection to the destination mail server. Generally no routing is used.

• Provide direct support for end users when compared to message-queuing system

• E-mail systems have specific requirements such as automatic message filtering, support for advanced messaging databases to retrieve old messages etc

• Message queuing system enables persistent communication between processes. Wide range of applications including e-mail.

Page 34: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Stream-oriented Communication

Communication discussed till now dealt with independent, complete units of information and time had no effect on correctness of the communication

Stream communication such as audio or video stream in contrast to above are time-dependent

Information is represented in different formats such as GIF/JPEG for images, audio streams are encoded by taking 16-bit samples using PCM

Page 35: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

In continuous representation media, temporal relationship between data is retained in order to correctly interpret the data. For example, motion can be represented by a series of images with successive images displayed at uniform spacing T in time (about 30-40 msec per image)

In discrete representation media, temporal relationships between data is not fundamental to correctly interpreting the data. Examples are text, still images etc

Page 36: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Data stream is a sequence of data units that can be applied to discrete as well as continuous media. Examples are UNIX pipes, TCP/IP connections of byte-oriented discrete data streams. Playing an audio file requires setting up a continuous data stream between the file and the audio device

Time is crucial to continuous data streams

Page 37: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Three types of transmission modes exist for data streams

Asynchronous transmission mode where data items are transmitted one after the other but no time constraints as to when the transmission of each item takes place. Example for discrete data streams : file transmission

Synchronous transmission mode where there is a maximum end-to-end delay defined for each data unit in stream. Example is Temperature sampled by sensors and passed over network

Isochronous transmission mode where time constraint is rigid and data units are transferred subject to maximum and minimum end-end delay (bounded (delay) jitter). Example are distributed multimedia systems such as audio/video

Page 38: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Focus in the presentation is on continuous data streams (streams) using isochronous transmission

Simple stream consists of single sequence of data

Complex stream consists of several related simple streams called substreams that are interdependent on each other based on time. Example is a video stream such as movie where two substreams continuously synchronized to transmit audio for the movie, a video substream, and a substream containing subtitles for the deaf or different language translation. All substreams are synchronized.

Page 39: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular
Page 40: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Architecture in the figure reveals some issues such as compression needed to reduce required storage and network capacity especially more for video than audio

Quality of transmission and synchronization to be controlled

Timing requirements expressed by Quality of Service(QoS) QoS for continuous data streams concerns the timeliness, volume and reliability of transmission.

Page 41: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

QoS Specification

The required bit rate at which data is transported

Maximum delay until a session is set up(i.e., when an application can start sending the data).

Maximum end-to-end delay

Maximum delay variance or jitter

Page 42: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Enforcing QoS• Use buffers to reduce jitter

• Use forward error correction to compensate for lost packets- encode the outgoing packets such that any k out of n received packets is enough to reconstruct k correct packets

• Many distributed systems for stream-oriented communication are built on top of Internet protocol stack. Internet provides differentiating classes for data using differentiated services. Sending host can mark outgoing packets as belonging to one of several classes. Expedited forwarding class specifies that the packet should be forwarded by the router with absolute priority. With assured forwarding class, traffic is divided into 4 subclasses along with three ways to drop packets if the network gets congested. This means a range of priorities can be assigned to the packets to differentiate time-critical packets from non-critical ones.

Page 43: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Stream Synchronization

Maintains temporal relationships between streams/substreams , for example between discrete data stream and continuous data stream, or between continuous data streams

Synchronization takes place at the level of data units

Synchronization mechanisms concerned with synchronizing data streams and distribution of the mechanism in a networked environment

Page 44: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Multimedia middleware offers a collection of interfaces for controlling audio and video streams including interfaces for controlling devices such as monitors, cameras, microphones etc.

Each device and stream has its own high level interface including interfaces for notifying an application when some event occurred. Latter used for writing handlers for synchronizing streams.

Distribution of synchronization mechanism - receiving side has to have complete synchronization specification locally available

Page 45: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

This approach is followed by MPEG streamsMPEG standards form a collection of algorithms

for compressing video and audioMPEG -2 designed for compressing broadcast

quality video into 4 to 6 Mbps. Unlimited number of continuous and discrete streams merged into single stream. Input stream turned into a stream of packets that carry timestamp based on 90kHz clock. These streams multiplexed into a program stream consisting of variable length packets with common time base. Receiving side demultiplexes the stream using timestamp for interstream synchronization. Better to do synchronization at the sender rather than at receiver.

Page 46: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Stream Control Transmission Protocol(SCTP)

• SCTP is a reliable transport protocol operating on top of a connectionless packet network such as IP. It is described in RFC 4960, RFC 3286. It offers the following services to its users:

-- acknowledged error-free non-duplicated transfer of user data

-- data fragmentation to conform to discovered path size

-- sequenced delivery of user messages within multiple streams, with an option for order-of-arrival delivery of individual user messages

-- optional bundling of multiple user messages into a single SCTP packet

-- network-level fault tolerance through supporting of multi-homing at either or both ends of an association

• The design of SCTP includes appropriate congestion avoidance behavior and resistance to flooding and masquerade attacks.

Page 47: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

The Stream Control Transmission Protocol (SCTP) is a new IP transport protocol, existing at an equivalent level with UDP (User DatagramProtocol) and TCP (Transmission Control Protocol), which provide transport layer functions to many Internet applications.

SCTP has been approved by the IETF as a Proposed Standard Like TCP, SCTP provides a reliable transport service, ensuring

that data is transported across the network without error and in sequence.

Like TCP, SCTP is a session-oriented mechanism, meaning that a relationship is created between the endpoints of an SCTP association prior to data being transmitted, and this relationship is maintained until all data transmission has been successfully completed.

Unlike TCP, SCTP provides a number of functions that are critical for telephony signaling transport, and at the same time can potentially benefit other applications needing transport with additional performance and reliability. The original framework for the SCTP definition is described in [3].

Page 48: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Network Transport

SCTP User Application

SCTP Transport Service

IP Network Service

SCTP User Application

SCTP Transport Service

IP Network Service

One or more IP AddressAppearances

One or more IP AddressAppearances

SCTP Node ASCTP Node B

An SCTP Association

Page 49: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Message Format

Common Header

Chunk No: 1

Chunk No: n

Page 50: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Chunk ID : 0 through 255. Each ID has Chunk Type defined as follows:

0 - Payload Data (DATA)1 - Initiation (INIT)2 - Initiation Acknowledgement (INIT ACK)3 - Selective Acknowledgement (SACK)4 - Heartbeat Request (HEARTBEAT)5 - Heartbeat Acknowledgement (HEARTBEAT ACK)6 - Abort (ABORT)7 - Shutdown (SHUTDOWN)8 - Shutdown Acknowledgement (SHUTDOWN ACK)9 - Operation Error (ERROR)Etc …

Page 51: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

SCTP Common Header Format

Source Port No: Destination Port No :

Verification Tag

Check Sum

Page 52: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Source Port Number: 16 bits (unsigned integer). This is the SCTP sender’s port number. It can be used by the receiver in combination with the source IP address, the SCTP destination port, and possibly the destination IP address to identify the association to which this packet belongs. The port number 0 MUST NOT be used.

Destination Port Number: 16 bits (unsigned integer). This is the SCTP port number to which this packet is destined. The receiving host will use this port number to de-multiplex the SCTP packet to the correct receiving endpoint/application. The port number 0 MUST NOT be used.

Page 53: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

• Verification Tag: 32 bits (unsigned integer). The receiver of this packet uses the Verification Tag to validate the sender of this SCTP packet. On transmit, the value of this Verification Tag must be set to the value of the Initiate Tag received from the peer endpoint during the association initialization, with the following exceptions:

- A packet containing an INIT chunk MUST have a zero Verification Tag.

- A packet containing a SHUTDOWN COMPLETE chunk with the T bit set MUST have the Verification Tag copied from the packet with the SHUTDOWN ACK chunk.

- A packet containing an ABORT chunk may have the verification tag copied from the packet that caused the ABORT to be sent.

An INIT chunk MUST be the only chunk in the SCTP packet carrying it.

• Checksum: 32 bits (unsigned integer). This field contains the checksum of this SCTP packet.

Page 54: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Basic SCTP Features• SCTP is a unicast protocol, and supports data exchange between

exactly 2 endpoints, although these may be represented by multiple IP addresses.

• SCTP provides reliable transmission, detecting when data is discarded, reordered, duplicated or corrupted, and retransmitting damaged data as necessary. SCTP transmission is full duplex.

• SCTP is message oriented and supports framing of individual message boundaries. In comparison, TCP is byte oriented and does not preserve any implicit structure within a transmitted byte stream without enhancement.

• SCTP is rate adaptive similar to TCP, and will scale back data transfer to the prevailing load conditions in the network. It is designed to behave cooperatively with TCP sessions attempting to use the same bandwidth

SCTP Multi-Streaming Feature• The name Stream Control Transmission Protocol is derived from

the multi-streaming function provided by SCTP. This feature allows data to be partitioned into multiple streams that have the property of independently sequenced delivery, so that message loss in any one stream will only initially affect delivery within that stream, and not delivery in other streams.

Page 55: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

SCTP accomplishes multi-streaming by creating independence between data transmission and data delivery. In particular, each payload DATA "chunk" in the protocol uses two sets of sequence numbers, a Transmission Sequence Number that governs the transmission of messages and the detection of message loss, and the Stream ID/Stream Sequence Number pair, which is used to determine the sequence of delivery of received data.

This independence of mechanisms allows the receiver to determine immediately when a gap in the transmission sequence occurs (e.g., due to message loss), and also whether or not messages received following the gap are within an affected stream.

SCTP Multi-Homing Feature Another core feature of SCTP is multi-homing, or the ability for a single

SCTP endpoint to support multiple IP addresses. The benefit of multi-homing is potentially greater survivability of the session in the presence of network failures. To support multi-homing, SCTP endpoints exchange lists of addresses during initiation of the association. Each endpoint must be able to receive messages from any of the addresses associated with the remote endpoint; in practice, certain operating systems may utilize available source addresses in round robin fashion, in which case receipt of messages from different source addresses will be the normal case. A single port number is used across the entire address list at an endpoint for a specific session.

Page 56: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Security Objectives As a common transport protocol designed to reliably carry time-

sensitive user messages, such as billing or signaling messages for telephony services, between two networked endpoints, SCTP has the following security objectives.

- availability of reliable and timely data transport services- integrity of the user-to-user information carried by SCTP

SCTP Responses to Potential Threats• SCTP may potentially be used in a wide variety of risk situations.

It is important for operators of systems running SCTP to analyze their particular situations and decide on the appropriate countermeasures.

• Operators of systems running SCTP should consult [RFC2196] for guidance in securing their site.

Countering Insider Attacks• The principles of [RFC2196] should be applied to minimize the

risk of theft of information or sabotage by insiders. Such procedures include publication of security policies, control of access at the physical, software, and network levels, and separation of services.

Page 57: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Protecting against Data Corruption in the Network Protecting Confidentiality

As with the supplementary checksum service, user data encryption MAY be performed by the SCTP user application.

Alternately, the user application may use an implementation-specific API to request that the IP Encapsulating Security Payload (ESP) [RFC4303] be used to provide confidentiality and integrity.

Protecting against Blind Denial-of-Service Attacks

A blind attack is one where the attacker is unable to intercept or otherwise see the content of data flows passing to and from the target SCTP node. Blind denial-of-service attacks may take the form of flooding, masquerade, or improper monopolization of services

• Flooding– The objective of flooding is to cause loss of service and incorrect

behavior at target systems through resource exhaustion, interference with legitimate transactions, and exploitation of buffer-related software bugs. Flooding may be directed either at the SCTP node or at resources in the intervening IP Access Links or the Internet. Where the latter entities are the target, flooding will manifest itself as loss of network services, including potentially the breach of any firewalls in place.

Page 58: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

– In general, protection against flooding begins at the equipment design level, where it includes measures such as:

- avoiding commitment of limited resources before determining that the request for service is legitimate.

- giving priority to completion of processing in progress over the acceptance of new work.

- identification and removal of duplicate or stale queued requests for service.

- not responding to unexpected packets sent to non-unicast addresses

• Network equipment should be capable of generating an alarm and log

if a suspicious increase in traffic occurs.• Blind Masquerade

Masquerade can be used to deny service in several ways:- by tying up resources at the target SCTP node to which the

impersonated node has limited access. For example, the target node may by policy permit a maximum of one SCTP association with the impersonated SCTP node. The masquerading attacker may attempt to establish an association purporting to come from the impersonated node so that the latter cannot do so when it requires it.

Page 59: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

- by deliberately allowing the impersonation to be detected, thereby provoking counter-measures that cause the impersonated node to be locked out of the target SCTP node.

- by interfering with an established association by inserting extraneous content such as a SHUTDOWN request.

SCTP reduces the risk of blind masquerade attacks through IP spoofing by use of the four-way startup handshake. Because the initial exchange is memory-less, no lockout mechanism is triggered by blind masquerade attacks. In addition, the INIT ACK containing the State Cookie is transmitted back to the IP address from which it received the INIT. Thus, the attacker would not receive the INIT ACK containing the State Cookie. SCTP protects against insertion of extraneous packets into the flow of an established association by use of the Verification Tag.

Logging of received INIT requests and abnormalities such as unexpected INIT ACKs might be considered as a way to detect patterns of hostile activity.

Page 60: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Improper Monopolization of Services Attacks under this heading are performed openly and

legitimately by the attacker. They are directed against fellow users of the target SCTP node or of the shared resources between the attacker and the target node. Possible attacks include the opening of a large number of associations between the attacker’s node and the target, or transfer of large volumes of information within a legitimately established association.

Policy limits should be placed on the number of associations per adjoining SCTP node. SCTP user applications should be capable of detecting large volumes of illegitimate or "no-op" messages within a given association and either logging or terminating the association as a result, based on local policy.

SCTP Interactions with Firewalls It is helpful for some firewalls if they can inspect just the first

fragment of a fragmented SCTP packet and unambiguously determine whether it corresponds to an INIT chunk (for further information, refer to [RFC1858]).

Accordingly, the requirements, (1) an INIT chunk MUST NOT be bundled with any other chunk in a packet, and (2) a packet containing an INIT chunk MUST have a zero Verification Tag.

Page 61: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

3GPP

The 3rd Generation Partnership Project (3GPP) is a collaboration between groups of telecommunications associations, to make a globally applicable third-generation (3G) mobile phone system specification within the scope of the International Mobile Telecommunications-2000 project of the International Telecommunication Union (ITU). 3GPP specifications are based on evolved Global System for Mobile Communications (GSM) specifications. 3GPP standardization encompasses Radio, Core Network and Service architecture. Some details of 3 GPP can found in RFC 3314.

Page 62: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Much of the standard addresses upgrading 3G UMTS to 4G mobile communications technology, which is essentially a mobile broadband system with enhanced multimedia services built on top.

The standard includes:Peak download rates of 326.4 Mbit/s for 4x4

antennas, and 172.8 Mbit/s for 2x2 antennas (utilizing 20 MHz of spectrum).

Peak upload rates of 86.4 Mbit/s for every 20 MHz of spectrum using a single antenna.

Five different terminal classes have been defined from a voice centric class up to a high end terminal that supports the peak data rates. All terminals will be able to process 20 MHz bandwidth.

At least 200 active users in every 5 MHz cell. (Specifically, 200 active data clients)

Page 63: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Security IssuesSecurity documents can be located at

ftp://ftp.3gpp.orgGSM was the first public telephone system to use

integrated cryptographic mechanismsGSM security features• Secure user access to telecommunications servicesIdentity of user authenticated by network operator• User and signaling traffic confidentialityProtects user voice and data traffic, and signaling data

from eavesdropping on radio path• User anonymityAttacker who knows user’s IMSI can be prevented from

tracking location of user and eavesdropping on radio path

Page 64: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

GSM security mechanisms• Cryptographic authentication verifies the subscription

with thehome network when service is requested– Challenge / response authentication protocol based on

asubscriber specific secret authentication key• Radio interface encryption prevents eavesdropping andauthenticates the use of the radio channel– The encryption mechanism is based on a symmetric

streamcipher– The key for encryption is established as part of theauthentication protocol• The allocation and use of temporary identities helps to

provideuser anonymity

Page 65: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Multicast Communication

Sending data to multiple receivers

Explicit communication paths set up could be at Application Level for peer-to-peer solutions

Without explicit communication paths, gossip based information dissemination provides simple but less efficient way to implement multicasting.

Page 66: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Application level multicasting

Nodes organize into an overlay network that is used to disseminate information to its members

Network routers not involved in group membership

Overlay network could be organized into a tree or a mesh network. Former provides a unique overlay path between every pair of nodes, while latter has each node connected to multiple neighbors – higher robustness in the event a connection breaks

Page 67: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Multicast SessionNode generates a multicast identifier mid (randomly

chosen 160 bit key). It then looks up succ(mid) that is the node responsible for this key and promotes it to become the root of the multicast tree that is used to send data to interested nodes

To join the tree, a node P executes operation LOOKUP(mid) that allows a lookup message with request to join the multicast group mid to be routed from P to succ(mid).

On the way up to the root, join request will add forwarder nodes or helpers for the group

Multicasting implemented by a node sending a multicast message towards the root by executing LOOKUP(mid) after which message can be sent along the tree.

Page 68: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Gossip based data dissemination

Spreading information without explicit communication paths in large distributed systems using epidemic protocols.

These protocols rapidly propagate information among large collection of nodes using only local information without any central component to coordinate information dissemination

Avoid write conflicts by allowing only a single node to initiate updates for a specific data item

Page 69: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Node is infected if it holds data that it is willing to spread to other nodes.

Node that has not seen the data is called susceptible.

Updated node that is not willing or able to spread the data is said to be removed.

Data is timestampedAnti-entropy model of propagation has

three approaches for updates for node P propagating to random node Q:

P only pushes its own updates to QP only pulls its own updates from QP and Q send updates to each other (push-pull approach)

– This is the best

Page 70: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Rumor spreading or gossiping allows node P to push update to arbitrary node Q if it is not yet updated. If Q already updated by another node, P loses interest in spreading the update further- it becomes removed.

Deletion of data item needs the use of death certificates to be recorded and spread through all the nodes.

Death certificates are time-stamped and they are removed after a maximum propagation time has elapsed

Page 71: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Conclusion

In this presentation, following topics were discussed:

Three popular models of communication in distributed systems:

Remote Procedure Calls (RPC)Message-oriented Middleware (MOM)Data Streaming, discussion of SCTP, 3GPP

Sending data to multiple receivers or Multicasting

Page 72: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

References:

[1] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., Zhang, L. and V. Paxson, "Stream Control Transmission Protocol", RFC 2960, October 2000.

[2] Stewart, Sharp, et. al., "SCTP Checksum Change", Work in Progress.

[3] Ong, L., Rytina, I., Garcia, M., Schwarzbauer, H., Coene, L., Lin, H., Juhasz, I., Holdrege, M. and C. Sharp, "Framework Architecture for Signaling Transport", RFC 2719, October 1999.

[4] Jungmeier, Rescorla and Tuexen, "TLS Over SCTP", Work in Progress.

[5] www.ietf.org

[6] RFC4960

[7] RFC3286

[8] RFC2196

[9] RFC1858

[10] RFC3314

[11] ftp://ftp.3gpp.org

Page 73: Communication in Distributed Systems. Communication in Distributed Systems based on low level message passing offered by underlying network Three popular

Bibliography

1. A. Tanenbaum, M.V. Steen, Distributed Systems: Principles and Paradigms, Pearson(2nd Ed), 2007.