chapter 2 application layer tami meredith. a protocol defines: 1. message format (syntax) 2. rules...

CSCI 3421DATA COMMUNICATIONS

AND NETWORKING

Chapter 2Application Layer

Tami Meredith

Layered Protocol ModelA protocol defines:

1. Message Format (Syntax)

2. Rules of Communication (Semantics)

3. Synchronisation and other application details (Implementation)

TCP/IP Protocol Suite

Architecture

Network topology: Graph structure Host networking system environment:

Layered protocol stack Application system communication

structure: Client-server (“pull”)

May have multiple clients and/or servers Client initiates, host responds

Peer-to-peer (P2P) E.g., Skype, BitTorrent Highly scalable, distributed, no host overhead

P2P Concerns

Networks designed predominantly for download, not upload – P2P can stress ISP upload capacity

Easily hacked because there is no central control

User pushback – Do you want someone downloading from you and slowing down your connection while you’re doing online gaming?

Inefficiency – lack of control can lead to redundancy, version control issues, etc.

Interface Model

API (Application Programming Interface) Based on “sockets” Sockets are bound to “ports” each with a

port number Every host has an IP Address Applications need a port and a host to be

identifiedip-address:port-number

E.g., 192.168.1.1:80, 140.184.133.99:8080

Communication Dimensions

1. Reliability Loss tolerant (UDP) – multimedia/video Reliable data transfer (TCP)

2. Throughput Bandwidth sensitive – voice Elastic – file transfer

3. Timing Impact of latency, delay, on the application

4. Security Encryption, authentification, access

Network Layer Support

TCP UDP

Connection-oriented Connectionless

Reliable Unreliable

Packet ordering ensured Random packet ordering

Congestion control No congestion control

No timing (delay) control

No throughput (QoS, BPS) guarantees

No security support

Application Layer Protocols

Hide many of the communication details for us Tailored to specific application domains May be public (e.g., HTTP) or proprietary (e.g.,

Skype) Only concerned with data communication and

communication format E.g., http controls the transfer of a web-page but not

its content format (html) – can transfer an invalid page Is a critical part of the application but does not

“create” the application – considerable additional support is needed (e.g., display and rendering).

The WWW: HTTP

HyperText Transfer Protocol RFC 1945, 2616 Demand (Pull) oriented Client (Browser) – server architecture Developed by Tim Berners-Lee at CERN] Is NOT the Internet, is just a single

application running over the Internet Content may use HTML (derived from SGML) Communicates Documents (web pages)

composed of Objects (files)

URIs

Consists of:scheme (colon) scheme-specific-part

More than just web addresses Web URLs use http

http://host:port/path?query-string#fragment-id

Browsers implement extensions (e.g. user@URL) fill in missing elements with defaults (e.g.,

port = 80, scheme = http)

HTTP

Uses TCP Stateless: server stores no information

between requests (but client may, e.g., cookies)

Pull-based: updates not propagated to users

May use both: persistent connections: one TCP connection

for all request-response pairs non-persistent connections: unique TCP

connection for each request-response pair

Non-Persistent Connections

Client initiates TCP connection Server processes connection request and

connection is built Client sends request Server responds to request

Client receives data and ends TCP connection

Server ends TCP connection

Efficiency

HTTP Request Format

Requests

GET /public_html/index.html HTTP/1.1 Request line: method URL versionHost: cs.smu.caConnection: closeUser-agent: Mozilla/5.0Accept-language: fr Header lines: name value (many exist) Blank line Entity body (empty in this example – used for

posts, form data)NOTE that a TCP connection exists when this is sent

Request Types

GET: most common, requests an object POST: a GET with information in the body HEAD: respond with everything but the

entity contents PUT: upload a file to the web server DELETE: delete a file on the web server

HTTP Response Format

Responses

HTTP/1.1 200 OKConnection: closeDate: Tue, 09 Aug 2011 15:44:04 GMTServer: Apache/2.2.3 (CentOS)Last-Modified: Tue, 09 Aug 2011 15:10:06 GMTContent-Length: 6821Content-Type: text/html

… 6821 bytes of data …

Status Codes (RFC1945, pp 32-37)

1xx Informational2xx Successful

200 OK

3xx Redirection 301 Moved permanently

4xx Client errors 400 Bad request 404 Not found

5xx Server errors 505 HTTP version not supported

Cookies

Servers are stateless no information held between requests except

connections Cookies (RFC6265) provide state to servers Cookies (on clients) store an identifier that can

be used by the server as a DB key to access information on a server-side DB

1) Cookie header line in HTTP requests2) Cookie header line in HTTP responses3) Cookie file kept by browser on user’s end

system4) Backend server-side database

Web Caches/Proxies

A network entity that intercepts HTTP requests and satisfies them using a cached version of the response data

Entity contacts the actual server if data not in cache Decreases net traffic Improves response times for commonly

accessed data Can be used to control web access and bypass

firewalls May provide stale data

Proxy Server

Conditional Get

Uses header line in request:If-modified-since: Wed, 2 Jan 2013 09:23:44

Server responds normally if the page has been modified

Server responds with 304 Not Modified it has not been modified

Reduces bandwidth usage Used by proxy servers to check for stale

data

Part 2

Network Layer Support

TCP UDP

Connection-oriented Connectionless

Reliable Unreliable

Packet ordering ensured Random packet ordering

Congestion control No congestion control

No timing (delay) control

No throughput (QoS, BPS) guarantees

No security support

URIs

Consists of:scheme (colon) scheme-specific-part

More than just web addresses Web URLs use http

http://host:port/path?query-string#fragment-id

Browsers implement extensions (e.g. user@URL) fill in missing elements with defaults (e.g.,

port = 80, scheme = http)

FTP

File transfer protocol Uses two parallel TCP connections

Communication is “out-of-band” as compared to HTTP which is “in-band”

Server maintains “state” information for each control connection e.g., current working directory cf. HTTP which is designed as stateless

FTP Connections

1. Control Connection Communicates control (header information) Kept open for the duration of the session

2. Data Connection For transmission of the file Non-persistent, closed after transfer complete New connection opened for each file

FTP Operation

Generally insecure, use SFTP, FTPS, or FTP over SSH for secure transfer

Transfer representations of ASCII, EBCDIC, binary, local Active mode: client initiates control connection (port

21) and then server initiates data connection Passive mode: (client behind firewall) client initiates

control connection and then sends a “PASV” command, server returns connection info via control to allow client to initiate data connections

Normally requires login, but anonymous FTP can be performed

Normally a CLI due to its age, generally being replaced by SCP

SMTP (e-mail) Hybrid architecture Mail servers communicate P2P using

SMTP (RFC 5321) User agents communicate to mail server

with SMTP (send) or POP3/IMAP (receive)

SMTP Operation

User Agents: E.g., Outlook, pine, elm, Mail

Mail servers Outgoing message queue Incoming mailboxes (one per user) Communicate using SMTP Function both as clients (sending) and

server (receiving) Persistent connections for more than one

message

Webmail

SMTP cf. HTTP

SMTP = Push: Servers are active in pushing data to the destinations Really? What about the receiving server

handling incoming mail messages? HTTP = Pull: Servers are passive and wait

for clients to pull the data from them SMTP requires 7-bit ASCII while HTTP does

not HTTP permits multiple objects whereas a

mail message must be a single object (including attachments)

Access Protocols

SMTP is designed to push mail to another server Permits retries Always there if user has dynamic DNS

Need a way for user mail agent to pull mail from the mailbox at the local mail server

1. POP3 – Post Office Protocol2. IMAP – Internet Mail Access Protocol

POP3

RFC 1939 Authorisation – passwords in the clear Transaction(s) Update on closing connection POP3 servers do not maintain state

information between connections Basically a program on the same system

as the mailbox for manipulating a mailbox

IMAP

IMAP is considerably more feature-rich than POP3

Maintains persistent state information for each mailbox (e.g., directory structure)

Allows users to obtain components (e.g., attachments) of messages

Also implemented using a basic client-server model

DNS

Domain Name System (RFC 1034, 1035) Uses UDP on port 53 Translates registered hostnames to IP addresses Hostname: cs.smu.ca IP address: 140.184.133.99 Not 1:1 but n:m for host:address

Permits aliasing (multiple names for an IP address) Manages complex multi-machine servers (e.g.,

google.com, cnn.com) Originally based on BIND (Berkeley Internet

Name Domain)

DNS API

#include <netdb.h>extern int h_errno;

struct hostent *gethostbyname(const char *name);

SIMPLE – DNS is encapsulated into a single system call

Architecture

Want everyone to have the same updated information which suggests a single global database Single point of failure Ridiculously high traffic volume Potentially long latency (distance etc.) Extreme maintenance and update demands

Uses a distributed hierarchical DB instead http://www.root-servers.org

http://www.root-servers.org/

Participants

ICANN: Internet Corporation for Assigned Names and Numbers (http://www.ican.org) U.S. Corporation reporting to U.S. government

IANA: Internet Assigned Numbers Authority (http://www.iana.org) Department of ICANN

Internet Society (http://www.internetsociety.org) Parent Corporation of the IETF

IETF: Internet Engineering Task Force (http://www.ietf.org)

W3C: World Wide Web Consortium (http://www.w3.org)

http://www.ietf.org/

Servers

Root DNS Servers 13 Root servers

TLD Servers Control specific domains, e.g., .ca

Authoritative servers Hostnames for specific machines May be provided by an organization such as

smu.ca May be provided by an ISP

Server Interaction

http://en.wikipedia.org/wiki/List_of_Internet_top-

level_domains

Query Chaining

Simplified by caching at every level

DNS Records

(Name, Value, Type, TTL)

1. A: Standard Hostname:IP address pair2. NS: value is an authoritative name-

server for a domain (used for DNS chaining)

3. CNAME: value is a canonical hostname for aliased hosts

4. MX: value is the canonical hostname of a mailserver

DNS Messages

The End

chapter 2 application layer tami meredith. a protocol defines: 1. message format (syntax) 2. rules...

Documents

communication details

single application

communication formate

port numberevery host

data communications

unique tcp connection

host respondspeer

scheme colon schemespecific