chapter 2 application layer tami meredith. a protocol defines: 1. message format (syntax) 2. rules...
TRANSCRIPT
CSCI 3421DATA COMMUNICATIONS
AND NETWORKING
Chapter 2Application Layer
Tami Meredith
Layered Protocol ModelA protocol defines:
1. Message Format (Syntax)
2. Rules of Communication (Semantics)
3. Synchronisation and other application details (Implementation)
TCP/IP Protocol Suite
Architecture
Network topology: Graph structure Host networking system environment:
Layered protocol stack Application system communication
structure: Client-server (“pull”)
May have multiple clients and/or servers Client initiates, host responds
Peer-to-peer (P2P) E.g., Skype, BitTorrent Highly scalable, distributed, no host overhead
P2P Concerns
Networks designed predominantly for download, not upload – P2P can stress ISP upload capacity
Easily hacked because there is no central control
User pushback – Do you want someone downloading from you and slowing down your connection while you’re doing online gaming?
Inefficiency – lack of control can lead to redundancy, version control issues, etc.
Interface Model
API (Application Programming Interface) Based on “sockets” Sockets are bound to “ports” each with a
port number Every host has an IP Address Applications need a port and a host to be
identifiedip-address:port-number
E.g., 192.168.1.1:80, 140.184.133.99:8080
Communication Dimensions
1. Reliability Loss tolerant (UDP) – multimedia/video Reliable data transfer (TCP)
2. Throughput Bandwidth sensitive – voice Elastic – file transfer
3. Timing Impact of latency, delay, on the application
4. Security Encryption, authentification, access
Network Layer Support
TCP UDP
Connection-oriented Connectionless
Reliable Unreliable
Packet ordering ensured Random packet ordering
Congestion control No congestion control
No timing (delay) control
No throughput (QoS, BPS) guarantees
No security support
Application Layer Protocols
Hide many of the communication details for us Tailored to specific application domains May be public (e.g., HTTP) or proprietary (e.g.,
Skype) Only concerned with data communication and
communication format E.g., http controls the transfer of a web-page but not
its content format (html) – can transfer an invalid page Is a critical part of the application but does not
“create” the application – considerable additional support is needed (e.g., display and rendering).
The WWW: HTTP
HyperText Transfer Protocol RFC 1945, 2616 Demand (Pull) oriented Client (Browser) – server architecture Developed by Tim Berners-Lee at CERN] Is NOT the Internet, is just a single
application running over the Internet Content may use HTML (derived from SGML) Communicates Documents (web pages)
composed of Objects (files)
URIs
Consists of:scheme (colon) scheme-specific-part
More than just web addresses Web URLs use http
http://host:port/path?query-string#fragment-id
Browsers implement extensions (e.g. user@URL) fill in missing elements with defaults (e.g.,
port = 80, scheme = http)
HTTP
Uses TCP Stateless: server stores no information
between requests (but client may, e.g., cookies)
Pull-based: updates not propagated to users
May use both: persistent connections: one TCP connection
for all request-response pairs non-persistent connections: unique TCP
connection for each request-response pair
Non-Persistent Connections
Client initiates TCP connection Server processes connection request and
connection is built Client sends request Server responds to request
Client receives data and ends TCP connection
Server ends TCP connection
Efficiency
HTTP Request Format
Requests
GET /public_html/index.html HTTP/1.1 Request line: method URL versionHost: cs.smu.caConnection: closeUser-agent: Mozilla/5.0Accept-language: fr Header lines: name value (many exist) Blank line Entity body (empty in this example – used for
posts, form data)NOTE that a TCP connection exists when this is sent
Request Types
GET: most common, requests an object POST: a GET with information in the body HEAD: respond with everything but the
entity contents PUT: upload a file to the web server DELETE: delete a file on the web server
HTTP Response Format
Responses
HTTP/1.1 200 OKConnection: closeDate: Tue, 09 Aug 2011 15:44:04 GMTServer: Apache/2.2.3 (CentOS)Last-Modified: Tue, 09 Aug 2011 15:10:06 GMTContent-Length: 6821Content-Type: text/html
… 6821 bytes of data …
Status Codes (RFC1945, pp 32-37)
1xx Informational2xx Successful
200 OK
3xx Redirection 301 Moved permanently
4xx Client errors 400 Bad request 404 Not found
5xx Server errors 505 HTTP version not supported
Cookies
Servers are stateless no information held between requests except
connections Cookies (RFC6265) provide state to servers Cookies (on clients) store an identifier that can
be used by the server as a DB key to access information on a server-side DB
1) Cookie header line in HTTP requests2) Cookie header line in HTTP responses3) Cookie file kept by browser on user’s end
system4) Backend server-side database
Web Caches/Proxies
A network entity that intercepts HTTP requests and satisfies them using a cached version of the response data
Entity contacts the actual server if data not in cache Decreases net traffic Improves response times for commonly
accessed data Can be used to control web access and bypass
firewalls May provide stale data
Proxy Server
Conditional Get
Uses header line in request:If-modified-since: Wed, 2 Jan 2013 09:23:44
Server responds normally if the page has been modified
Server responds with 304 Not Modified it has not been modified
Reduces bandwidth usage Used by proxy servers to check for stale
data
Part 2
Network Layer Support
TCP UDP
Connection-oriented Connectionless
Reliable Unreliable
Packet ordering ensured Random packet ordering
Congestion control No congestion control
No timing (delay) control
No throughput (QoS, BPS) guarantees
No security support
URIs
Consists of:scheme (colon) scheme-specific-part
More than just web addresses Web URLs use http
http://host:port/path?query-string#fragment-id
Browsers implement extensions (e.g. user@URL) fill in missing elements with defaults (e.g.,
port = 80, scheme = http)
FTP
File transfer protocol Uses two parallel TCP connections
Communication is “out-of-band” as compared to HTTP which is “in-band”
Server maintains “state” information for each control connection e.g., current working directory cf. HTTP which is designed as stateless
FTP Connections
1. Control Connection Communicates control (header information) Kept open for the duration of the session
2. Data Connection For transmission of the file Non-persistent, closed after transfer complete New connection opened for each file
FTP Operation
Generally insecure, use SFTP, FTPS, or FTP over SSH for secure transfer
Transfer representations of ASCII, EBCDIC, binary, local Active mode: client initiates control connection (port
21) and then server initiates data connection Passive mode: (client behind firewall) client initiates
control connection and then sends a “PASV” command, server returns connection info via control to allow client to initiate data connections
Normally requires login, but anonymous FTP can be performed
Normally a CLI due to its age, generally being replaced by SCP
SMTP (e-mail) Hybrid architecture Mail servers communicate P2P using
SMTP (RFC 5321) User agents communicate to mail server
with SMTP (send) or POP3/IMAP (receive)
SMTP Operation
User Agents: E.g., Outlook, pine, elm, Mail
Mail servers Outgoing message queue Incoming mailboxes (one per user) Communicate using SMTP Function both as clients (sending) and
server (receiving) Persistent connections for more than one
message
Webmail
SMTP cf. HTTP
SMTP = Push: Servers are active in pushing data to the destinations Really? What about the receiving server
handling incoming mail messages? HTTP = Pull: Servers are passive and wait
for clients to pull the data from them SMTP requires 7-bit ASCII while HTTP does
not HTTP permits multiple objects whereas a
mail message must be a single object (including attachments)
Access Protocols
SMTP is designed to push mail to another server Permits retries Always there if user has dynamic DNS
Need a way for user mail agent to pull mail from the mailbox at the local mail server
1. POP3 – Post Office Protocol2. IMAP – Internet Mail Access Protocol
POP3
RFC 1939 Authorisation – passwords in the clear Transaction(s) Update on closing connection POP3 servers do not maintain state
information between connections Basically a program on the same system
as the mailbox for manipulating a mailbox
IMAP
IMAP is considerably more feature-rich than POP3
Maintains persistent state information for each mailbox (e.g., directory structure)
Allows users to obtain components (e.g., attachments) of messages
Also implemented using a basic client-server model
DNS
Domain Name System (RFC 1034, 1035) Uses UDP on port 53 Translates registered hostnames to IP addresses Hostname: cs.smu.ca IP address: 140.184.133.99 Not 1:1 but n:m for host:address
Permits aliasing (multiple names for an IP address) Manages complex multi-machine servers (e.g.,
google.com, cnn.com) Originally based on BIND (Berkeley Internet
Name Domain)
DNS API
#include <netdb.h>extern int h_errno;
struct hostent *gethostbyname(const char *name);
SIMPLE – DNS is encapsulated into a single system call
Architecture
Want everyone to have the same updated information which suggests a single global database Single point of failure Ridiculously high traffic volume Potentially long latency (distance etc.) Extreme maintenance and update demands
Uses a distributed hierarchical DB instead http://www.root-servers.org
Participants
ICANN: Internet Corporation for Assigned Names and Numbers (http://www.ican.org) U.S. Corporation reporting to U.S. government
IANA: Internet Assigned Numbers Authority (http://www.iana.org) Department of ICANN
Internet Society (http://www.internetsociety.org) Parent Corporation of the IETF
IETF: Internet Engineering Task Force (http://www.ietf.org)
W3C: World Wide Web Consortium (http://www.w3.org)
Servers
Root DNS Servers 13 Root servers
TLD Servers Control specific domains, e.g., .ca
Authoritative servers Hostnames for specific machines May be provided by an organization such as
smu.ca May be provided by an ISP
Server Interaction
http://en.wikipedia.org/wiki/List_of_Internet_top-
level_domains
Query Chaining
Simplified by caching at every level
DNS Records
(Name, Value, Type, TTL)
1. A: Standard Hostname:IP address pair2. NS: value is an authoritative name-
server for a domain (used for DNS chaining)
3. CNAME: value is a canonical hostname for aliased hosts
4. MX: value is the canonical hostname of a mailserver
DNS Messages
The End