4/19/05cs1181 what we covered last week how to calculate the delay in packet delivery r queueing...
Post on 20-Dec-2015
218 views
TRANSCRIPT
4/19/05 CS1181
What we covered last week
How to calculate the delay in packet delivery
Queueing delay and congestion losses
Transmission delay Propagation delay Round Trip Time Bandwidth-delay product Delay in delivering multiple
packets across multiple hops
A few application protocols HTTP
HTTP 1.0 HTTP 1.1
• Pipelining
• Without pipelining
FTP Email and SMTP
4/19/05 CS1182
Why we need a naming database for the Internet
People: many identifiers: SSN, name, passport #
Internet hosts, routers: IP address (32 bit) - used
for addressing datagrams “name”, e.g.,
ww.yahoo.com - used by humans
Domain Name System Hostname to IP address
translation And vice versa
has also been used for other purposes E.g. Load distribution
among replicated Web servers: one website name maps to a set of IP addresses
4/19/05 CS1183
DNS: Domain Name System
DNS consists of A hierarchical name space A distributed database
implemented in hierarchy of many name servers
An application-layer protocol used by hosts, routers, and name servers to communicate to resolve names (address/name translation) A core Internet function,
implemented as an application-layer protocol
Why not a centralize DNS? single point of failure traffic volume distant centralized
database Maintenance The most important
factor: the need for distributed management
4/19/05 CS1184
A hierarchical name space
each non-leaf node in the tree is a domain Each domain belongs to an administrative authority
any domain can assign sub-domains below it no limit on the depth along any branch
DNS name hierarchy is completely independent from the Internet's topological structure
edu com gov org us uk fr
mit ucla xerox dec nasa nsf acm ieee
cs seas cad
...
.....
..... .....
rootTLD (top leveldomains)
Foo Bar
4/19/05 CS1185
Root DNS Servers
.com DNS servers .org DNS servers .edu DNS servers
ucla.eduDNS servers
umass.eduDNS servers
yahoo.comDNS servers
amazon.comDNS servers
pbs.orgDNS servers
DNS: Implemented as a distributed database
art art.ucla.edu
The entire DNS name space is divided to a hierarchy of zones a zone: a continuous sub-space in the DNS name tree
may contain domains at different levels
CScs.ucla.edu
netsec.cs.ucla.edu
4/19/05 CS1186
What makes a zoneeach zone is controlled by its own administrator,
served by its own name server(s)One master server keeps a master zone file,
distributes it to multiple secondary serversBoth are called authoritative servers for the zone
Each server must be able toResolve all the names in its own zoneKnow where to direct queries for names belonging to
its sub-zones CScs.ucla.edu
netsec.cs.ucla.edu
4/19/05 CS1187
What's in the zone's master file:
data that defines the top node of the zone including a list of all the servers for the zone
authoritative data for all nodes in the zone for all of the nodes from the top node to leaf nodes (that are
outside of any sub-zone)
data that describes delegated sub-zones Domain name, owner, etc
“glue data”: IP address(es) for each sub-zone's name server(s)
CScs.ucla.edu
netsec.cs.ucla.edu
4/19/05 CS1188
How to resolve a DNS name?
EX: your browser needs IP address for www.amazon.com: Your host sends a query to a local DNS server The local server either finds the answer in its cache, or
otherwise sends a query to a root server The root server replies with pointers to .com DNS
servers The local server queries .com DNS server which replies
with pointer to amazon.com DNS server The local server queries amazon.com DNS server to get
the IP address for www.amazon.com, and sends the answer back to your host
4/19/05 CS1189
Local Name Server
Each ISP (residential ISP, company, university) has one.Also called “default name server”, "local cache
server"
Every host knows the IP address(es) of its local DNS server(s)
When a host makes a DNS query, query is sent to its local DNS server
4/19/05 CS11810
requesting hostlixia.cs.ucla.edu
gaia.umass.edu
root DNS server
local DNS serverToucan.CS.UCLA.EDU
1
23
4
5
6
authoritative DNS serverdns.umass.edu
78
.edu DNS server
Example
A host at cs.ucla.edu wants IP address for gaia.umass.edu
4/19/05 CS11811
requesting hostlixia.cs.ucla.edu
gaia.umass.edu
root DNS server
local DNS serverToucan.CS.UCLA.EDU
1
2
45
6
authoritative DNS serverdns.umass.edu
7
8
3
Recursive queries
recursive query:puts burden of name resolution on contacted name server.heavy load?
.edu DNS server
iterated query:contacted server replies with name of server to contact“I don’t know this name, but ask this server”
4/19/05 CS11812
DNS: caching and replication
Virtually each and all Internet applications invoke DNS lookup
Redundant servers for each zone “13” root servers
once a name server learns a DNS name to IP address mapping, it caches the mappingcache entries timeout (deleted) after some time
(specified in the DNS query reply)TLD servers typically cached in local name servers
• Thus root name servers not often visited
4/19/05 CS11813
DNS records
DNS: all DNS servers storing Resource Records (RR)
Type=NS name is a domain (e.g.
foo.com) value is hostname of
authoritative name server for this domain
RR format: (name, value, type, ttl)
Type=Aname is hostnamevalue is IP address
Type=CNAMEname is a alias name for some “canonical” (the real) name
www.ibm.com is reallyservereast.backup2.ibm.comvalue is canonical name
Type=MXvalue is name of mailserver associated with namee.g. name = cs.ucla.edu value= mailman.cs.ucla.edu
type = MX ttl = 172800
4/19/05 CS11814
DNS protocol, messages
DNS protocol : query and reply messages, with same message format
msg headeridentification: 16 bit # for query, reply to query uses same #flags:
query or replyrecursion desired recursion availablereply is authoritative
4/19/05 CS11815
DNS protocol, messages
Name, type fields for a query
RRs in responseto query
records forauthoritative servers
additional “helpful”info that may be used
4/19/05 CS11816
Inserting records into DNS Example: just created startup “Network Utopia” Register name networkuptopia.com at a registrar (e.g.,
Network Solutions) Need to provide registrar with names and IP addresses of your
authoritative name servers (primary and secondary) Registrar inserts two RRs into the com TLD server:
(networkutopia.com, dns1.networkutopia.com, NS)(dns1.networkutopia.com, 212.212.212.1, A)
Put in authoritative server Type A record for www.networkuptopia.com and Type MX record for networkutopia.com
How do people get the IP address of Web site www.networkutopia.com ?
4/19/05 CS11817
How to use DNS in practice?
Two popular programs you can play on a unix: “host” – look up host names using domain servers
Command: host [-l] [-v] [-w] [-r] [-d] [-t query type] host [server] Manual page: man host
“nslookup” – query Internet name servers interactively Command: nslookup [-options…] [host-to-find | –[server] ] Manual page: man nslookup
> nslookup cs.ucla.eduServer: Toucan.CS.UCLA.EDUAddress: 131.179.96.16
Name: cs.ucla.eduAddress: 131.179.128.22
> nslookup -q=MX cs.ucla.eduServer: Toucan.CS.UCLA.EDUAddress: 131.179.96.16
cs.ucla.edu pref. = 3, mail exchanger=Mailman.cs.ucla.educs.ucla.edu pref. = 3, mail exchanger=Toucan.cs.ucla.educs.ucla.edu nameserver = NS0.cs.ucla.educs.ucla.edu nameserver = NS1.cs.ucla.educs.ucla.edu nameserver = NS2.cs.ucla.educs.ucla.edu nameserver = NS3.cs.ucla.eduMailman.cs.ucla.edu internet address = 131.179.128.30Toucan.cs.ucla.edu internet address = 131.179.128.16NS0.cs.ucla.edu internet address = 131.179.128.30NS1.cs.ucla.edu internet address = 131.179.128.16NS2.cs.ucla.edu internet address = 131.179.128.17NS3.cs.ucla.edu internet address = 131.179.128.18
4/19/05 CS11819
Chapter 2: Application layer
2.1 Principles of network applications app architectures app requirements
2.2 Web and HTTP 2.4 Electronic Mail
SMTP, POP3, IMAP
2.5 DNS
2.6 P2P file sharing 2.7 Socket programming
with TCP 2.8 Socket programming
with UDP 2.9 Building a Web
server
4/19/05 CS11820
P2P file sharing
Example Alice runs P2P client
application on her notebook computer
Intermittently connects to Internet; gets new IP address for each connection
Asks for “Hey Jude” Application displays other
peers that have copy of Hey Jude.
Alice chooses one of the peers, Bob.
File is copied from Bob’s PC to Alice’s notebook: HTTP
While Alice downloads, other users uploading from Alice.
Alice’s peer is both a Web client and a transient Web server.
All peers are servers = highly scalable!
4/19/05 CS11821
P2P: centralized directory
original “Napster” design
1) when peer connects, it informs central server: IP address content
2) Alice queries for “Hey Jude”
3) Alice requests file from Bob
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
4/19/05 CS11822
P2P: problems with centralized directory
Single point of failure Performance bottleneck Copyright infringement
file transfer is decentralized, but locating content is highly centralized
4/19/05 CS11823
Query flooding: Gnutella
fully distributed no central server
public domain protocol many Gnutella clients
implementing protocol
Overlay network: graph edge between peer X and
Y if there’s a TCP connection
all active peers and edges is overlay net
Edge is not a physical link
Given peer will typically be connected with < 10 overlay neighbors
4/19/05 CS11824
Gnutella: protocol
Query
QueryHit
Query
QueryQuery
Query
QueryHit
QueryHit
File transfer:HTTP
Query messagesent over existing TCPconnections
peers forwardQuery message
QueryHit sent over reversepath
Scalability:limited scopeflooding
4/19/05 CS11825
Gnutella: Peer joining
1. Joining peer X must find some other peer in Gnutella network: use list of candidate peers
2. X sequentially attempts to make TCP with peers on list until connection setup with Y
3. X sends Ping message to Y; Y forwards Ping message.
4. All peers receiving Ping message respond with Pong message
5. X receives many Pong messages. It can then setup additional TCP connections
4/19/05 CS11826
Exploiting heterogeneity: KaZaA
Each peer is either a group leader or assigned to a group leader. TCP connection between
peer and its group leader. TCP connections between
some pairs of group leaders.
Group leader tracks the content in all its children.
ordinary peer
group-leader peer
neighoring relationshipsin overlay network
4/19/05 CS11827
KaZaA: Querying
Each file has a hash and a descriptorClient sends keyword query to its group leaderGroup leader responds with matches:
For each match: metadata, hash, IP addressIf group leader forwards query to other group
leaders, they respond with matchesClient then selects files for downloading
HTTP requests using hash as identifier sent to peers holding desired file
4/19/05 CS11828
KaZaA tricks
Limitations on simultaneous uploadsRequest queuingIncentive prioritiesParallel downloading
4/19/05 CS11829
Chapter 2: Application layer
2.1 Principles of network applications
2.2 Web and HTTP 2.3 FTP 2.4 Electronic Mail
SMTP, POP3, IMAP
2.5 DNS
2.6 P2P file sharing 2.7 Socket programming
with TCP 2.8 Socket programming
with UDP 2.9 Building a Web
server
4/19/05 CS11830
Socket-programming using TCP
Socket: a door between application process and end-end-transport protocol (UCP or TCP)
TCP service: reliable transfer of bytes from one process to another
process
TCP withbuffers,
variables
socket
controlled byapplicationdeveloper
controlled byoperating
system
host orserver
process
TCP withbuffers,
variables
socket
controlled byapplicationdeveloper
controlled byoperatingsystem
host orserver
internet
4/19/05 CS11831
Socket programming with TCP
Client must contact server server process must first
be running server must have created
socket (door) that welcomes client’s contact
Client contacts server by: creating client-local TCP
socket specifying IP address, port
number of server process When client creates
socket: client TCP establishes connection to server TCP
When contacted by client, server TCP creates new socket for server process to communicate with client allows server to talk
with multiple clients source port numbers
used to distinguish clients (more in Chap 3)
TCP provides reliable, in-order transfer of bytes (“pipe”) between client and server
application viewpoint
4/19/05 CS11832
Client/server socket interaction: TCP
wait for incomingconnection requestconnectionSocket =welcomeSocket.accept()
create socket,port=x, forincoming request:welcomeSocket =
ServerSocket()
create socket,connect to hostid, port=xclientSocket =
Socket()
closeconnectionSocket
read reply fromclientSocket
closeclientSocket
Server (running on hostid) Client
send request usingclientSocketread request from
connectionSocket
write reply toconnectionSocket
TCP connection setup
4/19/05 CS11833
Socket programming with UDP
UDP: no “connection” between client and server
no handshaking sender explicitly attaches IP
address and port of destination to each packet
server must extract IP address, port of sender from received packet
UDP: transmitted data may be lost, or received out of order
application viewpoint
UDP provides unreliable transfer of chunks of bytes (“datagrams”)
between client and server
4/19/05 CS11834
Client/server socket interaction: UDP
closeclientSocket
Server (running on hostid)
read reply fromclientSocket
create socket,clientSocket = DatagramSocket()
Client
Create, address (hostid, port=x,send datagram request using clientSocket
create socket,port=x, forincoming request:serverSocket = DatagramSocket()
read request fromserverSocket
write reply toserverSocketspecifying clienthost address,port number
4/19/05 CS11835
Chapter 2: Summary Application architectures
client-server P2P
Specific application protocols HTTP, FTP, SMTP, POP,
IMAP, DNS application service
requirements: reliability, bandwidth, delay
Internet transport service model connection-oriented, reliable:
TCP unreliable, datagrams: UDP
Learned about protocols typical request/reply message
exchange: client sends request server responds with status
code, data
Typical message formats: headers: fields giving info about
data data: info being communicated
In-band vs. out-of-band control messages
Stateless vs. stateful protocols
4/19/05 CS11836
identifying servers and services
Each service is assigned a unique well-known port # HTTP: TCP/80, FTP: TCP/21, smtp: TCP/25, DNS: UDP/53
server application process registers with local protocol software with that port #
a client requests a service by sending request to a specific server host with the well-known port #
server handles multiple requests concurrently master process accepts incoming requests and creates a child
server process for each client, then goes to wait for future request
the child server process handles all msgs from the same client process, each incoming msg identifies its server process by (source addr + port, destination addr + port)
4/19/05 CS11837
Transport services and protocols provide logical communication
between app processes running on different hosts
transport protocols run in end systems send side: breaks app
messages into segments, passes to network layer
rcv side: reassembles segments into messages, passes to app layer
more than one transport protocol available to apps Internet: TCP and UDP
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
logical end-end transport
4/19/05 CS11838
Transport vs. network layer
network layer: logical communication between hosts
transport layer: logical communication between processes relies on, enhances,
network layer services
Household analogy:12 kids sending letters to 12
kids processes = kids app messages = letters in
envelopes hosts = houses transport protocol = Ann
and Bill network-layer protocol =
postal service
4/19/05 CS11839
Internet transport-layer protocols
reliable, in-order delivery (TCP) congestion control flow control connection setup
unreliable, unordered delivery: UDP no-frills extension of
“best-effort” IP services not available:
delay guarantees bandwidth guarantees
applicationtransportnetworkdata linkphysical
applicationtransportnetworkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysical
networkdata linkphysicalnetwork
data linkphysical
logical end-end transport
4/19/05 CS11840
Multiplexing/demultiplexing
application
transport
network
link
physical
P1 application
transport
network
link
physical
application
transport
network
link
physical
P2P3 P4P1
host 1 host 2 host 3
= process= socket
delivering received segmentsto correct socket
Demultiplexing at rcv host:gathering data from multiplesockets, enveloping data with header (later used for demultiplexing)
Multiplexing at send host:
Each process is identified by IP address and port#A transport association is identified by [source addr, port#; destination addr, port#]
4/19/05 CS11841
Multiplexing/demultiplexing: examples
Web clienthost A
Webserver B
Web clientshost C
Source IP: CDest IP: B
sour port:1180dest. port:
80
Source IP: CDest IP: B
sour port:2211dest. port:
80
port use: Web server
Source IP: ADest IP: B
sour port:1180dest. port:
80
host receives IP datagrams each datagram has source IP
address, destination IP address each datagram carries 1 transport-
layer segment each segment has source,
destination port number host uses IP addresses & port numbers
to direct segment to appropriate socket
4/19/05 CS11842
UDP: User Datagram Protocol [RFC 768]
“best effort” service, UDP segments may be: lost delivered out of order to
application processes connectionless:
no prior handshaking between UDP sender, receiver
each UDP segment handled independently of others
Why is there a UDP? no connection establishment
(which can add delay) simple: no connection state at
sender, receiver small segment header no congestion control: UDP can
blast away as fast as desired
4/19/05 CS11843
UDP: more
often used for streaming multimedia apps loss tolerant rate sensitive
other UDP uses DNS SNMP
reliable transfer over UDP: add reliability at application layer application-specific
error recovery!
source port # dest port #
32 bits
Applicationdata
(message)
UDP segment format
length checksumLength, in
bytes of UDPsegment,including
header
4/19/05 CS11844
UDP checksum
Sender: treat segment contents as
sequence of 16-bit integers checksum: addition (1’s
complement sum) of segment contents
sender puts checksum value into UDP checksum field
Receiver: compute checksum of
received segment check if computed checksum
equals checksum field value: NO - error detected YES - no error detected.
But maybe errors nonetheless? More later ….
Goal: detect “errors” (e.g., flipped bits) in transmitted segment
4/19/05 CS11845
Internet Checksum Example
Note: When adding numbers, a carryout from the most significant bit needs to be added to the result
Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
wraparound
sumchecksum
4/19/05 CS11846
source port # dest port #
32 bits
Applicationdata
(message)
length checksum
UDP header format
source IP address
destination IP address
zero protocol UDP length
How to Calculate UDP Checksum UDP header
Length: # of bytes (including both header & data) checksum: computed over
• the pseudo header, and• UDP header and data.• if the field is 0, no checksum
pseudo header: UDP's self-protection against misdelivered IP packets pseudo header is not carried in UDP packet, nor counted in the length field
4/19/05 CS11847
Chapter 3 outline
3.1 Transport-layer services
3.2 Multiplexing and demultiplexing
3.3 Connectionless transport: UDP
3.4 Principles of reliable data transfer
3.5 Connection-oriented transport: TCP segment structure reliable data transfer flow control connection management
3.6 Principles of congestion control
3.7 TCP congestion control
4/19/05 CS11848
Principles of Reliable data transfer
characteristics of unreliable channel determines complexity of reliable data transfer protocol (rdt)
We’ll: incrementally develop sender, receiver sides of reliable data
transfer protocol (rdt) consider only unidirectional data transfer
but control info will flow in both directions!
4/19/05 CS11849
Reliable data transfer: getting started
sendside
receiveside
rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer
udt_send(): called by rdt,to transfer packet over
unreliable channel to receiver
rdt_rcv(): called when packet arrives on rcv-side of channel
deliver_data(): called by rdt to deliver data to upper layer
4/19/05 CS11850
Reliable data transfer: getting started
use finite state machines (FSM) to specify sender, receiver
state1
state2
event causing state transitionactions taken on state transition
state: when in this “state”, the next state is uniquely determined by next event
eventactions
State 3State 3State 3State 3
4/19/05 CS11851
Wait for call from above packet = make_pkt(data)
udt_send(packet)
rdt_send(data)
extract (packet,data)deliver_data(data)
Wait for call from
below
rdt_rcv(packet)
sender receiver
Rdt1.0: reliable transfer over a reliable channel
underlying channel perfectly reliableno bit errorsno loss of packets
separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel
4/19/05 CS11852
Rdt2.0: channel with bit errors
underlying channel may flip bits in packet checksum to detect bit errors
the question: how to recover from errors: acknowledgements (ACKs): receiver explicitly tells sender that pkt
received OK negative acknowledgements (NAKs): receiver explicitly tells sender
that pkt had errors sender retransmits pkt on receipt of NAK
new mechanisms in rdt2.0 (beyond rdt1.0): error detection receiver feedback: control msgs (ACK,NAK) rcvr->sender
4/19/05 CS11853
rdt2.0: FSM specification
Wait for call from above
snkpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
belowsender
receiverrdt_send(data)
4/19/05 CS11854
rdt2.0: operation with no errors
Wait for call from above
snkpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
sender FSM
receiver FSM
4/19/05 CS11855
rdt2.0: error scenario
Wait for call from above
snkpkt = make_pkt(data, checksum)udt_send(sndpkt)
extract(rcvpkt,data)deliver_data(data)udt_send(ACK)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && isNAK(rcvpkt)
udt_send(NAK)
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Wait for ACK or
NAK
Wait for call from
below
rdt_send(data)
sender FSM
receiver FSM
4/19/05 CS11856
rdt2.0 has a fatal flaw!
What happens if ACK/NAK corrupted?
sender doesn’t know what happened at receiver!
can’t just retransmit: possible duplicate
Handling duplicates: sender retransmits current pkt
if ACK/NAK garbled sender adds sequence number
to each pkt receiver discards (doesn’t
deliver up) duplicate pkt
Sender sends one packet, then waits for receiver response
stop and wait
4/19/05 CS11857
rdt2.1: sender, handles garbled ACK/NAKs
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
Wait for ACK or NAK 0 udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)
Wait for call 1 from
above
Wait for ACK or NAK 1
4/19/05 CS11858
rdt2.1: receiver, handles garbled ACK/NAKs
Wait for 0 from below
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
Wait for 1 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)
4/19/05 CS11859
rdt2.1: discussion
Sender: seq # added to pkt two seq. #’s (0,1) will
suffice. Why? must check if received
ACK/NAK corrupted twice as many states
state must “remember” whether “current” pkt has 0 or 1 seq. #
Receiver: must check if received
packet is duplicate state indicates whether 0
or 1 is expected pkt seq #
note: receiver cannot know if its last ACK/NAK received OK at sender
4/19/05 CS11860
rdt2.2: a NAK-free protocol
same functionality as rdt2.1, using ACKs only instead of NAK, receiver sends ACK for last pkt
received OK receiver must explicitly include seq # of pkt being
ACKed
duplicate ACK at sender results in same action as NAK: retransmit current pkt
4/19/05 CS11861
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
Wait for ACK
0
sender FSMfragment
Wait for 0 from below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK1, chksum)udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt) || has_seq1(rcvpkt))
udt_send(sndpkt)
receiver FSMfragment
rdt2.2: sender, receiver fragments
4/19/05 CS11862
rdt3.0: channels with bit errors and packet loss
New assumption: underlying channel can also lose packets (data or ACKs) checksum, seq. #, ACKs,
retransmissions will be of help, but not enough
Approach: sender waits “reasonable” amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost): retransmission will be
duplicate, but use of seq. #’s already handles this
receiver must specify seq # of pkt being ACKed
requires countdown timer
4/19/05 CS11863
rdt3.0 sender
sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
Wait for
ACK0
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,1) )
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)start_timer
rdt_send(data)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0)
rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isACK(rcvpkt,0) )
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1)
stop_timerstop_timer
udt_send(sndpkt)start_timer
timeout
udt_send(sndpkt)start_timer
timeout
rdt_rcv(rcvpkt)
Wait for call 0from
above
Wait for
ACK1
rdt_rcv(rcvpkt)
4/19/05 CS11864
rdt3.0 in action
4/19/05 CS11865
rdt3.0 in action
4/19/05 CS11866
Performance of rdt3.0
example: 1 Gbps link, 15 ms prop. delay, 1KB packet:
Ttransmit
= 8kb/pkt10**9 b/sec
= 8 microsec
U sender: utilization – fraction of time sender busy sending 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources!
U sender
= .008
30.008 = 0.00027
microseconds
L / R
RTT + L / R =
L (packet length in bits)R (transmission rate, bps)
=
4/19/05 CS11867
rdt3.0: stop-and-wait operation
first packet bit transmitted, t = 0
sender receiver
RTT
last packet bit transmitted, t = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = RTT + L / R
U sender
= .008
30.008 = 0.00027
microseconds
L / R
RTT + L / R =
4/19/05 CS11868
Pipelined protocols
Pipelining: sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts range of sequence numbers must be increased buffering at sender and/or receiver
4/19/05 CS11869
Pipelining: increased utilization
first packet bit transmitted, t = 0
sender receiver
RTT
last bit transmitted, t = L / R
first packet bit arriveslast packet bit arrives, send ACK
ACK arrives, send next packet, t = RTT + L / R
last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK
U sender
= .024
30.008 = 0.0008
microseconds
3 * L / R
RTT + L / R =
Increase utilizationby a factor of 3!
4/19/05 CS11870
What if some packets get lost?
Two generic forms of pipelined protocols: go-Back-N, selective repeat
4/19/05 CS11871
Go-Back-NSender: k-bit seq # in pkt header “window” of up to N, consecutive unack’ed pkts allowed
ACK(n): ACKs all pkts up to, including seq # n (cumulative ACK) may receive duplicate ACKs (see receiver)
timer for each in-flight pkt timeout(n): retransmit pkt n and all higher seq # pkts in window
4/19/05 CS11872
GBN: sender extended FSM
Wait start_timerudt_send(sndpkt[base])udt_send(sndpkt[base+1])…udt_send(sndpkt[nextseqnum-1])
timeout
rdt_send(data)
if (nextseqnum < base+N) { sndpkt[nextseqnum] = make_pkt(nextseqnum,data,chksum) udt_send(sndpkt[nextseqnum]) if (base == nextseqnum) start_timer nextseqnum++ }else refuse_data(data)
base = getacknum(rcvpkt)+1If (base == nextseqnum) stop_timer else start_timer
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
base=1nextseqnum=1
rdt_rcv(rcvpkt) && corrupt(rcvpkt)
Call from application
Call from network
4/19/05 CS11873
GBN: receiver extended FSM
ACK-only: always send ACK for correctly-received pkt with highest in-order seq # may generate duplicate ACKs need only remember expectedseqnum
out-of-order pkt: discard (don’t buffer) -> no receiver buffering! Re-ACK pkt with highest in-order seq #
Wait
udt_send(sndpkt)
default
rdt_rcv(rcvpkt) && notcurrupt(rcvpkt) && hasseqnum(rcvpkt,expectedseqnum)
extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(expectedseqnum,ACK,chksum)udt_send(sndpkt)expectedseqnum++
expectedseqnum=1sndpkt = make_pkt(expectedseqnum,ACK,chksum)
4/19/05 CS11874
GBN in action
4/19/05 CS11875
Selective Repeat
receiver individually acknowledges all correctly received pkts buffers pkts (which may have arrived out-of-order), as
needed, for eventual in-order delivery to upper layer
sender only resends pkts for which ACK not received sender timer for each unACKed pkt
sender window N consecutive seq #’s again limits seq #s of sent, unACKed pkts
4/19/05 CS11876
Selective repeat: sender, receiver windows
4/19/05 CS11877
Selective repeat
data from above : if next available seq # in
window, send pkt
timeout(n): resend pkt n, restart timer
ACK(n) in [sendbase,sendbase+N]:
mark pkt n as received if n smallest unACKed pkt,
advance window base to next unACKed seq #
sender
pkt n in [rcvbase, rcvbase+N-1]
send ACK(n) out-of-order: buffer in-order: deliver (also
deliver buffered, in-order pkts), advance window to next not-yet-received pkt
pkt n in [rcvbase-N,rcvbase-1]
ACK(n)
otherwise: ignore
receiver
4/19/05 CS11878
Selective repeat in action
4/19/05 CS11879
Selective repeat: dilemma
Example: seq #’s: 0, 1, 2, 3 window size=3
receiver sees no difference in two scenarios!
incorrectly passes duplicate data as new in (a)
Q: what relationship between seq # size and window size?
4/19/05 CS11880
1 2 3 0 1 2 3 0
1 2 3 0 1 2 3 0
sender
reciver
(Max. seq# + 1) / 2 window-size
Sequence number: how many bits needed?
Example: Window size = 4, is 2 bits enough?
4/19/05 CS11881
Three basic componentsin reliable data delivery by retransmission
sequence number: used by both sender and receiver to uniquely identify individual frames
Acknowledgment (ACK): reception report sent by receiver
Retransmission by the sender upon TIMEOUTmust know how long to wait before retry
4/19/05 CS11882
M
M
M
M
Ht
HtHn
HtHnHl
Always Keeps the Big Picture in Mind
applicationtransportnetwork
linkphysical
Web browser
HTTP
TCP
Unreliable network data packet
delivery
Unreliable network data packet
delivery
Socket interface
Application process
Write bytes
TCP
Send buffer
Application process
Read bytes
TCP
Receive buffer
segment segment
Web serverHTTP
TCPSocket interface