chapter 3. berkeley sockets: sockets are not like meeting someone on the street. client sends a...
Post on 19-Dec-2015
226 views
TRANSCRIPT
Chapter 3
Berkeley Sockets:
• Sockets are NOT like meeting someone on the street.• Client sends a message to a server that is “passively”
waiting for someone to talk to.
• X Windows the exception• client-server is traditional metaphor; it’s changing (p2p)
client server
Client-Server and Peer-to-Peer Examples:
• HTTP, SMTP, database access protocols• Napster, Gnutella, Freenet
1: Inform and Update
1: Inform and Update
1: Inform and Update2: Query for Content
3: F
ile
Tra
nsf
er peerscentralizeddirectory server
Napster: hybrid cs/p2p
What is a protocol?
• agreed-upon set of communication standards used by two or more software components.
• protocols are everywhere• typical TCP/IP connection uses 5+ protocols
mailclient
mailserver
TCP TCP
IP IP
ethernetdriver
ethernetdriver
SMTP Protocol
TCP Protocol
IP Protocol
EthernetProtocol
data flowethernet
Application Layer
Transport Layer
Network Layer
Datalink Layer
UserProcess
OperatingSystem
Binary vs Text protocols:
• TCP, IP, Ethernet, ARP are all binary protocols• Binary Protocol Problems:
– word size, – number architectures, – embedded control characters– transmission data format
SenderFormat
ReceiverFormat
Network Format
pack() unpack()
Text-based Protocols are Becoming Popular Common:
• In this book, all application protocols are text-based.• SIP, RADIUS, HTTP, SMTP, POP3, IMAP• Easy to debug• Resource hungry, time consuming
Berkeley Sockets:
• API to TCP/IP• Early to UNIX systems (always networked)• Developed at UC-Berkeley• Windows/Mac environments started without them, also
have their own interface to TCP/IP• Berkeley Sockets are the most common API
What is a Socket?
• A communications end-point.• A good metaphor: the mailbox in front of your house:
– put stuff in there and it disappears– open it and miraculously there is mail– if you don’t remove your mail, delivery stops– only your mail, no one else’s
• Ways they are different:– pay per piece of mail– one mailbox for all people (applications)– fixed pick-up and delivery times
Socket Anatomy:
• Domain:– networking protocols, addressing schemes supported– local (AF_UNIX) vs remote (AF_INET) communication
• interprocess communication vs TCP/IP networking– Exercise: Find definition location and value in OS
• Type:– types of data transmission service: “stream” or “datagram”– SOCK_STREAM vs SOCK_DGRAM (continuous vs discrete)– Exercise: Find definition location and value in OS
• Protocol:– TCP supports SOCK_STREAM– UDP supports SOCK_DGRAM
Allowed Combinations
AF_INET SOCK_STREAM tcp
AF_INET SOCK_DGRAM udp
AF_UNIX SOCK_STREAM PF_UNSPEC
AF_UNIX SOCK_DGRAM PF_UNSPEC
Domain Type Protocol
AF == Address Family PF == Protocol Family
Datagram Sockets:
• Connectionless, unreliable, unsequenced, record-oriented messages
• UDP, chief implementation• closely related to postal service
(best-effort, uncoordinated, irregular)
• no long-term relationship between sender and receiver• no built-in acknowledgement of receipt (up to you)• no flow-control (receipt buffers full? throw it away)• delivery unreliable, contents reliable
Stream Sockets:
• Sequenced, reliable, bi-directional, byte-oriented messages with end-to-end awareness
• TCP is the chief implementation• Phone conversation is best metaphor• stream == file (think of C++)• Similar to
• but bidirectional.• TCP is byte-stream on top of datagram (IP). So TCP
adds “intelligence” to IP.
open(FH, “| $command”);
Datagram vs Stream Sockets:
• Why use UDP? TCP is more reliable.• VoIP uses UDP because it is fast and reliable “enough”.• UDP good for applications with very light service.• UDP can’t be used if every byte must arrive.• UDP good for broadcasting and multicasting (only one
socket on the sender end).• TCP is one-to-one; UDP is many-to-many• DNS uses UDP with a TCP fall-back in case of a large
data exchange.• NFS uses UDP but is decidedly local
host
TCP/UDP
IP
Application
Socket Addressing:
host address = IP address
Application address = host address:port number
TCP/UDP local address = file descriptor/handle
IP Addresses:
• IP addresses are 32-bit integers.• IP addresses are presented to the user as four 1-byte
numbers
• The above is always a string.• Perl often uses the binary form of an IP address; you
need to convert
a.b.c.d where 0 ≤ a, b, c, d ≤ 255
($a,$b,$c,$d) = split(/\./,’137.140.8.101’);$packed_ip_addr = pack(‘C4’,$a,$b,$c,$d);
($a,$b,$c,$d) = unpack(‘C4’, $packed_ip_addr);$dotted_ip_addr = join(‘.’,$a,$b,$c,$d);
loopback:
• 127.0.0.1 is the IP address of the current machine (alias localhost).
• 127.0.0.1 is also called the loopback address
host
TCP/UDP
IP
Application
loopback; tests the IP stack
host Internet Address:
• hard coded by system administrator or• provided by a DHCP server (BOOTP in the old days)
host
TCP/UDP
IP
Application
possible to have twonetwork interfaces withdistinct IP addresses
somewhat complicated example:
host
TCP
IP
myhttpd
also possible to haveone network interfacewith multiple IP addresses
mywebsite.com andyourwebsite.com both hosted at port 80 on the same host. mywebsite.com has IP address: a.b.c.1yourwebsite.com has IPaddress: a.b.c.2
GET arrives fora.b.c.2:80 so TCP knowsto deliver to your httpd.
a.b.c.1a.b.c.2
yourhttpd
Reserved IP Addresses:
• SUNY NP has reserved 137.140.0.0 - 137.140.255.255• Inefficient since we use only 3000-4000 of these but
outside NP, all NP addresses referenced by a single routing table entry
• This entry is found by matching it to the number
• This is part of what we call the Routing Algorithm.
137.140.0.0/16
137.140.a.b & ff.ff.0.0
IP Address Ranges:
A 000.000.000.000 to 127.255.255.255
B 128.000.000.000 to 191.255.255.255
C 192.000.000.000 to 223.255.255.255
D 224.000.000.000 to 239.255.255.255
E 240.000.000.000 to 255.255.255.255
class range
0 7-bit netid A
1 0 B
1 1 0 C
1 1 1 0
1 1 1 1
D
E
14-bit netid
21-bit netid hostid
(subnetid/8, hostid/8)
(subnetid/16, hostid/8)
Exercise: The loopback belongs to what class?
New Paltz has whatkind of IP address?
IP Address Classes:
• Class B address: Network Part is 16 bits
• Class A address: Network Part is 8 bits• Class C address: Network Part is 24 bits
137.140.8.101
network partsubnet part host ID
Special IP Addresses:
• An IP address with HostID == 0 or HostID == 255 is never used for a particular computer.
• Some IP addresses never used on the public internet
137.140.8.0 represents the entire subnetwork and is called the subnet address.
137.140.8.255 is the broadcast address on the 137.140.8.0 subnet.
127.0.0.* loopback10.*.*.* private class A addresses172.16.*.* - 172.32.*.* private class B address192.168.*.* private class C addresses
used behindhome router
IPv6
• We are running out of IP addresses • A new version of IP, called IP version 6 or IPv6, has
been introduced • IPv6 uses 128-bit (16-byte) addresses• We expect to use up the IPv6 address space by 2030.
host
TCP/UDP
IP
Application
Port Numbers:
host address = IP address
Application address = host address:port number(protocol)
TCP/UDP local address = file descriptor/handle
Port Numbers are 2-byte numbers: 1-64k
Reserved Ports:
• Port numbers under 1025 are reserved for “well-known” services (Appendix C).
• Only root can execute a program that opens one of these ports.
• Ports above 49151 are called “ephemeral” and are used when you ask for a local port without specifying which one
• Ports between 1025 and 49151 are used for servers
sockaddr_in
• host address + port number (packed)• sockaddr_un for AF_UNIX addresses• useful functions:
$packed_ip_addr = inet_aton($dotted_ip_addr)$dotted_ip_addr = inet_ntoa($packed_ip_addr )$packed_sockaddr = sockaddr_in($port,$packed_ip_addr)($port,$packed_ip_addr) = sockaddr_in($packed_sockaddr)
$packed_sockaddr = pack_sockaddr_in($port,$dotted_ip_addr)($port,$dotted_ip_addr) = unpack_sockaddr($port, $packed_sockaddr)
scalarcontext
list context
#!/usr/bin/perl# file: daytime_cli.pl# Figure 3.4: A Daytime Client
use strict;use Socket;
use constant DEFAULT_ADDR => '127.0.0.1';use constant PORT => 13;use constant IPPROTO_TCP => 6;
my $address = shift || DEFAULT_ADDR;my $packed_addr = inet_aton($address);my $destination = sockaddr_in(PORT,$packed_addr);
socket(SOCK,PF_INET,SOCK_STREAM,IPPROTO_TCP) or die "Can't make socket: $!";connect(SOCK,$destination) or die "Can't connect: $!";
print <SOCK>;
what did youlearn aboutPF_INET?
handle
What does socket() do?
• Creates
• which is a 4-tuple that identifies a connection between daytime_cli.pl running on this machine and the daytimed program also running on this machine
• Uses the TCP protocol
127.0.0.1,port_num,127.0.0.1,13)
Identifying a Connection:
a.b.c.d
TCP
IP
daytime_cli.pl
m.n.o.p
TCP
IP
daytimed
55123 13
(a.b.c.d,55123,m.n.o.p,13)
(m.n.o.p,13,a.b.c.d,55123)
Network Names and Services:
• Normally we use host names and not IP addresses.• However, routing works with IP addresses.• Problem: Translate host names into IP addresses.
($name,$aliases,$type,$len,$packed_ip_addr) = gethostbyname($hostname)
$packed_ip_addr = gethostbyname($hostname)
canonical (official) name
($name,$aliases,$type,$len,$packed_ip_addr) = gethostbyname(“npmail.newpaltz.edu”);printf “$name\n”;
# prints
esperanza.newpaltz.edu
gethostbyname() 2:
$packed_ip_addr = gethostbyname(“a.b.c.d”);
pass an IP address to gethostbyname() and it returns the same address packed and ready to go
gethostbyaddr():
• Suppose you know and IP address and want more info. This is what we call “reverse lookup”
$name = gethostbyaddr($packed_ip_addr,$family); scalar context returns full domain name
(($name,$aliases,$type,$len,$packed_ip_addr) = gethostbyaddr ($packed_ip_addr,$family); list context returns same as gethostbyname(); family is usually AF_INET
gethostbyname() 3:
• How does gethostbyname() work?– looks first in /etc/hosts– forwards the request to the addresses found in /etc/resolv.conf
[pletch@joyous etc]$ cat /etc/hosts# Do not remove the following line, or various programs# that require network functionality will fail.127.0.0.1 localhost.localdomain localhost137.140.8.101 joyous.cs.newpaltz.edu joyous137.140.4.181 avalon.cs.newpaltz.edu [pletch@joyous etc]$
gethostbyname() 4:
• How does gethostbyname() work?– forwards the request to the addresses found in /etc/resolv.conf
[pletch@joyous etc]$ cat /etc/resolv.confsearch cs.newpaltz.edu newpaltz.edu engr.newpaltz.edunameserver 137.140.7.101nameserver 137.140.1.98nameserver 137.140.1.102[pletch@joyous etc]$
start search for incompletename by appending these local domain names in ordersend request here
Ethereal Output:
gethostbyname(“npmail”)
1st: looking fornpmail.cs.newpaltz.edu
2nd: looking fornpmail.newpaltz.edu
asking first listed name server
response is CNAMEresponse
Example (cont):
• npmail not found in /etc/hosts• resolve npmail using /etc/resolv.conf, building names by
appending local domain names found on search line.• first search location is cs.newpaltz.edu so first ethereal
output line is looking for npmail.cs.newpaltz.edu and this fails.
• second search location is newpaltz.edu so second ethereal output line is looking for npmail.newpaltz.edu and this succeeds.
Ethereal Output (2):
gethostbyname(“npmail.”)
“npmail.” considered to be a complete domain name; no appending from search line
looking for“npmail” withno domain attached
Ethereal Output:
gethostbyname(“mycomputer.cs.columbia.edu”)
name was long enough to be considered a completedomain name so try it first
when first query fails; fall backon search list
notice we only ask the local name server; it forwards the request toColumbia University name server
Sample Program 1:#!/usr/bin/perl# file: ip_trans.pl# Figure 3.5: Translating hostnames into IP addresses
use Socket;
while (<>) { chomp; my $packed_address = gethostbyname($_); unless ($packed_address) { print "$_ => ?\n"; next; } my $dotted_quad = inet_ntoa($packed_address); print "$_ => $dotted_quad\n";}
read from STDIN
$ ./ip_trans.pl hostnames.txtpesto.cshl.org => 143.48.31.104foo.bar.com => 64.15.205.248ntp.css.gov => ?
$
Sample Program 2:
#!/usr/bin/perl# file: name_trans.pl# Figure 3.6: Translating IP addresses into hostnames
use Socket;my $ADDR_PAT = /^\d+\.\d+\.\d+\.\d+$/;
while (<>) { chomp; die "$_: Not a valid address" unless /$ADDR_PAT/o; my $name = gethostbyaddr(inet_aton($_),AF_INET); $name ||= '?'; print "$_ => $name\n";}
matches 999.999.999.999
don’t look for alternative matchespack it first$name = $name || ‘?’;
Better pattern match for an IP address:
sub validIP { local ($IP) = @_; local @IP = split(/\./,$IP); local $Pattern = /^\d+\.\d+\.\d+\.\d+$/; return 0 if (@IP != 4); return 0 unless /$Pattern/o; return 0 unless ($IP[0] > 0 && $IP[0] < 255); return 0 unless ($IP[1] >= 0 && $IP[2] <= 255); return 0 unless ($IP[3] >= 0 && $IP[3] <= 255); return 0 unless ($IP[4] >= 0 && $IP[4] <= 255); return 1;}
Protocols and Services:
$number = getprotobyname($protocol);($name,$aliases,$number) = getprotobyname($protocol); scalar: converts “udp” to 17 and “tcp” to 6
$name = getprotobynumber($number);($name,$aliases,$number) = getprotobynumber($number);
$port = getservbyname($service,$protocol);($name,$aliases,,$port,$protocol) = getservbyname ($service,$protocol);
$name = getservbyname($port,$protocol);($name,$aliases,,$port,$protocol) = getservbyname ($port,$protocol);
all these functions searchthe /etc/servicesfile
/etc/services:
# Copyright (c) 1993-1999 Microsoft Corp.## This file contains port numbers for well-known services defined by IANA## Format:## <service name> <port number>/<protocol> [aliases...] [#<comment>]#
echo 7/tcpecho 7/udpdiscard 9/tcp sink nulldiscard 9/udp sink nullsystat 11/tcp users #Active userssystat 11/tcp users #Active usersdaytime 13/tcpdaytime 13/udp. . .
daytime_cli.pl with names:#!/usr/bin/perl# file daytime_cli2.pl# Figure 3.7: Daytime client, using symbolic host and service names
use strict;use Socket;
use constant DEFAULT_ADDR => '127.0.0.1';
my $packed_addr = gethostbyname(shift || DEFAULT_ADDR) or die "Can't look up host: $!";my $protocol = getprotobyname('tcp');my $port = getservbyname('daytime','tcp') or die "Can't look up port: $!";my $destination = sockaddr_in($port,$packed_addr);
socket(SOCK,PF_INET,SOCK_STREAM,$protocol) or die "Can't make socket: $!";connect(SOCK,$destination) or die "Can't connect: $!";
print <SOCK>;
shift from command line
read and print on same line
traceroute:
• this program let’s you see route from host A to host B• to understand traceroute you need to know about udp,
ttl, icmp and the fact that routers not
last NPhost
our connectionto theInternet
New York City
Seattle
Vancouver
netstat –nr:
• This command shows your routing table:
anything localto the network
anything remoteto the default router
no G-flag so 1st hopis the starting IP address
G-flag so 1st hop is the Gateway IP address
netstat –an:
• This command shows you all sockets - unix, internet
sshd
???
UNIX stream/dgram connections