1 slides are from richard yang from yale minor modifications are made network applications and...

64
1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Upload: meghan-lambert

Post on 16-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

1

Slides are from Richard Yang from Yale Minor modifications are made

Network Applications and Network Programming:

Web and P2P

Page 2: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Recap: FTP, HTTP

FTP: file transfer ASCII (human-readable

format) requests and responses stateful server one data channel and one control channel

HTTP Extensibility: ASCII requests, header lines,

entity body, and responses line Scalability/robustness

• stateless server (each request should contain the full information); DNS load balancing

• Client caching Web caches one data channel

2

Page 3: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Recap: WebServer Flow

TCP socket space

state: listeningaddress: {*.6789, *.*}completed connection queue: sendbuf:recvbuf:

128.36.232.5128.36.230.2

state: listeningaddress: {*.25, *.*}completed connection queue:sendbuf:recvbuf:

state: establishedaddress: {128.36.232.5:6789, 198.69.10.10.1500}sendbuf:recvbuf:

connSocket = accept()

Create ServerSocket(6789)

read request from connSocket

read local file

write file to connSocket

close connSocket

Page 4: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Recap: Writing High Performance Servers:

Major Issues: Many socket/IO operations can cause processing to block, e.g., accept: waiting for new connection; read a socket waiting for data or close; write a socket waiting for buffer space; I/O read/write for disk to finish

Thus a crucial perspective of network server design is the concurrency design (non-blocking) for high performance to avoid denial of service

A technique to avoidblocking: Thread

Page 5: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Multi-Threaded Web Server

5

connSocket = accept()

Create ServerSocket(6789)

Create thread for connSocket

read request from connSocket

read local file

write file to connSocket

close connSocket

read request from connSocket

read local file

write file to connSocket

close connSocket

Page 6: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Recap: Writing High Performance Servers Problems of multiple

threads Too many threads

throughput meltdown, response time explosion

Page 7: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Event-Driven Programming

Event-driven programming, also called asynchronous i/o Tell the OS to not block when accepting/reading/writing on sockets Java: asynchronous i/o

for an example see: http://www.cafeaulait.org/books/jnp3/examples/12/

Yields efficient and scalable concurrency Many examples: Click router, Flash web server, TP Monitors, etc.

Page 8: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Web Server

8

connSocket = accept()

Create ServerSocket(6789)

Create thread for connSocket

read request from connSocket

read local file

write file to connSocket

close connSocket

read request from connSocket

read local file

write file to connSocket

close connSocket

If the OS will not block on sockets, how may the program structure look

like?

Page 9: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Typical Structure of Async i/o

Typically, async i/o programs use Finite State Machines (FSM) to monitor the progress of requests The state info keeps track of the execution

stage of processing each request, e.g., reading request, writing reply, …

The program has a loop to check potential events at each state

9

Page 10: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Async I/O in Java

An important class is the class Selector, to support event loop

A Selector is a multiplexer of selectable channel objects example channels: DatagramChannel, ServerSocketChannel, SocketChannel

use configureBlocking(false) to make a channel non-blocking

A selector may be created by invoking the open method of this class

Page 11: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Async I/O in Java

A selectable channel registers events (called a SelectionKey) with a selector with the register method

A SelectionKey object contains two operation sets interest Set ready Set

A SelectionKey object has an attachment which can store data often the attachment is a

buffer

Selector

Selection Key

Selectable Channel

register

Page 12: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Async I/O in Java

Call select (or selectNow(), or select(int timeout)) to check for ready events, called the selected key set

Iterate over the set to process all ready events

Page 13: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Problems of Event-Driven Server

Difficult to engineer, modularize, and tune

No performance/failure isolation between Finite-State-Machines (FSMs)

FSM code can never block (but page faults, i/o, garbage collection may still force a block) thus still need multiple threads

Page 14: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Summary of Traditional C-S Web Servers

Is the application extensible, scalable, robust, secure?

14

app. server

C0

client 1

client 2

client 3

client n

DNS

Page 15: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Content Distribution History...

“With 25 years of Internet experience, we’ve learned exactly one way to deal with the exponential growth: Caching”.

(1997, Van Jacobson)

15

Page 16: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

16

Web Caches (Proxy)

Web caches/proxy placed at entrance of an ISP

Client sends all http requests to web cache if object at web

cache, web cache immediately returns object in http response

else requests object from origin server, then returns http response to client

client

Proxyserver

client

http request

http re

quest

http response

http re

sponse

http re

quest

http re

sponse

http requesthttp response

origin server

origin server

Page 17: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Web Proxy/Cache

Web caches give good performance because very often a single client

repeatedly accesses the same document

a nearby client also accesses the same document

Cache Hit ratio increases logarithmically with number of users

17

app. server

C0

client 1

client 2 client

3

ISP cache

client 4

client 5

client 6

ISP cache

Page 18: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

18

Benefits of Web Caching

Assume: cache is “close” to client (e.g., in same network)

smaller response time: cache “closer” to client

decrease traffic to distant servers link out of

institutional/local ISP network often bottleneck

originservers

public Internet

institutionalnetwork 10 Mbps LAN

1.5 Mbps access link

institutionalcache

Page 19: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

What went wrong with Web Caches? Web protocols evolved extensively to

accommodate caching, e.g. HTTP 1.1 However, Web caching was developed with a

strong ISP perspective, leaving content providers out of the picture It is the ISP who places a cache and controls it ISPs only interest to use Web caches is to reduce

bandwidth

In the USA: Bandwidth relative cheap In Europe, there were many more Web caches

However, ISPs can arbitrarily tune Web caches to deliver stale content

19

Page 20: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Content Provider Perspective

Content providers care about User experience latency Content freshness Accurate access statistics Avoid flash crowds Minimize bandwidth usage in their access

link

20

Page 21: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Content Distribution Networks Content Distribution Networks (CDNs) build an

overlay networks of caches to provide fast, cost effective, and reliable content delivery, while working tightly with content providers.

Example: Akamai – original and largest commercial CDN

operates over 25,000 servers in over 1,000 networks

Akamai (AH kuh my) is Hawaiian for intelligent, clever and informally “cool”. Founded Apr 99, Boston MA by MIT students

21

Page 22: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Basic of Akamai Operation Content provider server

provides the base HTML document Akamai caches embedded objects at a set

of its cache servers (called edge servers) Akamaization of embedded content: e.g.,

<IMG SRC= http://www.provider.com/image.gif > changed to

<IMGSRC = http://a661. g.akamai.net/hash/image.gif>

Akamai customizes DNS to select serving edge servers based on closeness to client browser server load

22

Page 23: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

More Akamai information

URL akamaization is becoming obsolete and only supported for legacy reasons Currently most content providers prefer to

use DNS CNAME techniques to get all their content served from the Akamai servers

still content providers need to run their origin servers

Akamai Evolution: Files/streaming Secure pages and whole pages Dynamic page assembly at the edge (ESI) Distributed applications

23

Page 24: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Lab: Problems of Traditional Content Distribution

24

app. server

C0

client 1

client 2

client 3

client n

DNS

Page 25: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

25

Objectives of P2P

Share the resources (storage and bandwidth) of individual clients to improve scalability/robustness

Bypass DNS to find clients with resources! examples: instant

messaging, skype

Internet

Page 26: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

P2P

But P2P is not new

Original Internet was a p2p system: The original ARPANET connected UCLA,

Stanford Research Institute, UCSB, and Univ. of Utah

No DNS or routing infrastructure, just connected by phone lines

Computers also served as routers

P2P is simply an iteration of scalable distributed systems

Page 27: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

P2P Systems

File Sharing: BitTorrent, LimeWireStreaming: PPLive, PPStream, Zatto,

…Research systems

Collaborative computing: SETI@Home project

• Human genome mapping• Intel NetBatch: 10,000 computers in 25

worldwide sites for simulations, saved about 500million

Page 28: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Peer-to-Peer Computing- 40-70% of total traffic in many networks- upset the music industry, drawn college

students, web developers, recording artists and universities into court

Source: ipoque Internet study 2008/2009

Page 29: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

29

Recap: P2P Objectives

Bypass DNS to locate clients with resources!examples: instant

messaging, skype

Share the storage and bandwidth of individual clients to improve scalability/robustness

Internet

Page 30: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

The Lookup Problem

Internet

N1

N2 N3

N6N5

N4

Publisher

Key=“title”Value=MP3 data… Client

Lookup(“title”)

?

find where a particular file is storedpay particular attention to see its equivalence of

DNS

Page 31: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

31

Outline

RecapP2P

the lookup problem Napster

Page 32: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

32

Centralized Database: Napster Program for sharing music over the Internet History:

5/99: Shawn Fanning (freshman, Northeasten U.) founded Napster Online music service, wrote the program in 60 hours

12/99: first lawsuit 3/00: 25% UWisc traffic Napster 2000: est. 60M users 2/01: US Circuit Court of

Appeals: Napster knew users violating copyright laws

7/01: # simultaneous online users:Napster 160K

9/02: bankruptcy

We are referring to the Napster before closure.03/2000

Page 33: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

33

Napster: How Does it Work?

Application-level, client-server protocol over TCP

A centralized index system that maps files (songs) to machines that are alive and with files

Steps: Connect to Napster server Upload your list of files (push) to server Give server keywords to search the full list Select “best” of hosts with answers

Page 34: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

34

Napster Architecture

Page 35: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Napster: Publish

I have X, Y, and Z!

Publish

insert(X, 123.2.21.23)...

123.2.21.23

Page 36: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Napster: Search

Where is file A?

Query Reply

search(A)-->123.2.0.18124.1.0.1

123.2.0.18

124.1.0.1

Page 37: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Napster: Ping

ping

123.2.0.18

124.1.0.1ping

Page 38: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Napster: Fetch

123.2.0.18

124.1.0.1fetch

Page 39: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

39

Napster MessagesGeneral Packet Format

[chunksize] [chunkinfo] [data...]

CHUNKSIZE: Intel-endian 16-bit integer size of [data...] in bytes

CHUNKINFO: (hex) Intel-endian 16-bit integer.

00 - login rejected 02 - login requested 03 - login accepted 0D - challenge? (nuprin1715) 2D - added to hotlist 2E - browse error (user isn't online!) 2F - user offline

5B - whois query 5C - whois result 5D - whois: user is offline! 69 - list all channels 6A - channel info 90 - join channel 91 - leave channel …..

Page 40: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

40

Centralized Database: Napster Summary of features: a hybrid design

control: client-server (aka special DNS) for files data: peer to peer

Advantages simplicity, easy to implement sophisticated

search engines on top of the index system

Disadvantages application specific (compared with DNS) lack of robustness, scalability: central search

server single point of bottleneck/failure easy to sue !

Page 41: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

41

Variation: BitTorrent

A global central index server is replaced by one tracker per file (called a swarm) reduces centralization; but needs other

means to locate trackers

The bandwidth scalability management technique is more interesting more later

Page 42: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

42

Outline

Recap P2P

the lookup problem Napster (central query server; distributed

data servers) Gnutella

Page 43: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Gnutella

On March 14th 2000, J. Frankel and T. Pepper from AOL’s Nullsoft division (also the developers of the popular Winamp mp3 player) released Gnutella

Within hours, AOL pulled the plug on it

Quickly reverse-engineered and soon many other clients became available: Bearshare, Morpheus, LimeWire, etc.

43

Page 44: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

44

Decentralized Flooding: Gnutella

On startup, client contacts other servents (server + client) in network to form interconnection/peering relationships servent interconnection used to forward control (queries,

hits, etc) How to find a resource record: decentralized flooding

send requests to neighbors neighbors recursively forward the requests

Page 45: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

45

Decentralized Flooding

B

A

C E

F

H

J

S

D

G

IK

M

N

L

Page 46: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

46

Decentralized Flooding

B

A

C E

F

H

J

S

D

G

IK

send query to neighbors

M

N

L

Each node forwards the query to its neighbors other than the onewho forwards it the query

Page 47: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

47

Background: Decentralized Flooding

B

A

C E

F

H

J

S

D

G

IK

M

N

L

Each node should keep track of forwarded queries to avoid loop ! nodes keep state (which will time out---soft state) carry the state in the query, i.e. carry a list of visited nodes

Page 48: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

48

Decentralized Flooding: Gnutella

Basic message header Unique ID, TTL, Hops

Message types Ping – probes network for other servents Pong – response to ping, contains IP addr, # of files, etc.

Query – search criteria + speed requirement of servent QueryHit – successful response to Query, contains addr

+ port to transfer from, speed of servent, etc.

Ping, Queries are flooded QueryHit, Pong: reverse path of previous message

Page 49: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

49

Advantages and Disadvantages of Gnutella

Advantages: totally decentralized, highly robust

Disadvantages: not scalable; the entire network can be swamped

with flood requests• especially hard on slow clients; at some point broadcast

traffic on Gnutella exceeded 56 kbps to alleviate this problem, each request has a TTL to

limit the scope• each query has an initial TTL, and each node forwarding

it reduces it by one; if TTL reaches 0, the query is dropped (consequence?)

Page 50: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Flooding: FastTrack (aka Kazaa) Modifies the Gnutella protocol into two-level hierarchy Supernodes

Nodes that have better connection to Internet Act as temporary indexing servers for other nodes Help improve the stability of the network

Standard nodes Connect to supernodes and report list of files

Search Broadcast (Gnutella-style) search across

supernodes Disadvantages

Kept a centralized registration prone to law suits

Page 51: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Optional Slides

51

Page 52: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Optional Slides

52

Page 53: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Aside: Search Time?

Page 54: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Aside: All Peers Equal?

56kbps Modem

10Mbps LAN

1.5Mbps DSL

56kbps Modem56kbps Modem

1.5Mbps DSL

1.5Mbps DSL

1.5Mbps DSL

Page 55: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

Aside: Network Resilience

Partial Topology Random 30% die Targeted 4% die

from Saroiu et al., MMCN 2002

Page 56: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

56

Asynchronous Network Programming

(C/C++)

Page 57: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

57

A Relay TCP Client: telnet-like Program

TCP client

TCP server

writen

readn

fgets

fputs

http://zoo.cs.yale.edu/classes/cs433/programming/examples-c-socket/tcpclient

Page 58: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

58

Method 1: Process and Thread process

fork() waitpid()

Thread: light weight process pthread_create() pthread_exit()

Page 59: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

59

pthread

Void main() { char recvline[MAXLINE + 1]; ss = new socketstream(sockfd);

pthread_t tid; if (pthread_create(&tid, NULL, copy_to, NULL)) { err_quit("pthread_creat()"); }

while (ss->read_line(recvline, MAXLINE) > 0) { fprintf(stdout, "%s\n", recvline); }}

void *copy_to(void *arg) { char sendline[MAXLINE];

if (debug) cout << "Thread create()!" << endl; while (fgets(sendline, sizeof(sendline), stdin)) ss->writen_socket(sendline, strlen(sendline));

shutdown(sockfd, SHUT_WR); if (debug) cout << "Thread done!" << endl;

pthread_exit(0);}

Page 60: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

60

Method 2: Asynchronous I/O (Select)

select: deal with blocking system callint select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);

FD_CLR(int fd, fd_set *set);FD_ZERO(fd_set *set);FD_ISSET(int fd, fd_set *set);FD_SET(int fd, fd_set *set);

Page 61: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

61

Method 3: Signal and Select

signal: events such as timeout

Page 62: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

62

Examples of Network Programming

Library to make life easier Four design examples

TCP Client TCP server using select TCP server using process and thread Reliable UDP

Warning: It will be hard to listen to me reading through the code. Read the code.

Page 63: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

63

Example 2: A Concurrent TCP Server Using Process or Thread

Get a line, and echo it back Use select() For how to use process or thread, see

later Check the code at:

http://zoo.cs.yale.edu/classes/cs433/programming/examples-c-socket/tcpserver

Are there potential denial of service problems with the code?

Page 64: 1 Slides are from Richard Yang from Yale Minor modifications are made Network Applications and Network Programming: Web and P2P

64

Example 3: A Concurrent HTTP TCP Server Using Process/Thread

Use process-per-request or thread-per- request

Check the code at:http://zoo.cs.yale.edu/classes/cs433/programming/examples-c-socket/simple_httpd