distributing over the web

Post on 11-Apr-2017

136 Views

Category:

Engineering

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

https://github.com/naighes

distributing over the web

@NicolaBaldi

stillnoagreementonReST…:-)

WhataWonderfulWorld

4

cryptic…

“a resource is a conceptual mapping

to a set of entities, not the entity that

corresponds to the mapping at any

particular point in time”

very cryptic…

coupling

ReST is loosely coupled:

§ there are no «contracts» in ReST!

The only contract is represented

by the URI.

§ clients does not (MUST not)

depend on server-side

implementations.

constraintsI

client-server

§ driven by SoC (separation of concerns).

• UI portability.

• scalability (server components simplification).

• loose coupling (client & server will evelove

indipendently).

constraintsII

stateless

§ visibility (just look at «request»).

§ reliability (recovering from partial failures).

§ scalability (no need to store client data).

constraintsIII

cache

NOTE: caching «just» improves network efficiency.

§ data within a response to a request implicitly or

explicitly labeled as cacheable or non-cacheable.

• efficiency, scalability, and user-perceived

performance by reducing the average latency of a

series of interactions.

constraintsIII

“the goal of caching in HTTP/1.1 is to

eliminate the need to send requests in many

cases (expiration), and to eliminate the need

to send full responses in many other cases

(validation)”

constraints III

constraints III

constraintsIV

uniform interface

the ReST interface is designed to be efficient for large-

grain hypermedia data transfer.

§ identification of resources.

§ manipulation of resources through representations.

§ self-descriptive messages.

§ HatEoAS, HatEoAS, HatEoAS and HatEoAS again!

data

ReST components communicate by transferring a

representation of a resource.

a representation is a sequence of bytes, plus

representation metadata to describe those bytes.

§ type (not format) selected dynamically:

• based on capabilities or desires of recipients.

• based on the nature of the resource.

resources

§ any information that can be named can be a resource.

§ a resource is a conceptual mapping to a set of entities.

§ every resource must provide an identifier.

URI– URL- URN

URI(UniformResourceIdentifier)

asetofcharactersusedtoidentify anameor aresourceontheInternet

URL(UniformResourceLocator)

where anidentifiedresourceisavailableandhow to

retrieveit(http://,ftp://,smb://…)

URN(UniformResourceName)

TheURNdefinessomething'sidentity

URI– URL- URN

URL: ftp://ftp.is.co.za/rfc/rfc1808.txt

URL: http://www.ietf.org/rfc/rfc2396.txt

URL: ldap://[2001:db8::7]/c=GB?objectClass?one

URL: mailto:john.doe@example.com

URL: news:comp.infosystems.www.servers.unix

URL: telnet://192.0.2.16:80/

URN: urn:oasis:names:specification:docbook:dtd:xml:4.1.2

URN: urn:isbn:0-486-27557-4

it’samatterofimplementation

HTTP != ReST

HTTP is a ReSTful protocol for exposing resources across

distributed systems.

HTTP doesn’t map 1:1 to ReST, it's an implementation of ReST.

GET

> GET /orders/1772634< 200 Ok< ETag: 686897696a7c876b7e< Last-Modified: Thu, 05 Jul 2012 15:31:30 GMT

GOOD

> GET /GetOrder?id=1772634< 200 Ok

BAD

include Last-Modified header whenever feasible!

localization

> GET /entries/1772634> Accept-Language: it, en-gb;q=0.8, en;q=0.7< 200 Ok< ...< Content-Language: en

GOOD

> GET /GetEntry?id=1772634&languageId=4< 200 Ok

BAD

language is a matter of representation

aliasing

> GET /weather/tomorrow< 302 Found< Location: /weather/2015-03-21T12%3A24%3A26Z< Link: </weather/2015-03-21T12%3A24%3A26Z>; rel="canonical"

GOOD

> GET /GetWeatherForecastForTomorrow< 200 Ok

BAD

POST

> POST /entries/188273/comments< 201 Created< Location: /entries/188273/comments/2

GOOD

> POST /AddComment?entryId=188273< 200 Ok

BAD

[…] is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI

PUT

> PUT /entries/188274< 204 No Content (when resource is updated)

< 201 Created (when resource is created)

GOOD

> PUT /AddEntry< 200 Ok

BAD

PUT is (MUST be) idempotent, while POST is not.

idempotency

donotPATCHlikeanidiot!

PATCH is not about sending an updated value,

rather than the entire resource.

Stop doing this right now!

> PATCH /entries/188273{"email": "mario@rossi.com"}

BAD

> PATCH /entries/188273?email=mario%40rossi.com

BAD

PATCH

PATCH method requests that a set of changes have to

be applied to the resource this set contains

instructions describing how a resource should be

modified to produce a new version.

> PATCH /entries/188273[description of changes]

GOOD

the entire set of changes must be applied atomically.

PATCH

you have to use a media type that defines semantics

for PATCH (RFC 6902).

> PATCH /entries/188273[{"op": "replace","path": "/email","value": "mario@rossi.com"}]

< 200 Ok

GOOD

PATCH:asidenotes

• Fielding's dissertation does not define any way to

partially modify resources.

• PATCH does not transfer a complete

representation, but ReST doesn't require

representations to be complete anyway.

checkifaresourceexists

> HEAD /orders/1772634< 404 Not Found

GOOD

> GET /ExistsOrder?id=1772634< 200 Ok

BAD

save bandwidth!

longrunningprocesses

> POST /entries< 202 Accepted< Location: /queue/982773

> GET /queue/982773< 200 Ok< {"status": "pending",

"eta": "00:01:23"}< Link: </queue/982773>; rel="cancel";

> GET /queue/982773< 303 See Other< Location: /entries/188275

optimisticconcurrency

> GET /orders/123< 200 Ok< ETag: 686897696a7c876b7e

> PUT /orders/123> If-Match: 686897696a7c876b7e< 412 Precoondition Failed

always, always and always rely on ETag!

did I say “always”? :-)

409Conflict

> POST /users< 409 Conflict< {"error": "username_already_taken"}

the request could not be completed due to a conflict

with the current state of the resource.

400BadRequest

> POST /entries/188273/comments< 400 Bad Request< {"message": "problems parsing payload."}

the request could not be understood by the server

due to malformed syntax.

422UnprocessableEntity

> POST /entries/188273/comments< 422 Unprocessable Entity< {"message": "validation failed.",

"errors": [{"path": "/title","code": "missing_field"}]}

server was unable to process the contained

instructions (eg. semantically erroneous instructions).

customerrorcodes

> POST /entries/188273/comments< 489 Entry Does Not Allow New Comments

don’tdothat

BAD

503ServiceUnavailable

> GET /entries/188273< 503 Service Unavailable< Retry-After: Mon, 20 Apr 2015 23:59:59 GMT

the server is currently unable to handle the request

due to a temporary overloading or maintenance of

the server.

pagination

> GET /user/7364/orders?page=3&page_size=10< 200 Ok< Link: </user/7364/orders?page=2&page_size=10>; rel="prev", </user/7364/orders?page=4&page_size=10>; rel="next", </user7364/orders?page=11&page_size=10>; rel="last"

include Link header, embrace HatEoAS!

authorization

> GET /user/7364/orders< 401 Unauthorized< WWW-Authenticate: Bearer realm="example"

on 401 status code you MUST include a WWW-

Authenticate header field containing a challenge

applicable to the requested resource.

conditionalrequests

> GET /orders/123> If-None-Match: "644b5b0155e6404a9cc4bd9d8b1ae730" < 304 Not Modified

body will be (MUST be) empty on a 304 response

save bandwidth!

> GET /orders/123> If-Modified-Since: Thu, 05 Jul 2012 15:31:30 GMT< 304 Not Modified

versioning

by URL: it sucks!

§ URI design should have less natural constraints and it

should be preserved over time.

§ it disrupts the concept of HatEoAS.

§ resource URIs that API users can depend on should be

permalinks.

by Accept header: application/vnd.contoso.cart-v2+json

§ the only drawback is that it could give you a few

headaches when it comes to testing / debugging.

versioning

“No, HTTP doesn’t version the interface

names — there are no numbers on the methods

or URIs. That doesn’t mean other aspects of the

communication aren’t versioned. We do want

change, since otherwise we would not be able

to improve over time, and part of change is

being able to declare in what language the data

is spoken. We just don’t want breaking change.

Hence, versioning is used in ways that are

informative rather than contractual”

falsemyth

ReST is suitable for CRUD…

BULLSHITReST != ODBC => Transaction boundaries are

defined by resources themselves.

ReST is far away from CRUD.

HatEoAS

Hypermedia as the Engine of Application State

§ API clients should not construct URLs on their

own.

§ decoupling: future upgrades of the API easier for

developers.

> GET /users/mariorossi< Link: <https://api.contoso.com/users/mariorossi>; rel="related"...

cookies

“An example of where an inappropriate

extension has been made to the

protocol to support features that

contradict the desired properties of the

generic interface is the introduction of

site-wide state information in the form

of HTTP cookies”

cookie-based applications on the web will never be reliable

what’s«wrong»with1.X

§ clients achieve concurrency by using multiple

connections.

§ HTTP headers cause a lot of network traffic on req/rsp.

«mosaic»

bandwithastheprimary metric

for most web-browsing use

cases, an internet connection

over several Mbps offers but a

tiny improvement in performance

TCPSlow-Start

• client sends a SYN packet which advertises its maximum

buffer size

• sender replies by sending several packets back

• then each time it receives an ACK from the client, it doubles

the number of packets that can be "on the wire“ (cwnd) while

unacknowledged. -> that allows exponential grow

• …

avoid sending more data than the network is capable of transmitting

HTTP traffic tends to make use of short and bursty connections - in these cases we often never even reach the full capacity of our pipes

whydoweneedheadercompression

§ page with 80 assets, each request has 1400 bytes of

headers.

• 7-8 roundtrips to get the headers out “on the wire”…

without counting response time.

pipelining

client server

does not scale very well… :-(

pipelining

client server

much better, but it suffers

from “head of line”

TCPSlow-Start

focus on cutting down the round-trip time between the client and server, not necessarily

just investing in bigger pipes

• re-use your TCP connections

• support HTTP keep-alive and pipelining

• think about end-2-end latency

specs

HTTP/2 - RFC7540

HPACK - RFC7541

terminology

frameanatomy

+-----------------------------------------------+| Length (24) |+---------------+---------------+---------------+| Type (8) | Flags (8) |+-+-------------+---------------+-------------------------------+|R| Stream Identifier (31) |+=+=============================================================+| Frame Payload (0...) ...+---------------------------------------------------------------+

frametypes

- HEADERS

- DATA

- SETTINGS

- WINDOWS_UPDATE

- PUSH_PROMISE

- PRIORITY

- RST_STREAM

- GOAWAY

- PING

- CONTINUATION

} basis of HTTP request

DATAframe

+---------------+|Pad Length? (8)|+---------------+-----------------------------------------------+| Data (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

HEADERSframe

+---------------+|Pad Length? (8)|+-+-------------+-----------------------------------------------+|E| Stream Dependency? (31) |+-+-------------+-----------------------------------------------+| Weight? (8) |+-+-------------+-----------------------------------------------+| Header Block Fragment (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

GETrequestexample

GET /resource HTTP/1.1 HEADERSHost: example.org ==> + END_STREAMAccept: image/jpeg + END_HEADERS

:method = GET:scheme = https:path = /resourcehost = example.orgaccept = image/jpeg

GETresponseexample

HTTP/1.1 304 Not Modified HEADERSETag: "xyzzy" ==> + END_STREAMExpires: Thu, 23 Jan ... + END_HEADERS

:status = 304etag = "xyzzy"expires = Thu, 23 Jan ...

GETrequestexample

POST /resource HTTP/1.1 HEADERSHost: example.org ==> - END_STREAMContent-Type: image/jpeg - END_HEADERSContent-Length: 123 :method = POST

:path = /resource{binary data} :scheme = https

CONTINUATION+ END_HEADERScontent-type = image/jpeghost = example.orgcontent-length = 123

DATA+ END_STREAM

{binary data}

serverpush

in addition to the response to the original request, the server can push additional resources to the client without the client

having to request each one explicitly

serverpush

• server receives HEADERS frame asking for index.html in stream 3,

and it can forecast the need for styles.css and script.js

• server sends a PUSH_PROMISE for styles.css and a PUSH_PROMISE

for script.js in stream 3

• server sends a HEADERS frame in stream 3 for responding to the

request for index.html

• server sends DATA frame(s) with the content of index.html in stream

3

• server sends HEADERS frame for the response to styles.css in

stream 4 and then HEADERS for the response to script.js in stream 6

• server sends DATA frames for the contents of styles.css in stream 4

and DATA frames for the contents of script.js in stream 6

PUSH_PROMISEframe

+---------------+|Pad Length? (8)|+-+-------------+-----------------------------------------------+|R| Promised Stream ID (31) |+-+-----------------------------+-------------------------------+| Header Block Fragment (*) ...+---------------------------------------------------------------+| Padding (*) ...+---------------------------------------------------------------+

what’snext

• hosting and distributing petabyte datasets

• computing on large data across organizations

• high-volume high-definition on-demand or real-time

media streams

• versioning and linking of massive datasets

• preventing accidental disappearance of

important files

• more

67

donotpoweritdown!

donotpoweritdown!

HTTPencourageshypercentralization

centrally managed web servers inevitably shut

down

HTTPisinefficient

2,576,067,779 views clocks in at 117 Megabytes 301.4 Petabytes

HTTPisinefficient

2,576,067,779 views clocks in at 117 Megabytes 301.4 Petabytes

assuming 1 cent per gigabyte means about 3.000.000…

overdependenceontheInternetbackbone

IPFS

instead of looking for a centrally-controlled location and asking it what it thinks /img/neocitieslogo.svg is, what if we instead asked a distributed network of millions of computers not for the name of a file, but for the content that is supposed to be in the file?

This is precisely what IPFS does.

https://ipfs.io/

75

Q&Atime

top related