slides on cross-domain call and remote procedure call (rpc)

44
Slides on cross-domain call and Remote Procedure Call (RPC)

Upload: joy-joyce-edman

Post on 14-Dec-2015

234 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Slides on cross-domain call and Remote Procedure Call (RPC)

Slides on cross-domain call and Remote Procedure Call (RPC)

Page 2: Slides on cross-domain call and Remote Procedure Call (RPC)

This classic paper is a good example of a microbenchmarking study. It also explains the RPC abstraction and serves as a case study of the nuts-and-bolts of I/O, and related performance issues. Or is it “just hacking”?

Page 3: Slides on cross-domain call and Remote Procedure Call (RPC)

Request/reply messaging

client server

request

reply

compute

Page 4: Slides on cross-domain call and Remote Procedure Call (RPC)

Messaging: examples and variations

• Details vary!– Supercomputing: MPI over fast interconnect

– High-level messages (e.g., HTTP) over sockets and network communication

– Microkernel / Mach / MacOS: high-speed local cross-domain messaging ports. (Also Windows/NT)

– Android: binder, and per-thread message queues

• Common abstraction: “Remote Procedure Call”– RPC for clients/serves talking over a network.

– For local processes it is often called cross-domain call or “Local Procedure Call” (LPC, in Windows).

Page 5: Slides on cross-domain call and Remote Procedure Call (RPC)

Network File System (NFS)

[ucla.edu]

Remote Procedure Call (RPC)External Data Representation (XDR)

Page 6: Slides on cross-domain call and Remote Procedure Call (RPC)

Cross-domain call: the basics

A B

Request: block A, wakeup B. Reply: block B, wakeup A.

A: syscall to post a message to B (e.g., a message queue). Wait for reply.

B: syscalls to receive an incoming message.Wait for request.

Page 7: Slides on cross-domain call and Remote Procedure Call (RPC)

Cross-domain call: the basics

A B

A: syscall to post a message to B (e.g., a message queue). Wait for reply.

B: syscalls to receive an incoming message.Wait for request.

Copy data from A to B, or use a shared memory region.

Transfer control through kernel: block A, wakeup B.Note: could use a socket, or fast IPC for processes on same host.

Page 8: Slides on cross-domain call and Remote Procedure Call (RPC)

“Marshalling” (“serializing”)

A B

What if the data is a complex linked structure? Must “pack” it as a sequence of bytes into a message, and reconstitute it on the other side.

Page 9: Slides on cross-domain call and Remote Procedure Call (RPC)

Concept: RPC

Remote Procedure Call (RPC) is request/response interaction through a published API, using IPC messaging to cross an inter-process boundary.

API stubs

RPC is used in many standard Internet services. It is also the basis for component frameworks like DCOM, CORBA, and Android. Software is packaged into named “objects” or components. Components may publish interfaces and/or invoke published interfaces of other components. Components may execute in different processes and/or on different nodes.

generated from an Interface Description Language (IDL)

Establishing an RPC connection to a named remote interface is often called binding.

Page 10: Slides on cross-domain call and Remote Procedure Call (RPC)

The classic picture

Implementing RPC

Birrell/Nelson 1984

Page 11: Slides on cross-domain call and Remote Procedure Call (RPC)

RPC Execution

• In general, RPC enables request/response exchanges (e.g., by messaging over a network) that “looks like” a local procedure call.

• In Android, RPC allows flexible interaction among apps running in different processes, across the kernel boundary.

• How is this different from a local procedure call?

• How is it different from a system call?

Page 12: Slides on cross-domain call and Remote Procedure Call (RPC)

RPC: Language integration

Page 13: Slides on cross-domain call and Remote Procedure Call (RPC)

RPC: Language integration

Stubs link with the client/server code to “hide” the boundary crossing.

– They “marshal” args/results

– i.e., translate to/from some standard network stream format

– Also known as linearize, serialize

– …or “flatten”

– Propagate PL-level exceptions

– Stubs are auto-generated from an Interface Description Language (IDL) file by a stub compiler tool at software build time, and linked in.

– Client and server must agree on the protocol signatures in the IDL file.

Page 14: Slides on cross-domain call and Remote Procedure Call (RPC)

Marshalling: a metaphor

Android Architecture and Binder

Dhinakaran PandiyanSaketh Paranjape

Page 15: Slides on cross-domain call and Remote Procedure Call (RPC)

Stubs

• RPC stubs are procedures linked into the client and server.

– RPC stubs are similar to system call stubs, but they do more than just trap to the kernel.

– The RPC stubs construct/deconstruct a message transmitted through a messaging system.

– Binder is an example of such a messaging system, implemented as a Linux kernel plug-in module (a driver) and some user-space libraries.

• The stubs are generated by a tool that takes a description of the application’s RPC API written in an Interface Description Language.

– Looks like any interface definition…

– List of method names and argument/result types and signatures.

– Stub code marshals arguments into request message, marshals results into a reply message.

Page 16: Slides on cross-domain call and Remote Procedure Call (RPC)

Stubs and IDL

This picture illustrates the stub generation and build process for an RPC system based on the C language (e.g., ONC or Sun RPC, used in NFS).

Page 17: Slides on cross-domain call and Remote Procedure Call (RPC)

Another picture of RPC

Implementing RPC

Birrell/Nelson 1984

Page 18: Slides on cross-domain call and Remote Procedure Call (RPC)

Threads and RPC

[OpenGroup, late 1980s]

Q: How do we manage these “call threads”?

A: Create them as needed, and keep idle threads in a thread pool.

When an RPC call arrives, wake up an idle thread from the pool to handle it.

On the client, the client thread blocks until the server thread returns a response.

Page 19: Slides on cross-domain call and Remote Procedure Call (RPC)

Thread pool: idealized

Incoming request(event)queue

worker loop

Magic elastic worker pool

Resize worker pool to match incoming request load:

create/destroy workers as needed. dispatch

idle workers

Workers wait here for next request dispatch.

(Workers are threads.)

Handle one event,

blocking as necessary.

When handler is complete,

return to worker pool.

handler

handler

handler

Page 20: Slides on cross-domain call and Remote Procedure Call (RPC)

Event/request queue

Incoming eventqueue

worker loop

Handle one event,

blocking as necessary.

When handler is complete,

return to worker pool.

We can synchronize an event queue with a monitor: a

mutex/CV pair.

Protect the event queue data structure itself with the mutex.

dispatch

threads waiting on CV

Workers wait on the CV for next event if the event queue

is empty. Signal the CV when a new event arrives. This is a producer/consumer problem.

handler

handler

handler

Page 21: Slides on cross-domain call and Remote Procedure Call (RPC)

Some details

• How is incoming data delivered to the correct process?

• On the return, how does the Receiver know which thread to wake up?

• How does the wakeup happen?

• What if a request/reply is dropped in the net?

• What if a request/reply is duplicated?

• How does the client find the server? (binding)

• What if the server fails?

• How to go faster if client/server are on the same host? (“LRPC” or “LPC”)

Page 22: Slides on cross-domain call and Remote Procedure Call (RPC)

Firefly vs. Web/HTTP etc.

• Firefly does not use TCP/IP.

• Instead, it has a custom packet protocol. Tradeoffs?

• But some of the basics of network communication are similar/identical.

• How is (say) HTTP different from RPC?

Page 23: Slides on cross-domain call and Remote Procedure Call (RPC)

Networked services: big picture

Internet “cloud”

server hosts with server applications

client applications

NIC device

kernel network software

client host

Data is sent on the network as messages called packets.

Page 24: Slides on cross-domain call and Remote Procedure Call (RPC)

A simple, familiar example

“GET /images/fish.gif HTTP/1.1”

sd = socket(…);connect(sd, name);write(sd, request…);read(sd, reply…);close(sd);

s = socket(…);bind(s, name);sd = accept(s);read(sd, request…);write(sd, reply…);close(sd);

request

reply

client (initiator) server

Page 25: Slides on cross-domain call and Remote Procedure Call (RPC)

End-to-end data transfer

transmit packet to network interface

move data from application to system buffer

TCP/IP protocol

compute checksum

network driver

sender

deposit packet in host memory

move data from system buffer to

application

TCP/IP protocol

compare checksum

network driver

receiver

DMA + interruptDMA + interrupt

buffer queues(mbufs, skbufs)

buffer queues

packet queues packet queues

Page 26: Slides on cross-domain call and Remote Procedure Call (RPC)

Ports and packet demultiplexing

Data is sent on the network in messages called packets addressed to a destination node and port. Kernel network stack demultiplexes incoming network traffic: choose process/socket to receive it based on destination port.

Network adapter hardware aka, network interface controller (“NIC”)

Incoming network packets

Apps with open

sockets

Page 27: Slides on cross-domain call and Remote Procedure Call (RPC)

Wakeup from interrupt handler

sleep

ready queue

interrupt

trap or fault return to user mode

wakeup

sleep queue

switch

Example 1: NIC interrupt wakes thread to receive incoming packets.Example 2: disk interrupt wakes thread when disk I/O completes. Example 3: clock interrupt wakes thread after N ms have elapsed.

Note: it isn’t actually the interrupt itself that wakes the thread, but the interrupt handler (software). The awakened thread must have registered for the wakeup before sleeping (e.g., by placing its TCB on some sleep queue for the event).

Page 28: Slides on cross-domain call and Remote Procedure Call (RPC)

Process, kernel, and syscalls

trap

read() {…}

write() {…}

copyout copyin

user buffers

kernel

process user space

read() {…}

syscall dispatchtable

I/Odescriptor table

syscall stub

Return to user mode

I/O objects

Page 29: Slides on cross-domain call and Remote Procedure Call (RPC)

Firefly: shared buffers

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 30: Slides on cross-domain call and Remote Procedure Call (RPC)

Binding

Implementing RPC

Birrell/Nelson 1984

Page 31: Slides on cross-domain call and Remote Procedure Call (RPC)

Optimize for the common case

Performance of Firefly RPC

Michaels Schroeder and Burrows

The slower path through the operating-system address space is used

when the interrupt routine cannot find the appropriate RPC thread in the

call table, when it encounters a lock conflict in the call table, or when it

handles a non-RPC packet.

Several of the structural features used to improve RPC performance

collapse layers of abstraction. Programming a fast RPC is not for the

squeamish.

Page 32: Slides on cross-domain call and Remote Procedure Call (RPC)

Latency and throughput

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 33: Slides on cross-domain call and Remote Procedure Call (RPC)

Marshalling overhead

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 34: Slides on cross-domain call and Remote Procedure Call (RPC)

Steps and overhead

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 35: Slides on cross-domain call and Remote Procedure Call (RPC)

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 36: Slides on cross-domain call and Remote Procedure Call (RPC)

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 37: Slides on cross-domain call and Remote Procedure Call (RPC)

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 38: Slides on cross-domain call and Remote Procedure Call (RPC)

Performance of Firefly RPC

Michaels Schroeder and Burrows

Page 39: Slides on cross-domain call and Remote Procedure Call (RPC)

ASPLOS 1991

Page 40: Slides on cross-domain call and Remote Procedure Call (RPC)

Schroeder and Burrows suggest that tripling CPU speed would reduce SRC RPC latency for a small packet by about 50%, on the expectation that the 83% of the time not spent on the wire will decrease by a factor of 3. Looking at Table 3, however, we see that much of the RPC time goes to functions that may not benefit proportionally from modern architectures. ……The only real ‘computation” in RPC, in the traditional sense, is the checksum processing, and this in fact is memory-intensive and not compute-intensive; each checksum addition is paired with a load …. Thus, Ousterhout found in the Sprite operating system [Ousterhout et al. 88] that kernel-to-kernel null RPC time was reduced by only half when moving from a Sun-3/75 to a SPARCstation-l, even though integer performance increased by a factor of five [Ousterhout 90a].

Page 41: Slides on cross-domain call and Remote Procedure Call (RPC)

Android: object-based RPC channels

JVM+lib

Linux kernel

Activity Manager Service

etc.

Android services and libraries communicate by sending messages through shared-memory channels set up by binder.

JVM+lib

Android binder

A client binds to a service.

Bindings are reference-counted.

Services register to

advertise for clients.

an add-on kernel driver for /dev/binder object RPC

Page 42: Slides on cross-domain call and Remote Procedure Call (RPC)

Binder is a add-on driver module that runs in the kernel. Unix drivers can define arbitrary “I/O control” APIs invoked through the ioctl system call. The ioctl syscall was designed for device control, but it serves as a general mechanism to extend the kernel and syscall interface (“kitchen sink”).

Kernel space

Page 43: Slides on cross-domain call and Remote Procedure Call (RPC)

Binder: thread pool details

“The system maintains a pool of transaction threads in each process that it runs in. These threads are used to dispatch all IPCs coming in from other processes.

For example, when an IPC is made from process A to process B, the calling thread in A blocks in transact() as it sends the transaction to process B. The next available pool thread in B receives the incoming transaction, calls Binder.onTransact() on the target object, and replies with the result Parcel.

Upon receiving its result, the thread in process A returns to allow its execution to continue. …”

[http://developer.android.com/reference/android/os/IBinder.html]Note: in this setting, a “transaction” is just an RPC request/response exchange.

Page 44: Slides on cross-domain call and Remote Procedure Call (RPC)

Stubs and Interface Description Language

This picture illustrates the Android class structure for objects invoked over binder RPC. …including classes generated via Android’s IDL (AIDL).