linux kernel library - reusing monolithic kernel

Post on 10-Feb-2017

437 Views

Category:

Technology

17 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Linux Kernel Library: ReusingMonolithic Kernel

Hajime TazakiIIJ Innovation Institute

2016/07

AIST seminar vol.2

2 . 1

LKL in a nutshellLinux kernel library

a library of LinuxOctavian Purdila (Intel)'s work (since 2007?)Proposed on LKML (Nov. 2015)

2809 LoC (as of Apr. 2016)https://lwn.net/Articles/662953/

Purdila et al., LKL: The Linux kernel library, RoEduNet

2010.

2 . 2

LKL (cont'd)hardware-independent architecture (arch/lkl)provide an interface underlying environment

outsource dependenciesclock, memory allocation, schedulerrunning on Windows, Linux, FreeBSD

simplify I/O operation of devicesvirtio host implementationcould use the driver (of virtio) in Linux

Purdila et al., LKL: The Linux kernel library,RoEduNet 2010.

2 . 3

Benefitless ossication of new features

operating system personalityuserspace library has less deployment cost

Well-matured code base(e.g.) Linux kernel running in userspacesmall kernel, a bunch of librarybut in a dierent shape

Any problem in computer science can be solved withanother level of indirection.

(Wheeler and/or Lampson)

img src: https://www.ickr.com/photos/thomasclaveirole/305073153

2 . 4

2 . 5

What is reusing monolithic kernel ?Anykernel: originally in NetBSD rump kernel

We dene an anykernel to be an organization ofkernel code which allows the kernel's unmodied

drivers to be run in various congurations such asapplication libraries and microkernel style servers,and also as part of a monolithic kernel. -- Kantee

2012.

Using (unmodied) high-quality code base of monolithic kernelon dierent environment in dierent shapeby gluing additional stus

2 . 6

2 . 7

(a bitof)

Historyrump: 2007 (NetBSD)LKL: 2007 (Linux)DCE/LibOS: 2008 (Linux/FreeBSD)LibOS/LKL revival: 2015

LibOS merged to LKL

http://news.mynavi.jp/news/2015/03/25/285/https://news.ycombinator.com/item?id=9259292

http://www.phoronix.com/scan.php?page=news_item&px=Linux-Library-LibOShttp://lwn.net/Articles/639333/

2 . 8

2 . 9

LKL v.s. LibOS

LKL

LibOS

LKL v.s. LibOS (cont'd)LoC:

arch/lkl (LKL) < arch/lib (LibOS)di: the amount of stub code

commonsno modication to the original Linux codedescription of kernel context (by POSIX thread)outsourced resources (clock, memory, scheduler)CPU independent architecture

disLibOS: implemented with higher API (timer, irq, kthread) by pthreadLKL: implement IRQ, kthread, timer with pthread in lower layer

2 . 103 . 1

Implementation

2 . 10

3 . 2

Internals

1. Host backend (host_ops)2. CPU independent arch. (arch/lkl)3. Application interface

1. host backendenvironment dependent part

unify an interface across dierent platforms(rump-hypercall like)

device interface with Virtioblock device <=> disk imagenetworking <=> TAP, raw socket, DPDK, VDE

3 . 33 . 4

2. CPU independent architecturearchitecture (arch/lkl)

transparent architecture bind (as CPU arch)

require no modication to the other

2800 LoCthread information (struct thread_info)irq, timer, syscall handleraccess to underlying layer by host_ops

3 . 3

3 . 5

3. Application interface

1. use exposed API (LKL syscall)2. use host libc (LD_PRELOAD)3. extend (alternative) libc

3 . 6

API 1: use exposed API (LKLsyscall)

call entry points of LKL kernellkl_sys_open(), lkl_sys_socket()

almost same as ordinal syscallsreturn value, errno notication are dierent

can use LKL syscall and host syscall simultaneously

read ext4 le by lkl_sys_read() => write into host (Windows) by write()

3 . 7

API 2: hijack host standard librarydynamically replace symbols of host syscalls (of libc)

LD_PRELOADsocket() => lkl_sys_socket()

can use host binary (executable) as-islimitation of replaceable symbolsneeds syscall translation on non-linux host

3 . 8

API 3: extend (alternative) libconly call LKL syscall with our own libcalso introduce as a virtual CPU architecturea program can link this instead of host libc

can't access to (underlying) host resource directly via this lkl syscall

as a patch for musl libc

3 . 9

Usecase (applications)Use Case 1: instant kernel bypassUse Case 2: programs reusing kernel code in userspaceUse Case 3: unikernel

3 . 10

Use Case 1: instant kernel bypasssyscall redirection by LD_PRELOADcan use both LKL and host syscalls

new feature without touching host kernel

LD_PRELOAD=liblkl­super­tcp++.so firefox

3 . 11

Use Case 2: programs reusingkernel code in userspace

use kernel code without portingmount a lesystem w/o root privilege

can use both LKL and host syscalls

e.g., access to disk image of ext4 format on Windows1. open disk image (CreateFile())2. Mount (lkl_sys_mount())3. read a le in the disk image (lkl_sys_read())4. write a le to windows side (WriteFile())

3 . 12

Use Case 3: Unikernelsingle-application contained LKL

python + LKL, nginx + LKLonly LKL syscalls available

musl libc extensionrump hypcall (frankenlibc)

running on non-OS environment(on Xen Mini-OS via rumprun)

Work in progress

- http://www.linux.com/news/enterprise/cloud-computing/751156-are-cloud-operating-

systems-the-next-big-thing-

3 . 13

demos with linux kernel library

Unikernel on Linux (ping6 commandembedded kernel library)

Unikernel on qemu-arm (helloworld)

4 . 1

Kernel bypass/userspacenetworking

4 . 2

Network StackWhy in kernel space ?

the cost of packet was expensive at the era ('70s)now much cheaper

Getting fat (matured) after decades

code path is longer (and slower)hard to add new featuresfaced unknown issues

img src: http://www.makelinux.net/kernel_map/

4 . 3

Alternate network stackslwip (2002~)Arrakis [OSDI '14]IX [OSDI '14]MegaPipe [OSDI '12]mTCP [NSDI '14]SandStorm [SIGCOMM '14]uTCP [CCR '14]rumpkernel [ATC '09]FastSocket [ASPLOS '16]SolarFlare (2007~?)StackMap [ATC '16]libuinet (2013~)SeaStar (2014~)Snabb Switch (2012~)

4 . 4

MotivationsSocket API sucks

StackMap, MegaPipe, uTCP, SandStorm, IXNew API: no benet with existing applications

Network stack in kernel space sucksFastSocket, mTCP, lwip (SolarFlare?)

Compatibility is (also) importantrumpkernel, libuinet, Arrakis, IX, SolarFlare

Existing programming model sucksSeaStar

4 . 5

Techniquesbatching (syscall/NIC access)

Arrakis, IX, MegaPipe, mTCP, SandStorm, uTCPUtilize feature-rich kernel stack

rumpkernel, fastsocket, StackMapPorting to userspace stack

libuinet, SandStormKernel bypass (userspace network stack)

mTCP, SandStorm, uTCP, rumpkernel, libuinet, lwip, SeaStarbypass technique itself

netmap, PF_RING, raw socket, Intel DPDKConnection locality (multi-core scalability)

SeaStar, MegaPipe, mTCP, fastsocket, .....

4 . 6

ImplementationFull scratch

lwip (Arrakis, IX, SolarFlare?), mTCP, uTCP, SeaStarPorting based

libuinet, SandStormNew API

MegaPipe, StackMapAnykernel

rumpkernel, (LKL)

4 . 7

What's still missing ?some solves problems by specialization

avoiding generality taxperformance w/ specialization v.s. more features w/ generalizatione.g., less TCP stack features, new API breaks existing applicationssupport.

specialized v.s. generalizedgeneralization often involves indirectionindirection usually introduces complexity (Wheeler/Lampson)

performant and generalized ?

5 . 1

Performance study

5 . 2

ConditionsThinkStation P310 x2

CPU: Intel Core i7-6700 CPU @ 3.40GHz (8 cores)Memory: 32GBNIC: X540-T2

Linux 4.4.6-301 (x86_64) on Fedora 23Linux bridge (X540 + tap/raw socket)no DPDK... can't with hijack, etc

netperf (git ~v2.7.0)netserver (native)netperf (varied)

5 . 3

Conditions (cont'd)combinations

netperf (sendmmsg) + host stack (native)+ hijack library, native thread (hijack)+ frankenlibc/lkl, green thread (lkl-musl)netperf (sendmmsg) + lkl extension + frankenlibc (lkl-musl (skb prealloc))

pinned a processorusing taskset command

disable all ooad features (tso/gso/gro, rx/tx cksum)

TCP_RR (netperf)

5 . 4

UDP_STREAM (netperf)

5 . 5

UDP_STREAM (pps, netperf)

5 . 6

TCP_STREAM (netperf)

5 . 7

5 . 8

(ref.) LibOS results (as of Feb.2015)

1024 bytes UDP, own-crafted tool

throughput: <10% of Linux native

5 . 9

Observations (of benchmark)Native thread vs Green thread

better TCP_RR w/ native thread (pthread)better TCP_STREAM/UDP_STREAM w/ green thread???

avoiding dynamic allocation contributes a lotpenalized over MTU-sized payload on host stack (?)

6 . 1

SummaryMorphing monolithic kernel into an AnykernelVarious use cases

Userspace network stack (kernel bypass)Unikernel

Performance study in progress

https://github.com/lkl/linux

6 . 2

ReferenceLinux Kernel Library

Purdila et al., LKL: The Linux kernel library, RoEduNet 2010.

Rumpkernel (dissertation)Kantee, Flexible Operating System Internals: The Design andImplementation of the Anykernel and Rump Kernels, Ph.D Thesis,2012

Linux LibOS in generalTazaki et al. Direct Code Execution: Revisiting Library OSArchitecture for Reproducible Network Experiments, CoNEXT 2013

(LibOS in general)

https://github.com/lkl/linux

http://libos-nuse.github.io/https://lwn.net/Articles/637658/

7 . 1

Backups

7 . 4

Recent Updates

7 . 5

Updates (diff to lkl)(musl) libc integrationrump hypercall interface

via frankenlibc tools (for POSIX environment)via rumprun framework (for baremetall/xen/kvm environment)

more applicationsnetperf (signal handling, etc)nginxghc (Haskell runtime)

performance study

7 . 6

libc integrationstandard lib for LKL

all syscall direct to LKLapplication can use LKL transparently no special modications or hijack needed

based on musl libcintroduce new (sub) architecture lkl

rump hypercall interfacereplacement of LKL host_ops

or yet-another new host environment (rump)

has two thread primitives

pthread-based (as LKL does)ucontext-based (more ecient on non-MP)

can reduce

the eort of host_ops maintainancecomplexity of tall abstraction turtle

7 . 77 . 8

rump hypcall (cont'd)integration of

libc (musl for LKL, netbsd libc for rumpkernel)rump hypcall (on linux, freebsd, netbsd, qemu-arm, spike)host (platform) support code

frankenlibchas two namespaced libc(s)hyper call implementation can use libc

providesa libc.across-build toolchains (rumprun-cc, etc)

7 . 7

7 . 9

Usagebuild

% ./configure CC=rumprun­cc ; make

execution (with rexec launcher)

% rexec ./nginx disk­nginx.img tap:tap0 ­­ ­c nginx.conf

rexec executable [disk image le] [NIC] -- [executable specic options]

7 . 10

Codeshttps://github.com/libos-nuse/lkl-linuxhttps://github.com/libos-nuse/muslhttps://github.com/libos-nuse/frankenlibchttps://github.com/libos-nuse/rumprunhttps://github.com/libos-nuse/nginxhttps://github.com/libos-nuse/ghc

top related