build high-performance, scalable, distributed applications with stacks of containers. jerome...

Build high-performance, scalable, distributed applications with

stacks of containers

Добрый день

Jérôme Petazzoni(@jpetazzo)

Grumpy French DevOps- Go away or I will replace you

with a very small shell script

Goal in life:run everything in containers- Docker-in-Docker

- VPN-in-Docker

- KVM-in-Docker

- Xorg-in-Docker

- etc

outline

Outline

Why containers?Containers vs Virtual MachinesHigh speed communicationService discovery and load balancingConclusions

why should we use

containers?

Packaging format

It's easier to build a container image than a traditional distro package (.deb, .rpm...)- Dockerfile

Application dependencies are isolated from the host- application can use Python X even if the host has Python Y

- application can use distro X even if the host has distro Y

Downside: increased disk usage- but disks are cheap

Service abstraction*

Service installation “docker pull”→- no more dependency issues

Service start “docker run”→- no more fiddling with language-specific wrappers

Service stop “docker stop/kill”→- no more runaway processes

*If using a standard container format, e.g. Docker

Service abstraction

Service check “docker inspect”→- no more “is this thing really running or not?”

Service reset “docker run”→- discard copy-on-write layer to return to original state

- no more “oops, I broke it, how do I fix it?”

- doesn't work in all situations, e.g. data corruption(but recovery is easier because creating test copies is cheap & fast)

No overhead*

CPU performance= native performance

Memory performance = a few % shaved off for (optional) accounting

Network and disk I/O performance = small overhead; can be reduced to zero

*May require tuning!

Immutable infrastructure

What's an immutable infrastructure?- re-create images each time you change a line of code

- prevent (or track) modifications of running images

Why is it useful?- no more deviant servers after manual upgrades

- no more “oops, how do we rollback?” after catastrophic upgrade

- easier security audit (inspect images at rest)

How containers can help?- container images are easier to create and manage than VM images

Micro-service architecture

What's a micro-service architecture?- decompose big application into many small services

Why is it useful?- it's easier to upgrade/refactor/replace a small service

- encourages to have many small teams*, each owning a service(*small teams are supposedly better, see Jeff Bezos “two-pizza rule”)

How containers can help?- problem: 10 micro-services instead of 1 big application

= 10x more work to deploy everything

- solution: need extremely easy deployment; hello containers!

containersvs

virtual machines

Virtual Machines

Emulate CPU instructions(painfully slow)

Emulate hardware (storage, network...)(painfully slow)

Run as an userland process on top of a kernel(painfully slow)

Virtual Machines

Use native CPU(fast!)

Paravirtualized storage, network...(fast, but higher resource usage)

Run on top of an hypervisor(faster, but still some overhead)

Containers

Processes isolated from each otherVery little extra code path(in many cases, it's comparable to UID checking)

Virtual Machines vs Containers

Native CPUParavirtualized devicesHypervisor

Native CPUNative syscallsNative kernel

Inter-VM communication

Strong isolation, enforced by hypervisor + hardware- no fast-path data tranasfer between virtual machines

- yes, there are PCI pass-throughs and things like xenbus,but that's not easy to use, very specific, not portable

Most convenient method: network protocols (L2/L3)But: huge advantage from a security POV

Inter-container communication

Tunable isolation- each namespace can be isolated or shared

Allows normal Unix communication mechanisms- network protocols on loopback interface

- UNIX sockets

- shared memory

- IPC...

Reuse techniques that we know and love (?)

high speed communication for containers

Shared localhost

Multiple containers can share the same “localhost”(by reusing the same network namespace)

Communication over localhost is very very fastAlso: localhost is a well-known address

Shared filesystem

A directory can be shared by multiple containers(by using a bind-mount)

That directory can contain:- named pipes (FIFOs)

- UNIX sockets

- memory-mapped files

Bind-mount = zero overhead

Shared IPC

Multiple containers can share IPC resources(using the special IPC namespace)

Semaphores, Shared Memory, Message Queues...Is anybody still using this?

Host networking

Containers can share the host's network stack(by reusing its network namespace)

They can use the host's interfaces without penalty(high speed, low latency, no overhead!)

Native performance to talk with external containers

Host filesystems

Containers can share a directory with the hostExample: use fast storage (SAN, SSD...) in container- mount it on the host

- share it with the container

- done!

Native performance to use I/O subsystem

Device nodes

Containers can use the host's devices (if allowed)Access is granted through “devices” control groupPerformance (throughput, latency, CPU usage)...is the same as using the device on the host

Examples:- /dev/sd*: raw block device in container

- /dev/video*: GPU in container

- /dev/kvm: VM in container

- and more!

service discovery,

load balancing

Service discovery is not a new problem

Usually not important for …- small architectures

- low server count

- static deployments

But with containers...- container count can be very high

- deployments are very dynamic

you need service

discovery

Name resolution

Provide custom /etc/hosts in container(or give a custom DNS resolver to the container)

Easy to integrate in code (just connect to e.g. “db”)But:- no way to push changes to the application

- some libraries won't re-resolve

- might need to restart containers when service address changes

- service must run on well-known port

Environment variables (or SRV records)

Solves the “well-known port” requirementHarder to integrate in codeDoesn't solve the other problems

In-app discovery with config DB

Connect to zookeeper/etcd to find service locationWatch zookeeper/etcd for changesIf change occurs, reconnect

In-app discovery with config DB

Works well, but requires deep code changesIf you support multiple languages, it's a lot of workTransitioning from a system to another is hard

ambassadors

container host

database containerI'm frontdb!

web containerI want to talk to frontdb!

loca

lco

nnec

t

database host web host



wiring containerI actually talk to frontdb!

wiring containerI pretend I'm frontdb!

loca

lco

nnec

t local

connect

?

database host web host



wiring containerI actually talk to frontdb!

wiring containerI pretend I'm frontdb!

loca

lco

nnec

t local

connect

UNICORNS

But! We are adding extra layers!

Yes.But service/ambassador communication can be:- over locahost

- over UNIX socket (even better but not always possible)

- over iptables or IPVS (to bypass the ambassador process)

Ambassador implementations

A few examples- Registrator

https://github.com/progrium/registrator

- Grand Ambassadorhttps://github.com/cpuguy83/docker-grand-ambassador

- AirBNB SmartStackhttps://github.com/airbnb/nervehttps://github.com/airbnb/synapse

Or roll your own- some HA KV store + HAProxy, stunnel, iptables, IPVS...

- serverless, ad-hoc discovery (avahi, multicast...)

conclusions

Thanks to containers, we can...

Build, ship, and run our applications more easilyDecouple application code from “plumbing”(service discovery, load balancing)

Which allows to implement micro-service archtectures efficiently (hopefully without the usual drawbacks!)

Which yields higher agility, shorter dev cycles, etc.Win!

https://github.com/progrium/registrator

https://github.com/cpuguy83/docker-grand-ambassador

https://github.com/airbnb/nerve

https://github.com/airbnb/synapse

containerize!

thank you!questions?

build high-performance, scalable, distributed applications with stacks of containers. jerome...

Technology

containers docker

containers processes

stacks of containers

external containers

host networking containers

host filesystems containers

shared ipc multiple

service installation