Download - Docker 原理與實作
![Page 1: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/1.jpg)
docker 原理與實作果凍
![Page 2: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/2.jpg)
簡介
● 任職於迎廣科技○ python○ openstack
● http://about.me/ya790206● http://blog.blackwhite.tw/● https://github.com/ya790206/call_seq
![Page 3: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/3.jpg)
Agenda
● linux kernel namespace● seccomp● cgroup● lxc● docker
![Page 4: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/4.jpg)
docker
● lightweight, portable, self-sufficient containers.
● the process running in the container is isolated from the process running in the other container.
![Page 5: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/5.jpg)
Linux startup process
● Linux startup process○ Boot loader -> ○ Kernel -> ○ Init process
● Difference between Linux distros:○ package manager○ init
![Page 6: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/6.jpg)
Docker
Autofs lxc
Kernel namespaces
Apparmor and SELinux profiles
Seccomp policies
Control groups
Kernel capabilities Chroots
btrfs
![Page 7: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/7.jpg)
kernel namespace
● The purpose of each namespace is to wrap a particular global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.
● Private view
![Page 8: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/8.jpg)
kernel pid namespaceroot pid namespace
pid 1 (pid 1)
pid namespace x pid 2 (pid 2)
pid 3 (pid 1)
pid 4 (pid 2) ● black: the real pid.● red: the pid process use getpid
to get.
![Page 9: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/9.jpg)
kernel namespace
Mount namespacesUTS namespacesPID namespaces Network namespacesUser namespaces IPC namespaces
![Page 10: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/10.jpg)
int child_pid = clone(child_main, child_stack+STACK_SIZE, CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | SIGCHLD, NULL);
● https://gist.github.com/ya790206/9855021
![Page 11: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/11.jpg)
尾巴沒藏好
![Page 12: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/12.jpg)
int child_pid = clone(child_main, child_stack+STACK_SIZE, CLONE_NEWUTS | CLONE_NEWIPC | CLONE_NEWPID | CLONE_NEWNS | SIGCHLD, NULL);mount("proc", "/proc", "proc", 0, NULL);
● https://gist.github.com/ya790206/9855094
![Page 13: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/13.jpg)
seccomp
● A process running in seccomp mode is severely limited in what it can do;
● there are only four system calls - read(), write(), exit(), and sigreturn() to already-open file descriptors.
![Page 14: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/14.jpg)
libseccomp example
https://gist.github.com/ya790206/9579145
![Page 15: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/15.jpg)
cgroup
● This work was started by engineers at Google
● Resource limiting● Prioritization● Accounting● Control
![Page 16: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/16.jpg)
cgroup○ blkio — this subsystem sets limits on input/output access to and from block devices such as
physical drives (disk, solid state, USB, etc.).○ cpu — this subsystem uses the scheduler to provide cgroup tasks access to the CPU.○ cpuacct — this subsystem generates automatic reports on CPU resources used by tasks in a
cgroup.○ cpuset — this subsystem assigns individual CPUs (on a multicore system) and memory nodes to
tasks in a cgroup.○ devices — this subsystem allows or denies access to devices by tasks in a cgroup.○ freezer — this subsystem suspends or resumes tasks in a cgroup.○ memory — this subsystem sets limits on memory use by tasks in a cgroup, and generates
automatic reports on memory resources used by those tasks.○ net_cls — this subsystem tags network packets with a class identifier (classid) that allows the
Linux traffic controller (tc) to identify packets originating from a particular cgroup task.○ net_prio — this subsystem provides a way to dynamically set the priority of network traffic per
network interface.○ ns — the namespace subsystem.
![Page 17: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/17.jpg)
cgroup freezer
● The cgroup freezer is useful to batch job management system which startand stop sets of tasks in order to schedule the resources of a machineaccording to the desires of a system administrator.
![Page 18: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/18.jpg)
$ mount -t cgroup -ofreezer freezer /<path>/freezer
/<path>/freezer:root cgroup
tasks otherfile my
/<path>/freezer/my:sub cgroup
tasks otherfile
$ mkdir /<path>/freezer/my
all process
pid
![Page 19: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/19.jpg)
cgroup freezer
$ mount -t cgroup -ofreezer freezer /<path>/freezer$ ch /<path>/freezer/; ls cgroup.clone_children cgroup.event_control cgroup.procs cgroup.sane_behavior notify_on_release release_agent tasks
1. mkdir my_group;cd mygroup2. echo $some_pid > tasks3. echo FROZEN > freezer.state4. echo THAWED > freezer.state
![Page 20: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/20.jpg)
other cgroup
● memory cgroup:○ limit process memoroy usage.○ show various statistics
● blkio cgroup:○ change widget○ show various statistics
![Page 21: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/21.jpg)
lxc
● LXC is a userspace interface for the Linux kernel containment features.
● Container templates● A set of standard tools to control the
containers
![Page 22: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/22.jpg)
lxchost os
container A
process 1
process 2
container B
process 3
process 4
process x
A can see BA B A BA can see B.B can see A.
![Page 23: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/23.jpg)
lxc
1. lxc-create -n test-container -t ubuntu2. lxc-ls --fancy3. lxc-start -n test-container4. lxc-console -n test-container5. lxc-stop -n test-container6. lxc-destroy -n test-container
![Page 24: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/24.jpg)
start vs execute
● start:○ boot linux system
● execute:○ execute program directly○ make sure you have "/usr/lib/lxc/lxc-init" in your
container
![Page 25: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/25.jpg)
sudo lxc-checkpoint -name p1 --statefile a● output:
○ lxc-checkpoint: 'checkpoint' function not implemented
![Page 26: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/26.jpg)
linux aufs
● It allows files and directories of separate filesystem to co-exist under a single directories.
/tmp/union
/tmp/a /tmp/b /tmp/c
![Page 27: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/27.jpg)
# apt-get install aufs-tools
# mount -t aufs -o br=/tmp/a:/tmp/b none /tmp/union/
# mount -t aufs -o br=/tmp/a=rw:/tmp/b=rw none /tmp/union
![Page 28: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/28.jpg)
docker vs lxc
● docker is based on lxc● docker can create image from text file.● docker seldom boot system.● docker provide user-friendly interface● docker use less disk space.(aufs)
![Page 29: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/29.jpg)
dockerrunning containers
process
rootfs
stopped containers
rootfs
image
commit
r
un
st
op
st
ar
t
rootfs
![Page 30: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/30.jpg)
rootfs in container
image: rw
ZZZ image: ro
XXX image: ro
ubuntu image: ro
rootfs in image
image: ro
ZZZ image: ro
XXX image: ro
ubuntu image: ro
aufs
aufs
![Page 31: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/31.jpg)
taiwan.py site dockerfile
FROM ubuntu:12.10
RUN apt-get update
RUN apt-get install -y python-dev
RUN apt-get install -y python-pip
RUN apt-get install -y git
RUN pip install mynt
RUN git clone https://github.com/lucemia/taiwan.py
RUN mynt gen -f taiwan.py/src/ taiwan.py/build/
EXPOSE 8000
CMD cd taiwan.py/build/ && python -m SimpleHTTPServer
![Page 32: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/32.jpg)
How to run
1. cat dockerfile | sudo docker build -t taiwanpy -
2. docker run -p 8000:9000 taiwanpy3. docker stop xxx4. docker start xxx5. docker stop xxx6. docker rm xxx7. docker rmi taiwanpy
![Page 33: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/33.jpg)
simple docker shell
● https://github.com/ya790206/misc_tools/tree/master/docker_wrapper
![Page 34: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/34.jpg)
Summary
● Namespace for virtualization.● Cgroup for controlling a group of process.● Conatiner and host system use the same
kernel.● Docker is similar to lxc. But docker is easy
to use.
![Page 35: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/35.jpg)
Question
![Page 36: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/36.jpg)
Thank you
![Page 37: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/37.jpg)
參考資料 - kernel namespace
● Namespaces in operation, part 1: namespaces overview
● PaaS under the hood, episode 1: kernel namespaces
● Introduction to Linux namespaces – Part 1: UTS
![Page 38: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/38.jpg)
參考資料 - cgruop
● cgroup● http://en.wikipedia.
org/wiki/Cgroups
![Page 39: Docker 原理與實作](https://reader036.vdocuments.net/reader036/viewer/2022082309/540dec4f8d7f728d7e8b4b67/html5/thumbnails/39.jpg)
參考書目
● Linux Kernel Hacks:改善效能、提昇開發效率及節能的技巧與工具