Download - Kubernetes: Containing the Chaos
Kubernetes: Containing the Chaos
Texas Linux Fest7/9/2016
Seth Jennings
Minnowboard Turbot hardware by
Special Thanks
netgate.com
Minnowboard Turbot hardware for today's demo by
Who am I?
● Seth Jennings● Red Hat● 5 years Linux kernel contributor
– Transparent memory compression (zswap)
– Kernel Live Patching (kpatch)
● 5 months Kubernetes developer– In service to Openshift
Agenda
● Brief history of the isolation problem● Linux containers overview● Docker● What is Kubernetes
– Architecture
– Resources
– Networking
– Demo
Bare Metal
● Perfect kernel-level isolation● Server sprawl● Underutilization● Very rigid resource constraints
– hardware upgrade/downgrade required
Virtualization
● Very good kernel-level isolation● Good legacy software compatibility● Better utilization maybe
– VMs still overprovisioned
– Overhead of emulating hardware
● Still fairly rigid resource constraints– Hot cpu add and memory ballooning
Containers
● Good (?) isolation without duplicating– guest kernels
– page caches
– root filesystems, init systems, system daemons
● Legacy software might have troubles● Flexible resource constraints
– cgroups
What is a Linux Container
● A resource-constrained, namespaced environment, initialized by a container manager and enforced by the kernel, where processes can run
● kernel cgroups limits hardware resources– cpus, memory, i/o
● kernel namespacing limits resource visibility– mount, pid, user, network, ipc
Container Managers
● Initialize the namespaces● Configure cgroups● Examples:
– docker (runc under the covers since 1.11)
– rkt (nspawn under the covers)
Docker
● One stop shop for– Image creation (docker build, Dockerfile)
– Image distribution (docker hub, registry/distribution)
– Container runtime (docker daemon)
● The devs rejoiced!
But
● Then the ops guy/gal installs docker on two production nodes and realizes “oh no...”
Dev view of docker
My sweet laptop w/ stickers
LAMP-for-reasons
maybe-good-idea
i-will-surely-regret-this
Ops view of dockernode01node01 node01node01 node01node01 node01node04 node01node05 node01node06 node01node07 node01node08 node01node09
node01node10 node01node11 node01node12
node01node01 node01node02 node01node03
node01node13 node01node14 node01node15 node01node16 node01node17 node01node18
node01node01 node01node01 node01node01 node01node22 node01node23 node01node24 node01node25 node01node26 node01node27node01node19 node01node20 node01node21
Worse ops view of dockernode01node01 node01node01 node01node01 node01node04 node01node05 node01node06 node01node07 node01node08 node01node09
node01node10 node01node11 node01node12
node01node01 node01node02 node01node03
node01node13 node01node14 node01node15 node01node16 node01node17 node01node18
node01node01 node01node01 node01node01 node01node22 node01node23 node01node24 node01node25 node01node26 node01node27node01node19 node01node20 node01node21
I could write a special program!
● Clear and Present Danger? Anyone?● Why bleed the blood, sweat the sweat, and cry
the tears that others have already shed?!● Do NOT do this and waste their sacrifice!
What is Kubernetes
Kubernetes is an open-source system for automating deployment, scaling, and
management of containerized applications
kubernetes.io
Oblig History and Github Stats
● Distilled patterns from Google's internal Borg system
● Over 800 individual contributors● Over 17000 non-merge commits● Contributing Companies
– Google (primary), Red Hat, IBM, CoreOS, Intel
Architecture
● Kubernetes design seeks to be– decoupled (asynchronous)
– scalable
– self-healing
● Resources have a desired state and it is the role of Kubernetes component to converge the actual state and the desired state
Architecture
● Kubernetes master– etcd
● key/value● watchable
– apiserver
– controller-manager
– scheduler
etcd (cluster)
apiserver
controller-manager scheduler
Architecture
● Kubenetes node– kube-proxy
● iptables
– kubelet● docker management● node status
– flannel● flat network for containers
apiserver
kube-proxy kubelet
dockerflannel
Resources
● Pods– One or more containers
– Schedulable unit
– Share network namespace and pod IP address
Pod
Resources
● ReplicaSets (was Replication Controller)– Ensures that a certain number of pods are running
● Pods selected by label
– Contains template for creating new pods
– Not typical used directly
– Used indirectly by Deployments
Pod
ReplicaSet
Pod
Resources
● Deployments– Allows one ReplicaSet to be replaced by other
– Versioned updates
– Rolling updatesDeployment
Pod
ReplicaSet
Pod Pod
ReplicaSet
Pod
Resources
● Service– Fronts a service offered by a set of endpoints
– Endpoints are pods selected by label
– Basic load balancing
– Can be assigned a “cluster IP”● virtual IP address not assigned to any node or pod● kube-proxy can use this IP to route traffic to and from
endpoints using iptables rules
Resources
● Secrets and Config Maps– Allows configuration data to be bound to the
container at runtime instead of baking it into the image
● secrets for things like TLS keys● config maps for things like configuration files
– Support updating in-place
– Hint: In practice and in implementation, they are quite similar
Resources
● Persistent Volumes and Claims– Allows persistent storage to be bound to a container
at runtime
– Supported persistent volume types● NFS, iSCSI, RDB (Ceph), Glusterfs, Host (testing only)● Cloud Specific: GCE PD, AWS EBS
– A claim for a certain storage capacity is created an a controller binds the claim to a volume that can satisfy the claim
Other Resources
● Ingresses– L7 load balancing
● DaemonSets– Run a pod on each node (or subnet of nodes)
● Jobs– For running batch containers that don't run forever
Networking
● Kubernetes assumes a flat network between pods in a cluster
● This is not the default for docker– NAT over the docker bridge
● flannel is a small daemon, run on the node, that allows pods to appear to be on a flat network
Networking
● flannel retrieves a cluster subnet from etcd– 172.16.0.0/12
● It picks a subnet of that subnet to use for containers on that node– 172.16.8.0/24
– flannel tunnel will configure with 172.16.8.0/12 and route off-node container traffic
– docker bridge will configure with 172.16.8.1/24 and route on-node container traffic
Networking
● flannel registers the subnet it picked and the IP address of the node in etcd
● When container traffic is egressing a node, flannel queries etcd for the IP of the node whose containers are using the subnet of the destination IP and routes over a vxlan tunnel to that node
● That last point was just words strung together
Demo Cluster
● 3x Minnowboard Turbots with Sliverjaw lures– Intel Atom E3826
1.46 GHz Dual Core– 2GB DDR3L– Intel HD Graphics
– Gigabit Ethernet
– micro HDMI, SATA,USB3.0, USB2.0(mSATA, mPCIe via lure)
Demo!
Openshift
● Openshift is a PaaS built on Kubernetes– Source-to-Image (s2i)
– Integrated image registry and image streams
– RBAC with project-level and cluster-level roles
Openshift Online Developer Preview
https://www.openshift.com/devpreview/
More Fun Things
● What's new in Kubernetes? - Tim Hockin– https://www.youtube.com/watch?v=k6vCEc86ihI
● Kubernetes Documentation (excellent!)– http://kubernetes.io/docs/
● My Kubernetes Tutorial Video Series– https://www.youtube.com/playlist?
list=PLj_IGCS9P2Sn_MwgDO5gMXPGa88h7cPN2
– More videos to come
Special Thanks
netgate.com
Minnowboard Turbot hardware for today's demo by