kubernetes scaling sig (k8scale)
TRANSCRIPT
Kubernetes Scaling SIG (K8Scale)
Bob Wise Samsung SDS Research America
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved
2
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved
This presentation is intended to provide information concerning Samsung’s efforts around containers and container orchestration. We do our best to make sure that information presented is accurate and fully up-to-date. However, the presentation may be subject to technical inaccuracies, information that is not up-to-date or typographical errors. As a consequence, Samsung does not in any way guarantee the accuracy or completeness of information provided on this presentation. Samsung reserves the right to make improvements, corrections and/or changes to this presentation at any time.
The information in this presentation or accompanying oral statements may include forward-looking statements. These forward-looking statements include all matters that are not historical facts, statements regarding the Samsung Data System' intentions, beliefs or current expectations concerning, among other things, market prospects, growth, strategies, and the industry in which Samsung operates. By their nature, forward-looking statements involve risks and uncertainties, because they relate to events and depend on circumstances that may or may not occur in the future. Samsung cautions you that forward looking statements are not guarantees of future performance and that the actual developments of Samsung, the market, or industry in which Samsung operates may differ materially from those made or suggested by the forward-looking statements contained in this presentation or in the accompanying oral statements. In addition, even if the information contained herein or the oral statements are shown to be accurate, those developments may not be indicative developments in future periods.
Logos remain the property of their respective owners. So there.
3
Presenta@on Goals
• Con@nue to make a posi@ve contribu@on. • K8scale is a way for Samsung to contribute to a project that is important to us.
• Transparency enhancement. • Encourage involvement in K8scale if you are interested in this area.
• Share any learnings with the rest of the community to help other SIGs.
• Give a perspec@ve on kubernetes scalability. Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved
4
A Bit of History • July – Kubernetes 1.0 Launch • Post launch… interest in evolving the community by
breaking into SIGs… – Auto-‐scaling – Federa@on (recently ac@ve) – Network – Scalability – Storage – Configura@on – Tes@ng – UI (just started mee@ng?) – Node – Big Data (just kicked off)
• Aug 5 – first K8Scale mee@ng
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 5
K8Scale Goals • 1000 nodes • 100 pods/sec scheduling rate
– Dense microservices, IOT • 99% of API calls to apiserver return in less than 1 second • 99%-‐ile of end-‐to-‐end pod startup @me with prepulled
images less than 5 seconds (up to 30-‐pods per node) • Configura@ons that are “fieldable”
– HA – HTTPS/tokens
• AWS, GCE, and bare metal – The more the merrier, please join in!
• Use and improve standard conformance tests • Data sharing back to the community
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 6
K8Scale Info
• Regular Weekly Mee@ngs since Aug 5, with only one cancella@on.
• Co-‐chairs – Bob Wise – Joe Beda
• Ac@ve Slack channel • Consistent ac@ve par@cipa@on by Google, Redhat, CoreOS, Samsung, and others
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 7
Context – Scaling Dimensions
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 8
Node Count
Pods/Node (Node Size)
Pod Rate
Latency
Group Priori@es • Focus has been on kubernetes ability to manage overall high pod crea@on/destruc@on rates
• Logging and metrics collec@on is cri@cal • Not yet much work yet on very high pod-‐per-‐node numbers (over 100/node) – Very large nodes – Dense (very micro) micro-‐service deployments
• Not yet concentra@on on Docker daemon performance – Possibly an issue at higher pod density? – Using docker version appropriate to kubernetes release in test
• Not yet much work on pod/pod networking performance
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 9
Test Info
• K8scale has standardized on the density conformance test.
• Density conformance test runs at: – 3 pods/node – 30 pods/node – 100 pods/node
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 10
Fun Facts • Kubemark is a stubbed-‐out kubelet, very useful for performance tes@ng.
• Kubernetes CI includes e2e at 500 nodes with kubemark.
• Samsung running 1000 node tests on AWS regularly, 100 node CI
• Redhat CI runs high density on a variety of configura@ons
• Please join k8scale and let us know what _you_ are doing!
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 11
Challenges • Sharing dashboard data and performance runs is not really sorted out yet
• Performance issues occur in some environments but not others – Federated CI
• There are a lot of tuning (50+?) knobs, e.g. – QPS and Burst limits X kubelet, API Server, etc – Resync @mers – Garbage collec@on @mers (docker images, etc) – Really need a beoer way to share full cluster config sepngs
– h8ps://github.com/kubernetes/kubernetes/issues/14916
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 12
Hypothesis for Inves@ga@on • Implement watch in apiserver • Offload events from apiserver not used by controllers or the scheduler – Interim: Separate etcd server
• Auto-‐tuning via backpressure and backoff • Etcd v3 w/gRPC • Use other backing stores • Op@mize node status messaging • Scheduler throughput op@miza@on
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 13
Ongoing Ques@ons
• TLS overhead • JSON overhead • Performance effect of moving to the 2.2 etcd client
• API server load balancing prac@ces • At what point is etcd a booleneck? • At what point is the scheduler a booleneck? • What’s the effect of RKT
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 14
Observa@ons • Having the same people in the mee@ng every week is really
important… this really makes it work: – Tim St. Clair (Redhat) – Wojtek Tyczynski, Quinton Hoole, Daniel Smith (Google)
• Engagement from the CI team has been extremely helpful • Direct support from CoreOS on etcd has been great -‐
thanks to Xiang Li and Yicheng Qin • We moved to slack for the group before the main
community… really has worked well – The email list exists but is effec@vely completely unused
• Great mee@ng notes (esp thanks to Joe) • Substan@ve technical discussion has migrated out of the
main community to the SIG community… this is a good thing.
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 15
Tales from the trenches…
• We have to be data driven • Intui@ons can easily be wrong • It’s a bit too hard to replicate test setups from one team to another
• Here’s an example…
hops://github.com/kubernetes/kubernetes/issues/14216 “Stair-‐stepping in pods going from Pending to Running”
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 16
The guessing… • Scheduler has a bug • Etcd is misbehaving • Go garbage collec@on is firing • QPS rate limits are causing backpressure • Some performance difference between AWS and GCE instances is triggering bug (gepng desperate now J ) – Only showing on AWS
We banged our heads on this one for weeks… Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved
20
Cause…
• Scheduler logging was verbose • Scheduler hits a buffer dump sync pause • Nothing gets scheduled un@l the log buffer gets dumped
• Samsung/AWS setup was just enough bigger to hit the threshold
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 21
Cause…
• Scheduler logging was verbose • Scheduler hits a buffer dump sync point • Nothing gets scheduled un@l the log buffer gets dumped
• Samsung/AWS setup was just enough bigger to hit the threshold
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 22
Snapshot of Wojtek’s API Server Opt • 1.0.X
– All watch requests to apiserver are in etcd – Apiserver watches every object that matches – E.g., if there are 1000 kublets watching for pods with host assigned for their machine… 1000 watches in etcd
• 1.1 – Watch implemented in apiserver – Apiserver has one watch open to etcd – h8ps://github.com/kubernetes/kubernetes/pull/10679
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 23
Bob’s Final Thoughts on SIGs • SIGs should be organic and formed by those who have specific goals and interests.
• Mul@ple organizers is a very good idea. Maybe even more than two.
• Too many SIGs might diffuse energy. Refactor? • Would really like to see a release SIG. • The SIG format works really well for deeper/longer technical discussions and planning.
• K8Scale is working really well, we are con@nuing our long term involvement here.
• We really need full chat history… Nonprofit version of slack? CNCF – help!
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 24
Why do we want Kubernetes?
26
Standardize, Containerize, Deploy
…to Samsung Data Centers. …to developer systems for agility and produc@vity.
…to public virtual machine clouds. …to new and even more efficient public container clouds.
Why Focus on Kubernetes?
• Key Technology: Container Management – Deployment – Repair – Scaling
• Clean open source license • Good design by a vibrant, healthy community • Rapid pace of improvement • Right contributors with the right experience: • Best high scale public cloud container op@on
– Google Container Engine – available now • Supports mul@ple container specs: Docker and APCC
27
Why are we involved in K8Scale? • We want “Google infrastructure for everyone else” (us!)
• We want very large clusters with cross applica@on resource sharing
• We believe we can make a posi@ve contribu@on to make this happen faster and beoer.
• We believe we need deep technical involvement to build/deploy/operate at scale
• We’ve been pushing the envelope on AWS
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 28
SDSRA Ref Architecture Evolu@on
CoreOS
Docker
Kubernetes 1.0.0
Intel NUC
Flannel
Cassandra
PXE
Vagrant
OSCON Stack (July)
Op@mized OS
Container
Orchestra@on
Compute
Networking
App
Provisioning
SDSA Ref Architecture Evolu@on
CoreOS
Docker
Kubernetes 1.0.6
AWS
Flannel
Cassandra
Terraform
Ansible
1000 Node Stack (September)
Op@mized OS
Container
Orchestra@on
Compute
Networking
App
Provisioning
Switched based on demands of scale (itera@ng quickly!)
Performance Data shared by Samsung
• Detailed version of this published via K8scale notes or contact me.
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 31
RELEASE NODES DENSITYv1.0.6 (388061f) 1000 15 5.43release-1.1 (dbb37d9) 100 3 13.98release-1.1 (dbb37d9) 100 15 17.60release-1.1 (dbb37d9) 100 30 17.20release-1.1 (dbb37d9) 1000 3 14.63release-1.1 (dbb37d9) 1000 15 6.02master (1524d74) 100 3 16.36master (1524d74) 100 15 18.75master (1524d74) 100 30 15.79master (1524d74) 1000 3 14.63master (1524d74) 1000 15 6.12master (1524d74) 1000 30 3.50
• Early numbers. Not tuned. AWS only. • Please understand the details of the tests before jumping to conclusions. • These are **NOT** max numbers, these are numbers from our journey.
Performance Data Observa@ons • 12,600 pods/hour at the edges of the tests • Some correla@on to number of nodes • Stronger correla@on to total number of pods running • Performance is improving every release • Very good gains at lower pod densi@es
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 32
Performance Perspec@ve • Just tuning is not going to get us to the goal • End-‐to-‐end op@miza@ons are where the biggest gains will come from
• Hard to point a finger at any single component as “the booleneck”
• Efficient and thorough metrics and log collec@on is cri@cal – design choices have to be data driven
• Something more efficient than scheduling one pod a @me will be needed
• Will need horizontal scaling on all components
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 33
Samsung Engagement – 2016
• We are not forking • We are pushing back our work either into kubernetes or into our github repo – Samsung-‐AG.
• Plan to shiw scaling work from pod rates to networking and storage
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 34
Contact Info
• [email protected] • bobwise on kubernetes slack • hops://github.com/Samsung-‐AG
( for K8scale also [email protected] )
Copyright © 2015 Samsung SDS Co., Ltd. All rights reserved 35