continuous integration on steroids

72
Continuous Integration on Steroids Akbashev Alexander Highload++ | November 07, 2016

Upload: alexander-akbashev

Post on 15-Apr-2017

206 views

Category:

Engineering


3 download

TRANSCRIPT

Page 1: Continuous Integration on Steroids

Continuous Integration on SteroidsAkbashev AlexanderHighload++ | November 07, 2016

Page 2: Continuous Integration on Steroids
Page 3: Continuous Integration on Steroids

Agenda

01. CI in HERE 02. Monitoring 03. Scalability 04. Jenkins 05. Nightmares Plugins 06. Morale 07. Q&A

Page 4: Continuous Integration on Steroids

01Continuous Integration in HERE

Page 5: Continuous Integration on Steroids

Every change goes through validation pipeline

Gerrit Gerrit Plugin

Pre-submit Trigger

Pre-submit Trigger

Build

Build

Build

Build

Build

TestsTestsTestsTestsTestsTests

TestsTestsTestsTests

TestsTests

Tests

Page 6: Continuous Integration on Steroids

Feedback goes from tests back to Gerrit

Gerrit Gerrit Plugin

Pre-submit Trigger

Pre-submit Trigger

Build

Build

Build

Build

Build

TestsTestsTestsTestsTestsTests

TestsTestsTestsTests

TestsTests

Tests

Page 7: Continuous Integration on Steroids

Feedback comes from every pipeline

Gerrit Gerrit Plugin

Pre-submit Trigger

Pre-submit Trigger

Build

Build

Build

Build

Build

TestsTestsTestsTestsTestsTests

TestsTestsTestsTests

TestsTests

Tests

Page 8: Continuous Integration on Steroids

Numbers

100k+ builds per day ~1.5k concurrent builds 1.3-2.5k executors

• Each “build” is execution of one build/test job

• Total number correlates with number of commits

• Number of builds is not so important as number of commits

• Big throughput is extremely important

• Morning commit • Before lunch • “Last attempt for today”

• Raised on-demand • Health checks • Jenkins strategy is not

optimized for cloud

Page 9: Continuous Integration on Steroids

02Monitoring

Page 10: Continuous Integration on Steroids

Collects information about every build in system

Groovy Event

Listener Plugin

Jenkins build Fluentd InfluxDB Grafana

Page 11: Continuous Integration on Steroids

Collects information about every build in system

Groovy Event

Listener Plugin

Jenkins build Fluentd InfluxDB Grafana

Page 12: Continuous Integration on Steroids

JVM stats are the best “canary”

Groovy Event

Listener Plugin

Jenkins build Fluentd InfluxDB Grafana

Jenkins JVM

Page 13: Continuous Integration on Steroids
Page 14: Continuous Integration on Steroids

03Scalability

Page 15: Continuous Integration on Steroids

What do we want to achieve?

Page 16: Continuous Integration on Steroids

What do we want to achieve?

Keep feedback time (< 20 min.)

Page 17: Continuous Integration on Steroids

What do we want to achieve?

Keep feedback time (< 20 min.)Test as much as possible

Page 18: Continuous Integration on Steroids

What do we want to achieve?

Keep feedback time (< 20 min.)Test as much as possible… with debug symbols

Page 19: Continuous Integration on Steroids

What do we want to achieve?

Keep feedback time (< 20 min.)Test as much as possible… with debug symbols… and code coverage information

Page 20: Continuous Integration on Steroids

What do we want to achieve?

Keep feedback time (< 20 min.)Test as much as possible… with debug symbols… and code coverage informationand on physical devices

Page 21: Continuous Integration on Steroids

How to scale

Increase number of executors Minimize job execution time Smart testing

Page 22: Continuous Integration on Steroids

How to increase number of executors?

EC2 Plugin TestDroid

Page 23: Continuous Integration on Steroids

How to minimize job execution time

Page 24: Continuous Integration on Steroids

How to minimize job execution time

Split tests by type

Page 25: Continuous Integration on Steroids

How to minimize job execution time

Split tests by typeParallel execution

Page 26: Continuous Integration on Steroids

How to minimize job execution time

Split tests by typeParallel executionNode as cache storage

Page 27: Continuous Integration on Steroids

How to minimize job execution time

Split tests by typeParallel executionNode as cache storageShared compiler cache

Page 28: Continuous Integration on Steroids

How to minimize job execution time

Split tests by typeParallel executionNode as cache storageShared compiler cacheProfiling!

Page 29: Continuous Integration on Steroids

04Jenkins

Page 30: Continuous Integration on Steroids

Is Jenkins so slow or we are doing something wrong?

Page 31: Continuous Integration on Steroids

Is Jenkins so slow or we are doing something wrong?

Jenkins is ok.

Page 32: Continuous Integration on Steroids

Is Jenkins so slow or we are doing something wrong?

Jenkins is ok.But…

Page 33: Continuous Integration on Steroids

Surprise #1

Rotation costs a lot

Page 34: Continuous Integration on Steroids

Surprise #2

It works much better with nginx

less jenkins.access.log | tail -n1000 | grep urt=\"\-\" | wc -l407

Page 35: Continuous Integration on Steroids

Surprise #3

Some buttons are very dangerous

Page 36: Continuous Integration on Steroids

Surprise #3

Some buttons are very dangerous

Page 37: Continuous Integration on Steroids

Slave

Slave

One fundamental issue

Master

Slave

Slave

Slave

Slave

Slave

Slave

Users

Page 38: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Page 39: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Console logs

Page 40: Continuous Integration on Steroids

Console logs

Should be less than X MB Verbose output goes to file “>” and “tee” are amazing!

Page 41: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Console logs

Page 42: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Console logs Build history

Page 43: Continuous Integration on Steroids

Build history

2000 entities or 3 days Efficient rotator

Page 44: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Console logs Build history

Page 45: Continuous Integration on Steroids

What can you find in heap dump of OOM-Killed Jenkins?

Console logs Build history Build artifacts

Page 46: Continuous Integration on Steroids

Build artifacts

Push to S3 directly from slaves Don’t store anything on master

Page 47: Continuous Integration on Steroids

05Nightmares Jenkins Plugins

Page 48: Continuous Integration on Steroids

Limit of number of builds

20K

Page 49: Continuous Integration on Steroids

Groovy Event Listener Plugin

all events synchronized groovy compilation

fixed since 1.010 (Mar 10, 2016)

Page 50: Continuous Integration on Steroids

Limit of number of builds

40K

Page 51: Continuous Integration on Steroids

Warnings Plugin

Just another parser of console log

parseConsole is “deprecated” parseFile is allowed 0 warnings are very appreciated :)

Page 52: Continuous Integration on Steroids

Limit of number of builds

60K

Page 53: Continuous Integration on Steroids

Timestamper Plugin

Tail needs not only “tail”

fixed since 1.8.5 (Aug 31, 2016)

Page 54: Continuous Integration on Steroids

Limit of number of builds

60K

Page 55: Continuous Integration on Steroids

EC2 Plugin

Full list of all images in AWS

fixed since 1.35 (Jun 30, 2016)

Page 56: Continuous Integration on Steroids

Limit of number of builds

90K

Page 57: Continuous Integration on Steroids

Robot Framework Plugin

Green chart costs 100 times more

Replaced by xUnit Plugin

Page 58: Continuous Integration on Steroids

Limit of number of builds

120K

Page 59: Continuous Integration on Steroids

Build Failure Analyzer Plugin

One regexp One stream One thread

PR-57 is not accepted yet

Page 60: Continuous Integration on Steroids

Limit of number of builds

140K

Page 61: Continuous Integration on Steroids

Cleanup Workspace Plugin

`ü` breaks everything

PR-29 is not accepted yet

Page 62: Continuous Integration on Steroids

06Morale

Page 63: Continuous Integration on Steroids

Final recommendations

Page 64: Continuous Integration on Steroids

Final recommendations

Think about scalability at first place

Page 65: Continuous Integration on Steroids

Final recommendations

Think about scalability at first placeFlakiness could be a huge problem

Page 66: Continuous Integration on Steroids

Final recommendations

Think about scalability at first placeFlakiness could be a huge problemReduce memory allocations

Page 67: Continuous Integration on Steroids

Final recommendations

Think about scalability at first placeFlakiness could be a huge problemReduce memory allocationsCache as much as possible

Page 68: Continuous Integration on Steroids

Final recommendations

Think about scalability at first placeFlakiness could be a huge problemReduce memory allocationsCache as much as possibleFailing builds can be expensive

Page 69: Continuous Integration on Steroids

Workflow

Slowness? Profile! Fix! Contribute!

Page 70: Continuous Integration on Steroids

Open source collaboration

Let’s make our life better ;)

Page 71: Continuous Integration on Steroids

Full list of our contributions related to this talk

• Jenkins • ccache • clcache • EC2 Plugin • S3 Plugin • FluentD Plugin

• BuildRotator Plugin • Groovy Event Listener Plugin • Timestamper Plugin • Robot Framework Plugin • Build Failure Analyzer Plugin • JVM GC Log Plugin for

FluentD

Page 72: Continuous Integration on Steroids

Thank youContact

Akbashev Alexander GitHub: Jimilian E-mail: [email protected]