techniques to speed up your build pipeline
DESCRIPTION
I would like to share my experience and journey on how we brought down our Jenkins build pipeline time down from over 90 minutes to under 12 minutes. In the process, I would share specific techniques which helped and also some, which logically made sense, but actually did not help. If your team is trying to optimize their build times, then this session might give you some ideas on how to approach the problem. Development Impact - For one of our build job, below graph shows how the number of builds in a day have increased over a period of time as the build time has reduced. Frequency of code check-in has increased; Wait time has reduced; failed test case faster to isolate and fix. Details: http://confengine.com/agile-pune-2014/proposal/458/techniques-to-speed-up-your-build-pipeline-for-faster-feedback Conference: http://pune.agileindia.org/TRANSCRIPT
Techniques to Speed
Up Your Build
[email protected] [email protected]
@AshishParkhi @nashjain
ashishparkhi.com nareshjain.com
Build Pipeline - Best Case to Worst Case time.
About 60 to 90 minutes
Impact on life
Image source – http://ak3.picdn.net/shutterstock/videos/5132438/preview/stock-footage-mixed-ethnicity-group-of-medical-professionals-working-late-at-night-are-
looking-at-a-computer.jpg
http://the247analyst.files.wordpress.com/2011/10/dealing-with-pressure.jpg
http://www.dimitri.co.uk/business/business-images/worker-alone-dark-office.jpg
http://cdn.sheknows.com/articles/2012/10/crying-little-girl.jpg
Build Pipeline – Now takes 10 to 12 Minutes
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
Commercial Break
[email protected] [email protected]
@AshishParkhi @nashjain
ashishparkhi.com nareshjain.com
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
Disk IO – Example
File Operations
Focus on Bottleneck
Disk IO – Example
Database operations.
Image Source - https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcTPdVawndjUZbU2PDn-oKgjBPqmgDqr3PPZatZh9kxEgNi71AND
http://www.dba-oracle.com/images/large_disk_hot_files.gif
Focus on Bottleneck
Disk IO – Alternative
Image Source - http://3.bp.blogspot.com/-bqTjSN7pSpg/UbqyjVojEFI/AAAAAAAADBw/PWe0kiuRHJ4/s200/no+duplicate+content.jpg
• Avoid file operations – e.g. duplicating workspace
Focus on Bottleneck
Disk IO – Alternative
• Avoid file operations – e.g. Jar creation.
Focus on Bottleneck
Image Source - http://i1.wp.com/blog.quoteroller.com/wp-content/uploads/2013/04/Dont-start-from-scratch.png?resize=800%2C264
Disk IO – Alternative
Image source - http://4.bp.blogspot.com/_4hvqisoH9CE/TSZIs7eiSAI/AAAAAAAAA7E/vanj6bGD8XQ/s1600/big-vs-small-left.jpg
• Test on smaller but apt data set.
Focus on Bottleneck
Disk IO – Alternative - SSD
CrystalDiskMark - http://crystalmark.info/software/CrystalDiskMark/index-e.html
• HDD (Toshiba MQ01ACF050 500GB SATA III) vs SSD (Samsung
PM851 512GB mSata)
Focus on Bottleneck
Disk IO – Alternative - SSD
• HDD SSD
Focus on Bottleneck
Disk IO – Alternative - In Memory DB
Memory (Heap) Engine
– had some limitations over myisam engine.
Focus on Bottleneck
Disk IO – Alternative - In Memory DB
– was not supporting many MySQL queries so was
discarded.
Focus on Bottleneck
Disk IO – Alternative - In Memory DB
database
– looked promising as it could support many MySQL
queries but still required couple of modifications to our
code.
Focus on Bottleneck
Disk IO – Alternative - In Memory DB
– looked most promising as it is wire compatible with
MySQL, which means without code changes I could
just point to memsql and be done with it.
Focus on Bottleneck
Disk IO – Alternative - RAM Drive
• SoftPerfect RAM Disk
Focus on Bottleneck
Disk IO – Alternative – RAM Drive
• RAM Drive
Focus on Bottleneck
Disk IO – Alternative – RAM Drive
• RAM Drive – did not work
Focus on Bottleneck
Disk IO – Alternative – RAM Drive
• RAM Drive – did not work
Focus on Bottleneck
CPU - Profiling
Focus on Bottleneck
CPU – Profiling - Insights
Image source - https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcQde6NeSrbuv40CIhKtFa1OuIQXf7F7esMJKp1Ie7zmH2t29l6Z
Scanning resource bundle files from jars.
Focus on Bottleneck
CPU – Profiling - Insights
Image source - http://2.bp.blogspot.com/-uKMyLlB3F7o/Tqn_6yqdElI/AAAAAAAAB94/_1FMbHJFQBQ/s1600/weight-lift-cartoon.jpg
Loading Spring Application Context.
Focus on Bottleneck
CPU – Profiling - Insights
Image source - http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/7000/000/7029/7029.strip.gif
Avoiding unnecessary activities during build e.g. sending
out email.
Focus on Bottleneck
CPU – Profiling - Insights
java.util.Calendar is horribly slow.
Total processing time took 20.72 minutes out of which Date
Arithmetic took 18.15 minutes which is about 87.6% of the
total processing time!
Focus on Bottleneck
CPU – Profiling - Insights
java.util.Calendar is horribly slow. We switched to joda date
library and deprecated java.util.Date API.
Now Date Arithmetic takes 1.30 minutes; that’s a massive
saving of 93.77%
Focus on Bottleneck
CPU - ANT 1.7 Junit task options.
Focus on Bottleneck
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
CPU - Running Tests Concurrently
• Create parallel jobs.
Divide and Conquer
CPU - Running Tests Concurrently
• Distribute tasks across multiple slaves.
Divide and Conquer
Image source - https://wiki.jenkins-ci.org/download/attachments/2916393/logo.png?version=1&modificationDate=1302753947000
CPU - Running Tests Concurrently
Image source - http://sharpreflections.com/wp-content/uploads/2012/06/multi_core_cpu.png
• Using @RunWith(ConcurrentJunitRunner.class).
– Curtesy - Mathieu Carbou http://java.dzone.com/articles/concurrent-
junit-tests
– Maven Surefire plugin has built in mechanism.
Divide and Conquer
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
Restructure The Build Pipeline
Image Source - http://javapapers.com/wp-content/uploads/2012/11/failfast.jpg
• We want our builds to give us fast feedback. Hence it is very important to
prioritise your build tasks based on what is most likely to fail first.
• Push unnecessary stuff to a separate build – Things like JavaDocs can be
done nightly.
• Separate out fast and slow running tests.
Fail Fast
Incremental Build vs. Clean Build
• Local dev builds are incremental, instead of clean builds, as it helps with
faster feedback and fail fast.
Fail Fast
Prioritize Test
• We prioritize and group our tests so that the tests which are fast and
mostly likely to fail are run first.
– ProTest framework
Fail Fast
Summary
• Focus on bottlenecks
– Avoid Disk IO - File operations, file based database operations.
– Use smaller datasets.
– Use in-memory databases, Ram Drives, SSDs.
– Perform CPU profiling, scan logs, to uncover the unknown.
– Verify build tool settings.
• Divide and Conquer
– Create smaller jobs that can run in parallel.
– Distribute jobs across multiple slaves.
– Write tests that can run in isolation and use ConcurrentJunitRunner to run them
in parallel.
• Fail Fast
– Restructure the build pipeline to uncover failures soon.
– Incremental Builds
– Prioritize tests.
Build Time Vs No Of BuildsRemoved
Workspace Duplication
Ant Junit Task – Fork Once
Ram Disk
Caching Resource
Caching Spring Context
Avoided Email
Joda DateTime
Deprecated Date API
Concurrent Junit
Runner
Impact on life
Image source - http://t3.gstatic.com/images?q=tbn:ANd9GcTCvK8pY5qcp7Gl3ZBjxN1mc1HVHdiy1sQhByKeGgUk_5eJuUk7cA
https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcQpoUXqhEpdGl1cLzn4gQsng_GyxUmOKWxYUH6GfrjN_FRUYPxw-Q
Resources
• Jenkins – http://jenkins-ci.org/
• CI – http://en.wikipedia.org/wiki/Continuous_integration
• Mklink – http://technet.microsoft.com/en-us/library/cc753194.aspx
• http://ant.apache.org/manual/Tasks/junit.html
• http://java.dzone.com/articles/javalangoutofmemory-permgen
• SSD – http://en.wikipedia.org/wiki/Solid-state_drive
• Hybrid disk – http://en.wikipedia.org/wiki/Hybrid_drive
• HSQL – http://hsqldb.org/
• H2 – http://www.h2database.com/html/main.html
• Memsql – http://www.memsql.com/
• MySQL is bazillion times faster than MemSQL
• Tmpfs – http://en.wikipedia.org/wiki/Tmpfs
• http://blog.laptopmag.com/faster-than-an-ssd-how-to-turn-extra-memory-into-a-ram-disk
• RAM Disk Software Benchmarked
• http://jvmmonitor.org/
• http://searchvmware.techtarget.com/tip/VMware-snapshot-size-and-other-causes-for-slow-
snapshots
• http://blogs.agilefaqs.com/2014/10/03/key-principles-for-reducing-continuous-integration-build-
time/
• http://googletesting.blogspot.com/2011/06/testing-at-speed-and-scale-of-google.html
• http://www.infoq.com/presentations/Development-at-Google
• http://crystalmark.info/software/CrystalDiskMark/index-e.html
© Copyright Integrated Decisions and Systems, Inc. (IDeaS – A SAS COMPANY)
Visit IDeaS online at www.ideas.com
Thank you.
[email protected] [email protected]
@AshishParkhi @nashjain
ashishparkhi.com nareshjain.com