Continuous Integration on Steroids

Continuous Integration
on Steroids
Akbashev Alexander
Highload++ | November 07, 2016

Agenda
01. CI in HERE
02. Monitoring
03. Scalability
04. Jenkins
05. Nightmares Plugins
06. Morale
07. Q&A

01
Continuous Integration in
HERE

Every change goes through validation pipeline
Gerrit
Gerrit
Plugin
Pre-submit
Trigger
Pre-submit
Trigger
Build
Build
Build
Build
Build
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests

Feedback goes from tests back to Gerrit
Gerrit
Gerrit
Plugin
Pre-submit
Trigger
Pre-submit
Trigger
Build
Build
Build
Build
Build
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests

Feedback comes from every pipeline
Gerrit
Gerrit
Plugin
Pre-submit
Trigger
Pre-submit
Trigger
Build
Build
Build
Build
Build
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests
Tests

Numbers
100k+ builds per day ~1.5k concurrent builds 1.3-2.5k executors
• Each “build” is
execution of one build/
test job
• Total number correlates
with number of commits
• Number of builds is not
so important as number
of commits
• Big throughput is
extremely important
• Morning commit
• Before lunch
• “Last attempt for today”
• Raised on-demand
• Health checks
• Jenkins strategy is not
optimized for cloud

Collects information about every build in system
Groovy
Event
Listener
Plugin
Jenkins
build
Fluentd InfluxDB Grafana

Collects information about every build in system
Groovy
Event
Listener
Plugin
Jenkins
build Fluentd InfluxDB Grafana

JVM stats are the best “canary”
Groovy
Event
Listener
Plugin
Jenkins
build
Fluentd InfluxDB Grafana
Jenkins
JVM

What do we want to achieve?
Keep feedback time (< 20 min.)

Test as much as possible

… with debug symbols

… and code coverage information

… and code coverage information
and on physical devices

How to scale
Increase number of executors
Minimize job execution time
Smart testing

How to increase number of executors?
EC2 Plugin
TestDroid

How to minimize job execution time

Split tests by type

Split tests by type
Parallel execution

Split tests by type
Parallel execution
Node as cache storage

Split tests by type
Parallel execution
Shared compiler cache

Split tests by type
Parallel execution
Shared compiler cache
Profiling!

Is Jenkins so slow or we are doing something wrong?

Jenkins is ok.

Jenkins is ok.
But…

Surprise #1
Rotation costs a lot

Surprise #2
It works much better with nginx
less jenkins.access.log | tail -n1000 | grep urt="-" | wc -l
407

Surprise #3
Some buttons are very dangerous

Slave
Slave
One fundamental issue
Master
Slave
Slave
Slave
Slave
Slave
Slave
Users

What can you find in heap dump of OOM-Killed Jenkins?

Console logs

Console logs
Should be less than X MB
Verbose output goes to file
“>” and “tee” are amazing!

Console logs
Build history

Build history
2000 entities or 3 days
Efficient rotator

Console logs
Build history
Build artifacts

Build artifacts
Push to S3 directly from slaves
Don’t store anything on master

Groovy Event Listener Plugin
all events
synchronized
groovy compilation
fixed since 1.010 (Mar 10, 2016)

Warnings Plugin
Just another parser of console log
parseConsole is “deprecated”
parseFile is allowed
0 warnings are very appreciated :)

Timestamper Plugin
Tail needs not only “tail”
fixed since 1.8.5 (Aug 31, 2016)

EC2 Plugin
Full list of all images in AWS
fixed since 1.35 (Jun 30, 2016)

Robot Framework Plugin
Green chart costs 100 times more
Replaced by xUnit Plugin

Limit of number of builds
120K

Build Failure Analyzer Plugin
One regexp
One stream
One thread
PR-57 is not accepted yet

Limit of number of builds
140K

Cleanup Workspace Plugin
`ü` breaks everything
PR-29 is not accepted yet

Final recommendations
Think about scalability at first place

Flakiness could be a huge problem

Reduce memory allocations

Cache as much as possible

Cache as much as possible
Failing builds can be expensive

Workflow
Slowness? Profile! Fix! Contribute!

Open source collaboration
Let’s make our life better ;)

Full list of our contributions related to this talk
• Jenkins
• ccache
• clcache
• EC2 Plugin
• S3 Plugin
• FluentD Plugin
• BuildRotator Plugin
• Groovy Event Listener Plugin
• Timestamper Plugin
• Robot Framework Plugin
• Build Failure Analyzer Plugin
• JVM GC Log Plugin for
FluentD

Thank you
Contact
Akbashev Alexander
GitHub: Jimilian
E-mail: alexander.akbashev@here.com

Continuous Integration on Steroids

Related slideshows

More Related Content

Continuous Integration on Steroids