How do you operate over 1,200 deployments on a single BOSH Director? In the past many talks have had the Topic of Cloud Foundry at scale. But how about the underlying automation layer? BOSH has its own set of challenges and limits for running VMs and Deployments at scale. Learn which obstacles and limits came up and how we solved them with the help of the BOSH core development team. Learn how we monitor the directors, be it via logging and metrics or performance indicators. We’ll also show you how we automate BOSH itself to ensure the best experience for end users, and to keep them blissfully unaware of the complexity of the processes working on their behalf After this talk you will also be able to run at least 1,200 deployments on your directors.
7. BOSH Setup
Overbosh: Deploys runtime and Underbosh
Underbosh: Deploys service brokers and services (where we have 1200
deployments on) uses credhub colocated on overbosh
Utilsbosh: Utilities and Prometheus monitoring
7
8. Lessons Learned
● Deploy less with create-env
● Only create-env utilsbosh
● Deploy other directors with Utilsbosh
● Using Overbosh as Credhub provider for Underbosh can be
suboptimal
○ Recreate of the Overbosh means people cannot create services
○ Dedicated Credhub/UAA deployment better
● Using an external RDS does not solve all problems
○ More about that later
8
10. IO Credits are Fun
● AWS has limited IOPS for ssd disks at a rate of 3 IOPS/GB on gp2
● None on magnetic volumes (st1, sc1, standard)
● AWS-Stage BOSH DB runs on gp2
● AWS-Prod BOSH DB runs on standard
● You can see disk IOPS budget on the disk in cloud watch
● Unless its an RDS instance
● You have to create an alert for each single volume
10
11. Effects
● Database in AWS-P is consistently slower, but no variation in answer
times
● Database in AWS-S went unresponsive at some points
○ BOSH sometimes sends a few thousand requests in which do large joins
● EU-P BOSH has 50GB of standard disk (3$/mo)
● EU-S BOSH has 1TB of GP2 disk (119$/mo)
11
12. Things That Drain Your IOPS
● The daily snapshot task, even if snapshots are disabled
○ Made less severe
● Bosh vms/deployments
○ More later
● If your IOPS on the director disk get depleted repeatedly, take a
magnetic storage like sc1 or st1
○ Slower than gp2 at max speed
○ Costs half
○ Consistent and fast enough for BOSH
12
14. September 2018
● 670 Deployments
● BOSH director is very slow
● Some queries take 2-3 minutes to complete
● Scaling BOSH and DB does only bring minor reductions
● M4.2xlarge RDS is a bit faster, but does not solve it
● More disk IOPS does not help
14
15. Solution
● Updating the director
● Reason was that every bosh vms made the bosh also select
deployment configs for each deployment separately
○ Even though it was not part of the output
● SAP stumbled over the issue first and fixed it
15
16. November 2018
● BOSH unresponsive or very slow
● No uploads/deploys possible
● Persistent disk 50% free
● “df -i” showed all inodes exhausted
● BOSH stores task logs on disk
○ And deletes regularly
○ If you have 900 deployments and prometheus bosh exporter does a bosh vms every 5
minutes you create tasks faster than bosh cleans them up
○ 1.8m task log folders on disk
○ Every one contained 0-3 log files
16
17. Solution
● Removing some older log files (1.79m)
● Scaling the disk
● Notifying BOSH core
● Set up alert for Inode usage on all persistent disks
● Switch from bosh exporter to graphite hm plugin
● BOSH core made the director more aggressive at purging old task
logs
○ Went from 1.6m task logs on disk to just 18.000
17
18. December 2018
● BOSH very slow
● Sometimes locks up for minutes
● Database works on some queries longer than BOSH waits
● Whenever a service is deployed or updated
18
19. Investigation
● Turns out when you use `bosh tasks -r` it queries the last 30 tasks
● We had 3.5m tasks in the DB
● Query: `SELECT * FROM “tasks” WHERE (“deployment_name” =
‘d27eda6’) ORDER BY “timestamp” DESC LIMIT 30`
○ No index on deployment_name
○ So if only 29 tasks are there it crawls through all 3.5m lines to find task 30.
○ Most deployments have less than 30 tasks in the DB
19
20. Solution
● Change to -r=1
● Make a deploy task for each deployment to make sure there is one
task
● Issue with BOSH core (No. 2105)
● BOSH core fix:
○ BOSH deletes old tasks faster so you have less (10 instead of 2 in each run)
○ Put an index on task types
○ 3.5m tasks > 1100 tasks in the DB
20
22. Things You Should Monitor
● Network IP exhaustion
○ IaaS dependent, but running out of IPs during deploys is suboptimal
○ Especially when customer notices first
● Disk IOPS (depending on IaaS)
● Quota limitations
○ Record holder is azure where a limit increase took 9 days
● CPU credits on important instances
● Disk inode usage, not just how full it is in terms of data
● Certificate expiration
● Check if metrics are missing
22
23. What 1200 deployments taught us
● BOSH team is usually rather fast at fixing issues that block the
director
● BOSH itself is pretty stable
● Change from the Prometheus bosh exporter to the graphite hm plugin
● For most smaller to medium environments t2.large (2cpu, 8GB ram
with burst CPU) or equivalent is plenty
● For large environments a m5.xlarge or m5.2xlarge is enough
○ Disk IO/Network speed will most likely be the bottleneck
23
24. Advice
● Don’t overdo it on the worker count
○ Our biggest director still has only 9 workers for tasks
○ The others have usually 3-4 workers
● Otherwise you run the risk of CPU starving yourself when you use all
workers simultaneously
24