Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Sensu and Sensibility 
Sensu and Sensibility
Cycle of failure and 
• Manually edited and deployed monitoring 
• Changes require two teams 
• Low developer visibility about production 
Cycle of failure and 
• Manually edited and deployed monitoring 
• Changes require two teams 
• Low developer visibility about production 
• Escalation of issues is hard 
• Ops ignore alerts from services 
• Postmortems 
Cycle of failure and 
• Manually edited and deployed monitoring 
• Changes require two teams 
• Low developer visibility about production 
• Escalation of issues is hard 
• Ops ignore alerts from services 
• Postmortems 
• High friction, low trust, low visibility. 
This is 
“51 % viewed their ERP implementation as 
The Robbins-Gioia Survey (2001)
The Conference Board Survey (2001) 
“40 % of the projects failed to achieve their 
business case within one year of going live” 
McKinsey & Company in conjunction 
with the University of Oxford (2012) 
• “17 percent of large IT projects go so 
badly that they can threaten the very 
existence of the company” 
• “On average, large IT projects run 45 
percent over budget and 7 percent over 
time, while delivering 56 percent less 
value than predicted” 
Failure is an option 
Why Sensu? 
• Designed to be pluggable / extensible 
• Arbitrary check metadata 
• Simple model 
• Components do exactly one thing 
• Ruby 
• Not afraid to extend (or fork!) 
‘industry standard’ 
‘enterprise class’ 
Cheap shot 
How we use Sensu 
• Don’t use all of this! 
• ‘Standalone’ checks only 
• Default in the puppet module 
Sensu data flow 
• Sensu client runs checks on each machine 
• Pushes results to RabbitMQ 
• Clustered, clients/messages will fail over. 
• Sensu server (multiple, ha) 
• Processes check results, invokes handlers 
• Writes state to redis 
• Redis + sentinel 
• Read by API (2 instances) 
• All layers behind haproxy 
Quis custodiet ipsos custodes? 
Mutually assured monitoring 
• Multiple independent Sensu installs (per-datacenter) 
• Monitor each other! 
Machine readable config 
• /etc/sensu/conf.d/checks/check_name.json 
• Extensible with arbitrary metadata 
• Hash merge 
• Never edit by hand! 
monitoring_check { 'systems-apache-external': 
page => true, 
command => "/usr/lib/nagios/plugins/ 
check_tcp -H ${external_ip_address} -p 443", 
check_every => ‘5m', 
alert_after => '30m', 
realert_every => 10, 
runbook => 'y/apache', 
monitoring_check { 'systems-apache-external': 
page => true, 
command => "/usr/lib/nagios/plugins/ 
check_tcp -H ${external_ip_address} -p 443", 
check_every => ‘5m', 
alert_after => '30m', 
realert_every => 10, 
runbook => 'y/apache', 
monitoring_check { 'systems-apache-external': 
page => true, 
command => "/usr/lib/nagios/plugins/ 
check_tcp -H ${external_ip_address} -p 443", 
check_every => ‘5m', 
alert_after => '30m', 
realert_every => 10, 
runbook => 'y/apache', 
monitoring_check { 'systems-apache-external': 
page => true, 
command => "/usr/lib/nagios/plugins/ 
check_tcp -H ${external_ip_address} -p 443", 
check_every => ‘5m', 
alert_after => '30m', 
realert_every => 10, 
runbook => 'y/apache', 
• monitoring_check wraps this 
• Writes a JSON file for each check 
• Comment safe 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
"disk_ro_mounts": { 
"standalone": true, "handlers": [“default"], "subscribers": [], 
"command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", 
"interval": 60, 
"alert_after": 0, "realert_every": “-1", 
"dependencies": [], 
"runbook": "http://lmgtfy.com/?q=linux+read+only+disk", 
"annotation": "https://gitweb.yelpcorp.com/? 
"team": "operations", 
"irc_channels": "operations-notifications", 
"notification_email": "undef", 
"ticket": true, 
"project": “OPS”, 
"page": false, 
"tip": false 
Check scripts 
• Same as nagios checks 
• Simple (text) output 
• Exit code 
• Result sent to server, along with check definition 
• Including all the custom metadata 
• Our handlers use the extra data. 
• base 
• email 
• irc 
• pagerduty 
• awsprune 
How do checks get run? 
• Every machine runs the client. 
• Client managed by puppet 
• Client has a TCP socket you can send JSON to 
• Custom checks + pysensu-yelp 
Situational awareness 
Single source of truth 
• DNS is canonical for sensu servers 
• Configure things in one place! 
Single source of truth 
• DNS is canonical for sensu servers 
• Configure things in one place! 
Automatic monitoring 
• E.g. cron jobs - check successful recently! 
• cron::d 
Automatic monitoring 
• E.g. cron jobs - check successful recently! 
• cron::d 
Generate monitoring_check 
User specified monitoring 
User specified monitoring 
• Data lives in the service config 
• Next to the code to emit metrics!
• Simple checks for free! 
User specified monitoring
User specified monitoring 
• Data lives in the service config 
• Next to the code to emit metrics 
• Next to metadata about SLAs and LB timeouts 
• Developers can push without OPS 
Cluster checks 
• We’re working on this currently 
• Assert some % of machines are healthy. 
• Use to reduce alert noise. 
• If a service becomes fully unavailable to clients, 
you want to page someone. 
• If one machine goes belly up, you don’t (make 
a JIRA ticket for handling later!) 
• This is all still a work in progress. 
• We’ve not 100% migrated off of Nagios 
• Open sourcing the pieces 
• Slides will be online shortly: 
• slideshare.net/bobtfish 
• @bobtfish 
• Some (most?) of our code is open source: 
• https://github.com/Yelp/sensu/commit/ 
• https://github.com/Yelp/puppet-monitoring_check 
• https://github.com/Yelp/puppet-netstdlib 
• https://github.com/Yelp/sensu_handlers 
• https://github.com/Yelp/pysensu-yelp 

More Related Content

What's hot

How Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA WorldHow Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA World
Kyle Anderson
Superb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with SensuSuperb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with Sensu
Paul O'Connor
Experiences from Running Masterless Puppet - PuppetConf 2014
Experiences from Running Masterless Puppet - PuppetConf 2014Experiences from Running Masterless Puppet - PuppetConf 2014
Experiences from Running Masterless Puppet - PuppetConf 2014
How Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA InfrastructureHow Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA Infrastructure
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)
Andy Sykes
Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...
SaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertoolsSaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertools
Thomas Jackson
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and ServerspecVerifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Edmund Dipple
Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015 Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015
London devops logging
London devops loggingLondon devops logging
London devops logging
Tomas Doran
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customization
Sensu Monitoring
Sensu MonitoringSensu Monitoring
Sensu Monitoring
Mohanasundaram Ponnusamy
Ansible Case Studies
Ansible Case StudiesAnsible Case Studies
Ansible Case Studies
Greg DeKoenigsberg
Configuration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsConfiguration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needs
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, FasterSaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
Thomas Jackson
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
Chef Provisioning a Chef Server Cluster - ChefConf 2015
Chef Provisioning a Chef Server Cluster - ChefConf 2015Chef Provisioning a Chef Server Cluster - ChefConf 2015
Chef Provisioning a Chef Server Cluster - ChefConf 2015

What's hot (20)

How Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA WorldHow Yelp Uses Sensu to Monitor Services in a SOA World
How Yelp Uses Sensu to Monitor Services in a SOA World
Superb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with SensuSuperb Supervision of Short-lived Servers with Sensu
Superb Supervision of Short-lived Servers with Sensu
Experiences from Running Masterless Puppet - PuppetConf 2014
Experiences from Running Masterless Puppet - PuppetConf 2014Experiences from Running Masterless Puppet - PuppetConf 2014
Experiences from Running Masterless Puppet - PuppetConf 2014
How Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA InfrastructureHow Yelp uses Mesos to Power its SOA Infrastructure
How Yelp uses Mesos to Power its SOA Infrastructure
Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)Stop using Nagios (so it can die peacefully)
Stop using Nagios (so it can die peacefully)
Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...Spot Trading - A case study in continuous delivery for mission critical finan...
Spot Trading - A case study in continuous delivery for mission critical finan...
SaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertoolsSaltConf 2014: Safety with powertools
SaltConf 2014: Safety with powertools
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Puppet Camp NYC 2014: Build a Modern Infrastructure in 45 min!
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Security Testing with OWASP ZAP in CI/CD - Simon Bennetts - Codemotion Amster...
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and ServerspecVerifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015 Inside the Chef Push Jobs Service - ChefConf 2015
Inside the Chef Push Jobs Service - ChefConf 2015
London devops logging
London devops loggingLondon devops logging
London devops logging
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customization
Sensu Monitoring
Sensu MonitoringSensu Monitoring
Sensu Monitoring
Ansible Case Studies
Ansible Case StudiesAnsible Case Studies
Ansible Case Studies
Configuration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needsConfiguration Management - Finding the tool to fit your needs
Configuration Management - Finding the tool to fit your needs
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, FasterSaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
Chef Provisioning a Chef Server Cluster - ChefConf 2015
Chef Provisioning a Chef Server Cluster - ChefConf 2015Chef Provisioning a Chef Server Cluster - ChefConf 2015
Chef Provisioning a Chef Server Cluster - ChefConf 2015

Viewers also liked

Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internals
Tomas Doran
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for Docker
Tomas Doran
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
Tomas Doran
BIg Data Trends in 2016
BIg Data Trends in 2016BIg Data Trends in 2016
BIg Data Trends in 2016
Stig-Arne Kristoffersen
1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing
Internet of Things and Big Data
Internet of Things and Big DataInternet of Things and Big Data
Internet of Things and Big Data
Swiss Data Forum Swiss Data Forum
Thank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement PlatformThank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement Platform
Seshu Karthick
The Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial ServicesThe Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial Services
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
Cloud transition - The Trivadis approach
Cloud transition - The Trivadis approachCloud transition - The Trivadis approach
Cloud transition - The Trivadis approach
Swiss Data Forum Swiss Data Forum
Puppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Sydney 2015: The (Im)perfect Puppet ModulePuppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet ModulesPuppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
Using Vagrant, Puppet, Testing & Hadoop
Using Vagrant, Puppet, Testing & HadoopUsing Vagrant, Puppet, Testing & Hadoop
Using Vagrant, Puppet, Testing & Hadoop
Devops, Dungeons & Dragons
Devops, Dungeons & Dragons Devops, Dungeons & Dragons
Devops, Dungeons & Dragons
David Lutz
Puppet - Configuration Management Made Eas(ier)
Puppet - Configuration Management Made Eas(ier)Puppet - Configuration Management Made Eas(ier)
Puppet - Configuration Management Made Eas(ier)
Aaron Bernstein
Writing and Publishing Puppet Modules - PuppetConf 2014
Writing and Publishing Puppet Modules - PuppetConf 2014Writing and Publishing Puppet Modules - PuppetConf 2014
Writing and Publishing Puppet Modules - PuppetConf 2014
The Art Of Net Promoter Score
The Art Of Net Promoter ScoreThe Art Of Net Promoter Score
The Art Of Net Promoter Score
Aureus Analytics
Galvanize Data Science Open House
Galvanize Data Science Open HouseGalvanize Data Science Open House
Galvanize Data Science Open House
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Spark Summit

Viewers also liked (20)

Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internals
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for Docker
Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
BIg Data Trends in 2016
BIg Data Trends in 2016BIg Data Trends in 2016
BIg Data Trends in 2016
1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing1DMP: Marketing Data Platform - the future of data-driven marketing
1DMP: Marketing Data Platform - the future of data-driven marketing
Internet of Things and Big Data
Internet of Things and Big DataInternet of Things and Big Data
Internet of Things and Big Data
Thank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement PlatformThank Bunny - Customer Engagement Platform
Thank Bunny - Customer Engagement Platform
The Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial ServicesThe Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial Services
Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics Building a New Platform for Customer Analytics
Building a New Platform for Customer Analytics
Cloud transition - The Trivadis approach
Cloud transition - The Trivadis approachCloud transition - The Trivadis approach
Cloud transition - The Trivadis approach
Puppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Sydney 2015: The (Im)perfect Puppet ModulePuppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Sydney 2015: The (Im)perfect Puppet Module
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet ModulesPuppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
Puppet Camp Atlanta 2014: Continuous Deployment of Puppet Modules
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
Using Vagrant, Puppet, Testing & Hadoop
Using Vagrant, Puppet, Testing & HadoopUsing Vagrant, Puppet, Testing & Hadoop
Using Vagrant, Puppet, Testing & Hadoop
Devops, Dungeons & Dragons
Devops, Dungeons & Dragons Devops, Dungeons & Dragons
Devops, Dungeons & Dragons
Puppet - Configuration Management Made Eas(ier)
Puppet - Configuration Management Made Eas(ier)Puppet - Configuration Management Made Eas(ier)
Puppet - Configuration Management Made Eas(ier)
Writing and Publishing Puppet Modules - PuppetConf 2014
Writing and Publishing Puppet Modules - PuppetConf 2014Writing and Publishing Puppet Modules - PuppetConf 2014
Writing and Publishing Puppet Modules - PuppetConf 2014
The Art Of Net Promoter Score
The Art Of Net Promoter ScoreThe Art Of Net Promoter Score
The Art Of Net Promoter Score
Galvanize Data Science Open House
Galvanize Data Science Open HouseGalvanize Data Science Open House
Galvanize Data Science Open House
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...
Data Driven-Toyota Customer 360 Insights on Apache Spark and MLlib-(Brian Kur...

Similar to Sensu and Sensibility - Puppetconf 2014

“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
Consul administration at scale
Consul administration at scaleConsul administration at scale
Consul administration at scale
Pierre Souchay
Tool up your lamp stack
Tool up your lamp stackTool up your lamp stack
Tool up your lamp stack
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP Stack
Lorna Mitchell
Security research over Windows #defcon china
Security research over Windows #defcon chinaSecurity research over Windows #defcon china
Security research over Windows #defcon china
Peter Hlavaty
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
Graham Dumpleton
DC612 Day - Hands on Penetration Testing 101
DC612 Day - Hands on Penetration Testing 101DC612 Day - Hands on Penetration Testing 101
DC612 Day - Hands on Penetration Testing 101
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebula Project
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
Monitoring at/with SUSE 2015
Monitoring at/with SUSE 2015Monitoring at/with SUSE 2015
Monitoring at/with SUSE 2015
Lars Vogdt
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
Amazon Web Services
StackStrom: If-This-Than-That for Devops Automation
StackStrom: If-This-Than-That for Devops AutomationStackStrom: If-This-Than-That for Devops Automation
StackStrom: If-This-Than-That for Devops Automation
Dmitri Zimine
I hunt sys admins 2.0
I hunt sys admins 2.0I hunt sys admins 2.0
I hunt sys admins 2.0
Will Schroeder
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
Nathan Handler
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]
Mahmoud Hatem
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOps
OCTO Technology
RIoT (Raiding Internet of Things) by Jacob Holcomb
RIoT  (Raiding Internet of Things)  by Jacob HolcombRIoT  (Raiding Internet of Things)  by Jacob Holcomb
RIoT (Raiding Internet of Things) by Jacob Holcomb
Priyanka Aash
2021 ZAP Automation in CI/CD
2021 ZAP Automation in CI/CD2021 ZAP Automation in CI/CD
2021 ZAP Automation in CI/CD
Simon Bennetts
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...
DevOpsDays Riga

Similar to Sensu and Sensibility - Puppetconf 2014 (20)

“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
“Sensu and Sensibility” - The Story of a Journey From #monitoringsucks to #mo...
Consul administration at scale
Consul administration at scaleConsul administration at scale
Consul administration at scale
Tool up your lamp stack
Tool up your lamp stackTool up your lamp stack
Tool up your lamp stack
Tool Up Your LAMP Stack
Tool Up Your LAMP StackTool Up Your LAMP Stack
Tool Up Your LAMP Stack
Security research over Windows #defcon china
Security research over Windows #defcon chinaSecurity research over Windows #defcon china
Security research over Windows #defcon china
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
DC612 Day - Hands on Penetration Testing 101
DC612 Day - Hands on Penetration Testing 101DC612 Day - Hands on Penetration Testing 101
DC612 Day - Hands on Penetration Testing 101
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
OpenNebulaConf 2013 - Monitoring of OpenNebula installations by Florian Heigl
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
Monitoring at/with SUSE 2015
Monitoring at/with SUSE 2015Monitoring at/with SUSE 2015
Monitoring at/with SUSE 2015
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
(APP307) Leverage the Cloud with a Blue/Green Deployment Architecture | AWS r...
StackStrom: If-This-Than-That for Devops Automation
StackStrom: If-This-Than-That for Devops AutomationStackStrom: If-This-Than-That for Devops Automation
StackStrom: If-This-Than-That for Devops Automation
I hunt sys admins 2.0
I hunt sys admins 2.0I hunt sys admins 2.0
I hunt sys admins 2.0
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]
Introduction to DevOps
Introduction to DevOpsIntroduction to DevOps
Introduction to DevOps
RIoT (Raiding Internet of Things) by Jacob Holcomb
RIoT  (Raiding Internet of Things)  by Jacob HolcombRIoT  (Raiding Internet of Things)  by Jacob Holcomb
RIoT (Raiding Internet of Things) by Jacob Holcomb
2021 ZAP Automation in CI/CD
2021 ZAP Automation in CI/CD2021 ZAP Automation in CI/CD
2021 ZAP Automation in CI/CD
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...
DevOpsDaysRiga 2017: Mandi Walls - Building security into your workflow with ...

More from Tomas Doran

Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speed
Tomas Doran
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layout
Tomas Doran
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013
Tomas Doran
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"
Tomas Doran
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)
Tomas Doran
Test driven infrastructure development
Test driven infrastructure developmentTest driven infrastructure development
Test driven infrastructure development
Tomas Doran
London devops - orc
London devops - orcLondon devops - orc
London devops - orc
Tomas Doran
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
Tomas Doran
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
Tomas Doran
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
Tomas Doran
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!
Tomas Doran
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new framework
Tomas Doran
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
Tomas Doran
Cooking a rabbit pie
Cooking a rabbit pieCooking a rabbit pie
Cooking a rabbit pie
Tomas Doran
High scale flavour
High scale flavourHigh scale flavour
High scale flavour
Tomas Doran
Large platform architecture in (mostly) perl - an illustrated tour
Large platform architecture in (mostly) perl - an illustrated tourLarge platform architecture in (mostly) perl - an illustrated tour
Large platform architecture in (mostly) perl - an illustrated tour
Tomas Doran
Large platform architecture in (mostly) perl
Large platform architecture in (mostly) perlLarge platform architecture in (mostly) perl
Large platform architecture in (mostly) perl
Tomas Doran
Web frameworks don't matter
Web frameworks don't matterWeb frameworks don't matter
Web frameworks don't matter
Tomas Doran
Real time system_performance_mon
Real time system_performance_monReal time system_performance_mon
Real time system_performance_mon
Tomas Doran
Tomas Doran

More from Tomas Doran (20)

Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speed
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layout
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development
Test driven infrastructure developmentTest driven infrastructure development
Test driven infrastructure development
London devops - orc
London devops - orcLondon devops - orc
London devops - orc
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new framework
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
Cooking a rabbit pie
Cooking a rabbit pieCooking a rabbit pie
Cooking a rabbit pie
High scale flavour
High scale flavourHigh scale flavour
High scale flavour
Large platform architecture in (mostly) perl - an illustrated tour
Large platform architecture in (mostly) perl - an illustrated tourLarge platform architecture in (mostly) perl - an illustrated tour
Large platform architecture in (mostly) perl - an illustrated tour
Large platform architecture in (mostly) perl
Large platform architecture in (mostly) perlLarge platform architecture in (mostly) perl
Large platform architecture in (mostly) perl
Web frameworks don't matter
Web frameworks don't matterWeb frameworks don't matter
Web frameworks don't matter
Real time system_performance_mon
Real time system_performance_monReal time system_performance_mon
Real time system_performance_mon

Recently uploaded

call bomber software for call centers.pdf
call bomber software for call centers.pdfcall bomber software for call centers.pdf
call bomber software for call centers.pdf
Asfera Technologies
Guide to Improving QA Testing with Gen AI.pdf
Guide to Improving QA Testing with Gen AI.pdfGuide to Improving QA Testing with Gen AI.pdf
Guide to Improving QA Testing with Gen AI.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdfTop 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Banibro IT Solutions
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
Safe Software
Viswanath_Cover letter_Scrum Master_10+yrs
Viswanath_Cover letter_Scrum Master_10+yrsViswanath_Cover letter_Scrum Master_10+yrs
Viswanath_Cover letter_Scrum Master_10+yrs
Top 5 ERP Companies in India Banibro IT Solutions.pdf
Top 5 ERP Companies in India Banibro IT Solutions.pdfTop 5 ERP Companies in India Banibro IT Solutions.pdf
Top 5 ERP Companies in India Banibro IT Solutions.pdf
Banibro IT Solutions
Asset Management software Technologies.pdf
Asset Management software Technologies.pdfAsset Management software Technologies.pdf
Asset Management software Technologies.pdf
Hr365.us smith
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
Henry Schreiner
'Build Your First Website with WordPress' Workshop Introduction
'Build Your First Website with WordPress' Workshop Introduction'Build Your First Website with WordPress' Workshop Introduction
'Build Your First Website with WordPress' Workshop Introduction
Sunita Rai
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
Sri Damayanti
How Odoo Accounting Can Save Your Business, Time and Money.pdf
How Odoo Accounting Can Save Your Business, Time and Money.pdfHow Odoo Accounting Can Save Your Business, Time and Money.pdf
How Odoo Accounting Can Save Your Business, Time and Money.pdf
Banibro IT Solutions
Software Development Company in Florida.pdf
Software Development Company in Florida.pdfSoftware Development Company in Florida.pdf
Software Development Company in Florida.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdfSOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Henry Schreiner
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
Andre Hora
How and Why Developers Migrate Python Tests (SANER 2022)
How and Why Developers Migrate Python Tests (SANER 2022)How and Why Developers Migrate Python Tests (SANER 2022)
How and Why Developers Migrate Python Tests (SANER 2022)
Andre Hora
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)

Recently uploaded (20)

call bomber software for call centers.pdf
call bomber software for call centers.pdfcall bomber software for call centers.pdf
call bomber software for call centers.pdf
Guide to Improving QA Testing with Gen AI.pdf
Guide to Improving QA Testing with Gen AI.pdfGuide to Improving QA Testing with Gen AI.pdf
Guide to Improving QA Testing with Gen AI.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdfTop 10 ERP Companies in UAE Banibro IT Solutions.pdf
Top 10 ERP Companies in UAE Banibro IT Solutions.pdf
Understanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdfUnderstanding Automated Testing Tools for Web Applications.pdf
Understanding Automated Testing Tools for Web Applications.pdf
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
Viswanath_Cover letter_Scrum Master_10+yrs
Viswanath_Cover letter_Scrum Master_10+yrsViswanath_Cover letter_Scrum Master_10+yrs
Viswanath_Cover letter_Scrum Master_10+yrs
Top 5 ERP Companies in India Banibro IT Solutions.pdf
Top 5 ERP Companies in India Banibro IT Solutions.pdfTop 5 ERP Companies in India Banibro IT Solutions.pdf
Top 5 ERP Companies in India Banibro IT Solutions.pdf
Asset Management software Technologies.pdf
Asset Management software Technologies.pdfAsset Management software Technologies.pdf
Asset Management software Technologies.pdf
The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024The two flavors of Python 3.13 - PyHEP 2024
The two flavors of Python 3.13 - PyHEP 2024
'Build Your First Website with WordPress' Workshop Introduction
'Build Your First Website with WordPress' Workshop Introduction'Build Your First Website with WordPress' Workshop Introduction
'Build Your First Website with WordPress' Workshop Introduction
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
Limited Time Offer! Pay One Time to Access to Sociosight for Only $95
How Odoo Accounting Can Save Your Business, Time and Money.pdf
How Odoo Accounting Can Save Your Business, Time and Money.pdfHow Odoo Accounting Can Save Your Business, Time and Money.pdf
How Odoo Accounting Can Save Your Business, Time and Money.pdf
Software Development Company in Florida.pdf
Software Development Company in Florida.pdfSoftware Development Company in Florida.pdf
Software Development Company in Florida.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdfSOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
SOCRadar's Hand Guide For the 2024 Paris Olympics--.pdf
Learning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - PrincetonLearning Rust with Advent of Code 2023 - Princeton
Learning Rust with Advent of Code 2023 - Princeton
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
Experience Enhanced Testing with the Best Test Automation Tools for Salesforc...
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
JavaScript API Deprecation in the Wild: A First Assessment (SANER 2020)
How and Why Developers Migrate Python Tests (SANER 2022)
How and Why Developers Migrate Python Tests (SANER 2022)How and Why Developers Migrate Python Tests (SANER 2022)
How and Why Developers Migrate Python Tests (SANER 2022)
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)
A House In The Rift 0.7.10 b1 (Gallery Unlock, MOD)

Sensu and Sensibility - Puppetconf 2014

  • 1. Sensu and Sensibility Tomas Doran @bobtfish 2014-­‐09-­‐23
  • 2. 2 Sensu and Sensibility
  • 3. Cycle of failure and disappointment • Manually edited and deployed monitoring • Changes require two teams • Low developer visibility about production 3
  • 4. 4
  • 5. Cycle of failure and disappointment • Manually edited and deployed monitoring • Changes require two teams • Low developer visibility about production • Escalation of issues is hard • Ops ignore alerts from services • Postmortems 5
  • 6. 6
  • 7. Cycle of failure and disappointment • Manually edited and deployed monitoring • Changes require two teams • Low developer visibility about production • Escalation of issues is hard • Ops ignore alerts from services • Postmortems • High friction, low trust, low visibility. 7
  • 8. “Normality” 8 -­‐ http://gunshowcomic.com/648
  • 9. “Normality” dysfunctional 9 This is -­‐ http://gunshowcomic.com/648
  • 12. “51 % viewed their ERP implementation as unsuccessful” 12 The Robbins-Gioia Survey (2001)
  • 13. The Conference Board Survey (2001) “40 % of the projects failed to achieve their business case within one year of going live” 13
  • 14. McKinsey & Company in conjunction with the University of Oxford (2012) • “17 percent of large IT projects go so badly that they can threaten the very existence of the company” • “On average, large IT projects run 45 percent over budget and 7 percent over time, while delivering 56 percent less value than predicted” 14
  • 15. Failure is an option -­‐ blog.parasoft.com/single-­‐greatest-­‐barrier-­‐with-­‐sw-­‐delivery 15
  • 18. Why Sensu? • Designed to be pluggable / extensible • Arbitrary check metadata • Simple model • Components do exactly one thing • Ruby • Not afraid to extend (or fork!) 18
  • 21. 21
  • 25. 25
  • 26. How we use Sensu • Don’t use all of this! • ‘Standalone’ checks only • Default in the puppet module 26
  • 27. Sensu data flow • Sensu client runs checks on each machine • Pushes results to RabbitMQ • Clustered, clients/messages will fail over. • Sensu server (multiple, ha) • Processes check results, invokes handlers • Writes state to redis • Redis + sentinel • Read by API (2 instances) • All layers behind haproxy 27
  • 28. Quis custodiet ipsos custodes? 28 “Sensu has so many moving parts that I wouldn’t be able to sleep at night unless I set up a Nagios instance to make sure they were all running.”
  • 29. Mutually assured monitoring • Multiple independent Sensu installs (per-datacenter) • Monitor each other! 29
  • 30. Machine readable config • /etc/sensu/conf.d/checks/check_name.json • Extensible with arbitrary metadata • Hash merge • Never edit by hand! 30
  • 31. monitoring_check monitoring_check { 'systems-apache-external': page => true, command => "/usr/lib/nagios/plugins/ check_tcp -H ${external_ip_address} -p 443", check_every => ‘5m', alert_after => '30m', realert_every => 10, runbook => 'y/apache', } 31
  • 32. monitoring_check monitoring_check { 'systems-apache-external': page => true, command => "/usr/lib/nagios/plugins/ check_tcp -H ${external_ip_address} -p 443", check_every => ‘5m', alert_after => '30m', realert_every => 10, runbook => 'y/apache', } 32
  • 33. monitoring_check monitoring_check { 'systems-apache-external': page => true, command => "/usr/lib/nagios/plugins/ check_tcp -H ${external_ip_address} -p 443", check_every => ‘5m', alert_after => '30m', realert_every => 10, runbook => 'y/apache', } 33
  • 34. monitoring_check monitoring_check { 'systems-apache-external': page => true, command => "/usr/lib/nagios/plugins/ check_tcp -H ${external_ip_address} -p 443", check_every => ‘5m', alert_after => '30m', realert_every => 10, runbook => 'y/apache', } 34
  • 35. sensu::check • monitoring_check wraps this • Writes a JSON file for each check • Comment safe 35
  • 36. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 36
  • 37. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 37
  • 38. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 38
  • 39. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 39
  • 40. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 40
  • 41. "disk_ro_mounts": { "standalone": true, "handlers": [“default"], "subscribers": [], "command": "/usr/lib/nagios/plugins/yelp/check_ro_mounts", "interval": 60, "alert_after": 0, "realert_every": “-1", "dependencies": [], "runbook": "http://lmgtfy.com/?q=linux+read+only+disk", "annotation": "https://gitweb.yelpcorp.com/? p=puppet.git;a=blob;f=modules/profile/manifests/server.pp#l80", "team": "operations", "irc_channels": "operations-notifications", "notification_email": "undef", "ticket": true, "project": “OPS”, "page": false, "tip": false } 41
  • 42. Check scripts • Same as nagios checks • Simple (text) output • Exit code • Result sent to server, along with check definition • Including all the custom metadata • Our handlers use the extra data. 42
  • 43. Handlers • base • JIRA • email • irc • pagerduty • awsprune 43
  • 44. How do checks get run? • Every machine runs the client. • Client managed by puppet • Client has a TCP socket you can send JSON to • Custom checks + pysensu-yelp 44
  • 45. 45
  • 47. Single source of truth • DNS is canonical for sensu servers • Configure things in one place! 47
  • 48. Single source of truth • DNS is canonical for sensu servers • Configure things in one place! 48
  • 49. Automatic monitoring • E.g. cron jobs - check successful recently! • cron::d 49
  • 50. Automatic monitoring • E.g. cron jobs - check successful recently! • cron::d 50
  • 53. User specified monitoring 53 • Data lives in the service config • Next to the code to emit metrics!
  • 54. • Simple checks for free! 54 User specified monitoring
  • 55. User specified monitoring • Data lives in the service config • Next to the code to emit metrics • Next to metadata about SLAs and LB timeouts • Developers can push without OPS 55
  • 56. Cluster checks • We’re working on this currently • Assert some % of machines are healthy. • Use to reduce alert noise. • If a service becomes fully unavailable to clients, you want to page someone. • If one machine goes belly up, you don’t (make a JIRA ticket for handling later!) 56
  • 57. WIP • This is all still a work in progress. • We’ve not 100% migrated off of Nagios • Open sourcing the pieces 57
  • 58. Thanks! • Slides will be online shortly: • slideshare.net/bobtfish • @bobtfish • Some (most?) of our code is open source: • https://github.com/Yelp/sensu/commit/ aa5c43c2fdfde5e8739952c0b8082000934f3ad2 • https://github.com/Yelp/puppet-monitoring_check • https://github.com/Yelp/puppet-netstdlib • https://github.com/Yelp/sensu_handlers • https://github.com/Yelp/pysensu-yelp 58