This document discusses logging best practices and tools for log centralization and analysis. It recommends:
1. Using Monolog for PHP logging to centralize logs from application code in a standardized format. Monolog supports multiple handlers for storage and alerts.
2. Leveraging Rsyslog to also centrally collect logs from software, services and the operating system. Rsyslog can scale across servers.
3. Ingesting logs into Logstash for filtering, parsing and forwarding. Logstash supports many input sources and output targets.
4. Storing logs long-term in Graylog2 for powerful searching, analytics, dashboards and alerting. Graylog2 is highly scalable and easy to use.
2. Who?Who?
● Ex-pat Englishman, living in Southern
Ontario.
● Web developer for 5 years, mostly PHP.
● Senior Developer at MRX Digital Sports
Solutions.
● Ex-professional musician.
10. What's wrong with error_log?What's wrong with error_log?
● Nothing at all but...
● It's limited:
– Have to format the message yourself.
– Limited number of destinations.
– Doesn't support all logging levels defined in
RFC 5424.
12. Introducing MonologIntroducing Monolog
● PHP 5.3+ logging library by Jordi
Boggiano.
● Based on Python's Log Book library.
● PSR-3 compliant.
● Supports logging levels defined in RFC
5424.
13. Installing MonologInstalling Monolog
● Symfony2, Laravel4, Silex and PPI all
come with Monolog.
● CakePHP and Slim have have plug-ins to
use it.
● Most easily installed with Composer.
15. ChannelsChannels
● A channel is a name or category for a
logger.
● Each logger instance is given a channel
when instantiated.
● Allows for multiple loggers, each with a
different channel.
16. HandlersHandlers
● Handlers write log messages to a storage
medium.
● Multiple handlers can be attached to each
logger.
● Each handler can be configured to handle
different log levels and to 'bubble' or not.
● Many handlers available or you can write
your own.
18. FormattersFormatters
● Processes a log message into the
appropriate format for a handler.
● Each handler has a default formatter to
use but this can be overridden.
22. ProcessorsProcessors
● Used to amend or add to the log message.
● PHP callable, called when a message is
logged.
● Built in processors available:
– IntrospectionProcessor
– WebProcessor
– MemoryUsageProcessor
– MemoryPeakUsageProcessor
– ProcessIdProcessor
– UidProcessor
24. Where does this get us to?Where does this get us to?
● Centralised. Maybe...
● Accepts messages from application code,
software and the OS.
● Performant. Maybe...
● Scalable. Maybe...
● Easily searchable.
● Alarms and alerting. Yes but crude.
27. Why Syslog?Why Syslog?
● Loggable events don't only happen in code!
● Many apps/services send messages to
syslog.
● To get a full picture of what's going on we
need to monitor these too.
28. Syslog basicsSyslog basics
● OS daemon to process log messages.
● Messages are assigned a facility, such as
auth, authpriv, daemon or cron.
● Custom facilities of local0 – local7.
● Messages are also assigned a severity,
defined in RFC 5424.
● Messages can be sent to files, console or
a remote location.
29. Which Syslog daemon to use?Which Syslog daemon to use?
● In part will depend on your OS.
● Features:
– Syslog is the oldest with not as many features.
– Syslog-ng is produced under a dual license.
– Rsyslog fully featured and open source.
30. Introduction to RsyslogIntroduction to Rsyslog
● Fork of syslog by Rainer Gerhards.
● Drop in replacement for syslog.
● Many, many features including plugin
system for extending.
● Default syslogger in Debian, can be
installed on other distros too.
31. Remote logging with RsyslogRemote logging with Rsyslog
● Rsyslog can be configured to work in a
client-server setup.
– One or more machines are setup as clients to
forward log messages.
– One machine is setup to receive and store
them.
● Probably want to filter sender on the
receiving machine...
34. Leveling up with RsyslogLeveling up with Rsyslog
● Apache can send all error logs to syslog
directly.
● Rsyslog can also monitor other log files
using the Text File Input module.
– Example of monitoring Apache access log at
https://gist.github.com/joseph12631/2580615
35. Where does this get us to?Where does this get us to?
● Centralised. Yes.
● Accepts messages from application code,
software and the OS. Possibly.
● Performant. Depends.
● Scalable. Depends.
● Easily searchable.
● Alarms and alerting. Yes but crude.
37. What is Logstash?What is Logstash?
● Tool to collect, filter and output log
messages.
● Currently accepts 34 inputs, has 28 filters
and 47 different outputs.
● Built in web interface or richer web interface
project called Kibana available.
● Full information at http://logstash.net/
● Kibana web demo at
http://demo.logstash.net/
38. Installing LogstashInstalling Logstash
● Current release is 1.1.13 and can be
downloaded from here.
● Run from cli, use supervisord or an
init.d/upstart script (cookbook entry on how
to do this at http://cookbook.logstash.net/).
40. Logstash configLogstash config
● When starting specify the path to a config
file for Logstash to use.
● Config file has JSON like syntax.
● Three main sections: input, filter and output.
● Each section may have multiple instances
of each type.
42. Where does this get us to?Where does this get us to?
● Centralised. Yes.
● Accepts messages from application code,
software and the OS. Yes.
● Performant. Yes.
● Scalable. Yes.
● Easily searchable. Possibly.
● Alarms and alerting. Yes.
44. What is Graylog2?What is Graylog2?
● Log storage and search application.
● Can accept thousands of messages per
second and store terabytes of data.
● Web interface for searching and analytics.
● Built in alerting and metrics.
45. Installing Graylog2Installing Graylog2
● Components:
– Elasticsearch
– MongoDb
– Graylog2 server
– Graylog2 web interface
● Full info on installing at
http://support.torch.sh/help/kb
● Live demo at
http://public-graylog2.taulia.com/login
46. Getting log messages intoGetting log messages into
Graylog2Graylog2
● Can accept log messages in 3 ways:
– Graylog Extended Log Format (GELF) via
UDP .
– Syslog via UDP or TCP.
– AMQP.
● Multiple Graylog2 server instances can be
run in parallel to spread processing of logs.
47. Graylog2 web interfaceGraylog2 web interface
● Main view shows all recent log messages
and graphs of number of messages
received over the last several hours.
● Single message can be clicked on to view
all details for it.
● Dashboard views.
● Full search functionality.
● Analytics dashboard and metrics.
51. Searches and streamsSearches and streams
● Web interface allows fine grained searching
by different fields.
● Frequently used searches can be saved as
streams.
● Streams can be marked as favourites by
users and can be viewed as dashboards.
52. Stream alarmsStream alarms
● Alarms can be sent for a stream with user
defined sensitivity.
● Plugins for sending alarms include:
– Email
– PagerDuty
– HipChat
– Twilio SMS
– Jabber/XMPP
● You can also write your own plugins.
53. Where does this get us to?Where does this get us to?
● Centralised. Yes.
● Accepts messages from application code,
software and the OS. Yes.
● Performant. Yes.
● Scalable. Yes.
● Easily searchable. Yes.
● Alarms and alerting. Yes.
55. Putting it all togetherPutting it all together
A few possible implementations.
Editor's Notes
Of all the things you would come to a conference like this to hear about...
At any given moment full of information about how things are performing right now. Mention framework logs, error logs, Apache logs, MySQL logs, system logs.
Ask how many people have any sort of log monitoring setup. For those that have, how many are looking at log data in real time? Ask how many only look at logs during an outage.
Many log files generated by many applications/pieces of software. Last time want to be digging through this is in a crisis.
Mention that I can't tell you how to do this. This talk will introduce some tools that can get you to this point. Combination of tools will get you to a pro-active log monitoring solution. Also mention that for each tool I'm talking about there are many alternatives... Mention closed source alternatives. Mention that this is being used in production at MRX.
Of course this will be different for everyone!
Also mention that it's specifically for logging errors, not informational or debug messages. Difficult to format messages. Destinations: file or email. Define log levels in RFC 5425
Note that notice and emergency are recommended not to be used by Monolog.
Mention that there are many logging libraries but Monolog has seemed to have gained the most traction. Describe what PSR-3 is.
PPI takes pieces of Zend 2, Sf2 and Doctrine2 and mashes them! Silex allows you to register a Monolog provider.
Channel equates to facility in Syslog. Makes it easy to use different loggers for different parts/functionality in an app. Each logger can have different handlers.
The handlers constructor accepts the minimum log level that the handler should accept. Defaults differently depending on handler. Handlers can be shared between multiple loggers. Needs care when not bubbling! Add more specific handlers later.
Rotating File Handler: Creates one file per day but meant as a quick + dirty solution. Mail handlers include native mail and Swiftmail handlers. Pushover handler sends mobile notifications through the Pushover API. HipChatHandler send notification to a HipChat chat room (Rafael Dohms wrote it) FirePHP and ChromePHP write to FireBug or Chrome consoles. DEV ONLY!!
Use Handler::setFormatter() method to set the formatter for a handler.
Mention that logging a message accepts up to two arguments: The message (string) and an array of context.
Mention that handlers added last are called first.
Mention that this takes away some of the repetition of adding context to each log message. IntrospectionProcessor: Adds the line/file/class/method from which the log call originated. WebProcessor: Adds the current request URI, request method and client IP to a log record. MemoryUsageProcessor: Adds the current memory usage to a log record. MemoryPeakUsageProcessor: Adds the peak memory usage to a log record. ProcessIdProcessor: Adds the process id to a log record. UidProcessor: Adds a unique identifier to a log record.
Mention that system logger can log messages from the OS (e.g kernel) or applications.
Mention that you can often replace the default syslog daemon in an OS.
Mention that not going into all features of Rsyslog, just focusing on remote logging. Suggest 'man rsyslog' or 'man rsyslog.conf'. Also mention that can use something like Rsyslog or IPTables to filter remote loggers.
Note this should be added to main rsyslog config file or a file that's included in it. This is for UDP forwarding. TCP would use @@.
Mention that normally you would need just one of these. Also that the corresponding port needs to be opened in the server config. This would only load the handler for the remote logs. Still needs to be processed with other directives.
Note that if all you want is to centralise all of your logs this could be the solution...
Mention that Logstash is written in Java.
Varnishlog – input from Varnishes memory log. Anonymize – anonymise fields using a consistent hash. Grok – regex library for parsing log messages and processing matches. Geoip – add geo data to ip addresses in log messages. Mutate – General mutations (rename, remove, replace, modify) to fields.
KV filter – parses key=value pairs into message parts (eg. “key”: “value”).
Of course this will be different for everyone!
Discuss advantages and disadvantages to using Graylog or Logstash.
Mention that graylog server and elasticsearch are written in Java, web interface is a Rails app. Mention login details for the demo – username admin or user, password graylog2.
Benefits of UDP – 'Fire and forget'. Drawbacks of UDP – Lack of acknowledgement of receiving messages. TCP can mitigate packet loss but slower. AMQP guarantees delivery, but more complex to setup and run. GELF is basically JSON. Ideal for sending messages from app code. Libraries in many languages, including a Monolog handler.