This document discusses NHN Japan's use of Fluentd for log collection, conversion, and analysis. It summarizes their architecture which involves collecting logs from servers using Scribeline, delivering them to Fluentd processes, converting the logs to a structured format using Fluentd plugins, writing the structured data to HDFS, and performing analysis and reporting with Hive. Key aspects discussed include the scaling of Fluentd processes, performance improvements over their previous Hadoop Streaming-based approach, and implementation details of plugins used.
Report
Share
Report
Share
1 of 62
Download to read offline
More Related Content
Distributed Stream Processing on Fluentd / #fluentd
4. What we are doing about logs with fluentd
data mining
reporting
page views, unique users,
traffic amount per page,
...
12 2 4
5. What we are doing about logs with fluentd
super large scale
'sed | grep | wc'
like processes
12 2 4
6. What fluentd? (not Storm, Kafka or Flume?)
Ruby, Ruby, Ruby! (NOT Java!)
we are working in lightweight language culture
easy to try, easy to patch
Plugin model architecture
Builtin TimeSlicedOutput mechanism
12 2 4
7. What I talk today
What we are trying with fluentd
How we did, and how we are doing now
What is distributed stream process topologies like?
What is important about stream processing
Implementation details
(appendix)
12 2 4
8. Architecture in last week's presentation
archive
deliver server(scribed)
archive server
(scribed)
deliver server RAID
server (scribed)
(scribed) Large volume RAID
server
send data both archive servers and
Fluentd workers (as stream)
import past logs and convert
on demand (as batch)
Hadoop
Hadoop Hadoop
Hadoop Hadoop
Hadoop Shib
Cluster
Hadoop
Cluster
Hadoop Cluster
Hadoop
Cluster
Fluentd Cluster
Hadoop
Cluster
Hadoop Hadoop Hive
Cluster
Cluster Cluster
Cluster Cluster
Cluster Web Client
aggregation queries
convert logs as structured data
on demand
and write HDFS (as stream)
12 2 4
9. Now
archive
deliver server(scribed)
archive server
(scribed)
deliver server RAID
server (scribed)
(Fluentd) Large volume RAID
server
Hadoop
Hadoop Hadoop
Hadoop Hadoop
Hadoop Shib
Cluster
Hadoop
Cluster
Hadoop Cluster
Hadoop
Cluster
Fluentd Cluster
Hadoop
Cluster
Hadoop Hadoop Hive
Cluster
Cluster Cluster
Cluster Cluster
Cluster Web Client
Fluentd
Watcher
12 2 4
16. What we are trying with fluentd
log conversion
from: raw log
(apache combined like format)
to: structured and query-friendly log
(TAB separated, masked some fields,
many flags added)
12 2 4
17. What we are trying with fluentd
log conversion
99.999.999.99 - - [03/Feb/2012:10:59:48 +0900] "GET /article/detail/6246245/ HTTP/1.1" 200
17509 "http://news.livedoor.com/topics/detail/6246245/" "Mozilla/4.0 (compatible; MSIE 8.0;
Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR
3.0.30729; Media Center PC 6.0; InfoPath.1; .NET4.0C)" "news.livedoor.com"
"xxxxxxx.xx.xxxxxxx.xxx" "-" 163266
152930 news.livedoor.com /topics/detail/6242972/ GET 302 210 226 - 99.999.999.99
TQmljv9QtXkpNtCSuWVGGg Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X)
AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A406 Safari/7534.48.3 TRUE
TRUE FALSE FALSE FALSE FALSE FALSE
hhmmdd vhost path method status bytes duration referer rhost userlabel agent FLAG [FLAGS]
FLAGS: status_redirection status_errors rhost_internal suffix_miscfile suffix_imagefile agent_bot
FLAG: logical OR of FLAGS
userlabel: hash of (tracking cookie / terminal id (mobile phone) / rhost+agent)
12 2 4
18. What we are trying with fluentd
TimeSlicedOutput of fluentd
Traditional 'log rotation' is important,
but troublesome
We want:
2/3 23:59:59 log in access.0203_23.log
2/4 00:00:00 log in access.0204_00.log
12 2 4
19. How we did, and how we are doing now
collect
archive
convert
aggregate
show
12 2 4
20. How we did in past (2011)
collect (scribed)
stream
stream
store to hdfs
archive (scribed) HIGH LATENCY
hourly/daily time to flush +
hourly invocation +
convert (Hadoop Streaming) running time
20-25mins
on demand
aggregate (Hive)
on demand
show
12 2 4
21. How we are doing now
collect (Fluentd)
stream
stream
archive (scribed)
stream convert
stream VELY LOW LATENCY
convert (Fluentd) 2-3 minutes
(only time to wait flush)
store to hdfs
(over Cloudera's Hoop)
on demand
aggregate (Hive)
on demand
show
12 2 4
23. What is important about stream processing
reasonable efficiency
(compared with batch throughput)
ease to re-run same conversion as
batch
None SPOF
ease to add/remove nodes
12 2 4
24. Stream processing and batch
How to re-run conversion as batch
when we got troubles?
We want to use 'just one' converter
program for both stream processes
and batch processes!
12 2 4
25. out_exec_filter (fluentd built-in plugin)
1. fork and exec 'command' program
2. write data to child process stdin as TAB separated
fields specified by 'out_keys' (for tag, remove_prefix available)
3. read data from child process stdout as TAB
separated fields named by 'in_keys' (for tag, add_prefix
available)
4. set message's timestamp by 'time_key' value in
parsed data as format specified by 'time_format'
12 2 4
26. 'out_exec_filter' and 'Hadoop Streaming'
read from stdin / write to stdout
TAB separated values as input/output
WOW!!!!!!!
difference: 'tag' may be needed with
out_exec_filter
simple solution: if not exists, ignore.
12 2 4
27. What is important about stream processing
reasonable efficiency
(compared with batch throughput)
ease to re-run same conversion as
batch
None SPOF
ease to add/remove nodes
12 2 4
28. What is distributed stream process toplogies like?
deliver archiver backup
servers
deliver
servers
servers
worker worker worker worker
worker worker worker worker worker
servers
servers serializer serializer
Redundancy and load balancing
HDFS
MUST be guaranteed anywhere. (Hoop Server)
12 2 4
29. Deliver nodes
deliver archiver backup
servers
deliver
servers
Accept connections from web servers,
servers Copy messages and send to:
worker worker worker worker
worker worker worker worker worker
servers 1. archiver (and its backup)
2. convert workers (w/ load balancing)
servers serializer and ...
3. serializer
useful for casual worker append/remove
HDFS
(Hoop Server)
12 2 4
30. Worker nodes
deliver archiver backup
servers Under load balancing,
deliver
workers as many as you want
servers
servers
worker worker worker worker
worker worker worker worker worker
servers
servers serializer serializer
HDFS
(Hoop Server)
12 2 4
31. Serializer nodes
deliver archiver backup
Receive converted data stream from workers,
servers
deliver
aggregate by services, and :
servers
1. write to storage(hfds/hoop)
servers
2. and...
worker worker worker worker
useful to reduce overhead of storage from many
worker worker worker worker worker
servers concurrent write operations
servers serializer serializer
HDFS
(Hoop Server)
12 2 4
32. Watcher nodes
deliver archiver backup
servers
deliver Watching data for
servers real-time workload repotings
and trouble notifications
servers
worker worker worker worker
worker
1. for raw data from delivers
worker worker worker worker
servers 2. for structured data from serializers
servers serializer serializer
watcher HDFS
watcher (Hoop Server)
12 2 4
35. log agent: scribeline
log delivery agent tool, python 2.4, scribe/thrift
easy to setup and start/stop
works with any httpd configuration updates
works with logrotate-ed log files
automatic delivery target failover/takeback
(NEW) Cluster support
(random select from server list)
https://github.com/tagomoris/scribe_line
12 2 4
36. From scribeline To deliver
deliver server (primary)
category: blog
message: RAW LOG fluentd
(Apache combined + α)
in_scribe
scribeline scribe
servers in_scribe
fluentd
deliver server (secondary)
12 2 4
37. deliver 01 (primary)
From scribeline To deliver
deliver 02 (secondary)
xNN servers
x8 fluentd
per node
deliver 03 (primary for high throughput nodes)
12 2 4
38. From scribeline To deliver
deliver server (primary)
category: blog
message: RAW LOG fluentd
(Apache combined + α)
in_scribe
scribeline
servers in_scribe
fluentd
deliver server (secondary)
12 2 4
39. deliver node internal routing
deliver server (primary) x8 fluentd instances
deliver fluentd copy scribe.*
in_scribe out_scribe category: blog
host archive.server.local message: RAW LOG
add_prefix scribe remove_prefix scribe
add_newline true
remove_newline true
out_flowcounter (see later..)
time: received_at
tag: scribe.blog
roundrobin (see next)
message: RAW LOG
out_forward (see later with out_flowcounter..)
12 2 4
40. deliver node: roundrobin strategy to workers
roundrobin
x56 substore configurations (7workers x 8instances)
out_forward
server: worker01 port 24211
secondary
server: worker02 port 24211
out_forward time: received_at
server: worker01 port 24212
secondary tag: scribe.blog
server: worker03 port 24212 message: RAW LOG
out_forward
server: worker01 port 24213
secondary
server: worker04 port 24213
out_forward
server: worker01 port 24214
secondary
server: worker05 port 24214
12 2 4
41. From deliver To worker
deliver server worker server X
deliver fluentd worker fluentd Xn1
copy scribe.* in_forward
roundrobin time: received_at
tag: scribe.blog
out_forward message: RAW LOG
in_forward
time: received_at
tag: scribe.blog
message: RAW LOG
worker fluentd Yn2
worker server Y
12 2 4
43. out_exec_filter (review.)
1. fork and exec 'command' program
2. write data to child process stdin as TAB separated
fields specified by 'out_keys' (for tag, remove_prefix available)
3. read data from child process stdout as TAB
separated fields named by 'in_keys' (for tag, add_prefix
available)
4. set message's timestamp by 'time_key' value in
parsed data as format specified by 'time_format'
12 2 4
53. Fluentd stream processing
Finally, works fine, now
Log conversion latency dramatically
reduced
Many useful plugins for monitoring are waiting
shipped
Hundreds of cool features to implement are also
waiting for us!
12 2 4
56. input traffics: by fluent-plugin-flowcounter
deliver server (primary) x8 fluentd instances
deliver fluentd copy scribe.*
in_scribe out_scribe category: blog
host archive.server.local message: RAW LOG
add_prefix scribe remove_prefix scribe
add_newline true
remove_newline true
out_flowcounter (see later..)
time: received_at
tag: scribe.blog
roundrobin (see next)
message: RAW LOG
out_forward (see later with out_flowcounter..)
12 2 4
57. bytes/messages counting on fluentd
1. 'out_flowcounter' counts input message and its
size (specified fields) and its rate (/sec)
2. Counting results emitted per minute/hour/day
3. Worker fluentd sends results to 'Watcher' node
over out_forward
4. Watcher receives counting results, and pass to
'out_growthforecast'.
'GrowthForecast' is graph drawing tool with REST
API for data registration, by kazeburo
12 2 4
58. Why not out_forward roundrobin in deliver?
out_forward roundrobin is per buffer
flushing !
(per buffer size, or flush_interval)
For high throughput stream,
this unit is too large.
We needs roundrobin per 'emit'.
12 2 4
59. deliver node: roundrobin strategy to workers
roundrobin
x56 substore configurations (7workers x 8instances)
out_forward
server: worker01 port 24211
secondary
server: worker02 port 24211
out_forward time: received_at
server: worker01 port 24212
secondary tag: scribe.blog
server: worker03 port 24212 message: RAW LOG
out_forward
server: worker01 port 24213
secondary
server: worker04 port 24213
out_forward
server: worker01 port 24214
secondary
server: worker05 port 24214
12 2 4
60. Why not out_forward roundrobin in deliver?
out_forward roundrobin is per buffer
flushing !
(per buffer size, or flush_interval)
For high throughput stream,
this unit is too large.
We needs roundrobin per 'emit'.
12 2 4
61. From worker To serializer: details
worker server x8 worker instances, x1 serializer instance
worker fluentd serializer fluentd
out_forward converted.* in_forward
server: localhost
secondary: worker1, worker2, worker3, worker4
worker5, worker6, worker7
normally send to localhost
in trouble, balance all traffic to all
other worker's serializers
12 2 4