Distributed Stream Processing on Fluentd / #fluentd

Working at NHN Japan
we are hiring!

12 2 4

What we are doing about logs with ﬂuentd

data mining
reporting
page views, unique users,
traffic amount per page,
...
12 2 4

What we are doing about logs with ﬂuentd

super large scale
'sed | grep | wc'
like processes

12 2 4

What ﬂuentd? (not Storm, Kafka or Flume?)

Ruby, Ruby, Ruby! (NOT Java!)
we are working in lightweight language culture

easy to try, easy to patch
Plugin model architecture
Builtin TimeSlicedOutput mechanism

12 2 4

What I talk today

What we are trying with fluentd

How we did, and how we are doing now

What is distributed stream process topologies like?

What is important about stream processing

Implementation details

(appendix)

12 2 4

Architecture in last week's presentation
archive
deliver server(scribed)
archive server
(scribed)
deliver server RAID
server (scribed)
(scribed) Large volume RAID
server

send data both archive servers and
Fluentd workers (as stream)
import past logs and convert
on demand (as batch)

Hadoop
Hadoop Hadoop
Hadoop Hadoop
Hadoop Shib
Cluster
Hadoop
Cluster
Hadoop Cluster
Hadoop
Cluster
Fluentd Cluster
Hadoop
Cluster
Hadoop Hadoop Hive
Cluster
Cluster Cluster
Cluster Cluster
Cluster Web Client

aggregation queries
convert logs as structured data
on demand
and write HDFS (as stream)

12 2 4

Now
archive
deliver server(scribed)
archive server
(scribed)
deliver server RAID
server (scribed)
(Fluentd) Large volume RAID
server

Hadoop
Hadoop Hadoop
Hadoop Hadoop
Hadoop Shib
Cluster
Hadoop
Cluster
Hadoop Cluster
Hadoop
Cluster
Fluentd Cluster
Hadoop
Cluster
Hadoop Hadoop Hive
Cluster
Cluster Cluster
Cluster Cluster
Cluster Web Client

Fluentd
Watcher

12 2 4

Fluentd in production service

10 days

12 2 4

Scale of Fluentd processes

from 127 Web Servers
146 log streams

12 2 4


70,000 messages/sec
120 Mbps
(at peak time)

12 2 4


650 GB/day
(non-blog: 100GB)

12 2 4


89 fluentd instances
on
12 nodes (4Core HT)

12 2 4

We can't go back.

crouton by kbysmnr
12 2 4

What we are trying with ﬂuentd

log conversion

from: raw log
(apache combined like format)

to: structured and query-friendly log
(TAB separated, masked some fields,
many flags added)
12 2 4


log conversion
99.999.999.99 - - [03/Feb/2012:10:59:48 +0900] "GET /article/detail/6246245/ HTTP/1.1" 200
17509 "http://news.livedoor.com/topics/detail/6246245/" "Mozilla/4.0 (compatible; MSIE 8.0;
Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR
3.0.30729; Media Center PC 6.0; InfoPath.1; .NET4.0C)" "news.livedoor.com"
"xxxxxxx.xx.xxxxxxx.xxx" "-" 163266

152930 news.livedoor.com /topics/detail/6242972/ GET 302 210 226 - 99.999.999.99
TQmljv9QtXkpNtCSuWVGGg Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X)
AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A406 Safari/7534.48.3 TRUE
TRUE FALSE FALSE FALSE FALSE FALSE
hhmmdd vhost path method status bytes duration referer rhost userlabel agent FLAG [FLAGS]
FLAGS: status_redirection status_errors rhost_internal suffix_miscfile suffix_imagefile agent_bot
FLAG: logical OR of FLAGS
userlabel: hash of (tracking cookie / terminal id (mobile phone) / rhost+agent)

12 2 4


TimeSlicedOutput of fluentd

Traditional 'log rotation' is important,
but troublesome

We want:
2/3 23:59:59 log in access.0203_23.log
2/4 00:00:00 log in access.0204_00.log
12 2 4

How we did, and how we are doing now

collect

archive

convert

aggregate

show
12 2 4

How we did in past (2011)

collect (scribed)
stream
stream
store to hdfs
archive (scribed) HIGH LATENCY
hourly/daily time to ﬂush +
hourly invocation +
convert (Hadoop Streaming) running time
20-25mins
on demand
aggregate (Hive)

on demand
show
12 2 4

How we are doing now

collect (Fluentd)
stream
stream
archive (scribed)
stream convert
stream VELY LOW LATENCY
convert (Fluentd) 2-3 minutes
(only time to wait ﬂush)
store to hdfs
(over Cloudera's Hoop)
on demand
aggregate (Hive)

on demand
show
12 2 4

crouton by kbysmnr

break.

12 2 4

What is important about stream processing

reasonable efficiency
(compared with batch throughput)

ease to re-run same conversion as
batch
None SPOF
ease to add/remove nodes

12 2 4

Stream processing and batch

How to re-run conversion as batch
when we got troubles?

We want to use 'just one' converter
program for both stream processes
and batch processes!

12 2 4

out_exec_ﬁlter (ﬂuentd built-in plugin)

1. fork and exec 'command' program

2. write data to child process stdin as TAB separated
fields specified by 'out_keys' (for tag, remove_prefix available)

3. read data from child process stdout as TAB
separated fields named by 'in_keys' (for tag, add_prefix
available)

4. set message's timestamp by 'time_key' value in
parsed data as format specified by 'time_format'

12 2 4

'out_exec_ﬁlter' and 'Hadoop Streaming'

read from stdin / write to stdout
TAB separated values as input/output
WOW!!!!!!!

difference: 'tag' may be needed with
out_exec_filter
simple solution: if not exists, ignore.
12 2 4

What is distributed stream process toplogies like?

deliver archiver backup
servers
deliver

servers

servers
worker worker worker worker
worker worker worker worker worker
servers

servers serializer serializer

Redundancy and load balancing
HDFS
MUST be guaranteed anywhere. (Hoop Server)

12 2 4

Deliver nodes

servers
deliver

servers
Accept connections from web servers,
servers Copy messages and send to:
servers 1. archiver (and its backup)
2. convert workers (w/ load balancing)
servers serializer and ...
3. serializer

useful for casual worker append/remove
HDFS
(Hoop Server)

12 2 4

Worker nodes

servers Under load balancing,
deliver
workers as many as you want
servers

servers
servers


HDFS
(Hoop Server)

12 2 4

Serializer nodes

Receive converted data stream from workers,
servers
deliver
aggregate by services, and :
servers
1. write to storage(hfds/hoop)
servers
2. and...
useful to reduce overhead of storage from many
servers concurrent write operations

HDFS
(Hoop Server)

12 2 4

Watcher nodes

servers
deliver Watching data for
servers real-time workload repotings
and trouble notiﬁcations
servers
worker
1. for raw data from delivers
servers 2. for structured data from serializers

watcher HDFS
watcher (Hoop Server)

12 2 4

crouton by kbysmnr
break.
12 2 4

Implementation details

log agents on servers (scribeline)

deliver (copy, in_scribe, out_scribe, out_forward)

worker (in/out_forward, out_exec_filter)

serializer/hooper (in/out_forward, out_hoop)

watcher (in_forward, out_flowcounter, out_growthforecast)

12 2 4

log agent: scribeline

log delivery agent tool, python 2.4, scribe/thrift
easy to setup and start/stop
works with any httpd configuration updates
works with logrotate-ed log files
automatic delivery target failover/takeback

(NEW) Cluster support
(random select from server list)

https://github.com/tagomoris/scribe_line

12 2 4

From scribeline To deliver

deliver server (primary)
category: blog
message: RAW LOG ﬂuentd
(Apache combined + α)
in_scribe

scribeline scribe
servers in_scribe

ﬂuentd
deliver server (secondary)

12 2 4

deliver 01 (primary)


deliver 02 (secondary)

xNN servers

x8 ﬂuentd
per node
deliver 03 (primary for high throughput nodes)

12 2 4


deliver server (primary)
category: blog
message: RAW LOG ﬂuentd
(Apache combined + α)
in_scribe

scribeline
servers in_scribe

ﬂuentd
deliver server (secondary)

12 2 4

deliver node internal routing

deliver server (primary) x8 fluentd instances
deliver fluentd copy scribe.*
in_scribe out_scribe category: blog
host archive.server.local message: RAW LOG
add_prefix scribe remove_prefix scribe
add_newline true
remove_newline true

out_flowcounter (see later..)
time: received_at
tag: scribe.blog
roundrobin (see next)
message: RAW LOG

out_forward (see later with out_flowcounter..)

12 2 4

deliver node: roundrobin strategy to workers

roundrobin
x56 substore conﬁgurations (7workers x 8instances)
out_forward
server: worker01 port 24211
secondary

out_forward time: received_at
secondary tag: scribe.blog
server: worker03 port 24212 message: RAW LOG

out_forward
secondary

out_forward
secondary

12 2 4

From deliver To worker

deliver server worker server X
deliver fluentd worker fluentd Xn1
copy scribe.* in_forward
roundrobin time: received_at
tag: scribe.blog
out_forward message: RAW LOG

in_forward
time: received_at
tag: scribe.blog
message: RAW LOG
worker fluentd Yn2
worker server Y
12 2 4

worker node internal routing

worker server x8 worker instances, x1 serializer instance
worker fluentd serializer fluentd
in_forward out_exec_filter scribe.* in_forward
command: convert.sh
in_keys: tag,message
remove_prefix scribe out_hoop converted.blog
out_keys: ....... hoop_server servername.local
add_prefix: converted username
time_key: timefield path /on_hdfs/%Y%m%d/blog-%H.log
time_format: %Y%m%d%H%M%S
time:received_at out_hoop converted.news
tag: scribe.blog out_forward converted.* path /on_hdfs/%Y%m%d/news-%H.log
message: RAW LOG

time:written_time TAB separated
tag: converted.blog text data
[many data fields]
12 2 4

out_exec_ﬁlter (review.)

1. fork and exec 'command' program

2. write data to child process stdin as TAB separated
fields specified by 'out_keys' (for tag, remove_prefix available)

3. read data from child process stdout as TAB
separated fields named by 'in_keys' (for tag, add_prefix
available)

4. set message's timestamp by 'time_key' value in
parsed data as format specified by 'time_format'

12 2 4

out_exec_filter behavior details time: 2012/02/04 17:50:35
tag: converted.blog
path:... agent:...
worker fluentd referer:... flag1:TRUE

out_exec_filter scribe.*
command: convert.sh in_keys: tag,message remove_prefix: scribe
out_keys: ....... add_prefix: converted
time_key: timefield time_format: %Y%m%d%H%M%S
time: received_at
tag: scribe.blog
message: RAW LOG

blog RAWLOG blog 20120204175035 field1 field2.....

stdin stdout
Forked Process (convert.sh -> perl convert.pl)

12 2 4

From serializer To HDFS (Hoop)

worker server Hadoop NameNode
serializer ﬂuentd Hoop Server
HTTP
in_forward out_hoop converted.blog
hoop_server servername.local
username
path /on_hdfs/%Y%m%d/blog-%H.log
time:written_time
tag: converted.blog
[many data ﬁelds] out_hoop converted.news
path /on_hdfs/%Y%m%d/news-%H.log TAB separated
text data

HDFS
12 2 4

Overview deliver node cluster

servers
deliver

servers

servers
servers


worker node cluster
HDFS
(Hoop Server)

12 2 4

crouton by kbysmnr

12 2 4

Trafﬁcs: Bytes/sec (on deliver 2/3-4)

• bytes

12 2 4

Trafﬁcs: Messages/sec (on deliver 2/3-4)

• counts

12 2 4

Trafﬁc/CPU/Load/Memory: deliver nodes (2/3-4)

12 2 4

Traffics: workers network traffics total

• total network traffics

12 2 4

Trafﬁc/CPU/Load/Memory: a worker (2/3-4)

12 2 4

Fluentd stream processing

Finally, works fine, now
Log conversion latency dramatically
reduced

Many useful plugins for monitoring are waiting
shipped

Hundreds of cool features to implement are also
waiting for us!

12 2 4

Thank you!

crouton by kbysmnr

12 2 4

crouton by kbysmnr

Appendix

12 2 4

input traffics: by fluent-plugin-flowcounter

deliver server (primary) x8 fluentd instances
deliver fluentd copy scribe.*
in_scribe out_scribe category: blog
host archive.server.local message: RAW LOG
add_prefix scribe remove_prefix scribe
add_newline true
remove_newline true

out_flowcounter (see later..)
time: received_at
tag: scribe.blog
roundrobin (see next)
message: RAW LOG

out_forward (see later with out_flowcounter..)

12 2 4

bytes/messages counting on ﬂuentd

1. 'out_flowcounter' counts input message and its
size (specified fields) and its rate (/sec)

2. Counting results emitted per minute/hour/day

3. Worker fluentd sends results to 'Watcher' node
over out_forward

4. Watcher receives counting results, and pass to
'out_growthforecast'.

'GrowthForecast' is graph drawing tool with REST
API for data registration, by kazeburo

12 2 4

Why not out_forward roundrobin in deliver?

out_forward roundrobin is per buffer
flushing !
(per buffer size, or flush_interval)

For high throughput stream,
this unit is too large.
We needs roundrobin per 'emit'.
12 2 4

From worker To serializer: details

worker server x8 worker instances, x1 serializer instance
worker ﬂuentd serializer ﬂuentd
out_forward converted.* in_forward
server: localhost
secondary: worker1, worker2, worker3, worker4
worker5, worker6, worker7

normally send to localhost
in trouble, balance all traffic to all
other worker's serializers
12 2 4

Software list:
scribed: github.com/facebook/scribe/

scribeline: github.com/tagomoris/scribe_line

fluent-plugin-scribe: github.com/fluent/fluent-plugin-scribe

Hoop: http://cloudera.github.com/hoop/docs/latest/ServerSetup.html

fluent-plugin-hoop: github.com/fluent/fluent-plugin-hoop

GrowthForecast: github.com/kazeburo/growthforecast

fluent-plugin-growthforecast: github.com/tagomoris/fluent-plugin-growthforecast

fluent-plugin-flowcounter: github.com/tagomoris/fluent-plugin-flowcounter

12 2 4

Distributed Stream Processing on Fluentd / #fluentd

More Related Content

Distributed Stream Processing on Fluentd / #fluentd