Slides for our solution we developed for using Mesos, Docker, Kafka, Spark, Cassandra and Solr (DataStax Enterprise Edition) all developed in Go for doing realtime log analysis at scale. Many organizations either need or want log analysis in real time where you can see within a second what is happening within your entire infrastructure. Today, with the hardware available and software systems we have in place, you can develop, build and use as a service these solutions.
Report
Share
Report
Share
1 of 37
More Related Content
Real-Time Log Analysis with Apache Mesos, Kafka and Cassandra
5. Mesos Papers
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
http://static.usenix.org/event/nsdi11/tech/full_papers/Hindman_new.pdf
Google Borg - https://research.google.com/pubs/pub43438.html
Google Omega: flexible, scalable schedulers for large compute clusters
http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf
5
10. Fine Grained Resource Elasticity
"If people knew how low it really is, we’d all get fired."
https://gigaom.com/2013/11/30/the-sorry-state-of-server-utilization-and-the-impending-post-hypervisor-era/
10
17. Kafka papers
Apache Kafka was first open sourced by LinkedIn in 2011
Papers
● Building a Replicated Logging System with Apache Kafka http://www.vldb.org/pvldb/vol8/p1654-wang.pdf
● Kafka: A Distributed Messaging System for Log Processing http://research.microsoft.com/en-
us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
● Building LinkedIn’s Real-time Activity Data Pipeline http://sites.computer.org/debull/A12june/pipeline.pdf
● The Log: What Every Software Engineer Should Know About Real-time Data's Unifying Abstraction
http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-
unifying
http://kafka.apache.org/
17
28. Kafka on Mesos
• smart broker.id assignment.
• preservation of broker placement (through constraints and/or new
features).
• ability to-do configuration changes.
• rolling restarts (for things like configuration changes).
• scaling the cluster up and down with automatic, programmatic and manual
options.
• smart partition assignment via constraints visa vi roles, resources and
attributes.
28
29. CLI & REST API
• scheduler - starts the scheduler.
• broker
– add - adds one more more brokers to the cluster.
– update - changes resources, constraints or broker properties one or more brokers.
– remove - take a broker out of the cluster.
– start - starts a broker up.
– stop - this can either a graceful shutdown or will force kill it (./kafka-mesos.sh help stop)
• topic
– list - list topics in cluster
– add - add new topics in cluster
– update - change topics in cluster
– rebalance - allows you to rebalance a cluster either by selecting the brokers or topics to rebalance.
Manual assignment is still possible using the Apache Kafka project tools. Rebalance can also change the
replication factor on a topic.
• help - ./kafka-mesos.sh help || ./kafka-mesos.sh help {command}
29
32. Consume from Kafka → Write to Cassandra
Implement CQL write here
https://github.com/stealthly/go_kafka_client/blob/master/consu
mers/consumers.go#L186-L194 with
https://github.com/gocql/gocql
Go Kafka Client does fan out work processing, rebalance
doesn’t upset consumers that are reading already.
32