Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Why My Streaming Job is
Slow
A case study on profiling Kafka Streams application
Kafka Summit 2019 London
Nishchay Sinha,
Lei Chen
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Application in a nut shell
Transformer Transformer
Stage-1
State
Stage-N
Market data
(bid/ask/trade)
Kafka Streams Processor API
Composite price
© 2018 Bloomberg Finance L.P. All rights reserved.
Initial Latency – ~10ms
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Review configuration
• Persistent state store
• Cache enabled
• Changelog enabled
• Kryo serde
• Cache size
• Commit interval
• EOS disabled
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Profiler to rescue
• Profilers: VisualVm/YourKit/Async-profiler/etc
• Async-profiler
o https://github.com/jvm-profiling-tools/async-profiler
o Profile CPU and memory
o Integrated with Intellij 2019
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Where is Time Spent?
© 2018 Bloomberg Finance L.P. All rights reserved.
Under The Hood : Persistent State Store
CachingKeyValueStore
ChangelogBytesStore
RocksDBBytesStore
TreeMap
LRU cache
Flush by commit interval
PutGet
Cache miss
Changelog
Segments
MeteredKeyValueStore
(De)Serialization
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
WithCacheEnabled?
• A performance optimization
• Bytes in, Bytes out!
• For Kafka Streams, not RocksDB
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Revisit our application
Transformer Transformer
Stage-1
State
Stage-N
Market data
(bid/ask/trade)
Kafka Streams Processor API
Composite price
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Solution 1 – switch to InMemoryKeyValueStore
GET/PUT
TreeMap
• Cannot handle state larger than available
RAM
Pros
• ~10 times faster with logging
disabled
• Slower with logging enabled
Cons
ChangeLogs
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Why bytes conversion? Can it be deferred?
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Solution 2 - Move serde to after caching store
• More memory
• Cache size measurement
• Increase commit time burden
• Library change
Pros
• Faster
• Transparent to application
Cons
Cache
(De)Serialization
Persistence Changelogs
Async flush
GET/PUT
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Solution 3 – Cache both Bytes and Object
• More memory
• Cache size measurement
• Put not optimized
• Library change
Pros
• Faster
• Get is optimized
• Transparent to application
Cons
Cache
Object
Persistence Changelogs
Async flush
Bytes
GET/PUT
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
What’s the best way to pass transient state
across processors?
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Solution 4 – Application layer cache
• Faster
• More flexible caching/flushing
strategy
• Coarse grained size estimator
• No library change
Application cache
Kafka Streams State
Pros
• Need to flush explicitly
• More memory
ConsGET/PUT
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
CacheableStateStore - explained
• CachableStateStore extends KeyValueStore[K, V]
• persistentKeyValueStore as backend
• On top of Guava cache
• Tunable per state store, not global setting
• Only serialize till end of topology
• Partition specific
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Profile again - The hot spot was gone!
© 2018 Bloomberg Finance L.P. All rights reserved.
Final Latency – ~1ms
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Random latency spikes?
• Punctuators
• Commit
• State directory
• GC pressure
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Cms-GC to G1GC
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Does this apply to you?
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Does this apply to only Kafka Streams?
© 2018 Bloomberg Finance L.P. All rights reserved.
© 2018 Bloomberg Finance L.P. All rights reserved.
Questions?

More Related Content

Why My Streaming Job is Slow - Profiling and Optimizing Kafka Streams Apps (Lei Chen, Bloomberg L.P.) Kafka Summit London 2019

  • 1. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Why My Streaming Job is Slow A case study on profiling Kafka Streams application Kafka Summit 2019 London Nishchay Sinha, Lei Chen
  • 2. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Application in a nut shell Transformer Transformer Stage-1 State Stage-N Market data (bid/ask/trade) Kafka Streams Processor API Composite price
  • 3. © 2018 Bloomberg Finance L.P. All rights reserved. Initial Latency – ~10ms
  • 4. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Review configuration • Persistent state store • Cache enabled • Changelog enabled • Kryo serde • Cache size • Commit interval • EOS disabled
  • 5. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Profiler to rescue • Profilers: VisualVm/YourKit/Async-profiler/etc • Async-profiler o https://github.com/jvm-profiling-tools/async-profiler o Profile CPU and memory o Integrated with Intellij 2019
  • 6. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Where is Time Spent?
  • 7. © 2018 Bloomberg Finance L.P. All rights reserved. Under The Hood : Persistent State Store CachingKeyValueStore ChangelogBytesStore RocksDBBytesStore TreeMap LRU cache Flush by commit interval PutGet Cache miss Changelog Segments MeteredKeyValueStore (De)Serialization
  • 8. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. WithCacheEnabled? • A performance optimization • Bytes in, Bytes out! • For Kafka Streams, not RocksDB
  • 9. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Revisit our application Transformer Transformer Stage-1 State Stage-N Market data (bid/ask/trade) Kafka Streams Processor API Composite price
  • 10. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Solution 1 – switch to InMemoryKeyValueStore GET/PUT TreeMap • Cannot handle state larger than available RAM Pros • ~10 times faster with logging disabled • Slower with logging enabled Cons ChangeLogs
  • 11. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Why bytes conversion? Can it be deferred?
  • 12. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Solution 2 - Move serde to after caching store • More memory • Cache size measurement • Increase commit time burden • Library change Pros • Faster • Transparent to application Cons Cache (De)Serialization Persistence Changelogs Async flush GET/PUT
  • 13. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Solution 3 – Cache both Bytes and Object • More memory • Cache size measurement • Put not optimized • Library change Pros • Faster • Get is optimized • Transparent to application Cons Cache Object Persistence Changelogs Async flush Bytes GET/PUT
  • 14. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. What’s the best way to pass transient state across processors?
  • 15. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Solution 4 – Application layer cache • Faster • More flexible caching/flushing strategy • Coarse grained size estimator • No library change Application cache Kafka Streams State Pros • Need to flush explicitly • More memory ConsGET/PUT
  • 16. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. CacheableStateStore - explained • CachableStateStore extends KeyValueStore[K, V] • persistentKeyValueStore as backend • On top of Guava cache • Tunable per state store, not global setting • Only serialize till end of topology • Partition specific
  • 17. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Profile again - The hot spot was gone!
  • 18. © 2018 Bloomberg Finance L.P. All rights reserved. Final Latency – ~1ms
  • 19. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Random latency spikes? • Punctuators • Commit • State directory • GC pressure
  • 20. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Cms-GC to G1GC
  • 21. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Does this apply to you?
  • 22. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Does this apply to only Kafka Streams?
  • 23. © 2018 Bloomberg Finance L.P. All rights reserved. © 2018 Bloomberg Finance L.P. All rights reserved. Questions?