Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo

1

The Future of Apache Storm
Hadoop Summit 2016 - Dublin
P. Taylor Goetz, Hortonworks
@ptgoetz

2

About Me
• Tech Staff @ Hortonworks
• PMC Chair, Apache Storm
• ASF Member
• PMC, Apache Incubator
• PMC, Apache Arrow
• PMC, Apache Kylin
• Mentor/PPMC, Apache Eagle (Incubating)
• Mentor/PPMC, Apache Metron (Incubating)
• Mentor/PPMC, Apache Apex (Incubating)

3

Apache Storm 0.9.x
Storm moves to Apache

4

Apache Storm 0.9.x
• First official Apache Release
• Storm becomes an Apache TLP
• 0mq to Netty for inter-worker communication
• Expanded Integration (Kafka, HDFS, HBase)
• Dependency conflict reduction (It was a start ;) )

5

Apache Storm 0.10.x
Enterprise Readiness

6

Apache Storm 0.10.x
• Security, Multi-Tenancy
• Enable Rolling Upgrades
• Flux (declarative topology wiring/configuration)
• Partial Key Groupings

7

Apache Storm 0.10.x
• Improved logging (Log4j 2)
• Streaming Ingest to Apache Hive
• Azure Event Hubs Integration
• Redis Integration
• JDBC Integration

8

Apache Storm 1.0
Maturity and Improved Performance
Release Date: April 12, 2016

9

Pacemaker
Heartbeat Server

10

Pacemaker
• Replaces Zookeeper for Heartbeats
• In-Memory key-value store
• Allows Scaling to 2k-3k+ Nodes
• Secure: Kerberos/Digest Authentication

11

Pacemaker
• Compared to Zookeeper:
• Less Memory/CPU
• No Disk
• Spared the overhead of maintaining consistency

12

Distributed Cache API

13

Distributed Cache API
• Topology resources:
• Dictionaries, ML Models, Geolocation Data, etc.
• Typically packaged in topology jar
• Fine for small files
• Large files negatively impact topology startup time
• Immutable: Changes require repackaging and deployment

14

Distributed Cache API
• Allows sharing of files (BLOBs) among topologies
• Files can be updated from the command line
• Allows for files from several KB to several GB in size
• Files can change over the lifetime of the topology
• Allows for compression (e.g. zip, tar, gzip)

15

Distributed Cache API
• Two implementations: LocalFsBlobStore and HdfsBlobStore
• Local implementation supports Replication Factor (not needed for
HDFS-backed implementation)
• Both support ACLs

16

Distributed Cache API
Creating a blob:
storm blobstore create --file dict.txt --acl o::rwa
--repl-fctr 2 key1
Making it available to a topology:
storm jar topo.jar my.topo.Class test_topo -c
topology.blobstore.map=‘{"key1":{"localname":"dict.t
xt", "uncompress":"false"}}'

17

High Availability Nimbus

18

Before HA Nimbus

19

HA Nimbus

20

HA Nimbus - Failover

21

HA Nimbus - Failover

22

HA Nimbus
• Increase overall availability of Nimbus
• Nimbus hosts can join/leave at any time
• Leverages Distributed Cache API
• Topology JAR, Config, and Serialized Topology uploaded to
Distributed Cache
• Replication guarantees availability of all files

23

Native Streaming Windows

24

Streaming Windows
• Specify Length - Duration or Tuple Count
• Slide Interval - How often to advance the window

25

Sliding Windows
Windows can overlap

26

Tumbling Windows
Windows do not overlap

27

Streaming Windows
• Timestamps (Event Time, Ingestion Time and Processing Time)
• Out of Order Tuples
• Watermarks
• Window State Checkpointing

28

Sate Management
Stateful Bolts with Automatic Checkpointing

29

What you see.
State Management

30

State Management
public class WordCountBolt extends BaseStatefulBolt<KeyValueState> {
private KeyValueState wordCounts;
private OutputCollector collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void initState(KeyValueState state) {
this.wordCounts = state;
}
public void execute(Tuple tuple) {
String word = tuple.getString(0);
Integer count = (Integer) wordCounts.get(word, 0);
count++;
wordCounts.put(word, count);
collector.emit(new Values(word, count));
}
}

31

public class WordCountBolt extends BaseStatefulBolt<KeyValueState> {
private KeyValueState wordCounts;
private OutputCollector collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void initState(KeyValueState state) {
this.wordCounts = state;
}
public void execute(Tuple tuple) {
String word = tuple.getString(0);
Integer count = (Integer) wordCounts.get(word, 0);
count++;
wordCounts.put(word, count);
collector.emit(new Values(word, count));
}
}
Initialize State
State Management

32

public class WordCountBolt extends BaseStatefulBolt<KeyValueState> {
private KeyValueState wordCounts;
private OutputCollector collector;
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
this.collector = collector;
}
public void initState(KeyValueState state) {
this.wordCounts = state;
}
public void execute(Tuple tuple) {
String word = tuple.getString(0);
Integer count = (Integer) wordCounts.get(word, 0);
count++;
wordCounts.put(word, count);
collector.emit(new Values(word, count));
}
}
Read/Update State
State Management

33

Sate Management
Automatic Checkpointing

34

Checkpointing/Snapshotting
• Asynchronous Barrier Snapshotting (ABS) algorithm [1]
• Chandy-Lamport Algorithm [2]
[1] http://arxiv.org/pdf/1506.08603v1.pdf
[2] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf
State Management

35

Checkpointing/Snapshotting: What you see.
Storm State Management
execute/update state execute execute/update state

36

Checkpointing/Snapshotting: What you get.
Storm State Management

37

Automatic Back Pressure

38

Automatic Back Pressure
• In previous Storm versions, the only way to throttle topologies was to
enable ACKing and set topology.spout.max.pending.
• If you don’t require at-least-once guarantees, this imposed a
significant performance penalty.**
** In Storm 1.0 this penalty is drastically reduced (more on this later)

39

Automatic Backpressure
• High/Low Watermarks (expressed as % of buffer size)
• Back Pressure thread monitors buffers
• If High Watermark reached, slow down Spouts
• If Low Watermark reached, stop throttling
• All Spouts Supported

40

Resource Aware Scheduler
(RAS)

41

Resource Aware Scheduler
• Specify the resource requirements (Memory/CPU) for individual
topology components (Spouts/Bolts)
• Memory: On-Heap / Off-Heap (if off-heap is used)
• CPU: Point system based on number of cores
• Resource requirements are per component instance (parallelism
matters)

42

Resource Aware Scheduler
• CPU and Memory availability described in storm.yaml on each
supervisor node. E.g.:
supervisor.memory.capacity.mb: 3072.0
supervisor.cpu.capacity: 400.0
• Convention for CPU capacity is to use 100 for each CPU core

43

Resource Aware Scheduler
Setting component resource requirements:
SpoutDeclarer spout = builder.setSpout("sp1", new TestSpout(), 10);
//set cpu requirement
spout.setCPULoad(20);
//set onheap and offheap memory requirement
spout.setMemoryLoad(64, 16);
BoltDeclarer bolt1 = builder.setBolt("b1", new MyBolt(), 3).shuffleGrouping("sp1");
//sets cpu requirement. Not neccessary to set both CPU and memory.
//For requirements not set, a default value will be used
bolt1.setCPULoad(15);
BoltDeclarer bolt2 = builder.setBolt("b2", new MyBolt(), 2).shuffleGrouping("b1");
bolt2.setMemoryLoad(100);

44

Storm Usability Improvements
Enhanced Debugging and Monitoring of Topologies

45

Dynamic Log Level Settings

46

Dynamic Log Levels
• Set log level setting for a running topology
• Via Storm UI and CLI
• Optional timeout after which changes will be reverted
• Logs searchable from Storm UI/Logviewer

47

Dynamic Log Levels
Via Storm UI:

48

Dynamic Log Levels
Via Storm CLI:
./bin/storm set_log_level [topology name] -l
[logger_name]=[LEVEL]:[TIMEOUT]

49

Tuple Sampling
• No more debug bolts or Trident functions!
• In Storm UI: Select a Topology or component and click “Debug”
• Specify a sampling percentage (% of tuples to be sampled)
• Click on the “Events” link to view the sample log.

50

Distributed Log Search
• Search across all log files for a specific topology
• Search in archived (ZIP) logs
• Results include matches from all Supervisor nodes

51

Dynamic Worker Profiling
• Request worker profile data from Storm UI:
• Heap Dumps
• JStack Output
• JProfile Recordings
• Download generated files for off-line analysis
• Restart workers from UI

52

Supervisor Health Checks
• Identify Supervisor nodes that are in a bad state
• Automatically decommission bad nodes
• Simple shell script
• You define what constitutes “Unhealthy”

53

New Integrations
• Cassandra
• Solr
• Elastic Search
• MQTT

54

Integration Improvements
• Kafka
• HDFS Spout
• Avro Integration for HDFS
• HBase
• Hive

55

Before I forget...

56

Performance

57

Up to 16x faster throughput.
Realistically 3x -- Highly dependent on use case

58

> 60% Latency Reduction

59

Bear in mind performance varies
widely depending on the use case.

60

The most important benchmarks
are the ones you do.

61

Storm 2.0
Underway @ Apache

62

Clojure to Java
Broadening the contributor base

63

Clojure to Java
Alibaba JStorm Contribution

64

Questions?

65

Thank you!
P. Taylor Goetz, Hortonworks
@ptgoetz

More Related Content

The Future of Apache Storm

  • 1. The Future of Apache Storm Hadoop Summit 2016 - Dublin P. Taylor Goetz, Hortonworks @ptgoetz
  • 2. About Me • Tech Staff @ Hortonworks • PMC Chair, Apache Storm • ASF Member • PMC, Apache Incubator • PMC, Apache Arrow • PMC, Apache Kylin • Mentor/PPMC, Apache Eagle (Incubating) • Mentor/PPMC, Apache Metron (Incubating) • Mentor/PPMC, Apache Apex (Incubating)
  • 3. Apache Storm 0.9.x Storm moves to Apache
  • 4. Apache Storm 0.9.x • First official Apache Release • Storm becomes an Apache TLP • 0mq to Netty for inter-worker communication • Expanded Integration (Kafka, HDFS, HBase) • Dependency conflict reduction (It was a start ;) )
  • 6. Apache Storm 0.10.x • Security, Multi-Tenancy • Enable Rolling Upgrades • Flux (declarative topology wiring/configuration) • Partial Key Groupings
  • 7. Apache Storm 0.10.x • Improved logging (Log4j 2) • Streaming Ingest to Apache Hive • Azure Event Hubs Integration • Redis Integration • JDBC Integration
  • 8. Apache Storm 1.0 Maturity and Improved Performance Release Date: April 12, 2016
  • 10. Pacemaker • Replaces Zookeeper for Heartbeats • In-Memory key-value store • Allows Scaling to 2k-3k+ Nodes • Secure: Kerberos/Digest Authentication
  • 11. Pacemaker • Compared to Zookeeper: • Less Memory/CPU • No Disk • Spared the overhead of maintaining consistency
  • 13. Distributed Cache API • Topology resources: • Dictionaries, ML Models, Geolocation Data, etc. • Typically packaged in topology jar • Fine for small files • Large files negatively impact topology startup time • Immutable: Changes require repackaging and deployment
  • 14. Distributed Cache API • Allows sharing of files (BLOBs) among topologies • Files can be updated from the command line • Allows for files from several KB to several GB in size • Files can change over the lifetime of the topology • Allows for compression (e.g. zip, tar, gzip)
  • 15. Distributed Cache API • Two implementations: LocalFsBlobStore and HdfsBlobStore • Local implementation supports Replication Factor (not needed for HDFS-backed implementation) • Both support ACLs
  • 16. Distributed Cache API Creating a blob: storm blobstore create --file dict.txt --acl o::rwa --repl-fctr 2 key1 Making it available to a topology: storm jar topo.jar my.topo.Class test_topo -c topology.blobstore.map=‘{"key1":{"localname":"dict.t xt", "uncompress":"false"}}'
  • 20. HA Nimbus - Failover
  • 21. HA Nimbus - Failover
  • 22. HA Nimbus • Increase overall availability of Nimbus • Nimbus hosts can join/leave at any time • Leverages Distributed Cache API • Topology JAR, Config, and Serialized Topology uploaded to Distributed Cache • Replication guarantees availability of all files
  • 24. Streaming Windows • Specify Length - Duration or Tuple Count • Slide Interval - How often to advance the window
  • 27. Streaming Windows • Timestamps (Event Time, Ingestion Time and Processing Time) • Out of Order Tuples • Watermarks • Window State Checkpointing
  • 28. Sate Management Stateful Bolts with Automatic Checkpointing
  • 29. What you see. State Management
  • 30. State Management public class WordCountBolt extends BaseStatefulBolt<KeyValueState> { private KeyValueState wordCounts; private OutputCollector collector; public void prepare(Map conf, TopologyContext context, OutputCollector collector) { this.collector = collector; } public void initState(KeyValueState state) { this.wordCounts = state; } public void execute(Tuple tuple) { String word = tuple.getString(0); Integer count = (Integer) wordCounts.get(word, 0); count++; wordCounts.put(word, count); collector.emit(new Values(word, count)); } }
  • 31. public class WordCountBolt extends BaseStatefulBolt<KeyValueState> { private KeyValueState wordCounts; private OutputCollector collector; public void prepare(Map conf, TopologyContext context, OutputCollector collector) { this.collector = collector; } public void initState(KeyValueState state) { this.wordCounts = state; } public void execute(Tuple tuple) { String word = tuple.getString(0); Integer count = (Integer) wordCounts.get(word, 0); count++; wordCounts.put(word, count); collector.emit(new Values(word, count)); } } Initialize State State Management
  • 32. public class WordCountBolt extends BaseStatefulBolt<KeyValueState> { private KeyValueState wordCounts; private OutputCollector collector; public void prepare(Map conf, TopologyContext context, OutputCollector collector) { this.collector = collector; } public void initState(KeyValueState state) { this.wordCounts = state; } public void execute(Tuple tuple) { String word = tuple.getString(0); Integer count = (Integer) wordCounts.get(word, 0); count++; wordCounts.put(word, count); collector.emit(new Values(word, count)); } } Read/Update State State Management
  • 34. Checkpointing/Snapshotting • Asynchronous Barrier Snapshotting (ABS) algorithm [1] • Chandy-Lamport Algorithm [2] [1] http://arxiv.org/pdf/1506.08603v1.pdf [2] http://research.microsoft.com/en-us/um/people/lamport/pubs/chandy.pdf State Management
  • 35. Checkpointing/Snapshotting: What you see. Storm State Management execute/update state execute execute/update state
  • 36. Checkpointing/Snapshotting: What you get. Storm State Management
  • 38. Automatic Back Pressure • In previous Storm versions, the only way to throttle topologies was to enable ACKing and set topology.spout.max.pending. • If you don’t require at-least-once guarantees, this imposed a significant performance penalty.** ** In Storm 1.0 this penalty is drastically reduced (more on this later)
  • 39. Automatic Backpressure • High/Low Watermarks (expressed as % of buffer size) • Back Pressure thread monitors buffers • If High Watermark reached, slow down Spouts • If Low Watermark reached, stop throttling • All Spouts Supported
  • 41. Resource Aware Scheduler • Specify the resource requirements (Memory/CPU) for individual topology components (Spouts/Bolts) • Memory: On-Heap / Off-Heap (if off-heap is used) • CPU: Point system based on number of cores • Resource requirements are per component instance (parallelism matters)
  • 42. Resource Aware Scheduler • CPU and Memory availability described in storm.yaml on each supervisor node. E.g.: supervisor.memory.capacity.mb: 3072.0 supervisor.cpu.capacity: 400.0 • Convention for CPU capacity is to use 100 for each CPU core
  • 43. Resource Aware Scheduler Setting component resource requirements: SpoutDeclarer spout = builder.setSpout("sp1", new TestSpout(), 10); //set cpu requirement spout.setCPULoad(20); //set onheap and offheap memory requirement spout.setMemoryLoad(64, 16); BoltDeclarer bolt1 = builder.setBolt("b1", new MyBolt(), 3).shuffleGrouping("sp1"); //sets cpu requirement. Not neccessary to set both CPU and memory. //For requirements not set, a default value will be used bolt1.setCPULoad(15); BoltDeclarer bolt2 = builder.setBolt("b2", new MyBolt(), 2).shuffleGrouping("b1"); bolt2.setMemoryLoad(100);
  • 44. Storm Usability Improvements Enhanced Debugging and Monitoring of Topologies
  • 45. Dynamic Log Level Settings
  • 46. Dynamic Log Levels • Set log level setting for a running topology • Via Storm UI and CLI • Optional timeout after which changes will be reverted • Logs searchable from Storm UI/Logviewer
  • 48. Dynamic Log Levels Via Storm CLI: ./bin/storm set_log_level [topology name] -l [logger_name]=[LEVEL]:[TIMEOUT]
  • 49. Tuple Sampling • No more debug bolts or Trident functions! • In Storm UI: Select a Topology or component and click “Debug” • Specify a sampling percentage (% of tuples to be sampled) • Click on the “Events” link to view the sample log.
  • 50. Distributed Log Search • Search across all log files for a specific topology • Search in archived (ZIP) logs • Results include matches from all Supervisor nodes
  • 51. Dynamic Worker Profiling • Request worker profile data from Storm UI: • Heap Dumps • JStack Output • JProfile Recordings • Download generated files for off-line analysis • Restart workers from UI
  • 52. Supervisor Health Checks • Identify Supervisor nodes that are in a bad state • Automatically decommission bad nodes • Simple shell script • You define what constitutes “Unhealthy”
  • 53. New Integrations • Cassandra • Solr • Elastic Search • MQTT
  • 54. Integration Improvements • Kafka • HDFS Spout • Avro Integration for HDFS • HBase • Hive
  • 57. Up to 16x faster throughput. Realistically 3x -- Highly dependent on use case
  • 58. > 60% Latency Reduction
  • 59. Bear in mind performance varies widely depending on the use case.
  • 60. The most important benchmarks are the ones you do.
  • 62. Clojure to Java Broadening the contributor base
  • 63. Clojure to Java Alibaba JStorm Contribution
  • 65. Thank you! P. Taylor Goetz, Hortonworks @ptgoetz