Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Apache Kafka Best
Practices
Manikumar Reddy
@omkreddy
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Kafka
 Core APIs
– The Producer API
– The Consumer API
– The Connector API
– The Streams API
 Broad classes of applications
– Building real-time streaming data pipelines
– Building real-time streaming applications
– core building block in other data systems
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Key Concepts and Terminology
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Component Layout
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hardware Guidance
Cluster Size Memory CPU Storage
Kafka Brokers 3+
24G+ (for small)
64GB+ (for large)
Multi- core
processors( 12 CPU+
core), Hyper
threading enabled
6+ x 1TB dedicated
disks( RAID or JBOD)
Zookeeper
3 (for small)
5 (for large)
8GB+ (for small)
24GB+ (for large)
2 core +
SSD for Transaction
logs
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
OS Tuning
 OS Page Cache
– Ex: Allocate to hold all the active segments of the log.
 File descriptor limits : >100k
 less swapping
 Tcp tuning
 JVM Configs
– Java 8 with G1 Collector
– 6-8 GB heap
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Disk Storage
 Use multiple disk spindles, dedicated to kafka
 JBOD vs RAID10
 JBOD
– Gives all the disk I/O
 JBOD Limitations
– any disk failure causes an unclean shutdown and requires lengthy recovery
– data is not distributed consistently across disks
– Multiple directories
 KIP-112/113
– necessary tools for users to manage JBOD
– Intelligent partition assignment
– On disk failure, broker can serve replicas on the good disks
– re-assign replicas between disks of the same broker
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
RAID
 RAID10
– Can survive single disk failure
– Performance and protection
– balance load across disks
– Single mount point
– Performance hit and reduces the space
 File System
– EXT or XFS
– SSD
– Issues on NFS.
– SAN, NAS
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Basic Monitoring
 CPU Load
 Network Metrics
 File Handle Usage
 Disk Space
 Disk I/O Performance
 Garbage Collection
 ZooKeeper Monitoring
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Replication
 Partition has replicas – Leader replica, Follower replicas
 Leader maintains in-sync-replicas (ISR)
– replica.lag.time.max.ms, num.replica.fetchers
– min.insync.replica – used by producer to ensure greater durability
https://www.slideshare.net/junrao/kafka-replication-apachecon2013
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Under Replicated Partitions
 Number of partitions which are not fully replicated within the cluster
 Mbean - kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions
 ISR Shrink/Expand Rate
 Under Replicated Partitions
– Lost Broker?
– Controller Issues
– Zookeeper Issues
– Network Issues
 Solutions
– Tune the ISR settings
– Expand brokers
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Controller
 Manages Partitions Life cycle
 Avoid controller's ZK session expires
– Soft failures – ISR Churn/Under replicated partitions
– ZK Server performance
– Long GC pauses on Broker
– Bad network configuration
 Monitoring
– Mbean : kafka.controller:type=KafkaController,name=ActiveControllerCount
– only one broker in the cluster should have 1
– LeaderElectionRate
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Unclean leader election
 Enable replicas not in the ISR set to be elected as leader
 Availability vs correctness
– By-default kafka chooses availability
 Monitoring
– Mbean : kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec
 Default will be changed in next release
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Broker Configs
 log.retention.{ms, minutes, hours} , log.retention.bytes
 message.max.bytes, replica.fetch.max.bytes
 delete.topic.enable
 unclean.leader.election.enable = false
 min.insync.replicas = 2
 replica.lag.time.max.ms, num.replica.fetchers
 replica.fetch.response.max.bytes
 zookeeper.session.timeout.ms = 30s
 num.io.threads
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Cluster Sizing
 Broker Sizing
– Partition count on each broker (<2K)
– Keep partition size on disk manageable (under 25GB per partition )
 Cluster Size (no. of brokers)
– how much retention we need
– how much traffic cluster is getting
 Cluster Expansion
– Disk usage on the log segments partition should stay under 60%
– Network usage on each broker should stay under 75%
 Cluster Monitoring
– Keep cluster balanced
– Ensure that partitions of a topic are fairly distributed across brokers
– Ensure that nodes in a cluster are not running out of disk and network
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Broker Monitoring
 Partition Counts
– Mbean: kafka.server:type=ReplicaManager,name=PartitionCount
 Leader replica counts
– Mbean: kafka.server:type=ReplicaManager,name=LeaderCount
 ISR Shrink Rate/ISR expansion rate
– kafka.server:type=ReplicaManager,name=IsrExpandsPerSec
 Message in rate/Byte in rate/Byte out rate
 NetworkProcessorAvgIdlePercent
 RequestHandlerAvgIdlePercent
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Topic Sizing
 No. of partitions
– Have at least as many partitions as there are consumers in the largest group
– topic is very busy – more partitions
– Keep partition size on disk manageable (under 25GB per partition )
– Take into account any other application requirements
– Special use cases – single partition
 Keyed messages
– enough partitions to deal with future growth
 expanding partitions
– whenever the size of the partition on disk is larger than threshold
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Choosing Partitions
 Based on throughput requirements one can pick a rough number of partitions.
– Lets call the throughput from producer to a single partition is P
– Throughput from a single partition to a consumer is C
– Target throughput is T
– At least max (T/P, T/C)
 More Partitions
– More open file handles
– May increase unavailability
– May increase end-to-end latency
– More memory for clients
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Quotas
 Protect from bad clients and maintain SLAs
 byte-rate thresholds on produce and fetch requests
 can be applied to (user, client-id), user or client-id groups.
 Server delays the responses
 Broker Metrics for monitoring – throttle-rate, byte-rate
 replica.fetch.response.max.bytes
– Limit memory usage of replica fetch response
 Limiting bandwidth usage during data migration
– kafka-reassign-partitions.sh -- -throttle option
20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Producer
 User new java based clients
 Test in your Environment
– kafka-producer-perf-test.sh
 Memory
 CPU
 Batch Compression
 Avoid large messages
– creates more memory pressure
– slows down the brokers
21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Critical Configs
 batch.size
– size based batching
– larger size -> high throughput, higher latency
 linger.ms
– time based batching
– larger size -> high throughput, higher latency
 max.in.flight.requests.per.connection
– Better throughput, affects ordering
 compression.type
– adding more user threads can help throughput
 acks
– Affects message durability
22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Performance tuning
 If throughput < network capacity
– Add more user threads
– Increase batch size
– Add more producers instances
– Add more partitions
 Latency when acks = -1
– Increase num.replica.fetchers
 Cross datacenter data transfer
– Tune socket buffer settings, OS tcp buffer settings
23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Producer Monitoring
 batch-size-avg
 compression-rate-avg
 waiting-threads
 buffer-available-bytes
 record-queue-time-max
 record-send-rate
 records-per-request-avg
24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Consumer
 Test in your Environment
– kafka-consumer-perf-test.sh
 Throughput Issues
– not enough partitions
– OS Page Cache - allocate enough to hold all the messages for your consumers for say, 30s
– Application/Processing logic
 Offsets topic
– __consumer_offsets
– offsets.topic.replication.factor
– offsets.retention.minutes
– Monitor ISR, topic size
 Slow offset commits
– commit async, manual commits
25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Consumer Configs
 fetch.min.bytes and fetch.max.wait.ms
 max.poll.interval.ms
 max.poll.records
 session.timeout.ms
 Consumer Rebalance
– check timeouts
– check processing times/logic
– GC Issues
 Tune network settings
26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Consumer Monitoring
 Whether or not the consumer is keeping up with the messages that are being produced
 Consumer Lag: Difference between the end of the log and the consumer offset
 Monitoring
– Metrics Monitoring - records-lag-max
– bin/kafka-consumer-groups.sh
– LinkedIn’s Burrow for consumer monitoring
 Decreasing Lag
– Analyze consumer - GC Issues, hung instance
– Add more consumer Instances
– increase the number of partitions and consumers
27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
No data loss settings
 Producer
– block.on.buffer.full=true
– retries=Long.MAX_VALUE
– acks=all
– max.in.flight.requests.per.connection=1
– close producer
 Broker
– replication factor >= 3
– min.insync.replicas=2
– disable unclean leader election
 Consumer
– disable auto.offset.commit
– Commit offsets only after the messages are processed
28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Authorizer - Ranger Auditing
29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Mirror Maker
 Tool to mirror a source Kafka cluster into a target (mirror) Kafka cluster
30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Mirror Maker
 Run multiple mirroring processes
– high fault-tolerance
– high throughput
 --num.streams option to specify the number of consumer threads
– no.of threads in num.streams
31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Kafka Mirror Maker
 Consumer and source cluster socket buffer sizes
– high value for the socket buffer size
– consumer's fetch size
– OS networking Tuning
 Source and Target Clusters are independent entities
– Can be different numbers of partitions
– offsets will not be the same.
– partitioning order is preserved on a per-key basis.
 Create topics in target cluster
 Monitor whether a mirror is keeping up
– Consumer Lag
 Running In Secure Clusters
– We recommend to use SSL
– We can run MM on source cluster
32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Open source Operational Tools
 Ambari Metrics
– https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-user-
guide/content/grafana_kafka_dashboards.html
 Removing brokers and rebalancing partitions in a cluster
– https://github.com/linkedin/kafka-tools
 Consumer Lag Monitoring
– Burrow (https://github.com/linkedin/Burrow)
 Kafka Manager - https://github.com/yahoo/kafka-manager
33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Kafka 0.10.2 release
 Includes 15 KIPs, over 200 bug fixes and improvements
 The newest Java Clients now support older brokers (0.10.0 and higher)
 Separation of Internal and External traffic
 Create Topic Policy
 Security Improvements
– Support for SASL/SCRAM mechanisms
– Dynamic JAAS configuration for Kafka clients
– Support for authentication of multiple Kafka clients in single JVM
 Producer and Consumer Improvements
 Connect API & Streams API improvements
34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
35 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
References
 http://kafka.apache.org/documentation.html
 https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html
 https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache-
kafka-63147600
 https://www.slideshare.net/ToddPalino/tuning-kafka-for-fun-and-profit
 https://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kafka-
49753844
 https://www.slideshare.net/ToddPalino/putting-kafka-into-overdrive
 https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-
kafka-cluster/

More Related Content

Apache Kafka Best Practices

  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Kafka  Core APIs – The Producer API – The Consumer API – The Connector API – The Streams API  Broad classes of applications – Building real-time streaming data pipelines – Building real-time streaming applications – core building block in other data systems
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Key Concepts and Terminology
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Component Layout
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hardware Guidance Cluster Size Memory CPU Storage Kafka Brokers 3+ 24G+ (for small) 64GB+ (for large) Multi- core processors( 12 CPU+ core), Hyper threading enabled 6+ x 1TB dedicated disks( RAID or JBOD) Zookeeper 3 (for small) 5 (for large) 8GB+ (for small) 24GB+ (for large) 2 core + SSD for Transaction logs
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved OS Tuning  OS Page Cache – Ex: Allocate to hold all the active segments of the log.  File descriptor limits : >100k  less swapping  Tcp tuning  JVM Configs – Java 8 with G1 Collector – 6-8 GB heap
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Disk Storage  Use multiple disk spindles, dedicated to kafka  JBOD vs RAID10  JBOD – Gives all the disk I/O  JBOD Limitations – any disk failure causes an unclean shutdown and requires lengthy recovery – data is not distributed consistently across disks – Multiple directories  KIP-112/113 – necessary tools for users to manage JBOD – Intelligent partition assignment – On disk failure, broker can serve replicas on the good disks – re-assign replicas between disks of the same broker
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved RAID  RAID10 – Can survive single disk failure – Performance and protection – balance load across disks – Single mount point – Performance hit and reduces the space  File System – EXT or XFS – SSD – Issues on NFS. – SAN, NAS
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Basic Monitoring  CPU Load  Network Metrics  File Handle Usage  Disk Space  Disk I/O Performance  Garbage Collection  ZooKeeper Monitoring
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Replication  Partition has replicas – Leader replica, Follower replicas  Leader maintains in-sync-replicas (ISR) – replica.lag.time.max.ms, num.replica.fetchers – min.insync.replica – used by producer to ensure greater durability https://www.slideshare.net/junrao/kafka-replication-apachecon2013
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Under Replicated Partitions  Number of partitions which are not fully replicated within the cluster  Mbean - kafka.server:type=ReplicaManager,name=UnderReplicatedPartitions  ISR Shrink/Expand Rate  Under Replicated Partitions – Lost Broker? – Controller Issues – Zookeeper Issues – Network Issues  Solutions – Tune the ISR settings – Expand brokers
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Controller  Manages Partitions Life cycle  Avoid controller's ZK session expires – Soft failures – ISR Churn/Under replicated partitions – ZK Server performance – Long GC pauses on Broker – Bad network configuration  Monitoring – Mbean : kafka.controller:type=KafkaController,name=ActiveControllerCount – only one broker in the cluster should have 1 – LeaderElectionRate
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Unclean leader election  Enable replicas not in the ISR set to be elected as leader  Availability vs correctness – By-default kafka chooses availability  Monitoring – Mbean : kafka.controller:type=ControllerStats,name=UncleanLeaderElectionsPerSec  Default will be changed in next release
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Broker Configs  log.retention.{ms, minutes, hours} , log.retention.bytes  message.max.bytes, replica.fetch.max.bytes  delete.topic.enable  unclean.leader.election.enable = false  min.insync.replicas = 2  replica.lag.time.max.ms, num.replica.fetchers  replica.fetch.response.max.bytes  zookeeper.session.timeout.ms = 30s  num.io.threads
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Cluster Sizing  Broker Sizing – Partition count on each broker (<2K) – Keep partition size on disk manageable (under 25GB per partition )  Cluster Size (no. of brokers) – how much retention we need – how much traffic cluster is getting  Cluster Expansion – Disk usage on the log segments partition should stay under 60% – Network usage on each broker should stay under 75%  Cluster Monitoring – Keep cluster balanced – Ensure that partitions of a topic are fairly distributed across brokers – Ensure that nodes in a cluster are not running out of disk and network
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Broker Monitoring  Partition Counts – Mbean: kafka.server:type=ReplicaManager,name=PartitionCount  Leader replica counts – Mbean: kafka.server:type=ReplicaManager,name=LeaderCount  ISR Shrink Rate/ISR expansion rate – kafka.server:type=ReplicaManager,name=IsrExpandsPerSec  Message in rate/Byte in rate/Byte out rate  NetworkProcessorAvgIdlePercent  RequestHandlerAvgIdlePercent
  • 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Topic Sizing  No. of partitions – Have at least as many partitions as there are consumers in the largest group – topic is very busy – more partitions – Keep partition size on disk manageable (under 25GB per partition ) – Take into account any other application requirements – Special use cases – single partition  Keyed messages – enough partitions to deal with future growth  expanding partitions – whenever the size of the partition on disk is larger than threshold
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Choosing Partitions  Based on throughput requirements one can pick a rough number of partitions. – Lets call the throughput from producer to a single partition is P – Throughput from a single partition to a consumer is C – Target throughput is T – At least max (T/P, T/C)  More Partitions – More open file handles – May increase unavailability – May increase end-to-end latency – More memory for clients
  • 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Quotas  Protect from bad clients and maintain SLAs  byte-rate thresholds on produce and fetch requests  can be applied to (user, client-id), user or client-id groups.  Server delays the responses  Broker Metrics for monitoring – throttle-rate, byte-rate  replica.fetch.response.max.bytes – Limit memory usage of replica fetch response  Limiting bandwidth usage during data migration – kafka-reassign-partitions.sh -- -throttle option
  • 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Producer  User new java based clients  Test in your Environment – kafka-producer-perf-test.sh  Memory  CPU  Batch Compression  Avoid large messages – creates more memory pressure – slows down the brokers
  • 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Critical Configs  batch.size – size based batching – larger size -> high throughput, higher latency  linger.ms – time based batching – larger size -> high throughput, higher latency  max.in.flight.requests.per.connection – Better throughput, affects ordering  compression.type – adding more user threads can help throughput  acks – Affects message durability
  • 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Performance tuning  If throughput < network capacity – Add more user threads – Increase batch size – Add more producers instances – Add more partitions  Latency when acks = -1 – Increase num.replica.fetchers  Cross datacenter data transfer – Tune socket buffer settings, OS tcp buffer settings
  • 23. 23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Producer Monitoring  batch-size-avg  compression-rate-avg  waiting-threads  buffer-available-bytes  record-queue-time-max  record-send-rate  records-per-request-avg
  • 24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Consumer  Test in your Environment – kafka-consumer-perf-test.sh  Throughput Issues – not enough partitions – OS Page Cache - allocate enough to hold all the messages for your consumers for say, 30s – Application/Processing logic  Offsets topic – __consumer_offsets – offsets.topic.replication.factor – offsets.retention.minutes – Monitor ISR, topic size  Slow offset commits – commit async, manual commits
  • 25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Consumer Configs  fetch.min.bytes and fetch.max.wait.ms  max.poll.interval.ms  max.poll.records  session.timeout.ms  Consumer Rebalance – check timeouts – check processing times/logic – GC Issues  Tune network settings
  • 26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Consumer Monitoring  Whether or not the consumer is keeping up with the messages that are being produced  Consumer Lag: Difference between the end of the log and the consumer offset  Monitoring – Metrics Monitoring - records-lag-max – bin/kafka-consumer-groups.sh – LinkedIn’s Burrow for consumer monitoring  Decreasing Lag – Analyze consumer - GC Issues, hung instance – Add more consumer Instances – increase the number of partitions and consumers
  • 27. 27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved No data loss settings  Producer – block.on.buffer.full=true – retries=Long.MAX_VALUE – acks=all – max.in.flight.requests.per.connection=1 – close producer  Broker – replication factor >= 3 – min.insync.replicas=2 – disable unclean leader election  Consumer – disable auto.offset.commit – Commit offsets only after the messages are processed
  • 28. 28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Authorizer - Ranger Auditing
  • 29. 29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Mirror Maker  Tool to mirror a source Kafka cluster into a target (mirror) Kafka cluster
  • 30. 30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Mirror Maker  Run multiple mirroring processes – high fault-tolerance – high throughput  --num.streams option to specify the number of consumer threads – no.of threads in num.streams
  • 31. 31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Kafka Mirror Maker  Consumer and source cluster socket buffer sizes – high value for the socket buffer size – consumer's fetch size – OS networking Tuning  Source and Target Clusters are independent entities – Can be different numbers of partitions – offsets will not be the same. – partitioning order is preserved on a per-key basis.  Create topics in target cluster  Monitor whether a mirror is keeping up – Consumer Lag  Running In Secure Clusters – We recommend to use SSL – We can run MM on source cluster
  • 32. 32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Open source Operational Tools  Ambari Metrics – https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.2.0/bk_ambari-user- guide/content/grafana_kafka_dashboards.html  Removing brokers and rebalancing partitions in a cluster – https://github.com/linkedin/kafka-tools  Consumer Lag Monitoring – Burrow (https://github.com/linkedin/Burrow)  Kafka Manager - https://github.com/yahoo/kafka-manager
  • 33. 33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Kafka 0.10.2 release  Includes 15 KIPs, over 200 bug fixes and improvements  The newest Java Clients now support older brokers (0.10.0 and higher)  Separation of Internal and External traffic  Create Topic Policy  Security Improvements – Support for SASL/SCRAM mechanisms – Dynamic JAAS configuration for Kafka clients – Support for authentication of multiple Kafka clients in single JVM  Producer and Consumer Improvements  Connect API & Streams API improvements
  • 34. 34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You
  • 35. 35 © Hortonworks Inc. 2011 – 2017. All Rights Reserved References  http://kafka.apache.org/documentation.html  https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html  https://www.slideshare.net/JiangjieQin/producer-performance-tuning-for-apache- kafka-63147600  https://www.slideshare.net/ToddPalino/tuning-kafka-for-fun-and-profit  https://www.slideshare.net/JiangjieQin/no-data-loss-pipeline-with-apache-kafka- 49753844  https://www.slideshare.net/ToddPalino/putting-kafka-into-overdrive  https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a- kafka-cluster/