Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
21 views

Custom Partition - Building Data Streaming Applications With Apache Kafka

Uploaded by

Dallas Guy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Custom Partition - Building Data Streaming Applications With Apache Kafka

Uploaded by

Dallas Guy
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

PREV NEXT

⏮ Producer object and ProducerRecord object  Building Data Streaming Applications with Apache Kafka Additional producer configuration ⏭

Custom partition
Remember that we talked about key serializer and value serializer as well as partitions used in Kafka producer.

As of now, we have just used the default partitioner and inbuilt serializer. Let's see how we can create a custom
partitioner. 

Kafka generally selects a partition based on the hash value of the key specified in messages. If the key is not
specified/null, it will distribute the message in a round-robin fashion. However, sometimes you may want to have
your own partition logic so that records with the same partition key go to the same partition on the broker. We
will see some best practices for partitions later in this chapter. Kafka provides you with an API to implement your
own partition.

In most cases, a hash-based default partition may suffice, but for some scenarios where a percentage of data for
one key is very large, we may be required to allocate a separate partition for that key. This means that if key K
has 30 percent of total data, it will be allocated to partition N so that no other key will be assigned to partition N
and we will not run out of space or slow down. There can be other use cases as well where you may want to write
Custom Partition. Kafka provides the partitioner interface, which helps us create our own partition.

Here is an example in Java:

Find answers on the fly, or master something new. Subscribe today. See pricing options.
public class CustomePartition implements Partitioner {
public int partition(String topicName, Object key, byte[] keyBytes, Object value, byte[]
List<PartitionInfo> partitions = cluster.partitionsForTopic(topicName);

int numPartitions = partitions.size();


//Todo: Partition logic here
return 0;
}

public void close() {

public void configure(Map<String, ?> map) {

}
}

Scala:

class CustomPartition extends Partitioner {


override def close(): Unit = {}

override def partition(topicName: String, key: scala.Any, keyBytes: Array[Byte], value: sc

val partitions: util.List[PartitionInfo] = cluster.partitionsForTopic(topicName)

val numPartitions: Int = partitions.size

//TODO : your partition logic here


0
}

override def configure(map: util.Map[String, _]): Unit = {}


}

Support / Sign Out


© 2021 O'Reilly Media, Inc. Terms of Service / Privacy Policy

You might also like