The question

I would like to understand how the default Kafka Partitioner actually works.

The research

  • Kafka provides an interface1 Partitioner here which should be implemented by every partitioning method.
  • If not otherly defined, the DefaultPartitioner is used, as defined here. At the time of my research, this still was the default; now this one is deprecated. I have to find out what the replacement is.
  • The partitioning logic is quite eaesy:
    • If a partition is defined then use it.
    • If no key is given then return something called StickyPartition - this probably is the result of the deprecation. Find I have to check on that one.
    • If a partition is given then in the end, the method partitionForKey is called, as defined here. The code is: return Utils.toPositive(Utils.murmur2(serializedKey)) % numPartitions.
      • serializedKey is just a Bytes object. Generation of that one is unique and reversible.
      • numPartitions is the number of partitions, unique for a topic.
      • murmur2 is a Hashaing algorithm, used for Lookups.
      • So in the end we get a number in the Modulo numPartitions space. One can easily see that the operation is deterministic for every key.

The result

The topic itself does not play a role when computing the partition for a message, only the number of partitions. This also explains why Co-Partitioning of topics works: If we have the same key for two messages in separate topics then the only the number of partitions plays a role for finding the partition assignment. If this number is equal then also the assigned partition is equal.

Another thing to note is that partitioning happens at Consumer Level. So we have to trust the Consumer Code (which may also be the rdkafka C++ implementation) to correctly implement the same method.

Next steps

Find out why the Partitioner is deprecated and what the new default Partitioner looks and works like. Helpful references for that may be KIP-480 or KIP-794 as they were mentioned next to the deprecation.


  1. An interface is essentially the abstraction of a class. A specific class (in our case, a Partitioner) needs to implement the interface specification. ↩︎