Topic Partitioning

Topic partitioning works differently in Pulsar compared to Kafka.

In Kafka, the broker works as both the serving and the storage layer, making them tightly coupled together. In Pulsar, the Pulsar broker works as the serving layer, while the BookKeeper bookies work as the storage layer. Therefore the number of brokers and number of bookies can be scaled independently, with the brokers being stateless (the brokers do include caches for efficiency).

If we need to scale up the maximum throughput of an individual topic, we can partition the topic. Each partition can then be served by a different broker. This is what allows us to scale the throughput. Since the brokers are stateless, topics can be shifted between brokers as needed. This is also the case for individual partitions of a topic. This allows for an extremely flexible architecture where the number of topic partitions can be easily increased. The number of brokers can be easily increased or decreased, and the broker serving each individual partition can be changed depending on how busy each broker is at any time.

In this exercise we demonstrate how easy it is to increase the number of topic partitions and view the partition each message is consumed from.

Edit the number of partitions for the schema topic using ChangeNumberOfPartitions.java in the Partition package.

  • edit the credentials URL to point to the OAuth2 json file in the resources folder
  • edit the topic to include your student id

Execute ChangeNumberOfPartitions.java to change the number of partitions from 1 to 5 for the topic.

KProducer.java in the Partition package will write 10,000 messages. Before executing this file:

  • add your jwtToken (we are configuring the Kafka connection directly in the Java code)
  • edit topic1 to include your student id

Before executing KConsumer.java:

  • add your jwtToken
  • edit topic1 to include your student id

As we consume each message, we will print our the partition the message came from using:

System.out.println(record.value() + " from topic " + record.partition());

For testing, you can create multiple Kafka consumers using the same subscription name.