Introducing Partitioned Topics for Scaling Throughput

Before discussing how increasing the number of partitions of our topic allows for increased throughput, it’s important to learn a little about Pulsar’s unique architecture.

Pulsar’s architecture is more sophisticated than other platforms. The brokers are responsible for managing all the communication and the processing of the topics. Broker nodes are stateless: they don’t store data except caches for efficiency.

In contrast, the bookies (Apache BookKeeper) nodes are responsible for storage. They have state and are where the messages are stored.

This allows for a highly scalable elastic architecture where the number of brokers or bookies can be quickly scaled independently.

When the throughput of a single topic partition becomes too large for a single broker, we can increase the number of partitions. Each individual partition can be serviced by different brokers, increasing the overall throughput of the topic. This can be done quickly because the topic ownership by a broker is independent of where the data is stored. We will test the effect this has on each subscription type and ordering guarantees. Before starting this testing, we will also introduce keyed messages.