Durability and Offloading

Let’s dive a little deeper into how Pulsar’s storage model works.

The data that makes up a topic consists of multiple segments, where each segment consists of multiple entries/messages.

Each segment is persisted on multiple bookies. For example, the diagram to the left shows Segment 4 is persisted on Bookie 2, Bookie 3, and Bookie 4. If a consumer requests the data while Bookie 3 is down, Broker 2 can still reach out to Bookie 2 or Bookie 4 to obtain the data. There is no data loss and no interruption to producers and consumers.


You may need to store data for long periods of time. With Pulsar you can offload those messages to external storage. Instead of using fast and expensive disks in the bookies, we can leverage the use of third-party cloud storage systems, moving it into a more cost-effective storage tier.

The reading of this data is transparent to the consumer wether the data is read from a bookie node or tiered storage.

The ONE StreamNative Platform supports Lakehouse Storage where the data can be read directly from the external storage. For more information view Partner with StreamNative -> Lakehouse Storage.


For more information on Pulsar’s storage layer, Apache BookKeeper, click here.

More information on Pulsar’s tiered storage can be found here.