Data archiving at scale

Santosh_Paudel · April 24, 2026, 4:30am

As data keeps growing over time, managing and archiving it at scale is becoming a challenge.

I’m curious how the community is dealing with this. What approaches are being used when data keeps piling up over time, and what is actually working in practice?

Would be great to hear different experiences from the community.
Thank you
cc @sanjay @Raghav_Raj_Karkee @andra @antony

diana · April 24, 2026, 4:54am

Hi @Santosh_Paudel

Thank you for the question.
We are now working on implementing a built-in cold storage solution into the CHT, that will allow to prune the main database from records that are old and stale, and not used in any workflows. This will allow instances that opt to implement this to keep their databases lighter.
We are expecting to ship this in v5.3.0 (the release after next).

mrjones · April 24, 2026, 5:09am

In addition to the upcoming feature Diana mentioned, be sure to check client and server side purging as well as replication depth. These two features can help manage a growing data set.