Initial_replication_date

Hi community,

I’m seeking some technical insights regarding the initial_replication_date of our data records. I’ve been analyzing this field for our data records in the associated info doc since 2023, and I’ve observed that:

  • Prior to CHT version 4: We had approximately 4% of our data_record documents missing the initial_replication_date .
  • CHT version 4 onwards: The occurrence of missing initial_replication_date has significantly decreased. This April, we’re seeing around 0.01%, and it was even lower in February.

I’m trying to understand the root cause of why some info docs lack this initial_replication_date . My current hypothesis is that when CouchDB has a large number of documents to process simultaneously, Sentinel might not be able to keep up with all the _changes events, resulting in some info docs not being properly tagged with their initial replication date.

Has anyone else encountered this issue or have an explanation?

1 Like

Hi @bamatic

This is hard question to answer without knowing more particularities about the docs that are missing the initial_replication_date.

  1. what does it mean that they are missing this date? what value do you see for this field?
  2. what kind of docs present like this? do they have associated SMS / outbound workflows?

Sentinel goes over every document, it doesn’t skip any of them, even if it gets overloaded. And sentinel is not the one adding the initial_replication_date, API adds it.
What it could mean is that there is a race condition where Sentinel processes a document so quickly (to save transitions or outbound tasks) that API adding the infodoc happens after.

Thank you for this context I’m focusing only on data records and associated info docs
I’ve been looking to the postgresql database, but looking to the JSON docs I find that the initial replication date has these values:

typing f0_
unknow 1 471 758
a timestamp string 32 475 940
a numeric epoch in milliseconds 28 823

The SQL I used resulted in some values being the literal string ‘unknown’ and integer values appearing as NULL. Consequently, I’ve mistakenly labeled these as ‘missed’

All of our data_records have a corresponding info doc;
could you explain please the possible scenarios or underlying logic that would lead to the production of the string value ‘unknown’ in our data?

The scenario for unknown is the one I explained before. Where Sentinel or outbound processes a doc before API wrote the infodoc with the initial_replication_date. It’s a race condition, or was.

2 Likes