What scenarios will result in a telemetry doc with a `conflicted` property?

I’m seeing some telemetry documents that have a conflicted property. Is this expected? And if so, under what scenarios?

"metadata": {
        "user": "michael",
        "year": 2021,
        "month": 1,
        "deviceId": "e5c3bbc2-d24f-4251-a315-dab325ae747b",
        "versions": {
            "app": "3.8.2",
            "forms": {
                ...<removed>...
            }
        },
        "conflicted": true
    }

Code that adds conflicted docs: cht-core/webapp/src/js/services/telemetry.js at 3.8.x · medic/cht-core · GitHub

This can happen possibly because of a race condition on creating the telemetry aggregate document, or can happen if the aggregation is interrupted after the creation of the document and gets triggered again. It could also happen if the device time is changed into the past.

The way telemetry docs are created: when a new telemetry entry is written, we check whether we should aggregate old telemetry data (if the “reporting” interval turned over). If yes, we aggregate into a single doc which we will later be synced to the server. This doc has a “fixed” id. If, when we attempt a write, we discover that the doc already exists, we don’t overwrite, and instead we create this “conflicted” entry.

1 Like

Are these conflicted documents frequent?
Finding the corresponding “non-conflicted” doc would be useful, just to know when it was created and whether it contains identical data. Created Seconds before? Milliseconds before? Completely different data?

Thinking about it, conflicting documents can also be triggered by running the app in two tabs (or by some weirdness in cht-android)

1 Like

Are these conflicted documents frequent?

In one instance there were 20 conflicts out of 5,618 total telemetry docs (so about 0.36%).
In another there were 328 conflicts out of 27,475 total telemetry docs (so about 1.2%).

completely different data for the corresponding “non-conflicted” doc?

I’ve seen a mix. For the one with 20 conflicts, 10 of the conflicted documents had a corresponding document that had the exact same data for metrics, dbInfo, and device. The other 10 had some of those properties the same and some different.

Created Seconds before? Milliseconds before?

Telemetry docs don’t have a timestamp property. For the docs that are conflicted, the UUID has the UNIX epoch time on the end, but since the non-conflicted doc doesn’t have a timestamp, I can’t tell how close together they were.

Here’s the SQL I’m using just FYI:

SELECT
	doc#>>'{metadata,user}' AS t_user,
	(doc#>>'{metadata,year}')::int AS year,
	(doc#>>'{metadata,month}')::int AS month,
	doc#>>'{metadata,deviceId}' AS device_id,
	COUNT(*) AS count_telemetry_docs,
	COUNT(DISTINCT(doc#>>'{metrics}')) AS count_distinct_metrics,
	COUNT(DISTINCT(doc#>>'{dbInfo}')) AS count_distinct_dbinfo,
	COUNT(DISTINCT(doc#>>'{device}')) AS count_distinct_device
	
FROM
	couchdb_users_meta
	
WHERE
	doc->>'type' = 'telemetry'

GROUP BY 
	t_user,
	year,
	month,
	device_id
	
ORDER BY
	count_telemetry_docs DESC,
	count_distinct_metrics DESC,
	count_distinct_dbinfo DESC,
	count_distinct_device DESC

In another there were 328 conflicts out of 27,475 total telemetry docs (so about 1.2%).

That’s a pretty high incidence. What version is this instance running?
There have been changes done to the telemetry recording since 3.8. We could check an instance that is running a more recent version (3.12+) and check the frequency of conflicted telemetry after the upgrade to 3.12.

For the one with 20 conflicts, 10 of the conflicted documents had a corresponding document that had the exact same data for metrics , dbInfo , and device .

were all the users Android users?