Sync issues & status scrutiny

We’ve recently encountered an issue where some Community Health Workers (CHWs) have reported syncing two weeks’ worth of data (indicated by the green status), resetting the device, and subsequently discovering that all their work is missing.

Feedback from the field highlighted that when a significant amount of work needs to be synced, particularly as observed by the user mentioned in this thread, sometimes users need to press the sync button multiple times before the operation completes.
Not sure if this behavior has been observed before?

Upon reviewing the sync trigger documentation, we found that CHT actively attempts to push/pull data as quickly as possible.
So barring some data cap, loadshedding (electrical issues), signal availability (due to location), the app should stay in step with the server fairly regularly.

What’s worrying is that our session expiry is set to 12 hours, so the user HAD to log in in order to work.

Interestingly, one of our managers tested the ‘on login’ case:

  • log in with chw, disconnect from internet, create pieces of work.
  • wait a day or two
  • log in with chw again for a short while, then log out
  • no data present when logging in as admin
    What is the duration for sync to be kicked off on login?

We directed our attention to the Access Point Name (APN) to investigate whether the issue stemmed from a CHW hitting the data cap. The APN portal indicates that user requests were made on the dates when data appears to have gone missing, suggesting that data should have been successfully synced.

Our MDM solution, SureMDM, also reports that the app was indeed online during the days where the data went missing.

Is there a log or telemetry available for scrutiny to track when sync operations were executed for a specific user?

Additionally, is there a method in CouchDB to determine when a CHW last synced? We’ve discovered the last_replication_date property on a user record within the medic-sentinel database:


We hoped to create some sort of admin page where one can have a look at the hierarchy users across the app and the last time they have synced - just showing the name, lineage underneath & the timestamp. If the timestamp, for instance, is 1 day behind (green), 2 days behind (yellow), and 1 week behind (red). That way we can try get ahead of such situations from an operational perspective.

1 Like

Hi @Anro !

Sorry to hear some CHWs are reporting issues synchronizing. In order to see a report of users that have synchronized, a good way is to use couch2pg. With that set up, you can use this query to show usernames and dates, showing the most recent date first:

select 
  distinct doc#>>'{metadata,user}' as user, 
  concat_ws(
       '-',
       doc#>>'{metadata,year}',
       doc#>>'{metadata,month}',
       doc#>>'{metadata,day}' 
    ) as date
from 
	couchdb_users_meta 
where 
	doc#>>'{type}' = 'telemetry'  
order by 
	date desc

The query will return results that look like this:

user	date
----------------
bob	    2024-3-3
lisa	2024-3-3
tom	    2024-3-3
wonda	2024-3-3
marisa	2024-3-3

When using telemetry documents, be sure to remember that they may not happen everyday depending on CHW’s use of your app.

Please let us know if you need anymore help!

We also record several telemetry keys related specifically to replication and it is tracked for both the medic DB (contacts, reports) and the meta DB (telemetry, etc…).

You can see when the most recent replication to the server was completed successfully for the medic DB by the existence of replication:medic:to:success, for example (what do these look like when there is nothing to replicate?).

There are also keys like: replication:medic:from:success and replication:meta:sync:success, as well as tracking failures.

Here’s a way to see the most recent telemetry record and most recent successful replication to the medic DB by user.

SQL
SELECT
	user_name,
	max(period_start) AS most_recent_telemetry,
	max(period_start) FILTER (WHERE metric = 'replication:medic:to:success') AS most_recent_medic_to_success

FROM
	useview_telemetry_metrics

GROUP BY
	user_name

Sample output

Hi @Anro

We start automatic sync every 5 minutes, and 10 seconds after starting the app, and immediately after a local write happens.
Being logged out should not affect local document storage.
Have the managers tried to check the sync status - opening the hamburger menu and checking what the app reports?

Could you please share which CHT version you have deployed?