Capturing sync date

We would like to capture the date the CHV syncs their records to the database. We have a CHV P4P scheme in place where we pay the CHVs on a particular date every month . We have received many complaints from CHVs claiming that the amount they were paid does not reflect the work they did. We suspect that they sync the data late after the payment deadline but we have no way of proving that unless we have the sync date information.

1 Like

From 3.12, we are tracking new replication-related metrics in telemetry. You can read about the specific metrics by searching for replication: on that page.

Do let us know if any would appear beneficial in resolving your challenge.

2 Likes

There is also a metadata document (an info doc) available that might be useful in this situation. This will tell you when each document was initially replicated to the server, as well as when the latest update was replicated to the server.

You can find the documentation for info docs here.

1 Like

Based on the link you shared, is it fair to assume that latest_replication_date is the date (or the closest date) the chv clicks on the Sync Now button?

Not really… the latest_replication_date only gets updated when it’s corresponding document is EDITED and synced. So when the CHV submits a report and syncs it the first time, the initial_replication_date and latest_replication_date will be the same. If the CHV never edits that report, the latest_replication_date will never change.

Also, just in case it’s not clear, there is one -info document per replicated document. So if you submit 5 pregnancy reports and then sync them to the server, the CHT will create 5 corresponding -info documents on the server.

In your situation, I think you will be interested in comparing the reported_date on the report with the initial_replication_date on the -info doc. The reported_date will be the date that the CHV submitted that report (using the date on their phone), and the initial_replication_date will be the date the report was synced to the server for the first time.

You may need to tweak this for your situation, but here is some SQL that can help you see the difference between the three dates (reported, initial replication, and latest replication).

SELECT
	form.uuid,
	form.formname,
	form.reported AS chv_submitted_date,
	infodoc.doc->>'initial_replication_date' AS initial_replication_date,
	infodoc.doc->>'latest_replication_date' AS latest_replication_date
	
FROM
	form_metadata AS form
	INNER JOIN couchdb AS infodoc ON (form.uuid = infodoc.doc->>'doc_id')

LIMIT 50
1 Like

Hi @michael , @derick,

I did some analysis by running the following query to follow-up with CHVs whose reported date is greater than the sync date. My assumption was that their device date/time might be incorrect but that wasn’t the case. So I am not sure what’s the explanation behind this discrepancy.

select * from useview_chv_latest_reported_date latest_r inner join useview_chv_sync sync
on latest_r.chv_uuid = sync.chv_uuid and latest_reported_date > latest_replication_date

Hi @iesmail …

I’m not sure how your two useview’s are defined, are you just JOINing the CHV’s UUID or do those views already have some other logic in them? You need to JOIN the info doc with the report (from form_metadata) and compare those dates.

From my understanding of your intention, I think the ones you will be interested in will be where the reported date is in one month and the initial_replication_date is in a different (later) month. For example… you could have a scenario where the report was submitted on January 25th (form_metadata.reported) but the CHV didn’t sync until say February 10th (initial_replication_date).

HI @michael,

Here are the useview definitions:

useview_chv_.latest_reported_date

------------------------------------------------------------

-- Materialized view to show table of the most recent document upload date for each CHV.

------------------------------------------------------------

DROP MATERIALIZED VIEW IF EXISTS useview_chv_latest_reported_date;

CREATE MATERIALIZED VIEW useview_chv_latest_reported_date AS

(

  SELECT DISTINCT

      CASE

          WHEN doc ->> 'type' = 'data_record' THEN doc #>> '{fields,inputs,user,contact_id}'

          WHEN doc ->> 'type' = 'person' THEN CASE

            WHEN doc #>> '{meta,created_by_person_uuid}' != '' THEN doc #>> '{meta,created_by_person_uuid}'

            WHEN doc #>> '{meta,created_by_person_uuid}' = '' THEN doc #>> '{meta,last_edited_by_person_uuid}' END

          WHEN doc ->> 'type' = 'clinic' THEN doc #>> '{meta,created_by_person_uuid}' END

          AS chv_uuid,

      MAX(TO_TIMESTAMP((NULLIF(doc ->> 'reported_date', '')::bigint / 1000)::double precision)) AS latest_reported_date

  FROM

      couchdb

  WHERE

    doc ->> 'type' = 'data_record' OR doc ->> 'type' = 'person' OR doc ->> 'type' = 'clinic'

  GROUP BY

      chv_uuid

);

-- indexes

CREATE UNIQUE INDEX useview_chv_latest_reported_date_index ON useview_chv_latest_reported_date USING btree (chv_uuid,latest_reported_date);

useview_chv_sync

CREATE OR replace FUNCTION f_cast_dtts(TEXT) RETURNS TIMESTAMP WITHOUT TIME ZONE

    immutable

    LANGUAGE SQL

AS $$

SELECT $1::TIMESTAMP WITHOUT TIME ZONE

$$;

------------------------------------------------------------

-- Materialized view to show table of last CHV sync dates.

------------------------------------------------------------

DROP MATERIALIZED VIEW IF EXISTS useview_chv_sync;

CREATE MATERIALIZED VIEW useview_chv_sync AS

(

  WITH info_CTE AS

    (

      SELECT

        doc->>'doc_id' AS doc_uuid,

        f_cast_dtts(doc ->> 'latest_replication_date') AS latest_replication_date

      FROM

        couchdb

      WHERE

        doc ->> 'latest_replication_date' != 'unknown'

        AND f_cast_dtts(doc ->> 'latest_replication_date') >= (NOW() - INTERVAL '3 months')

        AND doc->>'type' = 'info'

    ),

  clinic_cte AS

    (

      SELECT

        clinic.uuid AS doc_uuid,

        chw_area.contact_uuid AS chw_uuid

      FROM

        contactview_metadata clinic

      LEFT JOIN

        contactview_metadata chw_area

      ON

        chw_area.uuid = clinic.parent_uuid

      WHERE

        clinic.type='clinic'

    ),

  persons_cte AS

    (

      SELECT

        person.uuid AS doc_uuid,

        clinic.chw_uuid AS chw_uuid

      FROM

        contactview_metadata person

      INNER JOIN

        clinic_cte clinic

      ON

        clinic.doc_uuid = person.parent_uuid

      WHERE

        person.type='person'

    ),

  reports_cte AS

    (

      SELECT

        meta.uuid,

        meta.chw AS chw_uuid

      FROM

        form_metadata meta

    )

  SELECT DISTINCT

    chw_uuid AS chv_uuid

    , MAX(latest_replication_date) AS latest_replication_date

  FROM

    (SELECT DISTINCT ON

      (latest_replication_date, info.doc_uuid)

      info.doc_uuid,

      docs.chw_uuid,

      latest_replication_date

    FROM

      info_CTE info

    LEFT JOIN

    (

      SELECT * FROM

        clinic_cte

      UNION ALL

      SELECT * FROM

        persons_cte

      UNION ALL

      SELECT * FROM reports_cte

    )

    docs ON docs.doc_uuid = info.doc_uuid

    ORDER BY latest_replication_date DESC) AS all_docs

  GROUP BY chw_uuid

  ORDER BY latest_replication_date DESC

);

-- indexes

CREATE UNIQUE INDEX useview_chv_sync_chw_uuid_latest_replication_date ON useview_chv_sync USING btree (chv_uuid,latest_replication_date);

So if you join these two views based on chv_uuid and search for instances where reported date is ahead of sync date, it will return a list of chvs that match this criteria This shouldn’t be the case because a CHV cannot submit a form after it has been synced.
This then begs the question as to why the reported date is greater than the sync date?
One reason might be that the CHV’s device date/time is in the future. Therefore, when they submitted the form, the reported date was saved as a future date. But upon investigating with the chvs, we found out that their device date/ times are correct. After ruling out this possibility, I can’t seem to think of any other reason behind this discrepancy.

Could it be that the date on the CHV phone was in the future and after being warned by the app of a misconfigured date issue, they ended up correcting it?

I personally called the CHV and asked them to read the date and time on their phone and also immediately send me a screenshot.

Thanks @iesmail …

As you correctly pointed out, if the phone’s date/time was wrong when they submitted the record, the reported date on the record will have the wrong date/time. So even if they immediately update their date/time after submitting the report, it will be permanently stored with the wrong time… so I don’t think you should rule out wrong date/time on the phone even though you have called them to ask what the date/time is on their phones.

Your queries are both using the aggregate MAX function which will obscure much of the detail you probably want to look at. To better understand what is going on, I would recommend JOINing individual records with their info docs and looking for records where reported > initial_replication_date. The info doc timestamps are based on the server time, just to be clear.

The query below will return all reports (form_metadata) where the reported day is LATER than the initial replication day (I’m casting to date for ease). Does this query return any records for you?

SELECT
	form.uuid,
	form.reported::date AS chv_submitted_date,
	(infodoc.doc->>'initial_replication_date')::date AS initial_replication_date,
	form.reported::date - (infodoc.doc->>'initial_replication_date')::date AS days_diff
	
FROM
	form_metadata AS form
	INNER JOIN couchdb AS infodoc ON (form.uuid = infodoc.doc->>'doc_id')

WHERE
	infodoc.doc->>'initial_replication_date' <> 'unknown' AND
	form.reported::date - (infodoc.doc->>'initial_replication_date')::date > 0

@michael,

An interesting observation is that after I called the chv to verify the date/time, I then re-ran the query I posted earlier and I still saw the person in the result set.

I ran the query you shared and it returns 110 rows with a max days_diff of 90 days.

OK. So those 110 rows are examples where the date on the phone was definitely wrong. We sometimes see dates on phones in the opposite direction too (2010, for example), just FYI.

Happy to connect one-on-one to have a look at things more closely if you’d like.

The connected_users view is also very useful for this. It isn’t exactly a “sync date” but it is the timestamp when the user last connected to the server. This is basically when did the user “last attempt to sync”.

https://<instance>/medic-logs/_design/logs/_view/connected_users?reduce=false&descending=true&end_key=1676592000000
  • descending=true show users who connected most recently
  • end_key=1676592000000 show connections only after Feb 17 2023
4 Likes