We would like to capture the date the CHV syncs their records to the database. We have a CHV P4P scheme in place where we pay the CHVs on a particular date every month . We have received many complaints from CHVs claiming that the amount they were paid does not reflect the work they did. We suspect that they sync the data late after the payment deadline but we have no way of proving that unless we have the sync date information.
From 3.12, we are tracking new replication-related metrics in telemetry. You can read about the specific metrics by searching for replication:
on that page.
Do let us know if any would appear beneficial in resolving your challenge.
There is also a metadata document (an info
doc) available that might be useful in this situation. This will tell you when each document was initially replicated to the server, as well as when the latest update was replicated to the server.
You can find the documentation for info
docs here.
Based on the link you shared, is it fair to assume that latest_replication_date
is the date (or the closest date) the chv clicks on the Sync Now button?
Not really… the latest_replication_date
only gets updated when it’s corresponding document is EDITED and synced. So when the CHV submits a report and syncs it the first time, the initial_replication_date
and latest_replication_date
will be the same. If the CHV never edits that report, the latest_replication_date
will never change.
Also, just in case it’s not clear, there is one -info
document per replicated document. So if you submit 5 pregnancy reports and then sync them to the server, the CHT will create 5 corresponding -info
documents on the server.
In your situation, I think you will be interested in comparing the reported_date
on the report with the initial_replication_date
on the -info
doc. The reported_date
will be the date that the CHV submitted that report (using the date on their phone), and the initial_replication_date
will be the date the report was synced to the server for the first time.
You may need to tweak this for your situation, but here is some SQL that can help you see the difference between the three dates (reported, initial replication, and latest replication).
SELECT
form.uuid,
form.formname,
form.reported AS chv_submitted_date,
infodoc.doc->>'initial_replication_date' AS initial_replication_date,
infodoc.doc->>'latest_replication_date' AS latest_replication_date
FROM
form_metadata AS form
INNER JOIN couchdb AS infodoc ON (form.uuid = infodoc.doc->>'doc_id')
LIMIT 50
I did some analysis by running the following query to follow-up with CHVs whose reported date is greater than the sync date. My assumption was that their device date/time might be incorrect but that wasn’t the case. So I am not sure what’s the explanation behind this discrepancy.
select * from useview_chv_latest_reported_date latest_r inner join useview_chv_sync sync
on latest_r.chv_uuid = sync.chv_uuid and latest_reported_date > latest_replication_date
Hi @iesmail …
I’m not sure how your two useview
’s are defined, are you just JOINing the CHV’s UUID
or do those views already have some other logic in them? You need to JOIN the info
doc with the report (from form_metadata
) and compare those dates.
From my understanding of your intention, I think the ones you will be interested in will be where the reported date is in one month and the initial_replication_date
is in a different (later) month. For example… you could have a scenario where the report was submitted on January 25th (form_metadata.reported
) but the CHV didn’t sync until say February 10th (initial_replication_date
).
HI @michael,
Here are the useview definitions:
useview_chv_.latest_reported_date
------------------------------------------------------------
-- Materialized view to show table of the most recent document upload date for each CHV.
------------------------------------------------------------
DROP MATERIALIZED VIEW IF EXISTS useview_chv_latest_reported_date;
CREATE MATERIALIZED VIEW useview_chv_latest_reported_date AS
(
SELECT DISTINCT
CASE
WHEN doc ->> 'type' = 'data_record' THEN doc #>> '{fields,inputs,user,contact_id}'
WHEN doc ->> 'type' = 'person' THEN CASE
WHEN doc #>> '{meta,created_by_person_uuid}' != '' THEN doc #>> '{meta,created_by_person_uuid}'
WHEN doc #>> '{meta,created_by_person_uuid}' = '' THEN doc #>> '{meta,last_edited_by_person_uuid}' END
WHEN doc ->> 'type' = 'clinic' THEN doc #>> '{meta,created_by_person_uuid}' END
AS chv_uuid,
MAX(TO_TIMESTAMP((NULLIF(doc ->> 'reported_date', '')::bigint / 1000)::double precision)) AS latest_reported_date
FROM
couchdb
WHERE
doc ->> 'type' = 'data_record' OR doc ->> 'type' = 'person' OR doc ->> 'type' = 'clinic'
GROUP BY
chv_uuid
);
-- indexes
CREATE UNIQUE INDEX useview_chv_latest_reported_date_index ON useview_chv_latest_reported_date USING btree (chv_uuid,latest_reported_date);
useview_chv_sync
CREATE OR replace FUNCTION f_cast_dtts(TEXT) RETURNS TIMESTAMP WITHOUT TIME ZONE
immutable
LANGUAGE SQL
AS $$
SELECT $1::TIMESTAMP WITHOUT TIME ZONE
$$;
------------------------------------------------------------
-- Materialized view to show table of last CHV sync dates.
------------------------------------------------------------
DROP MATERIALIZED VIEW IF EXISTS useview_chv_sync;
CREATE MATERIALIZED VIEW useview_chv_sync AS
(
WITH info_CTE AS
(
SELECT
doc->>'doc_id' AS doc_uuid,
f_cast_dtts(doc ->> 'latest_replication_date') AS latest_replication_date
FROM
couchdb
WHERE
doc ->> 'latest_replication_date' != 'unknown'
AND f_cast_dtts(doc ->> 'latest_replication_date') >= (NOW() - INTERVAL '3 months')
AND doc->>'type' = 'info'
),
clinic_cte AS
(
SELECT
clinic.uuid AS doc_uuid,
chw_area.contact_uuid AS chw_uuid
FROM
contactview_metadata clinic
LEFT JOIN
contactview_metadata chw_area
ON
chw_area.uuid = clinic.parent_uuid
WHERE
clinic.type='clinic'
),
persons_cte AS
(
SELECT
person.uuid AS doc_uuid,
clinic.chw_uuid AS chw_uuid
FROM
contactview_metadata person
INNER JOIN
clinic_cte clinic
ON
clinic.doc_uuid = person.parent_uuid
WHERE
person.type='person'
),
reports_cte AS
(
SELECT
meta.uuid,
meta.chw AS chw_uuid
FROM
form_metadata meta
)
SELECT DISTINCT
chw_uuid AS chv_uuid
, MAX(latest_replication_date) AS latest_replication_date
FROM
(SELECT DISTINCT ON
(latest_replication_date, info.doc_uuid)
info.doc_uuid,
docs.chw_uuid,
latest_replication_date
FROM
info_CTE info
LEFT JOIN
(
SELECT * FROM
clinic_cte
UNION ALL
SELECT * FROM
persons_cte
UNION ALL
SELECT * FROM reports_cte
)
docs ON docs.doc_uuid = info.doc_uuid
ORDER BY latest_replication_date DESC) AS all_docs
GROUP BY chw_uuid
ORDER BY latest_replication_date DESC
);
-- indexes
CREATE UNIQUE INDEX useview_chv_sync_chw_uuid_latest_replication_date ON useview_chv_sync USING btree (chv_uuid,latest_replication_date);
So if you join these two views based on chv_uuid
and search for instances where reported date is ahead of sync date, it will return a list of chvs that match this criteria This shouldn’t be the case because a CHV cannot submit a form after it has been synced.
This then begs the question as to why the reported date is greater than the sync date?
One reason might be that the CHV’s device date/time is in the future. Therefore, when they submitted the form, the reported date was saved as a future date. But upon investigating with the chvs, we found out that their device date/ times are correct. After ruling out this possibility, I can’t seem to think of any other reason behind this discrepancy.
Could it be that the date on the CHV phone was in the future and after being warned by the app of a misconfigured date issue, they ended up correcting it?
I personally called the CHV and asked them to read the date and time on their phone and also immediately send me a screenshot.
Thanks @iesmail …
As you correctly pointed out, if the phone’s date/time was wrong when they submitted the record, the reported
date on the record will have the wrong date/time. So even if they immediately update their date/time after submitting the report, it will be permanently stored with the wrong time… so I don’t think you should rule out wrong date/time on the phone even though you have called them to ask what the date/time is on their phones.
Your queries are both using the aggregate MAX
function which will obscure much of the detail you probably want to look at. To better understand what is going on, I would recommend JOINing individual records with their info
docs and looking for records where reported > initial_replication_date
. The info
doc timestamps are based on the server time, just to be clear.
The query below will return all reports (form_metadata) where the reported day is LATER than the initial replication day (I’m casting to date
for ease). Does this query return any records for you?
SELECT
form.uuid,
form.reported::date AS chv_submitted_date,
(infodoc.doc->>'initial_replication_date')::date AS initial_replication_date,
form.reported::date - (infodoc.doc->>'initial_replication_date')::date AS days_diff
FROM
form_metadata AS form
INNER JOIN couchdb AS infodoc ON (form.uuid = infodoc.doc->>'doc_id')
WHERE
infodoc.doc->>'initial_replication_date' <> 'unknown' AND
form.reported::date - (infodoc.doc->>'initial_replication_date')::date > 0
An interesting observation is that after I called the chv to verify the date/time, I then re-ran the query I posted earlier and I still saw the person in the result set.
I ran the query you shared and it returns 110 rows with a max days_diff
of 90 days.
OK. So those 110 rows are examples where the date on the phone was definitely wrong. We sometimes see dates on phones in the opposite direction too (2010, for example), just FYI.
Happy to connect one-on-one to have a look at things more closely if you’d like.
The connected_users
view is also very useful for this. It isn’t exactly a “sync date” but it is the timestamp when the user last connected to the server. This is basically when did the user “last attempt to sync”.
https://<instance>/medic-logs/_design/logs/_view/connected_users?reduce=false&descending=true&end_key=1676592000000
-
descending=true
show users who connected most recently -
end_key=1676592000000
show connections only after Feb 17 2023