Purging on 3.14.2 - After 8 hours waiting, how can I know if purging is still in progress?

Purging hasn’t worked since December on this instance. We have upgraded to 3.14.2. I scheduled purging for cron: "0 7 * * SAT". This is blocking users from using the system, so it is very important to complete the purging and I’m trying to understand if anything is happening.

There is no purgelog or purgelog:error docs. I have waited about 9 hours and I’m just wanting some indication that it is still in progress. How can I know?

How long should I wait? It took ~15 hours last time.

I don’t really know how to interpret the logs except to know that it started on time. If sentinel crashes or restarts, do I need to reschedule the purge? If sentinel hasn’t restarted and I reschedule - will it cancel an existing purge or can I accidently start two purges in parallel? I’m also slightly concerned that two purges are running at once.

Apr 16 02:00:00 -- [2022-04-16 07:00:00] 2022-04-16 07:00:00 INFO: Running purging 
Apr 16 02:00:01 -- [2022-04-16 07:00:00] 2022-04-16 07:00:00 INFO: Purging: Starting contacts batch: key "", doc id "", batch size 1000 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task purging completed 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task outbound started 
Apr 16 02:00:21 -- [2022-04-16 07:00:20]   cron: '',
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task reminders completed 
Apr 16 02:00:21 -- [2022-04-16 07:00:20]   mute_after_form_for: '',
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task reminders started 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task replications completed 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task replications started 
Apr 16 02:00:21 -- [2022-04-16 07:00:20]   message: '' } 
Apr 16 02:00:21 -- [2022-04-16 07:00:20]   text_expression: '',
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 WARN: Reminder configuration invalid: { form: '',
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task dueTasks started 
Apr 16 02:00:21 -- [2022-04-16 07:00:21] 2022-04-16 07:00:21 INFO: Background cleanup batch: 59747098 -> 59747134 (36) 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task backgroundCleanup started 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task purging started 
Apr 16 02:00:21 -- [2022-04-16 07:00:21] 2022-04-16 07:00:21 INFO: Task backgroundCleanup completed 
Apr 16 02:00:21 -- [2022-04-16 07:00:20] 2022-04-16 07:00:20 INFO: Task dueTasks completed 
Apr 16 02:00:23 -- [2022-04-16 07:00:22] 2022-04-16 07:00:22 WARN: Purging: Too many reports to purge. Decreasing contacts batch size to 500 
Apr 16 02:00:23 -- [2022-04-16 07:00:22] 2022-04-16 07:00:22 INFO: Purging: Starting contacts batch: key "", doc id "", batch size 500 
...
Apr 16 02:00:27 -- [2022-04-16 07:00:26] 2022-04-16 07:00:26 INFO: Task outbound completed 
Apr 16 02:00:27 -- [2022-04-16 07:00:26] 2022-04-16 07:00:26 ERROR: Failed to push d7e22b25-77c1-4c43-8903-aa2a62d01474 to muso-sih, server responsed with 400 
Apr 16 02:00:27 -- [2022-04-16 07:00:26] 2022-04-16 07:00:26 ERROR: Response body: {"statut":400,"message":"The date_of_birth is required !"} 
Apr 16 02:00:50 -- [2022-04-16 07:00:50] 2022-04-16 07:00:50 INFO: Purging: Starting contacts batch: key "", doc id "", batch size 250 
Apr 16 02:00:50 -- [2022-04-16 07:00:50] 2022-04-16 07:00:50 WARN: Purging: Too many reports to purge. Decreasing contacts batch size to 250 
Apr 16 02:01:27 -- [2022-04-16 07:01:27] 2022-04-16 07:01:27 WARN: Purging: Too many reports to purge. Decreasing contacts batch size to 125 
Apr 16 02:01:27 -- [2022-04-16 07:01:27] 2022-04-16 07:01:27 INFO: Purging: Starting contacts batch: key "", doc id "", batch size 125 
Apr 16 02:02:03 -- [2022-04-16 07:02:02] 2022-04-16 07:02:02 INFO: Purging: Starting contacts batch: key "", doc id "", batch size 62 
Apr 16 02:02:03 -- [2022-04-16 07:02:02] 2022-04-16 07:02:02 WARN: Purging: Too many reports to purge. Decreasing contacts batch size to 62 
Apr 16 02:02:11 -- [2022-04-16 07:02:11] 2022-04-16 07:02:11 INFO: transitions: processing enabled 
Apr 16 02:02:31 -- [2022-04-16 07:02:30] 2022-04-16 07:02:30 INFO: Found 8095 records 
Apr 16 02:05:21 -- [2022-04-16 07:05:20] 2022-04-16 07:05:20 INFO: Task purging started 
Apr 16 02:05:21 -- [2022-04-16 07:05:20]   cron: '',

Hi @kenn

The length of the last purging cycle is a good indication of how long it will take to purge.

By looking at the logs you can see things like:

Apr 16 20:38:27 | [2022-04-16 17:38:27] 2022-04-16 17:38:27 INFO: Purging: Starting contacts batch: key "c50_family", doc id "7a30d4f0-8b43-4b8a-baff-63d2f85e5937", batch size 124 
Apr 16 20:38:29 | [2022-04-16 17:38:29] 2022-04-16 17:38:29 INFO: Found 8207 records 

So purging is definitely still ongoing.

Progress is hard do estimate, as we don’t know how many contacts we still have to process, or how many reports they have, or whether we’ll need to reduce the batch size if there are too many reports for any of these contacts. What you can know is that contacts are purged alphabetically, by type and then by uuid, so you can have a rough idea of how far it’s progressed (key “c50_family”, doc id “7a30d4f0-8b43-4b8a-baff-63d2f85e5937”)

Do you think logs that specifically mention “purging” like Purging: Starting contacts batch are not sufficient to indicate that purging is ongoing?

Purging took ~31 hours this time. It was about 2x as long, but it did complete successfully.

I think I just didn’t know how to interpret the log entries and sometimes they are very far apart. I’m happy with the logging now that I know what they are saying. I’ll just look for Purging: Starting contacts batch if I’m wondering whether purging is in-progress.

@kenn
Starting a project with many offline users (CHVs) with low spec devices that cant hold a lot of data, thus will have to write client side purging at the end every week, Friday evening or Saturday evening.
Would you kindly share the code that worked for you?

You’re going to need to write a solution that works for you and your project - but here is an example we use in Malawi that can get you started.

1 Like

@kenn, currently working on having the community strategy work with the custom contact types, will review once done, thank you

@kenn, my devices are pulling many documents, time for purging rule, the rule above help

Testing the rule below on a test instance for purging documents for chws, chw_supervisor and health_workers Mon to Friday at 4 pm

module.exports = {
text_expression: ‘at 4 pm Monday to Friday’,
run_every_days: 7,
cron: ‘0 16 * * 1-5’,

fn: (userCtx, contact, reports) => {
const old = Date.now() - (1000 * 60 * 60 * 24 * 365);

const reportsToPurge = reports.filter(r => {
  if (userCtx.roles.includes('chw_supervisor' && 'chw' && 'health_worker') ){
    return true;
  }
    }).map(r => r._id);

return [...reportsToPurge];

}
};

I noted you could add the code to app_settings, and push to the instance using cht --local compile-app-settings backup-app-settings upload-app-settings.
You can also create a separate purge.js file as above, how do you push it to the instance?

1 Like