Ability to check if a form is previously filled for purged forms

Prajwol · January 16, 2025, 8:58am

Your Organization: Medic

Organization Type: Nepal Ministry of Health - eCHIS

What Other Organizations Would Benefit From This Feature: All current deployments need historical form fillup to avoid duplication

Describe the Feature: As a CHN User(offline user) I want to check If I have previously filled a (purgeable) form for a person or household to avoid repeating the form fillup in consecutive visits.

What “Pain Point” Does The Proposed Feature Address:

Data Duplication
Extra work for CHN
Data correctness

Proposed Solution:

Each time a report is filed for a contact capture the form name and date filled in the contact summary or
A summary of the contact’s form-filled history with the above details computed in a set time.

Do you have funding to cover external developers? :No

Do You Have Resources (Designers, Developers, PMs) Available: Nepal App dev might be able to work on it.

Links To Supporting Information / Uploads: Might be similar to how OpenMRS tracks patients visits/encounter

diana · January 16, 2025, 10:31am

I believe the simplest would be not to purge forms that have a high duplication rate, because people end up resubmitting them.

jkuester · January 16, 2025, 2:24pm

There is lots of discussion regarding preventing duplicates on this issue: Prevent duplicate sibling contact capture · Issue #9601 · medic/cht-core · GitHub

The initial focus is on preventing duplicate contacts, but we are trying to implement it in such a way as to be able to be naturally extended to reports in the future.

Most relevant to this thread, though, is that it is still going to depend on the existing report not being purged from the device. I do wonder if it is possible to setup your purge config to always leave the most recent copy of any purged form type. If not, that may be a possible feature to pursue adding…

Anro · January 21, 2025, 1:09pm

We’re interested in the "purge config to always leave the most recent copy of any purge form type" as well. Unfortunately, this is the only mention of such functionality we found while combing through the CHT forum threads.

Similar to the original post, we want to keep the last report to ensure that a subsequent form can load previous values via the passed-in context (provided through the contact summary). For something like legitimate follow ups.

At the moment, we only have time-based purging, which, to quote our site manager, is a dealbreaker if it compromises the continuity of care data. For that reason, we can’t risk rolling out the purge script to CHWs just yet.

As far as I know, the purge script loops through documents one by one. Since it’s a self-contained function, it’s unable to “check” against other documents to determine if the current item is the newest. @diana would that be trivial to add?

Our purge script content:

// As of 3.14.0, contacts that have more than 20,000 associated reports + messages will be skipped, 
// and none of their associated reports and messages will be purged.
// NOTE: Purging does not touch documents in the medic database, everything is done in separate purge databases(medic-purged-roles-<roles-hash>).
// A purgelog document is saved in the medic-sentinel database after every purge. The purgelog has a meaningful id: purgelog:<timestamp>, where timestamp represents the moment when purging was completed.
// Errors can be found in the purgelog as purgelog:error:<timestamp>
// https://docs.communityhealthtoolkit.org/apps/guides/performance/purging/#considerations

// ############################################### PURPOSE ############################################### 
// We've been experiencing replication issues on the topmost users (DHO, Team Lead).
// The purge script will reduce the number of records synced to the client devices of these login users.
module.exports = {
  // We need to find a sweet spot, as firing too frequently could overburden our already modestly specced server
  // Purging may take more than 31 hours - https://forum.communityhealthtoolkit.org/t/purging-on-3-14-2-after-8-hours-waiting-how-can-i-know-if-purging-is-still-in-progress/1837/2
  // https://forum.communityhealthtoolkit.org/t/purging-on-3-14-2-after-8-hours-waiting-how-can-i-know-if-purging-is-still-in-progress/1837/2
  run_every_days: 7, //  The interval (in days) at which purges will be downloaded client-side. Default 7.
  // 'text_expression': 'at 11:00 pm on Fri', // Any valid text expression to describe the interval of running purge server-side. For more information, see https://bunkat.github.io/later/parsers.html#text
  cron: '0 23 * * 5', // Same as above, just a different syntax. Either can be used, but one is required.
  // For more info on how to set the intervals see: https://docs.communityhealthtoolkit.org/building/guides/performance/purging/#schedule-configuration
  // The cron strings can be tested here: https://crontab.guru/

  /* eslint-disable-next-line no-unused-vars */ // Provides more info about the available tie-ins
  fn: function (userCtx = {}, contact = {}, reports = [], messages, chtScriptApi, permissions) { 
    // NOTE: the purge function CAN purge contacts, but it does not purge linked children.
    // This means that there can be dangling records replicated and could still impact performance.
    // We could consider having a static list of "top-ish level place IDs", and then remove any record with such a parent_id.
    // That could be very cumbersome to maintain, and the contacts probably has a way smaller impact on performance than app forms.

    const NOW = Date.now();
    // The purge function is self-contained. See linked CHT docs above
    // This method could add an overhead of between 0 - 3 days depending on the month, per month.
    // It will ensure all records for the given months are retrieved.
    const monthsAgo = months => NOW - 1000 * 60 * 60 * 24 * 31 * months;

    // NOTE: Make sure this is kept up to date as the roles & responsibilities of the app evolve over time!
    const householdCOPC = 'copc-hhscreening';
    const householdCSharp = 'csharp-householdconsentandquestionnaire';
    const TEST_FORM = 'YYYZ'; // This form is only used when testing the purge script for new roles
    const individualCOPC = 'copc-individualhealthcaretasks';
    const individualCSharp = 'csharp-individualhhconsentedquestionnaire';
    const individualDeath = 'death_report';
    // The "appliesIf" property could perform some sort of calculation to check if the purge applies. Default true.
    // Three "preserve" "types" can be defined: -1 = all, 0 = none, any other number of months
    // The same config can be applied to report_ and contact_types.
    const PRESERVE_ALL = -1;
    const PRESERVE_NONE = 0;

    const CONF = Object.freeze({
      // TODO: dho
      // 'vap': {
      //   // We don't care about any reports except for the death report
      //   'reportTypes': {
      //     [householdCOPC] : PRESERVE_NONE,
      //     [householdCSharp]: PRESERVE_NONE,
      //     [individualCOPC]: PRESERVE_NONE,
      //     [individualCSharp]: PRESERVE_NONE,
      //     [individualDeath]: 3
      //   },
      //   // We also don't care about any individuals that aren't marked for death
      //   'contactTypes': {
      //     // We can't delete team_areas, indawos, dwellings, or households as we don't know which contains flagged individuals
      //     'hhm': {
      //       'appliesIf': (doc) => !doc['death_flag'] || doc['death_flag'] === '' || doc['death_flag'] === 'no',
      //       'preserve': 3
      //     },
      //     // NOTE: We can, however, remove all other hierarchy persons EXCEPT our own!
      //     // If you don't take care the app will become unusable. It's a terrible experience when you delete yourself
      //     'dho': PRESERVE_NONE,
      //     'team_lead': PRESERVE_NONE,
      //     'chw': PRESERVE_NONE
      //   }
      // },
      'team_lead': {
        'reportTypes': {
          [householdCOPC] : 1,
          [householdCSharp]: 1,
          [individualCOPC]: 1,
          [individualCSharp]: 1,
          [individualDeath]: 1,
          [TEST_FORM]: PRESERVE_NONE,
        },
      }
      // TODO: chw
    });

    // TODO: we have not considered login users with multiple roles!
    const role = userCtx && userCtx.roles && userCtx.roles.length >= 1 ? userCtx.roles[0] : false;
    
    if (role && role in CONF) {
      const reportTypes = CONF[role]['reportTypes'];
      const contactTypes = CONF[role]['contactTypes'];

      const shouldPurge = (doc, conf) => {
        if(typeof conf === 'object'){
          return (conf['appliesIf'] ? conf['appliesIf'](doc) : true) && doc.reported_date <= monthsAgo(conf['preserve']);
        }
        else if(typeof conf === 'number'){
          return conf !== PRESERVE_NONE ? conf !== PRESERVE_ALL? doc.reported_date <= monthsAgo(conf) : false : true;
        }

        return false;
      };

      const purgeContact = (contactTypes && contact.contact_type in contactTypes && shouldPurge(contact, contactTypes[contact.contact_type]) ? [contact._id] : []).filter((value) => value);

      const reportsToPurge = reports.filter((doc) => doc.form in reportTypes && shouldPurge(doc, reportTypes[doc.form])).map(r => r._id).filter((v) => v);

      // Purging of messages is not needed as we don't use the feature
      return [
        ...purgeContact,
        ...reportsToPurge
      ];
    }

    return []; // Do not purge anything
  }
};

// Because JSON is fantastic with comments, we have to add the `forms.json` content here
// This can be used to allow forms posted via the CHT "/api/v2/records" API
// {
//   "YYYZ": {
//     "meta": {
//       "code": "YYYZ"
//     },
//     "fields": { 
//       "patient_id": {
//           "labels": {
//               "short": {
//                   "translation_key": "form.flag.patient_id.short"
//               },
//               "tiny": "pid"
//           },
//           "position": 0,
//           "type": "string",
//           "length": [
//           5,
//           13
//           ],
//           "required": true
//       }
//     },
//     "public_form": true
//   }
// }

// The post request body that adheres the above config & api:
// {
//   "patient_id": "<your_patient_id_here>",
//   "form": "YYYZ",
//   "nurse": "Sam",
//   "week": 23,
//   "year": 2015,
//   "visit": "ANC",
//   "fields": {
//     "patient_id": "<your_patient_id_here>"
//   },
//   "_meta": {
//     "form": "YYYZ",
//     "reported_date": 1725001661000
//   }
// }

Edit:
Changes related to this topic discussed further in the following thread:
Setup purge config to always leave the most recent report (per type) - Technical Support / Development - Community Health Toolkit

diana · January 21, 2025, 2:02pm

This is not entirely accurate. The purge function receives a context, just like the contact-summary function, that contains a contact and all reports and messages that are about that contact. So you most definitely have access to “other documents”, as long as they are from the same context.

In your example, you can group reports by type, and only keep the most recent one, for example.

Prajwol · January 21, 2025, 4:37pm

While this is a possible solution, we are purging because huge amounts of documents could slow the device down. If you see the doc count here there are quite a few CHN’s with over 10k records. Furthermore, with each CHN covering on average more than 600+ houses, and many continuous care use cases being released, this is a highly requested feature not only on purged but also other submitted reports.

diana · January 23, 2025, 3:48am

Unfortunately, with the way purging is designed now, there is no way to achieve this.

I can only think about extremely elaborate and complex ways to snapshot which forms were submitted. One similar recurring discussion was about snapshotting contact-summary for example, but that remained in the stage of passing discussions and was never deeply thought through due to its complexity.

I strongly suggest you rely on firm and well put together purging rules to avoid duplication right now.

jkuester · January 24, 2025, 3:03pm

For future reference, the conversation regarding purging all but the most recent report for a form continued in a new thread: Setup purge config to always leave the most recent report (per type)