Configure downwards replication

We’re hoping to filter the amount/type of records that gets replicated to specific users.

This was in part due to us noticing replication warnings for our DHO users, which have recently begun causing timeouts, and also because we’re trying to limit data usage.

From what I’ve seen it should be possible to filter the documents that will be replicated.
In this case, for a Verbal Autopsy Practitioner (VAP), we want to replicate all records that are not of type “hhm” and records that are of type “hhm” that also has been flagged as possibly being deceased.
Something along the lines of:

var filterFunction = function (doc, req) {
    // Retrieve user context from the request
    var userCtx = req.userCtx;

    // Check if the user has the role "vap"
    var hasVapRole = userCtx && userCtx.roles && userCtx.roles.indexOf('vap') !== -1;
    // Replicate all records if not the VAP role.
    if(!hasVapRole){
        return true;
    }
    // Allow replication for users with the "vap" role
    // or for documents where contact_type is not "hhm"
    // or where contact_type is "hhm" and death_flag is "true"
    else if (hasVapRole && (doc.contact_type !== "hhm" || (doc.contact_type === "hhm" && doc.death_flag === "true"))) {
        return true;
    } else {
        return false;
    }
};

What eludes me is where this piece of code should live and how does one make it fire on downwards replication only?

Hi @Anro

I suggest you use purging instead of adding this custom code.
CHT-Core used to use a replication filter function, but this was removed in 2016. There’s a very long thread discussing the very high performance impact that replication filter functions have: Replication performance · Issue #2286 · medic/cht-core · GitHub

We have since rewritten the replication mechanism 3 times, every time to improve performance, stability and reduce complexity. We are also shifting away from using CouchDb replication protocol entirely, and have instead implemented our own protocol in latest versions of the CHT.

I suggest you look into purging, and adjust your purging function to exclude documents from being replicated: Purging | Community Health Toolkit

Hi @diana,

Thank you for the detailed response and guidance. I’ve always wondered why the CHT broke away from the default couch replication mechanisms, now I’ve got a little bit of background.
I believe the downwards replication was further tweaked in the 4.3 upgrade.

Per your suggestion I’ve been having a look at the purging, just a few questions/confirmations:

  1. Does the purging function live on the server and run as a scheduled task by sentinel?
  2. Purging does not actually delete the document in the medic database, but rather adds a type of exclusion right?
  3. Will purging only work on downwards replication and not upwards? Might be a silly question.
  4. The documentation mentions creating a purge.js file on project root, whereas another section notes creating an entry in the app_settings.json. Does both work? I’d very much prefer to write code in a JS file.
  5. Is there a way of “editing” a record before it gets replicated downwards? Like omitting a property in the record.

We have two use cases:

  1. Pull consent reports to the client, without pulling their attachments.
  2. Pull household members marked as deceased to VAP users.

Hi @Anro

Does the purging function live on the server and run as a scheduled task by sentinel?

This is correct.

Purging does not actually delete the document in the medic database, but rather adds a type of exclusion right?

Also correct.

Will purging only work on downwards replication and not upwards? Might be a silly question.

Only for downwards. New documents that get created and uploaded and should be purged, will get purged when the purging cron runs.

The documentation mentions creating a purge.js file on project root,

purge.js is the right place. It will get copied in app_settings when config is compiled.

Is there a way of “editing” a record before it gets replicated downwards? Like omitting a property in the record.

No, there is no way to transform documents at random.

Thank you for confirming the above.

Just making 100%, this would be on the root of the project, not the root of the project specific config folder entry?

We need to transform all our consent forms, essentially omitting the signature attachment, prior to replicating the record downwards.
If purging does not facilitate that functionality, and db filters aren’t advised, what other mechanism can be used to achieve the same goal?

To enable purging, write your purge configuration to purge.js in your project root:

Purging is config. It should live where your config lives.

If purging does not facilitate that functionality, and db filters aren’t advised, what other mechanism can be used to achieve the same goal?

I’m afraid we have no such mechanism, and would probably discourage from anyone doing that on a regular basis, if you need to do this, this should be a one-off thing, and you should update your config so new documents have the right properties.

I would personally advise you search for a workaround where this is not needed. If you can’t find one, the only alternative is to write your own custom script that would do this migration.

We have several migrations we have written ourselves, but editing large batches of documents can produce large disruptions - due to added server load since everyone needs to download them again. We are no longer supporting or creating migrations that alter large numbers of documents, and rather we make our code backwards compatible with all previous “versions” of these documents.

Thank you for your detailed response!

I’m hoping our very recent upgrade to 4.3.x, containing the downwards replication rewrite, will help address some of our downward syncing woes with higher hierarchy level users. That being said, as work continues the sheer amount of data these folks will need to download will increase. It does not make sense, at least for our purposes, to replicate signatures down. It also won’t be a one-off thing, with consent being given and taken away at any given time.
This is why the couchDB filter functions would suite our needs to well, as it allows us to omit parts of records we’re no longer interested - rather than the entire record like in purging.

I’m not sure if custom migration scripts will work for us either as our server is quite modestly resourced, especially since our last migration conversation.
And, as you’ve noted, we don’t really want to introduce additional syncing overhead.

Am I correct in thinking, while CHT doesn’t use it, it’s still possible to have couchDB use filter functions?

The above only only pertains to our one use case, where we need to omit a specific property before it’s replicated down.


Our other use case, only pulling specific users & their death reports, I believe can be facilitated via purging.
I’ve yet to do some testing as the form/feature itself is still under development, but this is what I’ve dreamt up so far:

// As of 3.14.0, contacts that have more than 20,000 associated reports + messages will be skipped, 
// and none of their associated reports and messages will be purged.
// https://docs.communityhealthtoolkit.org/apps/guides/performance/purging/#considerations

module.exports = {
  run_every_days: 7,
  fn: function(userCtx = {}, contact = {}, reports = [], messages, chtScriptApi, permissions) {
    // The following purging rules only apply to the "Verbal Autopsy Practitioner (VAP)" user.
    // We've been experiencing some replication issues on the uppermost levels (DHO, Team Lead).
    // We're attempting to proactively guard this role against those same woes.
    if((userCtx && userCtx.roles && userCtx.roles.indexOf('vap') !== -1)){
      // We're only concerned with the following contact types,
      // which includes the "household member (hhm), however, the hhm db entry has some additional checks and is therefore a special entry."
      const HHM = 'hhm';
      const allowedContactTypes = ["npo", "team_area", "indawo", "dwelling", "household", HHM];
      // For the VAP we're only really interested in hhm's that are flagged for death.
      // However, we need to replicate the entire hierarchy down in order to get to them.
      const DEATH_FLAGGED = 'death_flag';

      const shouldPurgeContact = (doc) => {
        // Purge if the document does not contain a "contact_type" property
        if(!doc.contact_type){
          return true;
        }

        // Purge if the contact_type is not in the allowed list
        if(!allowedContactTypes.includes(doc.contact_type)) {
          return true;
        }
        
        // Purge hhm documents that have not been flagged as deceased
        if(doc.contact_type === HHM && (doc[DEATH_FLAGGED] === false || doc[DEATH_FLAGGED] === 'false' || !doc[DEATH_FLAGGED])) {
          return true;
        }

        return false;
      };

      const purgeContact = (shouldPurgeContact(contact) ? [contact._id] : []).filter((value) => value);

      // Purge all the reports, except the death report so that VAPs can see the work they've done
      const reportsToPurge = reports.filter((value) => value.form !== 'death_report').map(r => r._id).filter((v) => v);
      
      // Purging of messages is not needed as we don't use the feature
      return [
        ...purgeContact,
        ...reportsToPurge
      ]; 
    }

    return []; // Do not purge anything
  }
};

I’m not sure what possible means. If you intend to write your own replication script to download documents somewhere then yes, you can use filter functions. These functions cannot be inserted or included in any way in the CHT, unless you edit core code and hardcode your functions in your client.

Thanks Diana, appreciate the guidance.
Last question on this: can the contact, as shown in the above script, be purged just like the reports?

Hi @Anro

I believe you can purge contacts too - as in the system won’t stop you, as far as I remember. But I don’t believe we test for it. Our testing covers purging reports.

Thank you, @diana!

I managed to pin the script against the feature that is still under development, and it seemed to work as intended. Currently, it’s not very optimized, as all the places will still pull regardless of whether they have an HHM flagged for death, but it’s a step in the right direction :blush:.
Luckily most reports and persons are omitted :rocket: .
Further testing will tell if the script holds up.

1 Like