Destroying historical test data on instance

@Job_Isabai regarding our conversation earlier this week about deleting historical reports from a test instance, I have put together this NodeJS script that should at least provide a starting point to work from. Currently to run the script (with NodeJS 18+), you must supply a COUCH_URL environment variable that points to your CHT instance (ending with /medic) and includes an admin username and password.

COUCH_URL=https://medic:password@my.cht.instance.com/medic node purge_docs_from_server.js

The script will prompt you to enter a date. All reports on the instance with a reported_date at or below the given date will be completely removed from the server.

Warnings/caveats:

  • Should NEVER be used anywhere close to a prod instance. It is intended to cause data loss!
  • Totally removes the data from the server using the Couch _purge endpoint. There is no way to roll back these changes or recover the deleted data.
  • Will only remove reports (docs where type = data_record). No contact docs or other docs will be affected.
  • Requires NodeJS 18+ to run.
  • Documents are totally removed from the server, but this deletion will NOT be replicated to any clients. The documents will remain on the client.
    • For a test instance with a small set of client devices the easiest thing to do is to just reset and re-sync all the devices after running this script to ensure the devices only contain the data remaining on the server.
    • Alternatively, you can implement a CHT Purging strategy that will purge all reports from client devices after a specified period of time. Then, use this script to hard-delete these purged docs from the server.
  • This script is presented as-is with no guarantees of success. Use at your own risk!
4 Likes

We have a partner that requested a similar strategy and I’m putting together a draft of a cold storage solution script. I’ll share final source code here, after it’s been reviewed and approved.

3 Likes

Hi Josh,
I will test this later this week and keep you posted.
Thanks,

1 Like

Just realized that am on node 12. I will need to asses if the upgrade might affect existing set-up before upgrading.

For what it is worth, the script can be run from anywhere that is able to connect to the test instance. It does not have to be run on the instance, itself. Also, another option is to just run it from within a Docker container that already has Node 18:

docker run -it --rm -v "$PWD":/usr/src/app -w /usr/src/app -e COUCH_URL=https://medic:password@192-168-1-248.my.local-ip.co:8443/medic node:18 node src/purge_docs_from_server.js

Thank you for that script it work well

would it be possible to tweek it to remove contact too ?

like by changing this ?

br

We can definitely tweak the script to also purge contacts. (I actually just pushed some refactoring changes to the repo that will make adding features a lot easier in the future.)

I think the big question is what would be the most useful way to qualify on which contacts to purge? Some options:

  • Just use reported_date same as the reports
  • Purge all contacts of a certain type (e.g. all persons)
  • Purge all contacts in a particular hierarchy tree (probably going to require significant logic to make this happen)
  • Some combination of these

they all sound good but I would start with the all “patient” type because a tons of those are created during test, training or demoing

in addition, you would have less other contact so manual deletion is doable in my case, also other contact, as they can be attached to user may bring more complexity

br

1 Like

FYI, I have just merged support for purging “patient” contacts! Currently I am considering a contact to be a patient if they have "role": "patient" and the type of the contact matches the specified contact type (person by default). Definitely open to tweaking this logic if it is not suitable!

1 Like