I’m looking to migrate org units from our Community Health Toolkit test instance to the UAT environment. I currently have approximately 13,000 contacts that range from level 1 to level 7. Given the size and the hierarchical nature of the data, I’m exploring the best methods to ensure a smooth transition.
Has anyone managed a similar migration? What best practices or tips can you share to handle these concerns effectively? Any advice or resources would be greatly appreciated.
Is there already existing data on the UAT instance or is it empty?
Do you want to also copy the reports associated with the migrated contacts from the test instance to UAT?
Do you want to also migrate the users (including credentials, etc) from the test instance to UAT?
Are you migrating all of the contacts from the test instance to the UAT instance or just a sub-set of the contacts on the test instance?
Depending on your precise needs here, a number of options come to mind.
To just dump all of the data from the medic database and copy it to a new instance, you might be able to just use the couchbackup utility discussed recently. It is also possible to just set up direct replication between two Couch databases. This can be configured as a one-time job to copy all the data from one db to another.
Copying just the contact docs (and not any reports, etc) is going to be more tricky. However, filtered replication should be possible with a bit more configuration.
Migrating users/credentials is probably going to be the most difficult (if that is something that you need to do). I have never tried it before, but I doubt you can simply copy/replicate docs from one _users db to another. I think the user’s password salt is tied to the secret value that is specific to a given Couch instance.
One simple way to replicate docs between Couch databases is to use the doc replicate functionality of the CHToolbox. Under the hood, this tool uses normal Couch-to-Couch replication, so the data transfer is reasonably robust/efficient.
The chtx doc replicate command supports filtered replication by contact type. So, you can specify just the contact types you want to replicate and none of the other doc types (e.g. reports) will be copied.
Running something like this will replicate all the person and clinic docs:
CHToolbox is a testing utility and is not intended for use with a production instance or production data.
Replicating large numbers of documents can be a time consuming process (and is affected by bandwidth and available compute power). 13,000 docs is a small batch and I would not anticipate any performance issues with that number of docs. However, replicating hundreds of thousands of docs can take time, particularly if the target database has existing view indexes that will need to be updated with the newly replicated docs. The same performance considerations noted for the test-data-generator are relevant here.