A person assigned to a place, regardless of offline/online, should have their data scoped

This issue has been discussed at length in this GitHub issue and has also been opened here at @kenn’s request.

I’d like the behavior to be consistent for both the reports and contact data.
If a user, regardless of online or offline, is scoped to a specific place - the pulled data should be too.

At the moment the hierarchy data seems to be scoped, but if a user is online all the report data across the system is made available to them.

Since it is possible to create a user that is not linked to a hierarchy person, it would still be possible to see ALL reports if necessary.

This would also solve a replication timeout issue we’re currently facing on our top level hierarchy persons. These folks don’t need to be offline users, as some of them will never capture an app form. Unfortunately, due to how the report data is scoped we are forced to make them offline.

I think the related cht-core mega-issue for this is (somewhat misleadingly named):

1 Like

Just to clarify, Online users should have access to all data on both the Reports and Contact pages. The difference is that the view on the Contacts Page changes a bit depending on whether or not you have an associated place (online users are not required to have a “place”, offline users are). You can test this by logging in as your DHO user and searching for a Contact that is in NPO, or going to the Detail View of a report in NPO and tapping on the Contact, it should have no problems navigating to the Contacts Page for the contact in NPO (even though the DHO user is associated to Another DHO). So while the UI does change a bit, the data access should be consistent across Reports/Contacts pages for online users. Online users data is never scoped, offline users data is always scoped.

I know this :point_up_2:t4: doesn’t help you achieve what you want to achieve, but I wanted to make sure it was clear on how things are expected to work today.

My understanding is that it would be a very significant technical effort to make online user’s data scoped… something to the affect of having to update every single CouchDB view in the app.

1 Like

Thank you for all the info @michael and @jkuester.

It seems, for now, we’re going to use the purge script to reduce the downwards replicated, including the contacts.

The hope is to make our 3 top level users (DHO, VAP, Team Lead) usable again, as currently they draw too much data which causes their login to fail on the replication step.

Taking into consideration that some of our calculations rely on certain forms being available, it seems like purging reports older than 4 months should be a safe bet.

We’re also planning to use the purge script as a sort of archiving tool, since we have been unable to find such a capability listed within the app documentation itself.
What this means is that each edit form, place and person, will have an additional property called archived. This will also prefix the name property value with “(archived)” to indicate the record status to online users. The purge script will then look out for this property in order to omit the contact record from replication. That’s the goal at least :smile:.
Ideally these records will be hidden for online users too, perhaps similar to users marked as deceased.

@jkuester would omitting a Indawo also omit the Households, people and reports underneath it?

@michael how exactly would one go about amending the views in order to achieve such a result? Perhaps we can give it a test.

would omitting a Indawo also omit the Households, people and reports underneath it?

If by “omit” you mean purge, then the answer is no. Purging should be done on a doc-by-doc basis. There might be some edge cases where this could work for a user’s initial sync (after they first login), but to properly purge the docs from a device that is already logged in, your purge function needs to return the ids of each doc to purge (both contacts and reports).

Also, I just want to note that data “archival” continues to be hotly discussed by CHT maintainers. The general consensus so far is that if you have data that no users need to see anymore (either offline or online users), then ideally it would get totally removed from the main CouchDB altogether. The idea is that Couch is good for “hot” storage of data that is actively being used by the system (or users may need to reference in the course of their work), but it is not the best for archiving historical data for “cold” storage. Beyond just the scalability concerns with keeping an indeterminate amount of historical data in Couch, there is the practical consideration that historical data is mostly useful for analytics and Couch is less than ideal for these kind of queries. (Hence the existence of cht-sync to allow for data analytics with Postgres.)

We are very much still in the investigation phases, trying to find the best approach for this kind of archival, so we don’t have anything helpful to recommend yet. If you are interested in bleeding-edge technical details, we are currently exploring a flow where server-side purging will remove data from client devices, then cht-sync can “archive” data into Postgres, and finally we use the _purge endpoint to completely remove the “archived” data from the production Couch database. Poor performance of the _purge endpoint has been something of a problem though…


The hope is to make our 3 top level users (DHO, VAP, Team Lead) usable again, as currently they draw too much data which causes their login to fail on the replication step.

Have you configured replication depth for these users?