Visualizing CHT data using Superset, couch2pg and Postgres

Hello community

We have been able to use cht-couch2pg docker-compose file to replicate remote CHT data to postgres and connected superset postgres though the data was exported from couchdb as a json file and imported into Postgres as a one column table with all the data still formatted with JSON.


With data in this format its hard to do some metrics ,
Is there a way to change this data when it gets into postgres in a relational table like form with columns and rows ?
Thanks
cc @mrjones @michael @diana @gareth

Hi @cliff. You can create views and materialized views to help you structure the data in a more relational way. Below is an example of how to create a materialized view that gets pregnancy registrations:

CREATE MATERIALIZED VIEW useview_pregnancy AS 
  SELECT 
    doc ->> '_id' AS uuid, 
    doc ->> 'form' AS form, 
    to_timestamp((NULLIF(doc ->> 'reported_date', '')::bigint / 1000)::double precision) AS reported, 
    COALESCE((doc #>> '{fields, patient_uuid}'), '') AS patient_uuid, 
    COALESCE((doc #>> '{fields, inputs, source}'), (doc #>> '{fields, source}'), '') AS source, 
    COALESCE((doc #>> '{fields, inputs, source_id}'), (doc #>> '{fields, source_id}'), '') AS source_id, 
    doc #>> '{fields, patient_age_in_years}' AS patient_age_in_years,
    doc #>> '{fields, lmp_date}' AS lmp_date, 
    doc #>> '{fields, edd_8601}' AS edd, 
    doc #>> '{fields, weeks_since_lmp}' AS weeks_since_lmp
  FROM 
    couchdb
    WHERE
      doc ->> 'form' = 'pregnancy';
1 Like

thanks @samuel ,
Gonna explore this :+1:

1 Like

@cliff a couple other tips:

  • Every time couch2pg it runs, it calls the refresh_matviews() function so data in materialized views is kept up to date
  • Depending on how much data you have and how performant the query is that defines your materialized view, refreshing the matviews can take some time
  • Materialized views are not refreshed in any particular order so be mindful of dependencies between materialized views (ie one materialized view probably shouldn’t depend on another one)
  • Since the matviews are refreshed CONCURRENTLY, you MUST have a UNIQUE index defined for every materialized view.
2 Likes

hello @michael @samuel
With superset where can one create the views and materialized views from ?
TJe SQL editor only allows SELECT statements

Hi @cliff.

PostgreSQL is where you have your data transformation while Superset is where you have your data use. The views and materialized views would be written in PostgreSQL. The queries you write in Superset would only read the data from the views and materialized views.

A rough idea of the steps you would follow after setting up cht-couch2pg:

  • Create materialized views on PostgreSQL
  • Create views on PostgreSQL that read from materialized views and any other tables or functions
  • Configure your database connector on Superset
  • Write SQL SELECT statements on Superset that will retrieve the data that will feed into your dashboards

I hope this helps.

1 Like

Some more tips and tricks:

  1. Make sure your dashboard mockups are complete before starting to write your queries
  2. Make sure you understand user access needs for each dashboard
  3. Make sure the indicator definitions/calculations are complete & understood before starting
  4. Plan your query architecture approach to minimize duplication
  5. To the extent possible, do calculations in PostgreSQL, NOT Superset
  6. Queries should always explicitly include field names in the SELECT (ie don’t use SELECT *)
  7. Never write the same calculation twice
  8. Use parameterized functions where useful
  9. “Shape” your query results in a way that will make it easier to work with in Superset
  10. Be mindful of query performance and the amount of data it will return to Superset
1 Like

helo @samuel @michael

i created a materialized view of data from a form with

CREATE MATERIALIZED VIEW care_form_view AS 
  SELECT 
    doc ->> '_id' AS uuid, 
    doc ->> 'form' AS form, 
    doc #>> '{fields, care, medical}' AS medical,
    doc #>> '{fields, care, adherence}' AS adherence,
    doc #>> '{fields, care, mental_health}' AS mental_health,
    doc #>> '{fields, care, support_system}' AS support_system,
    doc #>> '{fields, care, transportation}' AS transportation,
    doc #>> '{fields, care, living_situation}' AS living_situation,
    doc #>> '{fields, care, knowledge_disease}' AS knowledge_disease,
    doc #>> '{fields, care, communication_skills}' AS communication_skills
  FROM 
    couchdb
    WHERE
      doc ->> 'form' = 'care';

however on viewing the materilazed view in superset i get Error: permission denied for materialized view care_form_view

Wondering if you encountered this before ?
Thanks

Sounds like the user in your connection details from Superset does not have the permission to query the new view you created in PostgreSQL. Are you able to query other tables or views in PostgreSQL from Superset? If yes, I’d recommend looking in PostgreSQL to see who the owner is of the new view you created and compare that to the other views that you are able to query from Superset.

You might need to REASSIGN OWNED BY abc TO xyz

2 Likes

thanks @michael

i altered the view owner to the one accepted in superset and works fine now

1 Like