From CHT techradar , cht-sync is still in trial phase , we are considering other already used industry ETL tools like Airflow and kindly requesting your thoughts on;-
What could be the the pros and cons of this approach for CHT as compared to cht-sync.
Will it require introducing custome code and scripts to achieve
Hello @cliff
The main advantage of using cht-sync over another tool is not having to develop much custom code, or solve again issues which are common to all CHT instances, and which cht-sync may already have solutions for.
Some examples include performance, monitoring, and schema design;
Performance: Getting dashboards and aggregates to update from couchdb to SQL in near real time while still having good query performance and flexibility can be tricky; cht-sync has been performance tested against the larger databases for both near real-time updates, and query performance against aggregate tables.
Monitoring: cht-sync comes with monitoring and observability out of the box.
Schema design: cht-sync provides DBT models that map data from the couchdb document format to a SQL schema that is easier to work with for data consumers.
Some custom code is still necessary in any case for the forms and aggregates that are only relevant to a particular CHT instance.
With cht-sync, this means adding DBT models using the base schema, with another tool it means writing ETL tasks to convert the couchdb document data into a SQL schema and THEN adding aggregates on top of that.