We used goodtables when building the matching tool at DSaPP. It’s designed primarily for testing data at ingestion – the data have to be loaded into Python and available in RAM. It would be a great tool for the purpose Lucas suggests (testing data being ingested into modeling pipelines, etc.) and also for testing data being uploaded into a repository before attempting to write it to a database table. For testing data resting in or moving through the database, it’s probably quite a bit slower than using tools like DBT that can interact with the data in the database without extracting them.
1 Like