Additional scalability testing for CHT 4.0.0

gareth · September 22, 2022, 2:39am

History

Previously CHT scalability was tested by seeing how many users could replicate documents from the server simulating first login. This targeted the main endpoints that CHT applications use, but was incomplete because it didn’t cover uploading documents to the CHT or doing incremental synchronizations. To address this an additional scalability test suite has been developed to continuously generate documents to find the maximum throughput of the CHT.

Setup

The test ran with the following parameters.

CHT 4.0.0 pre-release.
API and Sentinel running on 8 cores with 15GB ram.
A cluster of 3 CouchDB nodes each with 4 cores and 30GB ram.
300 users starting with 2,100 documents each creating 1,000 additional documents and timed how long it took the server to complete the replication. Here were making the assertion that this is similar to a more users creating fewer documents.
This test did not include some document types, for example: tasks, targets, and telemetry. This is because the replication code performs the same regardless of document type.
The result comes up with the maximum number of documents synced per month, assuming the load is spread out evenly over the month. If most of the activity for a project happens at a specific time of the month then this will reduce the amount the server can process in a month.
All servers and the simulated clients ran within the same datacenter so the internet connection was ideal. This helped to isolate server scalability by eliminating network limitations.

Results

With the above conditions the test proved that the CHT server could process more than 53,000 documents per hour, or 38 million documents per month. From this the maximum number of active users can be calculated for a given deployment, if you know how many documents the average user will create.

Open questions

An additional test was executed with each user starting with more documents. In this case it was found that the performance reduced quickly. We need to do more investigation into whether this is a limitation of the testing framework, the client, or the server.
It would be very interesting to compare the results above with older CHT versions.
Another interesting dimension would be testing how these results change when using a larger CouchDB cluster, for example 6 or 12 nodes.

marc · September 23, 2022, 6:30pm

Thanks for posting the preliminary results, it’s exciting to see the scalability testing results after so much hard work has gone into the new architecture.

From my quick calculations, an existing project of over 4000 users with eight use cases would reasonably be able to increase the number of users 3x to over 12,000 on CHT v4.0 with multiple CouchDB nodes. This is a huge gain!

Each project would have different document calculations and synchronization distribution throughout the month, so its worth digging into the numbers further. I’m happy to walk through the calculations with anyone interested in making their own estimates – and in the meantime it’s a great sign for deployments looking to scale further!