Hosting cost estimation improvement suggestions

I think the short answer is, yes. Even this first iteration of the calculator is based on data collected from production instances.

However, it is important to note that the current algorithm is extremely primitive (mostly for the sake of starting simple and then layering on complexity as necessary). For example, the server costs are being scoped to 1 out of 5 different EC2 instance configurations. The actual production costs for that EC2 instance are included, but there are many other EC2 configurations that could be used instead.

I think it should be pretty feasible to get accurate numbers for things like hosting cost per CPU/RAM/Disk as well as for things like docs-per-GB of disk space. However, precision is going to be harder for things like projecting the doc growth rate. Currently I am approximating a number based on the number of workflows implemented and the number of contacts in the instance. But perhaps there is a better way?

So, yeah, the main goal here is to leverage the data we have about productions instances into the most accurate estimation of the TCO for instances with various properties. But, some level of abstraction, approximation is always going to be required. My hope is that we can continue to evaluate and refine the algorithm and “test” it against known production instances to gain confidence in the accuracy of the estimation.