In the first quarter of 2022, we embarked on an exercise to track the live site incidents from various CHT deployments. Live site incidents are defined as issues in production that have a demonstrable impact on users such as inability to perform their regular work. We respond to these issues with highest priority and ensure appropriate resolution. From the sample instances we tracked, we observed that:
@binod - this is a great list - thanks for putting it together! I was wondering for this item:
Do you know how much of the 20% was just needing to upgrade the latest available CHT version vs how of it much exposed a novel bug that needed to be fixed, thus the incident had to wait longer to be resolved?
Another way of asking might be: Could all 20% of these incidents be avoided by staying up to date?
Looking back at the incident reports, around two-third of those 20% incidents would have been avoided by staying up to date. For the remaining one-third, we found and fixed the bug in the cht-core.