Impact
An unknown number of users encountered difficulties with the Automated Software integration due to machine scaling issues in the CI. The issue started on UTC-5 24-02-06 08:00 and was proactively discovered 1.1 days (TTD) later by the product team during their regular workflow. The problem was resolved in 2.8 hours (TTF) resulting in a total impact of 1.2 days (TTR). [1].
Cause
The workers' disks were nearly full, leading to issues when the worker’s operating systems slightly increased in size.
Solution
The size of the workers' disks was increased [2].
Conclusion
Gradually, the workers' operating systems grew beyond the disk capacity, leading to the issue. Increasing disk size by 50% mitigates future occurrences. IMPOSSIBLE_TO_TEST