Impact
An unknown number of users encountered the automatic vuln scanning system (Machine) unavailable. The issue started on UTC-5 24-02-20 09:37 and was proactively discovered 2.1 hours (TTD) later by one of our monitoring tools and staff members, indicating a disruption in this component. The problem was resolved in 57.6 minutes (TTF) resulting in a total impact of 3.1 hours (TTR).
Cause
A bug in the code affecting all Machine service jobs was introduced by a change in the script creating machine configurations, leading to job failures detected through AWS alerts [1].
Solution
The developer in charge reverted the changes that introduced the error, restoring the Machine service to normal operation [2].
Conclusion
The absence of functional tests for the component allowed the bug to slip into the production environment. To mitigate such occurrences in the future, we’re transitioning to a more robust component while simultaneously implementing thorough functional testing procedures [3]. COMMUNICATION_FAILURE < INCOMPLETE_PERSPECTIVE < MISSING_TEST