Impact
An unknown number of users experienced various issues while using the platform. The issue started on UTC-5 25-06-03 16:56 and was proactively discovered 57.6 minutes (TTD) later by a staff member, who reported through our help desk [1] that it was not possible to resubmit vulnerabilities. Afterwards, additional reports related to vulnerability management, repository access, and event handling were received. The problem was resolved in 20.1 hours (TTF), resulting in a total window of exposure of 21.1 hours (WOE) [2].
Cause
While updating our build system to a new version, we also updated the tool responsible for managing dependencies for one of our platform components. In the newer version, a key command used to provide these dependencies was removed. Since this process was happening inside a part of our infrastructure setup that wasn’t designed to catch such errors early (due to technical debt), the problem wasn’t visible until it affected the platform directly [3].
Solution
We improved how dependencies are handled, using an updated, recommended method. Additionally, we moved the dependency-building process out of the infrastructure setup and into a simpler script. This way, if something goes wrong in the future, it will fail earlier and more visibly, preventing broken updates from reaching the platform [4].
Conclusion
By adjusting the process to detect errors earlier, we’ve made the system more reliable and easier to maintain moving forward. INFRASTRUCTURE_ERROR < INCOMPLETE_PERSPECTIVE