Impact
At least two groups experienced issues with storing CI Agent execution logs on the platform. The issue started on UTC-5 24-10-16 08:58 and was discovered 22.1 days (TTD) later by a client who reported to one of our engagement managers [1] that, although the Agent completed its execution successfully, the logs for some executions were not recorded in the DevSecOps tab on the platform. The problem was resolved in 8.3 days (TTF) resulting in a total impact of 1 month (TTR) [2].
Cause
A security update was applied to the back-end to fix a vulnerability. However, this update unintentionally caused issues uploading large logs (over 5MB) to the platform. As a result, some executions failed to upload, which could disrupt client builds [3].
Solution
Switching to a new library for handling data uploads allowed large logs to be successfully uploaded to the platform [4].
Conclusion
The lack of tests for uploading large files to the platform was identified as a gap. A test has been added to upload logs of approximately 5-10MB to ensure no issues occur during the upload process and to guarantee the platform’s stability in these cases [5]. MISSING_TEST < INFRASTRUCTURE_ERROR