Impact
At least one user experienced difficulties while trying to review vulnerabilities, as the platform failed to load correctly. The issue started on UTC-5 25-11-28 15:23 and was proactively discovered 21 minutes (TTD) later by a staff member who reported through our help desk [1] that the vulnerabilities view remained stuck in a loading state. Following this initial report, an additional customer report arrived, confirming that the problem was affecting multiple users. The problem was resolved in 10 minutes (TTF), resulting in a total window of exposure of 31 minutes (WOE).
Cause
A large batch of approximately 3,900 automated tasks was executed against the platform. Each task performed several operations that required intensive use of the system, and up to 200 of them were running simultaneously. This created a sudden and unusually high amount of activity that the platform was not able to handle quickly enough.
Because the platform needs several minutes to increase its capacity when activity spikes, it continued receiving more and more requests before it was ready to support them. This led to delays, timeouts, and error responses for about half an hour, affecting both the automated tasks and regular users who were trying to interact with the platform during that period.
The incident was not caused by a recent update, but rather by a combination of a huge volume of simultaneous work and the current limitations of the platform’s ability to adapt to sudden increases in demand.
Solution
Stopping the ongoing tasks was not possible because, by the time the cause was clearly identified, most of them had already been submitted and were nearly complete. The situation was monitored closely until the workload naturally decreased. Once the activity level went down, the platform gradually recovered and returned to normal operation.
Conclusion
To prevent similar incidents, we will avoid generating large, highly concurrent workloads against the platform until it is better prepared to support them, adjust internal workflows so the platform receives fewer unnecessary requests, and continue improving its ability to scale more quickly during periods of high demand, ensuring greater stability and a smoother experience for all users. PERFORMANCE_DEGRADATION < INCOMPLETE_PERSPECTIVE