Increase in Platform response times

Incident Report for Fluid Attacks

Postmortem

Impact

An unknown number of users found increased response times when accessing the Platform. The issue started on UTC-5 23-11-08 12:54 and was proactively discovered 1.2 hours (TTD) later by one of our monitoring tools and staff members, encountering times above 5 seconds of web response in this component. The problem was resolved in 28.8 minutes (TTF), resulting in a total window of exposure of 1.6 hours (WOE).

Cause

In favor of improving technologies that allow simplifying and adding value, a new experimental library was being tested in the authorization module. When deploying a new version of the Platform, the change caused a performance downgrade, leading to increased response times  [1].

Solution

The engineering team reverted the Platform version to the previous one without the experimental library changes [2].

Conclusion

A satisfactory and accurate way to measure performance degradation before reaching production has not yet been implemented. The engineering team is still investigating ways to have these measures in place to keep such situations under control. IMPOSSIBLE_TO_TEST

Posted Nov 08, 2023 - 17:52 GMT-05:00

Resolved

The incident has been resolved and response times are now on target.
Posted Nov 08, 2023 - 15:11 GMT-05:00

Update

We are continuing to work on a fix for this issue.
Posted Nov 08, 2023 - 14:12 GMT-05:00

Identified

Web Response Time goes above 5 seconds in Platform.
Click for details: https://availability.fluidattacks.com
Posted Nov 08, 2023 - 14:05 GMT-05:00
This incident affected: Platform.