Duplicate reports of automatic events
Incident Report for Fluid Attacks
Postmortem

Impact

At least 16 groups experienced duplicates in the automatic event reports related to cloning issues. The issue started on UTC-5 24-02-16 15:36 and was proactively discovered 2.8 days (TTD) later by one of our engagement managers, who reported through our help desk that previously reported events, related to cloning issues, were duplicated. The problem was resolved in 3.8 hours (TTF) resulting in a total impact of 3 days (TTR) [1][2].

Cause

Duplicate events were generated during cloning operations on roots due to a validation error caused by a recent change in the email used by the scheduler. This led to confusion in event reporting, with events being created even for roots with unresolved issues. The validation mechanism failed to recognize the scheduler’s new email, allowing duplicate events to be generated [3].

Solution

The email in the BATCH_CLONE_EMAIL constant was updated to match the one used by the scheduler, fixing a validation issue [4]. Additionally, a migration was performed to remove all events generated automatically during the scheduler’s last execution [5].

Conclusion

The issue stemmed from a malfunctioning functional test due to a cache-related problem. Adjusting the test to accurately reflect real-time data has been implemented [6]. This ensures better detection of duplicate scenarios before deployment. INCOMPLETE_PERSPECTIVE

Posted Feb 20, 2024 - 09:38 GMT-05:00

Resolved
The issue has been resolved, and the automatic event reports are now operating as expected.
Posted Feb 19, 2024 - 18:14 GMT-05:00
Identified
Duplication of automatic event reports linked to cloning problems detected.
Posted Feb 19, 2024 - 14:23 GMT-05:00
This incident affected: Platform.