Issues cloning Git roots via Connector and Egress IP

Incident Report for Fluid Attacks

Postmortem

Impact

Failures were identified when cloning roots via Connector and Egress IP. The issue started on UTC-5 25-02-21 13:33 and was proactively discovered 2 months (TTD) later by one of our engagement managers who reported through our help desk that Git roots were failing to clone correctly. Additional reports followed from other groups observing the same behavior. Initially, failures were intermittent and limited to groups using Connector. Further investigation revealed related errors using Warp. The problem was resolved in 20.6 days (TTF), resulting in a total window of exposure of 2.6 months (WOE).

Cause

A misconfiguration in setting up secure network connections caused the system to miss or delay applying the correct network settings. This led to situations where the system couldn’t find the correct path to access external resources, such as code repositories, because the necessary internet routing information wasn’t available in time.

Solution

The processes were modified to establish secure network connections only when strictly necessary. New dedicated execution queues were created for tasks that use Connector and Egress IPs, and scheduling rules were adjusted to ensure those processes are routed correctly. A verification step was added to check internet routing settings before starting root cloning, and the base software images used in the virtual machines were updated.

Conclusion

The incident highlighted the importance of isolating specialized network operations to prevent configuration conflicts. With the changes implemented—such as separating execution flows, enhancing validation steps, and updating system components—the platform is now better equipped to handle scenarios involving Connector and Egress IP. Ongoing improvements will continue to strengthen the reliability and resilience of these processes. THIRD_PARTY_ERROR < INFRASTRUCTURE_ERROR

Posted May 15, 2025 - 13:50 GMT-05:00

Resolved

The incident has been resolved, and cloning operations via Connector and Egress have stabilized and returned to their expected behavior.
Posted May 09, 2025 - 22:45 GMT-05:00

Identified

The platform is experiencing failures when attempting to automatically clone Git roots through Conector and Egress.
Posted May 06, 2025 - 07:57 GMT-05:00
This incident affected: Platform.