Impact
Failures were identified when cloning roots via Connector and Egress IP. The issue started on UTC-5 25-02-21 13:33 and was proactively discovered 2 months (TTD) later by one of our engagement managers who reported through our help desk that Git roots were failing to clone correctly. Additional reports followed from other groups observing the same behavior. Initially, failures were intermittent and limited to groups using Connector. Further investigation revealed related errors using Warp. The problem was resolved in 20.6 days (TTF), resulting in a total window of exposure of 2.6 months (WOE).
Cause
A misconfiguration in setting up secure network connections caused the system to miss or delay applying the correct network settings. This led to situations where the system couldn’t find the correct path to access external resources, such as code repositories, because the necessary internet routing information wasn’t available in time.
Solution
The processes were modified to establish secure network connections only when strictly necessary. New dedicated execution queues were created for tasks that use Connector and Egress IPs, and scheduling rules were adjusted to ensure those processes are routed correctly. A verification step was added to check internet routing settings before starting root cloning, and the base software images used in the virtual machines were updated.
Conclusion
The incident highlighted the importance of isolating specialized network operations to prevent configuration conflicts. With the changes implemented—such as separating execution flows, enhancing validation steps, and updating system components—the platform is now better equipped to handle scenarios involving Connector and Egress IP. Ongoing improvements will continue to strengthen the reliability and resilience of these processes. THIRD_PARTY_ERROR < INFRASTRUCTURE_ERROR