Postmortem Index

Explore incident reports from various companies

Cloudflare outage on July 17, 2020

Cloudflare · Cloudflare backbone network

2020-07-17 config-change security

On July 17, 2020, Cloudflare experienced a 27-minute outage affecting its backbone network and various services globally. The incident began at 21:12 UTC when a configuration change was made on a router in Atlanta. Service was largely restored by 21:39 UTC after the Atlanta router was disabled, with full recovery, including logs and metrics, achieved by 22:10 UTC.

The root cause was a configuration error introduced by a network engineer while attempting to alleviate congestion on a backbone link between Newark and Chicago. Instead of removing specific Atlanta routes, a single line change inadvertently caused all BGP routes to be leaked into the backbone with a higher local-preference (200) than local server routes (100).

This misconfiguration instructed the Atlanta router to send all its BGP routes to other backbone routers. Due to the higher local-preference, all traffic intended for local compute nodes was redirected to Atlanta, rapidly overwhelming the Atlanta router and causing Cloudflare network locations connected to the backbone to fail.

The outage caused traffic to drop by approximately 50% across Cloudflare’s network. Numerous data centers in North America (e.g., San Jose, Dallas, Chicago, Atlanta), Europe (e.g., London, Amsterdam, Frankfurt), and South America (e.g., São Paulo) were affected, leading to service disruption for customers in these regions.

Immediate remediation involved disabling the overwhelmed Atlanta router to restore service. For long-term prevention, Cloudflare deployed a change to modify the BGP local-preference for local server routes, ensuring a single location cannot attract other locations’ traffic in a similar manner. They also planned to introduce a maximum-prefix limit on backbone BGP sessions as a fail-safe.

Keywords

cloudflareoutagebackbonenetworkatlantabgpconfiguration errorrouterbackbone network