Postmortem Index

Explore incident reports from various companies

Category

Config Change

Bad configuration pushed to production: feature flags, network ACLs, IAM policies, build settings, and routing rules.

Postmortems
111
Companies
47
Years covered
16
Date range
Apr 2011 – Feb 2026
Title Company Date Other categories
GitHub Actions and Codespaces outage of February 2026 GitHub 2026-02-02 – 2026-02-03
Cloudflare systemwide outage, November 18, 2025 Cloudflare 2025-11-18
incident.io service disruption during AWS us-east-1 outage on October 20, 2025 incident.io 2025-10-20
Global Google Cloud API outage due to Service Control null pointer exception Google 2025-06-12 – 2025-06-13
CircleCI workflows latency and failures on April 4, 2025 CircleCI 2025-04-04
CircleCI UI and build capabilities disruption on April 4, 2025 CircleCI 2025-04-04
GitHub DNS infrastructure failure and service degradation on October 11, 2024 GitHub 2024-10-11 – 2024-10-12
GitHub.com database configuration change causes 36-minute outage GitHub 2024-08-14
Amazon Kinesis Data Streams US-EAST-1 Degradation July 2024 Amazon 2024-07-30 – 2024-07-31
CrowdStrike Falcon Content Update Incident of July 2024 CrowdStrike 2024-07-19
GitHub Copilot degradation on July 13, 2024 GitHub 2024-07-13
Turso free tier data leak and loss Turso 2023-12-01 – 2023-12-04
Honeycomb total outage on July 25th, 2023 Honeycomb 2023-07-25
GitHub availability incidents May 9-11, 2023 GitHub 2023-05-09
Cloudflare service token incident on January 24, 2023 Cloudflare 2023-01-24
Intermittent downtime from repeated crashes incident.io 2022-11-18
Honeycomb Ingest System Outage: Shepherd Cache Delays Honeycomb 2022-09-08
Google Cloud Networking, Storage, and BigQuery reduced capacity for lower priority traffic Google 2022-07-15
Cloudflare outage on June 21, 2022 Cloudflare 2022-06-21
Heroku April 2022 security incident Heroku 2022-04-07 – 2022-04-14
Atlassian April 2022 customer site deletion outage Atlassian 2022-04-05 – 2022-04-18
GitHub mysql1 cluster repeated service disruptions (March 2022) GitHub 2022-03-16 – 2022-03-23
Slack’s Incident on 2-22-22 Slack 2022-02-22
Firefox HTTP/3 network stack outage Firefox 2022-01-13
AWS US-EAST-1 Internal Network Congestion on December 7, 2021 Amazon 2021-12-07 – 2021-12-08
GitHub November 2021 Availability Incident due to MySQL Schema Migration Github 2021-11-27
Google Cloud Networking and Load Balancing outage of November 2021 Google 2021-11-16
Google Cloud Networking issues in Europe and other regions on November 12, 2021 Google 2021-11-12
CircleCI jobs stuck in "not running" state on November 8, 2021 CircleCI 2021-11-08
Roblox 73-hour outage due to Consul and BoltDB issues (October 2021) Roblox 2021-10-28 – 2021-10-31
Google Meet Livestream degraded quality Google 2021-10-25 – 2021-10-26
Fastly global outage of June 8, 2021 Fastly 2021-06-08
Delay in starting Docker Jobs. Machine & remote Docker environments blocked CircleCI 2021-05-21 – 2021-05-22
GitHub Actions and Pages impacted by scoped token INT32 overflow GitHub 2021-05-16
Slack Outage on January 4th 2021 Slack 2021-01-04
Amazon Kinesis US-EAST-1 outage November 2020 Amazon 2020-11-25 – 2020-11-26
Leaderboarded production database accidental deletion Keepthescore 2020-10-17
GitHub background job system degraded availability October 2020 GitHub 2020-10-09 – 2020-10-10
Cloudflare outage on July 17, 2020 Cloudflare 2020-07-17
GitHub February 2020 mysql1 service disruptions GitHub 2020-02-19 – 2020-02-27
Cloudflare global outage on July 2, 2019 Cloudflare 2019-07-02
Firefox Add-ons Outage due to Certificate Expiration Mozilla 2019-05-04
CircleCI Workflow Delay Incidents March 26 - April 10, 2019 CircleCI 2019-03-26 – 2019-04-10
Google Cloud internal blob storage disruption March 2019 Google 2019-03-13
Elastic Cloud AWS us-east-1 outage of February 2019 Elastic 2019-02-04
GitHub October 2018 Service Degradation due to MySQL Failover GitHub 2018-10-21 – 2018-10-22
Authentication Latency on DUO1 Deployment Duo 2018-08-29
Fortnite service outages of February 3-4, 2018 Epic Games 2018-02-03 – 2018-02-05
Unavailable Guilds & Connection Issues Discord 2017-10-13
GoCardless API and Dashboard outage on 10 October 2017 GoCardless 2017-10-10
Discord Connectivity Issues (March 2017) Discord 2017-03-20
Square service disruption of March 16, 2017 Square 2017-03-16
Amazon S3 US-EAST-1 outage of February 2017 Amazon 2017-02-28
Instapaper AWS RDS MySQL 2TB File Size Limit Outage Instapaper 2017-02-09 – 2017-02-14
Travis CI container-based Linux builds outage due to worker rollback failure TravisCI 2017-02-02 – 2017-02-05
GitLab.com database outage of January 31, 2017 Gitlab 2017-01-31 – 2017-02-01
Google Compute Engine, Cloud VPN, and Network Load Balancer connectivity issues Google 2017-01-30
Mailgun Website Intermittent Timeouts Mailgun 2017-01-12
Cloudflare parser bug causes memory leak Cloudflare 2016-09-22 – 2017-02-18
Buildkite outage of August 22nd, 2016 Buildkite 2016-08-22
Platform.sh EU region outage of August 2016 Platform.sh 2016-08-18
Reddit outage and degraded performance on August 11, 2016 Reddit 2016-08-11 – 2016-08-12
Travis CI GCE base image deletion Travis CI 2016-08-09
AWS Sydney Region EC2 and EBS power disruption Amazon 2016-06-05
Google Compute Engine global connectivity loss April 2016 Google 2016-04-12
ShapeShift Cyberattack Shapeshift 2016-03-14 – 2016-04-09
GitHub January 28th, 2016 datacenter power disruption GitHub 2016-01-28
CircleCI Linux build queue backing up October 2015 CircleCI 2015-10-14 – 2015-10-15
High queue times on OS X builds (.com and .org) TravisCI 2015-08-04 – 2015-08-06
EVE Online long downtime on July 15th, 2015 CCP Games 2015-07-15
CircleCI DB performance issue CircleCI 2015-07-07 – 2015-07-08
Stack Exchange network outage due to StackEgg on March 31, 2015 Stack Exchange 2015-03-31
Azure Storage service interruption Microsoft 2014-11-19
Azure Storage service interruption November 2014 Microsoft 2014-11-19
GitHub DDoS attack of March 2014 GitHub 2014-03-11
GitHub DNS Outage on January 8, 2014 GitHub 2014-01-08
AWS SA-EAST-1 Availability Zone Power and Network Incident, December 2013 Amazon 2013-12-18
MongoHQ security breach impacting CircleCI customer data CircleCI 2013-10-27 – 2013-10-30
Stackdriver Intelligent Monitoring application outage on October 23, 2013 Stackdriver 2013-10-23 – 2013-10-26
PagerDuty notification dispatch system outage of April 2013 Pagerduty 2013-04-13
Amazon ELB Service Event in US-East Region on December 24, 2012 Amazon 2012-12-24 – 2012-12-25
GitHub.com outage of December 2012 GitHub 2012-12-22 – 2012-12-23
Chrome SyncDataType parsing crash Google 2012-12-10 – 2012-12-11
GitHub network problems on November 30, 2012 GitHub 2012-11-30 – 2012-12-01
AWS US-East Region Service Event of October 22, 2012 Amazon 2012-10-22 – 2012-10-23
Netflix's response to October 2012 AWS EBS degradation Netflix 2012-10-22
GitHub.com availability issues in September 2012 GitHub 2012-09-10
Knight Capital SMARS algorithmic trading incident of August 2012 Knight Capital 2012-08-01
Knight Capital SMARS deployment incident Knight Capital 2012-08-01
AWS US East-1 power failure and service disruption in June 2012 Amazon 2012-06-30
Windows Azure Service Disruption on Feb 29th, 2012 Azure 2012-02-29 – 2012-03-01
Amazon EC2 and Amazon RDS Service Disruption in US East Region Amazon 2011-04-21 – 2011-04-24
15 seconds of API downtime during PostgreSQL migration GoCardless
Amazon EC2, EBS, and RDS EU West Region Service Event Amazon
Untitled postmortem Elastic
Engineering Archives Heroku
Etsy site outage caused by multicast rsync Etsy
Foursquare MongoDB memory exhaustion outage Foursquare
GitHub availability incidents in February and March 2026 GitHub
Untitled postmortem Gliffy
Google Cloud GCVE deletion incident impacting UniSuper Google
Honeycomb operational burden and scaling issues in September and October Honeycomb
How I Broke `git push heroku main` Heroku
incident.io GKE Dataplane V2 `anetd` CPU saturation causes connection timeouts incident.io
incident.io incident ID sequence jump incident.io
Incident.io intermittent database connection pool timeouts incident.io
Steam client recursively deleted user files on Linux Valve
Subversion SHA1 Collision Affects WebKit Repository WebKit code repository
Summary of the Amazon DynamoDB Service Disruption and Related Impacts in the US-East Region Amazon
Supermarket Intermittent Unresponsiveness Chef.io
Untitled postmortem WebKit code repository