{"UUID":"d44e5d9b-5c18-4cbe-b624-46bb9a311b84","URL":"https://kickstarter.engineering/the-day-the-replication-died-e543ba45f262","ArchiveURL":"https://web.archive.org/web/20170728131458if_/https://kickstarter.engineering/the-day-the-replication-died-e543ba45f262","Title":"Kickstarter MySQL replication failure","StartTime":"2013-03-07T00:00:00Z","EndTime":"0001-01-01T00:00:00Z","Categories":["cloud"],"Keywords":["mysql","replication","amazon rds","database","inconsistency","bug","order by","kickstarter","data inconsistency","mixed mode replication","backer sequences"],"Company":"Kickstarter","Product":"Amazon RDS MySQL replication","SourcePublishedAt":"2013-04-09T21:03:00Z","SourceFetchedAt":"2026-05-04T18:18:49.845972Z","Summary":"Primary DB became inconsistent with all replicas, which wasn't detected until a query failed. This was caused by a MySQL bug which sometimes caused `order by` to be ignored.","Description":"On Thursday, March 7th, Kickstarter experienced a critical incident where all MySQL replicas stopped updating, leading to stale and inconsistent data across the site. The issue was signaled by `Duplicate entry` errors during replication, indicating a divergence between the master and its replicas.\n\nInvestigation revealed that specific data, particularly backer sequences, had been inconsistent for days, with some inconsistencies dating back over a month. The replication process eventually broke when an \"unsafe query\" triggered row-based replication, exposing the underlying data divergence.\n\nThe core issue was traced to a MySQL bug where the `ORDER BY` clause in an `UPDATE` query was sometimes ignored during replication. This, combined with InnoDB's transaction optimization causing transactions to finish out of order on the master, led to the master and replicas applying updates in different implicit orders, resulting in data inconsistency.\n\nThe incident caused Kickstarter to scramble, requiring efforts to stabilize the site, minimize the effects of stale replicas, and communicate with users. The inconsistent backer sequences were crucial for creator reports, impacting the accuracy of information provided to project creators.\n\nWhile the immediate crisis was resolved within hours by stabilizing the site, the underlying bug had caused data inconsistencies for an extended period. The primary outcome of the incident was the discovery and understanding of this critical MySQL replication bug, highlighting the importance of deeply understanding database internals beyond ORM abstractions."}