{"UUID":"0dfe934b-9234-4383-9d1c-de9b860a43d1","URL":"https://digital.ai/catalyst-blog/subversion-sha1-collision-problem-statement-prevention-and-remediation-options","ArchiveURL":"https://web.archive.org/web/20240701173654/https://web.archive.org/web/20210306015541/https://digital.ai/catalyst-blog/subversion-sha1-collision-problem-statement-prevention-and-remediation-options","Title":"Subversion SHA1 Collision Affects WebKit Repository","StartTime":"0001-01-01T00:00:00Z","EndTime":"0001-01-01T00:00:00Z","Categories":["automation","config-change"],"Keywords":["webkit","subversion","sha1","collision","repository","deduplication","checksum mismatch","representation sharing","data corruption"],"Company":"WebKit code repository","Product":"Subversion","SourcePublishedAt":"2022-09-07T11:55:04-04:00","SourceFetchedAt":"2026-05-04T18:11:12.726716Z","Summary":"The WebKit repository, a Subversion repository configured to use deduplication, became unavailable after two files with the same SHA-1 hash were checked in as test data, with the intention of implementing a safety check for collisions. The two files had different md5 sums and so a checkout would fail a consistency check. For context, the first public SHA-1 hash collision had very recently been announced, with an example of two colliding files.","Description":"The WebKit Subversion repository experienced an incident in late February 2017 when a user committed two files with different content but identical SHA1 hashes. This SHA1 collision led to unexpected behavior and data corruption within the repository.\n\nThe root cause was Subversion's \"Representation Sharing\" feature, introduced in version 1.6. This feature uses SHA1 hashes to deduplicate file content. When the two colliding files were committed, the system, encountering the same hash, stored only a pointer for the second file, effectively replacing its content with a reference to the first.\n\nCustomer impact included \"Checksum mismatch\" errors for users attempting to checkout or update the affected file, rendering it inaccessible. Tools like `svnsync` and `git-svn` also failed when encountering the corrupted revision during transaction history replay.\n\nRemediation options considered included disabling the representation sharing feature, implementing pre-commit hooks to block known colliding files, or deleting the problematic file from the HEAD revision. The WebKit repository specifically implemented a Subversion permission rule (authz) to block access to the affected files, providing a robust long-term solution and enabling potential migration."}