xxHash
Also known as: xxhash fast hash, non cryptographic hash, XXH64, XXH3, xxhash
xxHash is an extremely fast non-cryptographic hash function used to fingerprint files and data blocks. It runs at memory-bandwidth speeds, making it ideal for deduplication, integrity checks, and hash tables where speed matters more than collision resistance against attackers.
- xxHash comes in 32-, 64-, and 128-bit variants; XXH3 is the current high-speed flagship.
- It is non-cryptographic: great for dedupe and checksums, never for security or passwords.
- Throughput is typically limited by memory bandwidth, not CPU, making it ideal for large file scans.
What xxHash is and how it works
xxHash is a family of non-cryptographic hash functions designed by Yann Collet to produce a fixed-size fingerprint (32-, 64-, or 128-bit) from arbitrary input as fast as the CPU can read memory. The modern variant XXH3 uses SIMD-friendly arithmetic and a large internal accumulator to reach throughput that often saturates the memory bus, dramatically faster than cryptographic hashes like SHA-256 or MD5.
Because it is *non-cryptographic*, xxHash is not designed to resist a malicious party deliberately crafting collisions, and it should never be used for passwords, signatures, or security tokens. For its intended jobs, like spotting accidentally identical content, building hash-table keys, or checksumming data in transit, its statistical quality is excellent and its speed is the whole point.
Why dedupe and cleanup tools rely on it
A storage cleaner that scans thousands of photos and files needs to decide which ones are byte-for-byte identical without reading every pair in full. A common pattern is to hash each file (or a fast partial sample plus size) and group files whose hashes match. xxHash lets that pass run quickly even on a phone, so a full-library scan finishes in seconds rather than minutes.
After hashing flags candidate duplicates, a careful tool confirms a true match with a full byte comparison before deleting anything, since any non-cryptographic hash can in theory collide. The hash is the fast filter; the byte compare is the safety net. This two-stage approach is how Cleanor finds exact duplicate files efficiently while staying safe about what it removes.