Reference

xxHash

xxHash is an extremely fast non-cryptographic hash function used to fingerprint files and data blocks. It runs at memory-bandwidth speeds, making it ideal for deduplication, integrity checks, and hash tables where speed matters more than collision resistance against attackers.

APIs & internalsGeneral

xxHash

Also known as: xxhash fast hash, non cryptographic hash, XXH64, XXH3, xxhash

xxHash is an extremely fast non-cryptographic hash function used to fingerprint files and data blocks. It runs at memory-bandwidth speeds, making it ideal for deduplication, integrity checks, and hash tables where speed matters more than collision resistance against attackers.

  • xxHash comes in 32-, 64-, and 128-bit variants; XXH3 is the current high-speed flagship.
  • It is non-cryptographic: great for dedupe and checksums, never for security or passwords.
  • Throughput is typically limited by memory bandwidth, not CPU, making it ideal for large file scans.

What xxHash is and how it works

xxHash is a family of non-cryptographic hash functions designed by Yann Collet to produce a fixed-size fingerprint (32-, 64-, or 128-bit) from arbitrary input as fast as the CPU can read memory. The modern variant XXH3 uses SIMD-friendly arithmetic and a large internal accumulator to reach throughput that often saturates the memory bus, dramatically faster than cryptographic hashes like SHA-256 or MD5.

Because it is *non-cryptographic*, xxHash is not designed to resist a malicious party deliberately crafting collisions, and it should never be used for passwords, signatures, or security tokens. For its intended jobs, like spotting accidentally identical content, building hash-table keys, or checksumming data in transit, its statistical quality is excellent and its speed is the whole point.

Why dedupe and cleanup tools rely on it

A storage cleaner that scans thousands of photos and files needs to decide which ones are byte-for-byte identical without reading every pair in full. A common pattern is to hash each file (or a fast partial sample plus size) and group files whose hashes match. xxHash lets that pass run quickly even on a phone, so a full-library scan finishes in seconds rather than minutes.

After hashing flags candidate duplicates, a careful tool confirms a true match with a full byte comparison before deleting anything, since any non-cryptographic hash can in theory collide. The hash is the fast filter; the byte compare is the safety net. This two-stage approach is how Cleanor finds exact duplicate files efficiently while staying safe about what it removes.

Related terms

Keep reading the reference.

Act on it

Guides and tools for this topic.