MD5 Checksum
Also known as: md5 hash, file checksum md5
An MD5 checksum is a 128-bit value computed from a file's exact bytes. Two byte-identical files always produce the same MD5, while changing a single byte produces a completely different value, making it a fast way to detect exact-duplicate files.
- MD5 produces a fixed 128-bit (32 hex character) digest regardless of input file size.
- It detects only byte-identical files; a single changed byte yields a completely different hash.
- MD5 is cryptographically broken for security but remains fast and useful for exact-duplicate detection.
How MD5 works
MD5 is a cryptographic hash function that reads a file's bytes and outputs a fixed 128-bit (16-byte) digest, usually shown as a 32-character hexadecimal string. It processes input in 512-bit blocks, so the file size does not change the output length.
MD5 is deterministic and avalanche-sensitive: identical input always yields the same digest, and flipping even one bit cascades into a totally different hash. This is the opposite of a perceptual hash, which is designed to stay similar when the image changes only a little.
Exact duplicates vs visual duplicates
Because MD5 reflects raw bytes, it only matches byte-identical files. Two copies of the exact same photo file produce the same MD5, but the same picture re-saved at a different quality, resized, or with edited metadata produces a different MD5 entirely.
That makes MD5 perfect for catching true duplicate files, but useless for finding visually similar images. That job belongs to perceptual hashes like dHash and pHash compared with Hamming distance. Note that MD5 is broken for security use (collisions can be forged), but it remains fine and fast for simple duplicate detection.
Where Cleanor uses it
Cleanor can use a byte-level checksum to confirm that two files are exactly identical before treating them as duplicates, the safest possible match. This complements perceptual hashing, which handles the looser, look-alike cases.
Comparing short checksums is far cheaper than comparing whole files byte by byte, so Cleanor can confidently identify exact-duplicate photos and large files on-device and let you remove the redundant copies to free up storage.