Reference

pHash (DCT Hash)

pHash (DCT hash) is a perceptual image hashing method that applies a Discrete Cosine Transform to a grayscale thumbnail and keeps the low-frequency coefficients. This makes its fingerprint robust to scaling, recompression, and many edits, so visually similar photos hash alike.

APIs & internalsGeneral

pHash (DCT Hash)

Also known as: dct perceptual hash, phash algorithm, phash dct, perceptual hash dct

pHash (DCT hash) is a perceptual image hashing method that applies a Discrete Cosine Transform to a grayscale thumbnail and keeps the low-frequency coefficients. This makes its fingerprint robust to scaling, recompression, and many edits, so visually similar photos hash alike.

  • Applies a Discrete Cosine Transform to a grayscale thumbnail (often 32x32) and keeps the low-frequency 8x8 block.
  • Coefficients are thresholded against their median to build a 64-bit hash; similarity is the Hamming distance.
  • More robust to compression, gamma, and minor edits than aHash/dHash, at a higher compute cost.

How DCT-based pHash works

pHash first shrinks the image to a small grayscale square, commonly 32x32 pixels. It then applies a Discrete Cosine Transform (DCT), the same frequency transform used in JPEG, which separates the image into components from low frequency (broad structure) to high frequency (fine detail and noise).

The algorithm keeps only the top-left low-frequency block, typically the upper-left 8x8 of DCT coefficients, because low frequencies capture the dominant visual structure that survives editing. The very first DC coefficient is usually dropped since it represents overall brightness.

Each remaining coefficient is compared to the median of that block: above-median becomes 1, below becomes 0, yielding a 64-bit hash. As with other perceptual hashes, two images are compared by Hamming distance between their hashes.

Why pHash is more robust

By focusing on low-frequency content, pHash discards the high-frequency detail that compression, sharpening, watermarks, and small edits tend to alter. This makes it noticeably more resilient than aHash or dHash to gamma shifts, recompression, and minor cropping.

The trade-off is cost: the DCT is more computation per image than the simple averaging used by aHash. A common pattern in a duplicate finder is to run a cheap hash for a first pass, then confirm candidate near-duplicates with a DCT-based pHash to keep false positives low.

Related terms

Keep reading the reference.

Act on it

Guides and tools for this topic.