Files & formatsGeneral

Searchable PDF

Also known as: OCR PDF, text-searchable PDF, searchable scan

A searchable PDF is a scanned document that has a hidden, selectable text layer added by OCR (optical character recognition) beneath the page image. You see the original scan, but you can search, select, and copy the recognized text — unlike a plain image-only scan.

A scan with a hidden OCR text layer underneath
Searchable, selectable, and accessible — not just an image
The text layer is tiny; page images set the file size

Image-only vs searchable

When you scan a page, the result is usually a picture of the text — a computer cannot read the words. OCR analyzes that image and writes the recognized characters into an invisible layer aligned with the scan, turning it into a searchable PDF.

The visible page looks identical, but now Find works, screen readers can read it aloud, and you can copy passages. This is the difference between an archive you can dig through and a stack of pictures you cannot.

Making and using searchable PDFs

The text layer adds little to the file size since it is just characters, though the page images still dominate the total. Compressing the scans keeps the file manageable.

To add a text layer to a scan, /tools/searchable-scan-to-pdf runs OCR and outputs a searchable PDF; /tools/ocr-pdf-to-text instead extracts the recognized words as plain text.

Searchable PDF

Searchable PDF

Image-only vs searchable

Making and using searchable PDFs

Related terms

Act on it