OCR (optical character recognition)
Also known as: optical character recognition, image to text, extract text from image
OCR (optical character recognition) reads the text inside an image or scanned document and turns it into real, selectable, editable text. It converts a picture of words — a photo, screenshot, or scanned PDF — into characters you can copy, search, and edit.
- Turns pictures of text into selectable, searchable text
- Works on photos, screenshots, and scanned PDFs
- Accuracy drops on handwriting, skew, and low resolution
What OCR actually does
A photo or scan stores text as pixels, not letters, so you cannot select or search it. OCR analyzes the shapes in the image, recognizes each character, and outputs machine-readable text. That lets you copy a quote from a screenshot, search a scanned contract, or paste a receipt’s numbers into a spreadsheet.
Accuracy depends on the source. Clean, high-contrast, upright text reads almost perfectly; handwriting, low resolution, skew, and unusual fonts lower accuracy and may need a quick proofread.
Images vs PDFs
OCR works on both image files and PDFs. A scanned PDF is often just a stack of page images with no real text underneath — running OCR adds a searchable text layer so you can find and copy words. The same applies to a photo of a page or a screenshot.