How to OCR a PDF: Make Scanned Documents Searchable

You have a PDF. You try to search for a word in it. Nothing happens. You try to copy a sentence. Nothing comes through. The text is right there on the page — you can see it — but to your computer, it doesn't exist.

That PDF is a scanned document. Underneath the surface it's just a collection of images, with no text data attached. To OCR a PDF means to run Optical Character Recognition over it — and that fixes the problem.

What OCR actually does

OCR software looks at an image of a page and identifies the shapes that form letters and words. For each shape it finds, it guesses which character it represents based on a trained model. Then it stitches those characters together into words, words into lines, and lines into paragraphs.

The output is a layer of invisible, selectable text positioned exactly where the visible text appears in the image. PDF OCR doesn't change how the page looks — it changes what the computer can see inside it. To the human eye nothing changes — the page looks identical. But now you can:

Search the document for any word
Highlight, copy, and paste text
Convert the PDF to Word, Excel, or plain text
Have a screen reader read it aloud
Index it in a document management system

Tip

OCR doesn't replace the image with text. It adds text behind the image. If you want plain text without the original page layout, run PDF to Text after OCR completes.

How accurate is OCR in 2026?

Modern OCR is remarkably good. On clean, well-scanned pages with standard fonts, accuracy regularly exceeds 99%. That means about 1 in every 100 characters is misread — usually obvious typos a spell-check would catch.

Accuracy drops sharply in three situations:

Poor scan quality. Low resolution (below 300 dpi), skewed alignment, faded ink, or coffee stains all confuse the recognizer.
Unusual fonts or handwriting. Open-source OCR models are trained on common printed fonts. Handwriting recognition exists but is a different problem entirely (and a different model).
Multi-column or table layouts. The recognizer reads top-to-bottom, left-to-right. Newspaper-style columns can get jumbled.

For best results, scan at 300 dpi minimum, in grayscale or black-and-white, with the page aligned squarely.

How OCR works in your browser

iSavePDF runs OCR fully in your browser using Tesseract.js, an open-source OCR engine compiled to JavaScript. Your file never leaves your device — the entire recognition pipeline runs on your computer.

The tradeoff is upfront cost: the first time you use the tool, your browser downloads about 12 MB of model data. After that, runs are fast and offline-capable. Each page typically takes 5–15 seconds to process depending on your device.

Free tool

OCR your PDF free in your browser

Make a scanned PDF searchable.

Try OCR PDF

How to OCR a PDF online with iSavePDF

Open the OCR PDF tool
Drop your scanned PDF onto the upload zone
Wait for the model to download (first time only, ~12 MB)
Watch the per-page progress as recognition runs
Download the OCR'd PDF — same pages, now with a searchable text layer

The output file is roughly the same size as the original. The added text layer is small (text is much smaller than images). Your PDF is now fully searchable in any PDF viewer.

When OCR helps — and when it doesn't

OCR is the right tool when:

You scanned paper documents and need to search them — OCR is what makes a PDF searchable
You have a PDF that won't let you copy text
You want to convert a scanned PDF to Word or Excel
You're archiving documents that need to be findable later

OCR is not the right tool when:

The PDF already has selectable text — running OCR adds nothing and may degrade copy quality. Try copying first; if it works, skip OCR.
The PDF is your only copy of a sensitive document and the scan is poor. OCR will introduce errors. Re-scan first if you can.
You only need a few lines of text. Re-typing is faster than waiting for the model download and per-page recognition.

After OCR: what changed and what didn't

The visible page is identical. What's different:

Searchable. Cmd-F or Ctrl-F now finds words across the whole document.
Selectable. Click and drag selects text under the cursor.
Copyable. Copy and paste pulls real text, not page coordinates.
Accessible. Screen readers can now read the document.
Convertible. PDF to Word, Excel, or plain text now produces useful output.

What didn't change:

File size is roughly the same (text layer is small)
Visual appearance is unchanged
Page count is unchanged
Existing annotations, bookmarks, and form fields are preserved

How it compares

If you're looking to OCR PDF online without installing anything, here's how the main options stack up.

| Tool | Where it runs | Languages | Cost | |---|---|---|---| | iSavePDF | Browser (fully private) | English | Free | | Adobe Acrobat | Desktop app | 40+ | Paid ($23/mo) | | ABBYY FineReader | Desktop app | 200+ | Paid (one-time) | | Google Drive | Cloud | 100+ | Free with account | | Tesseract (CLI) | Local install | 100+ | Free, technical |

ABBYY's recognition is generally considered best-in-class for languages, table extraction, and handwriting support. For typical English documents, iSavePDF's browser-based OCR produces output indistinguishable from premium tools.

FAQ