PDF OCR
Run OCR on a scanned PDF and download a plain-text file with the recognized text.
100% in your browser. Your file never leaves your device.
Features
- Six built-in languages (EN, DE, FR, ES, IT, PT)
- Runs in your browser — no upload
- Per-page progress display
- Plain-text output (.txt)
- Cached language models on repeat use
- Searchable-PDF output in the native app
About this tool
Tesseract.js runs entirely in your browser (loaded on demand — never on idle pageviews). Pick a language, drop a scanned PDF, and the tool produces a clean .txt file you can paste into Word, Notes or anywhere else. The native app produces searchable-PDF output with embedded text layers.
How to OCR a PDF
-
Pick a language
Choose the dominant language of the document.
-
Drop your PDF
Best results on scans at ≥ 200 DPI.
-
Wait for recognition
A progress bar shows per-page progress.
-
Download the text
A .txt file with one section per page is saved.
Frequently asked questions
Why is the first run slow?
The language data (~10 MB) is downloaded once and cached for next time.
How accurate is it?
Tesseract 5 reaches 95%+ on clean modern scans. Handwriting is poor.
Can I get a searchable PDF?
Searchable-PDF output (text layer over images) is in the native app.
Is my scan uploaded?
No — OCR runs entirely in your browser.
Free?
Yes.