Is the PDF to Markdown tool free?

Yes — completely free, no signup, no limits on file size beyond what your browser can handle, no watermark on output. iSavePDF is funded by display ads on the page, not by a paid tier or premium features. There is no upsell.

Will it work on scanned PDFs?

No, not directly. Scanned PDFs are images of pages — there's no extractable text layer, so the tool will return empty or near-empty Markdown. To convert a scanned PDF, you need to run optical character recognition (OCR) first using a desktop tool like Adobe Acrobat, Tesseract, or Apple Preview's text recognition feature. Once the PDF has an embedded text layer, iSavePDF can extract from it normally.

How accurate is the conversion?

For PDFs created from text sources (Word, Google Docs, web exports, LaTeX), accuracy is typically very high for paragraphs and lists, good for headings, and partial for tables. For complex layouts (multi-column academic papers, magazine-style designs, PDFs with floating text boxes) the result is best treated as a starting point that you'll clean up by hand. Markdown is fast to edit in any text editor, so even a partial extraction saves significant work over starting from scratch.

Are my PDF files uploaded to a server?

No. The entire extraction happens inside your browser tab using pdfjs-dist (the JavaScript PDF library Mozilla ships in Firefox). No upload occurs at any point — you can verify this by opening DevTools, switching to the Network tab, and watching during conversion. There will be no outbound requests carrying your PDF content. The PDF, the extracted Markdown, and any intermediate state all live in your browser's memory and never leave the device.

Will images from the PDF appear in the Markdown?

In v1, no — only text is extracted. Inline images, charts, diagrams, and figures are not extracted or embedded into the resulting Markdown. If your PDF relies heavily on imagery, the Markdown will preserve the prose around the images but the images themselves will be missing. For PDF-to-image extraction, use our PDF to JPG or PDF to PNG tools.

What heading levels does it use?

Heading levels are inferred from font size. The largest non-body font size becomes H1 (#), the next becomes H2 (##), and so on, capped at H6. If your PDF uses only one heading size, you'll get only H1s. If it uses many sizes that are very close together, the bucketing may merge them into one level. Review the output and adjust manually if your downstream tool needs a specific heading structure.

Are bold and italic preserved?

Partially. The tool checks the font name of each text item — fonts ending in 'Bold', 'Black', or 'Heavy' are treated as bold; fonts ending in 'Italic' or 'Oblique' are treated as italic. Whole-line emphasis (a single bold or italic line) is detected reliably. Per-word emphasis inside a paragraph (a single bold word in otherwise-regular prose) is harder to detect from pdfjs's output and may not always survive in v1.

Can I extract from password-protected PDFs?

No — encrypted PDFs need to be unlocked first using your PDF reader (right-click → Properties → Security tab in Adobe Reader, or File → Print → Save as PDF in most readers if you know the password). iSavePDF doesn't unlock encrypted PDFs as a security policy: the tools that do this often double as bypass tools for legitimately-protected documents.

What about tables, footnotes, and references?

Simple tables with clear borders are detected and emitted as Markdown pipe tables; complex tables become jumbled lines in reading order — fixing them requires manual cleanup. Footnote markers (like ¹ or [1]) appear inline next to their callout in the prose; footnote bodies appear at the end of the page section in the output. References sections are extracted as plain paragraphs — formatting them as Markdown citations would require an actual reference parser, which is out of scope.

CONVERT TOOL

PDF to Markdown

Extract a PDF as Markdown — headings, lists, paragraphs.

Works best with text-based PDFs. Scanned (image-only) PDFs and complex multi-column layouts may produce empty or inconsistent output. For best results, use PDFs originally created from Word, Google Docs, web exports, or LaTeX.

or drag and drop · max 50 MB

PDFs are great at preserving exactly how a document looks, but terrible at letting you reuse the content inside them. The moment you want to pull text out of a PDF, edit it, version-control it, or feed it into another tool — a note app, a static site, an LLM — you hit the wall of PDF's frozen structure. iSavePDF's PDF to Markdown tool extracts the readable content of a PDF and reconstructs it as a Markdown (.md) file, with headings, lists, paragraphs, and basic formatting inferred from the original layout. Markdown is the format Obsidian, Notion, Bear, Logseq, Hugo, Astro, and most modern knowledge tools speak natively, so this is often the cleanest way to move PDF content into a system that lets you actually work with it. **An honest caveat up front**: PDF to Markdown is a best-effort conversion, not an exact one. PDFs created from text sources (Word documents, Google Docs exports, web pages saved as PDF, LaTeX output) convert well. Scanned PDFs — where each page is an image of a scanned document — contain no extractable text and will produce empty or near-empty output. PDFs with multi-column layouts, complex tables, or unusual typography may produce inconsistent results. We surface this disclosure on the upload zone so you know what to expect before clicking convert.

Step by step

How to pdf to markdown on iSavePDF

Open PDF to Markdown on iSavePDF
Visit isavepdf.com/pdf-to-md in any modern browser. The tool loads instantly, requires no signup, and works offline once the page has been cached. Read the warning banner near the upload zone — it explains which PDFs convert well and which don't, so you can decide whether the tool fits your file before uploading.
Upload your PDF
Drag your PDF into the upload zone or click to pick it from your file picker. The tool accepts single PDF files up to about 50 MB. For best results, use PDFs that were created from a text source (Word, Google Docs, LaTeX, browser print-to-PDF, design tools that include a text layer) rather than scanned image PDFs.
Click Extract as Markdown
Click the convert button. iSavePDF uses pdfjs-dist (the same PDF rendering library used by Mozilla's PDF reader) to extract every text item from every page along with its font, size, and position metadata. Heuristics then group items into lines, infer heading levels from font size, detect list markers, and emit clean Markdown.
Review the result summary
After conversion, you'll see a summary showing how many characters were extracted from how many pages. If the count is zero or near-zero, your PDF is probably scanned — the tool will surface a hint suggesting OCR is needed first. For PDFs with extractable text, the summary confirms the extraction worked and you can move to download.
Download the .md file
Save the Markdown file to your device. Open it in any text editor, paste it into your note app, commit it to a repo, or feed it into an LLM as cleaner input than raw PDF. The original PDF is never sent to a server — extraction happened entirely in your browser tab.

How it works

How PDF to Markdown works

Upload your PDF
Drop the PDF in. Text-based PDFs (Word/Google Docs/web exports) convert best; scanned PDFs may produce empty output.
Preview the Markdown
We extract text + structure, infer headings and lists from font size and bullet patterns, and show the result for review.
Download as .md
Save the Markdown file to your device. The original PDF is never sent anywhere.

When to use it

Common use cases

Migrating documents into Obsidian, Notion, or Logseq
If you're moving years of PDF-stored notes, articles, or documents into a Markdown-first knowledge system like Obsidian, Notion, or Logseq, doing it manually is impossible. PDF to Markdown gives you a starting point — each PDF becomes a Markdown file you can drop into your vault, optionally clean up, and link to other notes. The conversion preserves heading structure so your existing PDFs slot into your knowledge graph without losing their outline. Combined with batch processing in your file manager, you can migrate an entire reference library in an afternoon.
Cleaning up text for LLM input
Large language models work better with structured text than with raw PDF data. Many AI tools either can't read PDFs at all or read them poorly — text gets jumbled, tables flatten into noise, headings disappear. Converting your PDF to clean Markdown first and feeding the Markdown to the LLM produces dramatically better answers: the model sees real headings, real paragraphs, real lists. This is especially useful for research, technical writing, content analysis, or anything where the source document's structure carries meaning the model needs to respect.
Republishing PDF content on a Markdown-based website
Writers and publishers who have content sitting in PDF form (whitepapers, ebooks, old blog exports, research reports) and want to republish it on a Markdown-based site (Hugo, Astro, Eleventy, Gatsby, Next.js MDX) face hours of manual retyping otherwise. PDF to Markdown gets you 80% of the way there — heading structure, paragraphs, lists, and inline emphasis carry over. You polish the result, add frontmatter and image links, and you've turned a static PDF into a live, searchable, link-able web article.
Extracting structured notes from scanned papers
Academics, researchers, and students often have PDFs of papers, lecture notes, and textbook chapters that they want to reorganize into their own notes. Pure pdf-to-text gives a wall of unformatted prose; PDF to Markdown gives a wall of prose plus the heading structure, which makes it dramatically easier to skim, restructure, and synthesize. Note: this works only for papers with an embedded text layer — modern journal PDFs and most preprints have one; old scanned papers may not. For the latter, run OCR first using a desktop tool, then re-export to a text PDF and run that through iSavePDF.

Why iSavePDF

The privacy-first way to pdf to markdown

Most online PDF-to-Markdown converters work by uploading your PDF to a server, running extraction there, and sending the Markdown back. That means your PDF — which may contain confidential research, internal company documents, draft writing, financial reports, or personal records — leaves your device and lives temporarily on someone else's infrastructure. Some services then keep your file longer than they admit, train models on it, or share it with third-party processors. For documents subject to NDA, professional confidentiality, or compliance requirements (HIPAA, GDPR, SOC 2), this is often a hard policy violation.

iSavePDF runs the entire extraction in your browser using pdfjs-dist — the same PDF parsing library Mozilla ships in Firefox. The PDF is read into memory in your tab, its text and structure are reconstructed locally, the Markdown is generated locally, and the result is handed to your browser's download mechanism. There is no server-side processing. You can open DevTools, switch to the Network tab, and watch — you'll see zero outbound requests carrying your PDF content. The tool is free with no enforced limits, no signup, no watermark, and no upsell. We fund the site with banner ads on the page, not by monetizing the documents people convert.

Tips & limits

Tips for the best results

Best results from text-based PDFs
PDFs that started as Word documents, Google Docs exports, LaTeX output, or web pages saved via Print to PDF have a clean text layer and convert reliably. PDFs created by photographing or scanning paper documents are images of text — they contain no extractable text layer and will produce empty Markdown. Use Adobe Acrobat, Tesseract, or a similar OCR tool to add a text layer first, then re-run through iSavePDF.
Heading detection uses font size
The tool infers heading levels by comparing font sizes across the document. The largest size that isn't body text becomes H1, the next largest becomes H2, and so on. PDFs that use bold body text for emphasis (instead of larger fonts) may produce no headings; PDFs with unconventional typography may produce surprising results. Review the output and adjust heading levels manually if needed — Markdown is fast to edit in any text editor.
Tables get partial support
PDFs don't store tables as tables — they store text positioned in columns, which extractors have to reconstruct heuristically. Simple two- or three-column tables with clear borders often work; complex tables with merged cells, nested headers, or many columns usually come out as jumbled lines. For data you need to preserve as a table, consider extracting to Excel using our PDF to Excel tool, which uses a different extraction strategy tuned for tabular data.
Headers and footers are stripped automatically
Page headers and footers that repeat across pages (book title, chapter name, page numbers) are detected by their repetition and stripped from the output. This keeps your Markdown clean. If your document uses unique footers per page that contain meaningful content, you may lose them — review the output and re-add if needed.

FAQ

Frequently asked questions

Yes — completely free, no signup, no limits on file size beyond what your browser can handle, no watermark on output. iSavePDF is funded by display ads on the page, not by a paid tier or premium features. There is no upsell.

PDF to Markdown

How to pdf to markdown on iSavePDF

Open PDF to Markdown on iSavePDF

Upload your PDF

Click Extract as Markdown

Review the result summary

Download the .md file

How PDF to Markdown works

Upload your PDF

Preview the Markdown

Download as .md

Common use cases

Migrating documents into Obsidian, Notion, or Logseq

Cleaning up text for LLM input

Republishing PDF content on a Markdown-based website

Extracting structured notes from scanned papers