iSavePDF
CONVERT TOOL

PDF to Markdown

Extract a PDF as Markdown — headings, lists, paragraphs.

Works best with text-based PDFs. Scanned (image-only) PDFs and complex multi-column layouts may produce empty or inconsistent output. For best results, use PDFs originally created from Word, Google Docs, web exports, or LaTeX.

PDFs are great at preserving exactly how a document looks, but terrible at letting you reuse the content inside them. The moment you want to pull text out of a PDF, edit it, version-control it, or feed it into another tool — a note app, a static site, an LLM — you hit the wall of PDF's frozen structure. iSavePDF's PDF to Markdown tool extracts the readable content of a PDF and reconstructs it as a Markdown (.md) file, with headings, lists, paragraphs, and basic formatting inferred from the original layout. Markdown is the format Obsidian, Notion, Bear, Logseq, Hugo, Astro, and most modern knowledge tools speak natively, so this is often the cleanest way to move PDF content into a system that lets you actually work with it. **An honest caveat up front**: PDF to Markdown is a best-effort conversion, not an exact one. PDFs created from text sources (Word documents, Google Docs exports, web pages saved as PDF, LaTeX output) convert well. Scanned PDFs — where each page is an image of a scanned document — contain no extractable text and will produce empty or near-empty output. PDFs with multi-column layouts, complex tables, or unusual typography may produce inconsistent results. We surface this disclosure on the upload zone so you know what to expect before clicking convert.

Step by step

How to pdf to markdown on iSavePDF

  1. Open PDF to Markdown on iSavePDF

    Visit isavepdf.com/pdf-to-md in any modern browser. The tool loads instantly, requires no signup, and works offline once the page has been cached. Read the warning banner near the upload zone — it explains which PDFs convert well and which don't, so you can decide whether the tool fits your file before uploading.

  2. Upload your PDF

    Drag your PDF into the upload zone or click to pick it from your file picker. The tool accepts single PDF files up to about 50 MB. For best results, use PDFs that were created from a text source (Word, Google Docs, LaTeX, browser print-to-PDF, design tools that include a text layer) rather than scanned image PDFs.

  3. Click Extract as Markdown

    Click the convert button. iSavePDF uses pdfjs-dist (the same PDF rendering library used by Mozilla's PDF reader) to extract every text item from every page along with its font, size, and position metadata. Heuristics then group items into lines, infer heading levels from font size, detect list markers, and emit clean Markdown.

  4. Review the result summary

    After conversion, you'll see a summary showing how many characters were extracted from how many pages. If the count is zero or near-zero, your PDF is probably scanned — the tool will surface a hint suggesting OCR is needed first. For PDFs with extractable text, the summary confirms the extraction worked and you can move to download.

  5. Download the .md file

    Save the Markdown file to your device. Open it in any text editor, paste it into your note app, commit it to a repo, or feed it into an LLM as cleaner input than raw PDF. The original PDF is never sent to a server — extraction happened entirely in your browser tab.

How it works

How PDF to Markdown works

  1. Upload your PDF

    Drop the PDF in. Text-based PDFs (Word/Google Docs/web exports) convert best; scanned PDFs may produce empty output.

  2. Preview the Markdown

    We extract text + structure, infer headings and lists from font size and bullet patterns, and show the result for review.

  3. Download as .md

    Save the Markdown file to your device. The original PDF is never sent anywhere.

When to use it

Common use cases

  • Migrating documents into Obsidian, Notion, or Logseq

    If you're moving years of PDF-stored notes, articles, or documents into a Markdown-first knowledge system like Obsidian, Notion, or Logseq, doing it manually is impossible. PDF to Markdown gives you a starting point — each PDF becomes a Markdown file you can drop into your vault, optionally clean up, and link to other notes. The conversion preserves heading structure so your existing PDFs slot into your knowledge graph without losing their outline. Combined with batch processing in your file manager, you can migrate an entire reference library in an afternoon.

  • Cleaning up text for LLM input

    Large language models work better with structured text than with raw PDF data. Many AI tools either can't read PDFs at all or read them poorly — text gets jumbled, tables flatten into noise, headings disappear. Converting your PDF to clean Markdown first and feeding the Markdown to the LLM produces dramatically better answers: the model sees real headings, real paragraphs, real lists. This is especially useful for research, technical writing, content analysis, or anything where the source document's structure carries meaning the model needs to respect.

  • Republishing PDF content on a Markdown-based website

    Writers and publishers who have content sitting in PDF form (whitepapers, ebooks, old blog exports, research reports) and want to republish it on a Markdown-based site (Hugo, Astro, Eleventy, Gatsby, Next.js MDX) face hours of manual retyping otherwise. PDF to Markdown gets you 80% of the way there — heading structure, paragraphs, lists, and inline emphasis carry over. You polish the result, add frontmatter and image links, and you've turned a static PDF into a live, searchable, link-able web article.

  • Extracting structured notes from scanned papers

    Academics, researchers, and students often have PDFs of papers, lecture notes, and textbook chapters that they want to reorganize into their own notes. Pure pdf-to-text gives a wall of unformatted prose; PDF to Markdown gives a wall of prose plus the heading structure, which makes it dramatically easier to skim, restructure, and synthesize. Note: this works only for papers with an embedded text layer — modern journal PDFs and most preprints have one; old scanned papers may not. For the latter, run OCR first using a desktop tool, then re-export to a text PDF and run that through iSavePDF.

Why iSavePDF

The privacy-first way to pdf to markdown

Most online PDF-to-Markdown converters work by uploading your PDF to a server, running extraction there, and sending the Markdown back. That means your PDF — which may contain confidential research, internal company documents, draft writing, financial reports, or personal records — leaves your device and lives temporarily on someone else's infrastructure. Some services then keep your file longer than they admit, train models on it, or share it with third-party processors. For documents subject to NDA, professional confidentiality, or compliance requirements (HIPAA, GDPR, SOC 2), this is often a hard policy violation.

iSavePDF runs the entire extraction in your browser using pdfjs-dist — the same PDF parsing library Mozilla ships in Firefox. The PDF is read into memory in your tab, its text and structure are reconstructed locally, the Markdown is generated locally, and the result is handed to your browser's download mechanism. There is no server-side processing. You can open DevTools, switch to the Network tab, and watch — you'll see zero outbound requests carrying your PDF content. The tool is free with no enforced limits, no signup, no watermark, and no upsell. We fund the site with banner ads on the page, not by monetizing the documents people convert.

Tips & limits

Tips for the best results

  • Best results from text-based PDFs

    PDFs that started as Word documents, Google Docs exports, LaTeX output, or web pages saved via Print to PDF have a clean text layer and convert reliably. PDFs created by photographing or scanning paper documents are images of text — they contain no extractable text layer and will produce empty Markdown. Use Adobe Acrobat, Tesseract, or a similar OCR tool to add a text layer first, then re-run through iSavePDF.

  • Heading detection uses font size

    The tool infers heading levels by comparing font sizes across the document. The largest size that isn't body text becomes H1, the next largest becomes H2, and so on. PDFs that use bold body text for emphasis (instead of larger fonts) may produce no headings; PDFs with unconventional typography may produce surprising results. Review the output and adjust heading levels manually if needed — Markdown is fast to edit in any text editor.

  • Tables get partial support

    PDFs don't store tables as tables — they store text positioned in columns, which extractors have to reconstruct heuristically. Simple two- or three-column tables with clear borders often work; complex tables with merged cells, nested headers, or many columns usually come out as jumbled lines. For data you need to preserve as a table, consider extracting to Excel using our PDF to Excel tool, which uses a different extraction strategy tuned for tabular data.

  • Headers and footers are stripped automatically

    Page headers and footers that repeat across pages (book title, chapter name, page numbers) are detected by their repetition and stripped from the output. This keeps your Markdown clean. If your document uses unique footers per page that contain meaningful content, you may lose them — review the output and re-add if needed.

FAQ

Frequently asked questions

  • Yes — completely free, no signup, no limits on file size beyond what your browser can handle, no watermark on output. iSavePDF is funded by display ads on the page, not by a paid tier or premium features. There is no upsell.