HTML to Markdown Converter
Convert HTML to clean Markdown in your browser — GFM tables, task lists, and links. Choose ATX/Setext headings and inline or reference links. Great for migrating web content or feeding LLMs. 100% private, no upload.
Custom CSS
What is HTML to Markdown Conversion?
HTML to Markdown conversion takes a rendered HTML document — the tags, attributes, and nesting a browser displays — and rewrites it as Markdown, the lightweight plain-text format built for writing and version control. Where Markdown to HTML expands compact text into markup for display, this is the reverse and reductive direction: you start with rich, verbose HTML and distil it down to the small, readable set of conventions Markdown offers.
Under the hood the converter parses your HTML into a DOM tree — the same node structure a browser builds — then walks that tree and emits the Markdown equivalent for each node it recognises. An <h2> becomes ## , a <strong> becomes **text**, a <ul> becomes a bulleted list, an becomes a link, a <table> becomes a GFM pipe table. Traversing a real DOM, rather than running regular expressions over the raw string, is what lets it handle nested lists, mixed inline formatting, and tables correctly instead of breaking on edge cases.
You reach for this conversion when you are migrating out of HTML, not into it. Content trapped in a CMS, a WYSIWYG editor, an old web page, or a rich-text field is hard to diff, hard to review, and hard to move. Converting it to Markdown frees it into a format that lives happily in a Git repo, a static-site generator, or a notes app — and, increasingly, into a format that large language models read efficiently. The catch, which honest tools state plainly, is that the conversion is lossy: HTML can express things Markdown cannot, so some structure and every styling detail are deliberately discarded in exchange for clean, portable text.
The reverse operation — Markdown back to HTML, for when you are ready to publish or preview — is just as useful. Switch to the Markdown → HTML tab or open the dedicated Markdown to HTML converter.
HTML in:
<h2>Pricing</h2>
<p>Plans start at <strong>$9/mo</strong>. See the <a href="https://example.com/pricing">details</a>.</p>
<table>
<thead><tr><th>Plan</th><th>Price</th></tr></thead>
<tbody><tr><td>Pro</td><td>$9</td></tr></tbody>
</table>
Markdown out:
## Pricing
Plans start at **$9/mo**. See the [details](https://example.com/pricing).
| Plan | Price |
| ---- | ----- |
| Pro | $9 |
<!-- <div>, classes, and inline styles in the source are dropped — Markdown can't represent them. --> Key Features
GFM-Aware Output
Targets GitHub Flavored Markdown, not just plain CommonMark: HTML tables become pipe tables, checkbox <li>s become task lists (`- [x]`), and <del>/<s> become ~~strikethrough~~. The Markdown drops straight into a README, a GitHub issue, or a docs site and renders the same way.
ATX or Setext Headings
Choose hash-prefixed ATX headings (# H1) or underlined Setext headings (=== for H1, --- for H2). Setext covers only the top two levels, so the converter falls back to ATX for H3 and deeper automatically — you never get an invalid heading.
Inline or Reference Links
Switch between inline links — [text](url) next to the prose — and reference links, which collect every URL into a numbered list at the foot of the document. Reference style keeps link-heavy paragraphs readable and lets you reuse a URL by label.
Fenced Code Blocks
A <pre><code> block becomes a fenced code block with triple backticks, and a language- class on the <code> element carries through as the fence's info string. Inline <code> becomes backtick spans, so snippets survive the trip intact.
Handles Nested Lists and Tables
Walks the real DOM, so nested <ul>/<ol> structures convert to correctly indented Markdown lists and ordered lists renumber from 1. Simple tables flatten to pipe tables; genuinely complex ones fall back to raw HTML rather than losing data.
100% Private, In-Browser
Every conversion runs locally with JavaScript — your HTML and the resulting Markdown never leave your device, never hit a server, and work offline after the page loads. Safe for internal CMS exports, customer content, and unpublished pages.
Examples
Web <table> to a GFM pipe table
<table>
<thead><tr><th>Region</th><th>Sales</th></tr></thead>
<tbody>
<tr><td>EMEA</td><td>1,204</td></tr>
<tr><td>APAC</td><td>980</td></tr>
</tbody>
</table> | Region | Sales | | ------ | ----- | | EMEA | 1,204 | | APAC | 980 |
A scraped or copied HTML <table> collapses into a GitHub Flavored Markdown pipe table. The <thead> row becomes the header, the dashed delimiter row is generated for you, and each <tr> becomes one pipe-delimited line — ready to drop into a README or a docs page.
Links: inline vs reference style
<p>Read the <a href="https://example.com/guide">setup guide</a> and the <a href="https://example.com/api">API reference</a>.</p>
Inline: Read the [setup guide](https://example.com/guide) and the [API reference](https://example.com/api). Reference: Read the [setup guide][1] and the [API reference][2]. [1]: https://example.com/guide [2]: https://example.com/api
The same anchors render two ways. Inline keeps the URL next to the text; reference style moves every URL to a numbered list at the bottom, which keeps long paragraphs readable when a sentence carries several links. Pick the style with the Links radio.
Nested <ul>/<ol> to indented Markdown lists
<ul>
<li>Build
<ol>
<li>Compile</li>
<li>Bundle</li>
</ol>
</li>
<li>Ship</li>
</ul> - Build 1. Compile 2. Bundle - Ship
Nesting is preserved by indentation: the inner <ol> sits two spaces under its parent <li> and switches from a `-` bullet to `1.` numbering. Markdown re-numbers ordered lists automatically, so the source stays clean even if the HTML used explicit value attributes.
A chunk of web-page HTML to clean Markdown
<article> <h1>Changelog</h1> <p>We shipped <strong>dark mode</strong> and fixed <code>parseDate()</code>.</p> <blockquote><p>Thanks to everyone who reported it.</p></blockquote> </article>
# Changelog We shipped **dark mode** and fixed `parseDate()`. > Thanks to everyone who reported it.
Paste a slice of a real page — the <article> wrapper is dropped (Markdown has no container element), the <h1> becomes `#`, <strong> becomes `**`, inline <code> becomes backticks, and the <blockquote> becomes a `>` line. Structural wrappers with no Markdown equivalent simply fall away.
How to Convert HTML to Markdown
- 1
Paste your HTML
Drop in a copied web page, a CMS or WYSIWYG export, or a scraped HTML snippet. The DOM is parsed and serialised to Markdown in your browser as you paste — no upload, no size cap beyond your browser's memory.
- 2
Choose heading and link styles
Pick ATX (#) or Setext (===) headings and inline or reference links. The Markdown re-renders live, so you can compare styles instantly. The output targets GitHub Flavored Markdown — tables, task lists, and strikethrough included.
- 3
Copy or download
Click Copy to grab the Markdown, or Download to save a .md file. To go the other way, switch to the Markdown → HTML tab and paste your Markdown to get rendered HTML back.
Common Pitfalls
Expecting <div>/<span> Structure to Survive
Generic containers carry no Markdown equivalent, so they are unwrapped — their content stays but the tag, and any class or style on it, vanishes. If your layout depended on a wrapping <div> or a styled <span>, that styling is gone in the Markdown. This is expected, not a bug; Markdown simply has no way to express it.
<div class="callout warning"><span style="color:red">Heads up!</span></div> <!-- expecting the callout box and red colour to survive -->
Heads up! <!-- container and styles dropped; only the text remains in Markdown -->
Lost <br> Line Breaks Inside Paragraphs
A <br> inside a paragraph is a soft line break, which Markdown represents with two trailing spaces before the newline (or a backslash). Pasting HTML and expecting visible line breaks to survive can surprise you when adjacent lines reflow into one. The converter emits the hard-break form, but if you hand-edit afterward, do not strip the trailing spaces.
Line one<br>Line two <!-- if the break form is removed, these merge into one line -->
Line one Line two <!-- two trailing spaces preserve the <br> as a hard break -->
Deeply Nested Tables Degrading
GFM pipe tables cannot nest or hold block content. A legacy layout that puts a table (or a list, or multiple paragraphs) inside a table cell cannot become a clean pipe table — the converter flattens what it can and leaves the rest as raw HTML so nothing is lost. The fix is to simplify the source, not the output.
<table><tr><td><table><tr><td>inner</td></tr></table></td></tr></table> <!-- nested table can't become a flat pipe table -->
<!-- Flatten to a single-level table first: --> <table><tr><td>inner</td></tr></table> → | inner | | ----- |
Expecting <script> or Styles to Survive
<script>, <style>, and head-level elements are code and presentation, not document content, so they are stripped entirely — not converted, not preserved as raw HTML. Pasting a full page and expecting behaviour or CSS to carry into the Markdown will disappoint. Markdown is a content format; if you need the code or styling, keep the HTML.
<style>.x{color:blue}</style>
<script>track()</script>
<p>Body</p>
<!-- expecting the style and script to come through --> Body <!-- only the content survives; <script>/<style> are dropped -->
Common Use Cases
- Migrate web or CMS content into Notion, Obsidian, or a static site
- Pull pages out of a CMS, a WordPress export, or an old HTML site and convert them to Markdown that drops straight into Notion, Obsidian, Hugo, or Jekyll. You trade verbose markup for portable text that lives cleanly in a Git repo and diffs sensibly in review.
- Export from a WYSIWYG editor
- Rich-text editors emit dense, attribute-heavy HTML. Paste that output here to recover the clean Markdown underneath — headings, lists, links, and emphasis — so the content can move into a docs pipeline or a Markdown-based knowledge base instead of staying locked in the editor.
- Clean HTML into Markdown to feed LLMs and RAG pipelines
- Raw HTML burns tokens on tags, scripts, and styling a model never needs. Converting a scraped page to Markdown strips that noise while keeping the structure an LLM reads well, so you fit more real content in the context window and get cleaner embeddings for retrieval.
- Convert a rich-text paste into Markdown
- Copy formatted text from a web page, an email, or a doc and it arrives as HTML on the clipboard. Paste it here to turn that rich text into Markdown you can commit, send in a pull request, or write into your notes — formatting preserved, clutter gone.
- Archive a page as Markdown
- Save the meaningful content of a web page as a small, future-proof .md file instead of a heavy HTML snapshot full of scripts and tracking. Markdown stays readable in any text editor decades from now and takes a fraction of the space.
- Turn legacy HTML docs into Markdown
- Old documentation written as hand-coded HTML is painful to maintain. Convert it to Markdown to bring it into a modern docs-as-code workflow — where it can be linted, reviewed in pull requests, and rendered by a static-site generator.
Technical Details
- CommonMark vs GitHub Flavored Markdown Output
- The converter can target plain CommonMark or, by default, the GitHub Flavored Markdown superset. CommonMark defines headings, emphasis, lists, links, images, code, and blockquotes precisely. GFM adds four constructs that map directly from common HTML: <table> → pipe table, checkbox list items → task lists, <del>/<s> → strikethrough, and bare URLs → autolinks. Because most web content uses tables and the like, GFM output is the practical default; choose CommonMark only when the destination renderer does not understand GFM extensions, in which case tables fall back to raw HTML.
- Lossy, Irreversible Conversion — Stated Plainly
- HTML is strictly more expressive than Markdown, so the conversion cannot be lossless, and it is worth being upfront about that. Markdown has no syntax for <div>, <span>, or other generic containers; no way to carry class names, id, inline style, colspan/rowspan, or arbitrary data-* attributes; and no representation for most semantic or layout elements. Those are unwrapped (content kept, tag dropped), discarded (attributes), or — when dropping would lose meaning — preserved as raw inline HTML. A round-trip HTML → Markdown → HTML will not reproduce the original. This is a deliberate trade: Markdown exists to be clean, diffable, and human-editable, not to mirror HTML. Most competitors gloss over this; stating it lets you decide with eyes open whether Markdown is the right target.
- Style Trade-offs: ATX/Setext, Inline/Reference, Fenced/Indented
- Three output choices have real trade-offs. ATX headings (#) cover all six levels and grep cleanly; Setext (underlined) only exists for H1/H2, so the tool emits it for the top two levels and falls back to ATX below. Inline links keep the URL beside the text — best for sparse links; reference links pull URLs to the document foot — best for link-dense prose and reuse by label. For code, fenced blocks (triple backticks) carry a language info string and nest safely, whereas indented (four-space) code blocks cannot express a language and break inside lists — so this converter always emits fenced blocks from <pre><code>.
Best Practices
- Format the HTML Before You Convert
- Minified or deeply tangled HTML — especially nested layout tables and stray inline elements — converts more cleanly when it is well-formed first. Run messy source through our HTML Formatter to pretty-print and normalise it, then convert. Clean input yields clean Markdown with fewer raw-HTML fallbacks.
- Expect and Review the Lossy Drops
- Treat the conversion as lossy by design. Classes, inline styles, <div>/<span> wrappers, and exotic attributes are gone in the output, and that is usually what you want for portable Markdown — but skim the result for anything semantically important that lived only in an attribute (an aria-label, a colspan-merged cell) and add it back by hand if it matters.
- Pick the Link Style for the Document's Density
- Use inline links for prose with a link here and there — the URL stays next to its text and the source reads naturally. Switch to reference links when a section is link-heavy or reuses the same URLs: pulling them to a numbered list at the foot keeps paragraphs scannable and avoids repeating long URLs.
- Convert to Markdown Before Sending Pages to an LLM
- When you feed web content to a model — for a prompt, an embedding, or a RAG store — convert the HTML to Markdown first. You strip tags, scripts, and styling that waste tokens and add noise, keep the structure the model actually uses, and fit substantially more real content inside the context window.
- Verify Complex Tables After Conversion
- GFM pipe tables are flat — no nested tables, no block content in cells, no merged cells. After converting a data-heavy or layout table, check the Markdown: simple grids convert perfectly, but anything with colspans or nested blocks degrades and may appear as raw HTML. Simplify the source table first if a clean pipe table matters.
Frequently Asked Questions
How are inline vs reference links handled?
ATX vs Setext headings — which should I use?
What happens to HTML that Markdown can't represent, like <div> and <span>?
Does it strip <script> and styles?
How are nested tables and lists handled?
Is HTML to Markdown lossless?
Can I feed the Markdown to an LLM or ChatGPT?
Are my files uploaded to a server?
Does it work offline?
Can I convert Markdown back to HTML?
Related Tools
View all tools →Base64 Decoder & Encoder
Encoding & Formatting
Decode and encode Base64 online for free. Real-time conversion with full UTF-8 and emoji support. 100% private — runs in your browser. No signup needed.
Base64 to Image Converter
Encoding & Formatting
Decode a Base64 string or data URI back into an image in your browser. Preview, read dimensions & MIME, then download as PNG, JPG, GIF, SVG. No upload.
CSV to JSON Converter
Encoding & Formatting
Convert CSV to JSON in your browser. RFC 4180, type inference, header row, big-int safe. 100% private, no upload.
Image to Base64 Converter
Encoding & Formatting
Convert images to Base64 data URIs in your browser — PNG, JPG, GIF, WebP, SVG, ICO. Copy HTML, CSS, Markdown & JSON, with the exact size increase. 100% private, no upload.
JSON Diff & Compare
Encoding & Formatting
Compare two JSON files instantly in your browser. Side-by-side highlighting, RFC 6902 JSON Patch output, ignore noisy fields like timestamps and IDs. 100% private, no upload.
JSON Formatter & Validator
Encoding & Formatting
Format, validate and beautify JSON instantly in your browser. Free online tool with syntax validation, error detection, minify and one-click copy. 100% private.