Skip to content

htmlToMarkdown

Useful and high-quality HTML to Markdown converter. Internally, it just calls htmlToMdast and mdastToMarkdown in that order.

Usage

import { htmlToMarkdown } from "webforai";
 
const html = "<h1>Hello, world!</h1>";
const markdown = htmlToMarkdown(html);
=> "# Hello, world!"

Returns

string

The converted Markdown string.

Parameters

htmlOrHast

type: string | Hast

The HTML string or HAST tree to convert.

const markdown = htmlToMarkdown("<h1>Hello, world!</h1>");
// => "# Hello, world!"

options.baseUrl

type: string

The base URL to use for replacing relative links.

const markdown = htmlToMarkdown("<a href='/foo'>bar</a>", {
	baseUrl: "https://example.com",
});
// => "[bar](https://example.com/foo)"

options.extractors

type: ExtractorSelectors

An array of extractors to extract specific elements from the HTML. You can define your own functions in addition to the Extractor provided as a preset.

import { htmlToMarkdown, type Extractor, takumiExtractor } from "webforai"
 
const yourCustomExtractor: Extractor = (params) => {
	const { hast, url } = params
	// ... your logic ...
	return hast
};
 
const html = "<h1>Hello, world!</h1>"
const markdown = htmlToMarkdown(html, {
	extractors: [yourCustomExtractor, takumiExtractor]
});
// => "# Hello, world!"

options.formatting

type: Omit<MdastToMarkdownOptions, "baseUrl">

Formatting options passed to mdast-util-to-markdown.

const markdown = htmlToMarkdown("<h1>Hello, world!</h1>", {
	formatting: {
		bullet: "*",
	},
});
// => "* Hello, world!"

options.linkAsText

type: boolean

Whether to convert links to plain text.

const markdown = htmlToMarkdown("<a href='/foo'>bar</a>", {
	linkAsText: true,
});
// => "bar"

options.tableAsText

type: boolean

Whether to convert tables to plain text.

options.hideImage

type: boolean

Whether to hide images.

options.lang

type: string

The language of the HTML.