Skip to content

Structured Output

October 19, 2024 by inaridiy

You can get structured JSON directly from any website by using webforai and the Vercel AI SDK.

Install dependencies

Install the necessary packages:

npm
npm init -y
npm install webforai ai @ai-sdk/google zod
npm install -D tsx

Prepare API Key

This example uses Google Generative AI (Gemini 1.5 Flash) via the AI SDK. Set your Google Generative AI API key as an environment variable GOOGLE_GENERATIVE_AI_API_KEY. You can get the key here.

For other providers, see the AI SDK provider documentation.

Write code

Here’s how to convert HTML to Markdown using webforai and then transform it into a structured object with AI SDK:

src/index.ts
import { google } from "@ai-sdk/google";
import { generateObject } from "ai";
import { htmlToMarkdown } from "webforai";
import { loadHtml } from "webforai/loaders/fetch";
import { z } from "zod";
 
const html = await loadHtml("https://github.com/inaridiy?tab=repositories");
const markdown = htmlToMarkdown(html);
 
const { object: repositories } = await generateObject({
	model: google("gemini-1.5-flash-latest"),
	schema: z.object({
		repositories: z.array(
			z.object({
				name: z.string(),
				url: z.string(),
				stars: z.number(),
				license: z.string(),
			}),
		),
	}),
	prompt: `Please generate a list of repositories from the following markdown content.\n\n${markdown}`,
});
 
console.log(repositories);

Launch 🚀

Just run the following command:

tsx src/index.ts
 
# => {
# =>   repositories: [
# =>     {
# =>       name: 'webforai',
# =>       url: 'https://github.com/inaridiy/webforai',
# =>       stars: 46,
# =>       license: 'MIT'
# =>     }
# =>   ]
# => ...
# => }