Structured Output
October 19, 2024 by
You can get structured JSON directly from any website by using webforai and the Vercel AI SDK.
Install dependencies
Install the necessary packages:
npm
npm init -y
npm install webforai ai @ai-sdk/google zod
npm install -D tsx
Prepare API Key
This example uses Google Generative AI (Gemini 1.5 Flash) via the AI SDK. Set your Google Generative AI API key as an environment variable GOOGLE_GENERATIVE_AI_API_KEY. You can get the key here.
For other providers, see the AI SDK provider documentation.
Write code
Here’s how to convert HTML to Markdown using webforai and then transform it into a structured object with AI SDK:
src/index.ts
import { google } from "@ai-sdk/google";
import { generateObject } from "ai";
import { htmlToMarkdown } from "webforai";
import { loadHtml } from "webforai/loaders/fetch";
import { z } from "zod";
const html = await loadHtml("https://github.com/inaridiy?tab=repositories");
const markdown = htmlToMarkdown(html);
const { object: repositories } = await generateObject({
model: google("gemini-1.5-flash-latest"),
schema: z.object({
repositories: z.array(
z.object({
name: z.string(),
url: z.string(),
stars: z.number(),
license: z.string(),
}),
),
}),
prompt: `Please generate a list of repositories from the following markdown content.\n\n${markdown}`,
});
console.log(repositories);
Launch 🚀
Just run the following command:
tsx src/index.ts
# => {
# => repositories: [
# => {
# => name: 'webforai',
# => url: 'https://github.com/inaridiy/webforai',
# => stars: 46,
# => license: 'MIT'
# => }
# => ]
# => ...
# => }