Loaders Utilities
The Loaders Utilities provide simple tools to easily fetch HTML from websites.
All the utilities are designed to be straightforward, requiring no configuration.
Overview of Loaders
Webforai provides four different loaders:
- Fetch Loader: The simplest option, using JavaScript's built-in Fetch API.
- Playwright Loader: Ideal for sites requiring JavaScript execution, like SPAs.
- Puppeteer Loader: Another option for handling websites with JavaScript execution.
- CF Puppeteer Loader: Option to handle websites running JavaScript on cloudflare workers.
Fetch Loader
The Fetch Loader is the simplest utility, using JavaScript’s Fetch API. It retrieves HTML from a given URL, using a basic User-Agent for the request.
Usage
import { loadHtml } from "webforai/loaders/fetch";
const html = await loadHtml("https://example.com");
Playwright Loader
The Playwright Loader is a more powerful tool, using Playwright to fetch HTML from websites that need JavaScript execution, like SPAs (Single Page Applications).
Usage
Before using the Playwright Loader, you need to install the Playwright browser and its dependencies.
npx playwright-core install
And then you can use the Playwright Loader as follows:
import { loadHtml } from "webforai/loaders/playwright";
const html = await loadHtml("https://example.com");
Puppeteer Loader
The Puppeteer Loader is another advanced tool that uses Puppeteer to load HTML from sites that rely on JavaScript execution, similar to Playwright.
Usage
Before using the Puppeteer Loader, you need to install the Puppeteer browser and its dependencies.
npm install puppeteer
And then you can use the Puppeteer Loader as follows:
import { loadHtml } from "webforai/loaders/puppeteer";
const html = await loadHtml("https://example.com");
CF Puppeteer Loader
The CF Puppeteer Loader is the best option for loading HTML from sites that rely on JavaScript execution on cloudflare workers. This loader relies on puppeteer on cloudflare workers.
Usage
Before using the CF Puppeteer Loader, you need to prepare a wrangler environment and install @cloudflare/puppeteer. Refer to the cookbook for instructions on how to create a project.
npm install @cloudflare/puppeteer --save-dev
And then you can use the Playwright Loader as follows:
import { loadHtml } from "webforai/loaders/cf-puppeteer";
const html = await loadHtml("https://example.com", browser); // browser is the puppeteer browser instance