Skip to main content
The scrape capability extracts structured HTML content from URLs, preserving the page structure for data extraction.

Providers

ProviderFeatures
FirecrawlJavaScript rendering, handles dynamic content
ScraperAPIRotating proxies, handles blocks

Basic Usage

const result = await saturn.scrape({
  url: 'https://example.com/products',
});

console.log(result.data.html);
// → Raw HTML content

Parameters

url
string
required
The URL to scrape.
provider
string
Specific provider to use (firecrawl or scraperapi).
waitFor
number
Milliseconds to wait for JavaScript rendering.

Response

interface ScrapeResponse {
  data: {
    html: string;       // Raw HTML content
    url: string;        // Final URL (after redirects)
    statusCode: number; // HTTP status code
  };
  metadata: {
    chargedUsdCents: number;
    provider: string;
    latencyMs: number;
    auditId: string;
  };
}

Examples

Extract Product Data

const page = await saturn.scrape({
  url: 'https://store.example.com/product/123',
});

// Use LLM to extract structured data
const extraction = await saturn.reason({
  prompt: `Extract product information from this HTML as JSON:
  - name
  - price
  - description
  - availability

  HTML:
  ${page.data.html}`,
});

const product = JSON.parse(extraction.data.content);

Handle Dynamic Content

// Wait for JavaScript to render
const page = await saturn.scrape({
  url: 'https://spa.example.com',
  waitFor: 3000, // Wait 3 seconds
});

When to Use scrape vs read

CapabilityUse caseReturns
readArticles, blog posts, documentationClean text
scrapeData extraction, structured contentRaw HTML
Use scrape when you need to:
  • Extract specific elements (prices, tables, lists)
  • Parse structured data
  • Work with page layout

Pricing

ProviderCost per page
Firecrawl~$0.002
ScraperAPI~$0.005

Combining with LLM Extraction

async function extractJobListings(url: string) {
  const page = await saturn.scrape({ url });

  const jobs = await saturn.reason({
    prompt: `Extract job listings from this HTML as JSON array:
    Each job should have: title, company, location, salary (if listed)

    HTML:
    ${page.data.html}`,
  });

  return JSON.parse(jobs.data.content);
}