Automation 10 min read · Published: 07/05/2026

Automate Web Scraping in n8n with the ScrapingBot API

Combining n8n and ScrapingBot gives you the best of both worlds: a visual no-code workflow builder and a battle-tested scraping API that handles JavaScript rendering, rotating IPs, and anti-bot measures. In this guide, you will learn how to connect n8n's HTTP Request node to the ScrapingBot API, handle pagination and errors, and ship production-ready scraping automations — without writing complex infrastructure code.

Table of contents

Why combine n8n and ScrapingBot API?
Prerequisites
Setting up n8n ScrapingBot API with HTTP Request node
Parsing the response
Handling multiple URLs
Common errors and how to fix them
Production recipes

1. Why combine n8n and ScrapingBot API?

Building a scraping pipeline typically requires two things: a tool to extract data from pages, and a tool to orchestrate what happens with that data. In practice, most developers end up stitching these together manually with custom scripts. n8n and ScrapingBot solve this more cleanly:

Tool	What it does
n8n	Visual workflow builder — triggers, branching, batching, scheduling, and integrations with 400+ services
ScrapingBot	Scraping API — handles JavaScript rendering, geo-location, anti-bot measures, and rotating IPs

Together, they let you build a full data pipeline — from scraping a page to storing results in a database, sending a Slack alert, or updating a Google Sheet — all without maintaining brittle infrastructure.

The n8n ScrapingBot API combination is particularly powerful for teams who want to automate data collection without writing custom scrapers. Furthermore, n8n's visual interface makes it easy to iterate and debug each step independently.

2. Prerequisites

Before building your workflow, make sure you have the following ready:

An n8n instance — Desktop app, self-hosted, or n8n Cloud
Your ScrapingBot username and API key — available in your ScrapingBot dashboard
A target URL you want to scrape

💡 Note: ScrapingBot offers free access with 100 credits per month — no payment information required. Sign up at scraping-bot.io to get your credentials.

3. Setting up n8n ScrapingBot API with HTTP Request node

Step 1 — Create your credentials

First, set up a reusable credential in n8n so you don't have to paste your API key into every node:

In n8n, go to Credentials → Add Credential
Select Basic Auth
Name it ScrapingBot API
Set User to your ScrapingBot username
Set Password to your ScrapingBot API key
Click Save

Step 2 — Configure the HTTP Request node

Next, add an HTTP Request node to your workflow and configure it as follows:

Field	Value
HTTP Method	`POST`
URL	`https://api.scraping-bot.io/scrape/raw-html`
Authentication	Basic Auth → select `ScrapingBot API`
Body Content Type	`JSON`
Response Format	`JSON`

Step 3 — Set the request body

In the JSON body field, pass the URL you want to scrape along with any options:

{
  "url": "https://example.com/products",
  "options": {
    "premiumProxy": false,
    "country": "us",
    "waitForNetworkIdle": true
  }
}

For dynamic URLs coming from a previous node (for example, a Google Sheets row), use n8n's expression syntax instead:

{
  "url": "{{ $json.url }}",
  "options": {
    "premiumProxy": false
  }
}

💡 Available options: premiumProxy (boolean) enables residential IPs for harder targets. country sets the geo-location (e.g. "fr", "de", "us"). waitForNetworkIdle waits for all JS to finish loading before returning the HTML.

4. Parsing the response

Understanding the response structure

ScrapingBot returns a structured JSON object. The main field you will use is html, which contains the fully rendered page content:

{
  "html": "<html>...rendered page content...</html>",
  "statusCode": 200,
  "captchaFound": false,
  "host": "example.com"
}

Extracting data with the HTML Extract node

After the HTTP Request node, add an HTML Extract node to pull specific data from the response. For example, to extract all product titles from a page:

Field	CSS Selector	Return Value
productTitle	`h2.product-title`	Text
productPrice	`span.price`	Text
productUrl	`a.product-link`	HTML Attribute → `href`

Checking for errors before parsing

Always add an IF node after the HTTP Request to check that the scrape succeeded before processing the data:

// Condition in the IF node
{{ $json.statusCode === 200 && $json.captchaFound === false }}

If the condition is false, route that branch to a retry or error handler instead of continuing the workflow.

5. Handling multiple URLs

Recommended workflow structure

When you need to scrape a large list of URLs, batching is essential to avoid overloading the target server and hitting rate limits. Here is the recommended pattern:

Step	Node	Purpose
1	Trigger (Manual or Cron)	Start the workflow
2	Google Sheets / Database	Read the list of URLs to scrape
3	Split In Batches	Process 5–10 URLs at a time
4	HTTP Request → ScrapingBot	Scrape each URL
5	Wait	Add a 500–1500ms delay between batches
6	HTML Extract / Code	Parse the response
7	Write results	Push to database, sheet, or CRM

Adding a polite delay

In the Wait node, set a random delay between requests to avoid triggering rate limits:

// In a Code node before the Wait node
// Generate a random delay between 500ms and 1500ms
const delay = Math.floor(Math.random() * 1000) + 500;
return [{ json: { delay } }];

Then in the Wait node, set the duration to {{ $json.delay }} milliseconds. As a result, your workflow behaves more like a human browser and is far less likely to get blocked.

6. Common errors and how to fix them

Even with ScrapingBot handling most protections, errors can still occur. Here is how to handle the most common ones:

Error	Cause	Fix
`401 Unauthorized`	Wrong credentials	Double-check your username and API key in the n8n credential
`429 Too Many Requests`	Rate limit exceeded	Increase the delay between requests or reduce batch size
`captchaFound: true`	CAPTCHA not bypassed	Enable `premiumProxy: true` in the request options
`statusCode: 404`	Page no longer exists	Add an IF node to skip 404s and log them separately
Empty HTML response	JavaScript not rendered	Set `waitForNetworkIdle: true` in the options
Workflow timeout	Too many URLs in one run	Reduce batch size and add a Wait node between batches

Adding retry logic

For transient errors, add automatic retries using n8n's built-in retry mechanism. In the HTTP Request node settings, enable "Retry on Fail" and set:

Max Tries: 3
Wait Between Tries: 2000ms

Additionally, for persistent failures, route them to a dedicated error branch that logs the failed URL to a Google Sheet or sends a Slack notification for manual review.

7. Production recipes

Once your basic workflow is working, here are three ready-to-ship automations you can build today:

Price monitor

Track product prices and get alerted when they change:

Cron trigger — run every hour
HTTP Request → ScrapingBot scrapes the product page
HTML Extract — pulls the current price
IF node — compares with the last stored price
Slack / Email node — sends an alert if the price changed
Google Sheets — updates the stored price

Lead capture pipeline

Turn a list of company pages into enriched CRM records:

Google Sheets — reads a list of company LinkedIn or website URLs
Split In Batches — processes 5 URLs at a time
HTTP Request → ScrapingBot scrapes each page
Code node — extracts name, email, phone, address
HubSpot / Salesforce node — creates or updates the contact record

SEO audit

Audit your entire site for missing titles, broken H1s, and status codes:

HTTP Request — fetches your sitemap.xml
XML node — extracts all URLs from the sitemap
Split In Batches — processes pages in groups of 10
HTTP Request → ScrapingBot scrapes each page
HTML Extract — pulls title, H1, meta description, status code
Google Sheets — exports the full audit as a CSV-ready spreadsheet

Ready to automate your scraping workflows? Get 100 free credits when you sign up for ScrapingBot — no credit card required.

Try ScrapingBot for free →

How to Automate Web Scraping with n8n and ScrapingBot API