Vision-Based AI Agent Scraper Overview

This workflow leverages a sophisticated vision-based AI Agent, integrated with Google Sheets, ScrapingBee, and the Gemini-1.5-Pro model, to extract structured data from webpages. The AI Agent intelligently prioritizes screenshot analysis for data extraction but seamlessly switches to HTML scraping as a robust fallback mechanism when visual information is insufficient, ensuring high accuracy and comprehensive data capture. This template is primarily designed for e-commerce scraping but is highly customizable for a wide range of web data extraction needs.

Key Features

AI-Powered Vision Scraping: Utilizes Gemini-1.5-Pro to analyze webpage screenshots and extract data based on visual context, ideal for dynamic or visually complex sites.
Intelligent HTML Fallback: Automatically switches to HTML-based scraping (converted to Markdown for efficiency) when visual data is ambiguous or incomplete, ensuring no data is missed.
Google Sheets Integration: Effortlessly manage your list of URLs to be scraped and store all extracted, structured data directly into a designated Google Sheet.
Structured Data Output: Transforms raw extracted data into a clean, easy-to-use JSON format, ready for analysis or further processing.
Cost-Optimized Processing: Converts HTML to Markdown before AI processing to minimize token usage and reduce operational costs.

How It Works

The workflow begins by fetching a list of target URLs from a Google Sheet. For each URL, it captures a full-page screenshot using ScrapingBee. This screenshot, along with the URL, is then fed to the Vision-based Scraping Agent. The agent, powered by the Gemini-1.5-Pro large language model, first attempts to extract predefined data (e.g., product title, price, brand, promo info) directly from the screenshot. If the AI agent identifies any missing or ambiguous information from the visual data, it intelligently triggers an HTML-based scraping tool. This tool retrieves the webpage's HTML, converts it to a token-efficient Markdown format, and provides it back to the AI for a second extraction attempt. Once the data is extracted and validated, a Structured Output Parser formats it into a JSON array. Finally, the 'Split Out' node processes this array into individual records, which are then appended as new rows to a designated 'Results' sheet in Google Sheets.

AI Agent Automates Vision-Based Web Scraping to Google Sheets

Documentation

Vision-Based AI Agent Scraper Overview

Key Features

How It Works

Workflow Details

Frequently Asked Questions