Automate PDF & Image Text Extraction to Categorized CSV

Manually extracting and categorizing data from PDFs and images, such as bank statements, is a tedious and error-prone process. This workflow leverages Vertex AI and other LLMs to automatically extract text, categorize transactions, and output them as a structured CSV file, eliminating manual data entry.

Key Features

Automatically monitors Google Drive for new PDF and image files.
Intelligently routes files based on type (PDF or image) for specialized processing.
Leverages advanced OCR for accurate text extraction from PDF documents.
Utilizes Vertex AI (Gemini) for robust text recognition and data extraction from images.
Applies Large Language Models (LLMs) via OpenRouter to categorize extracted data, such as financial transactions.
Automatically converts extracted and categorized data into CSV format.
Uploads the final categorized CSV files directly to a designated Google Drive folder.

How It Works

This workflow starts by monitoring a specified Google Drive folder for new PDF or image files. Upon detection, it intelligently determines the file type. If a PDF is uploaded, the workflow downloads it, extracts the text using an OCR process, and then sends this raw text to an external LLM (Llama 3.1 via OpenRouter) to extract structured transaction data and assign categories. If an image is uploaded, it's downloaded and sent to Vertex AI (Gemini) for direct text extraction, transaction parsing, and categorization. Both paths then convert the processed data into a CSV file and upload it to a designated output folder in Google Drive, providing a fully automated data pipeline from document ingestion to categorized export.

Automate PDF & Image Text Extraction to Categorized CSV with AI

Documentation