Convert Bank Statements to Markdown for Effortless Data Extraction
Extract specific financial data like all deposit rows from complex bank statements with 95% accuracy in under 2 minutes, virtually eliminating manual data entry.
Manually extracting crucial financial data from bank statements, especially scanned PDFs, is a time-consuming and error-prone process. This n8n workflow automates the conversion of bank statements into structured markdown using AI, allowing for precise and efficient extraction of key financial insights like deposit transactions.

Documentation
Automate Bank Statement Data Extraction
Manual data entry from bank statements, especially those that are scanned images, is a significant bottleneck for finance teams and individuals. This n8n workflow provides a robust solution by leveraging advanced Vision Language Models (VLMs) to convert bank statement PDFs into a structured markdown format. This enables precise and efficient extraction of critical financial data, such as deposit and withdrawal transactions, transforming unstructured documents into actionable insights.
Key Features
- Seamlessly converts both digital and scanned PDF bank statements into structured markdown.
- Accurately extracts specific financial data, like all deposit line items, from complex table layouts.
- Utilizes powerful Vision Language Models (VLMs) for superior document understanding compared to traditional OCR.
- Optimizes processing for efficiency, handling multi-page documents while managing token and timeout limits.
- Reduces manual data entry and errors, saving significant time and improving data quality.
How It Works
This workflow begins by retrieving a bank statement PDF, which can be sourced from Google Drive or other triggers. The PDF is then converted into individual images using an external service (Stirling PDF, self-hostable for privacy). These images are resized for optimal AI processing and fed page-by-page into a Google Gemini Vision Language Model. The VLM transcribes each page into markdown, faithfully capturing headings, tables, and transactional details. All transcribed pages are then combined. Finally, a second Google Gemini LLM, guided by a specific schema, extracts precise financial data—such as all deposit table rows—from the consolidated markdown, delivering structured output ready for further analysis or integration.