Category Archives: Website

Receipts to Data: An n8n & OCR Guide

Do you dread budget day? The stack of receipts on the counter, the cryptic scribbles, the tedious process of entering every last detail into a spreadsheet? I recently took on a project to build a solution to this exact problem: a completely automated workflow that turns a pile of physical receipts into a clean, organized budget spreadsheet. No manual data entry, no messy transcription. Just a fully automated, no-code solution powered by n8n, OCR, and AI. The end result is a working automated budget tracker and a clear takeaway: automation is a powerful accelerator, but it’s a partnership that still requires your guidance. In this post, I’ll walk you through the entire process of building this solution, from scanning the receipts to a final, organized spreadsheet.

The Problem & The Solution

Keeping a budget is a best practice, but maintaining it with physical receipts is anything but efficient. The solution? Build a personal “digital assistant” that automates the entire process. The workflow starts with Receipt Ingestion, where we simply drop a batch of receipt images into a designated Nextcloud folder. From there, Data Extraction takes over, using Azure Computer Vision to perform OCR on each image and pull out the raw, unstructured text. This is followed by Data Interpretation, where a Chat model acts as a brain, analyzing the text to identify the paid amount, store name, and a specific budget category. The structured output is then sent to a spreadsheet for Data Storage & Analysis, where it’s logged and used to run real-time budget statistics. Finally, n8n serves as the Workflow Orchestrator, tying every single step together and ensuring the process runs seamlessly from start to finish.

The n8n workflow

Capturing the Files

The workflow kicks off with a Schedule Trigger node, which starts a new batch process at a set interval, like every hour. This node tells the Nextcloud node to pull a list of all items from our designated receipts folder. The items are then sent to a Filter node, which performs a crucial check to ensure that only valid files—and not folders or already-processed receipts—are passed on. This ensures our workflow is robust and efficient. From there, a Loop node handles the cleaned batch, processing each file individually and starting the action to download the image.

The OCR

The Azure Computer Vision node is the workhorse of this workflow, performing Optical Character Recognition (OCR) on our images. Its role is to take the digital image of a receipt and extract every piece of printed text it can find. You’ll connect this node directly to the output of the Nextcloud node. In the node’s configuration, you’ll pass the binary data of the receipt image using an expression like {{ $json.data }}. The output is a rich JSON object containing the extracted text, organized into lines with bounding box coordinates. This structured data is the raw material for our next step, where a chat model will analyze it to find the specific values we need.

The AI Node

An AI model is needed because OCR output is unstructured. While Azure Computer Vision can extract every word from the receipt, it doesn’t understand the context. It can’t tell the difference between a total amount, a subtotal, a tax line, or a random number in the receipt’s text. An AI model, on the other hand, can be prompted to read the entire text and identify specific pieces of information, such as the store name and the final paid amount, and even categorize the expense.

The prompt you’ll use is:

You are a helpful assistant that identifies key information from receipt text. Take this input, which is a text generation from Azure's Computer Vision of a receipt: ```{{ JSON.stringify($json) }}```. Extract the store name, total amount paid, and guess the category of this receipt. Most receipts will be of Food type (which includes Groceries & Food Receipts). The paid amount should include the tax. The output should be in a JSON format and here is an example: { "store": "My Supermarket", "paid_amount": "25.45", "paid_tax": "1.00" "category": "Food" }

The Spreadsheet Node

The Google Sheets node is where all our parsed data finds its final home. You’ll connect this node directly after the AI chat model. To get it set up, you authenticate your Google account and specify the spreadsheet and sheet name you want to work with. The node is configured to Append Row, which means it will add a new row of data to the bottom of the sheet for every receipt it processes. Once connected, you’ll map the data from the AI’s JSON output to the correct columns in your spreadsheet. For example, to populate your “Store” column, you’ll enter the expression {{ $json.store }}; for “Paid Amount,” you’ll use {{ $json.paid_amount }}; and so on for “Paid Tax” and “Category.” This also enables the asynchronous nature of your budget statistics—while n8n logs the raw data, the spreadsheet itself is doing the heavy lifting with formulas to calculate totals, visualize trends, or perform other statistical analysis. This is where a completion flag column comes into play, which is crucial for the next step, as it tells n8n when a specific row is fully processed and ready to be moved on from.

My Future Enhancements

To improve the chat model’s accuracy, you can refine your prompt with specific examples. If the AI struggles with a certain store or receipt type, provide it with an example of the OCR text from that receipt and show it the correct JSON output you expect. This is a form of in-context learning that helps the model perform better. For error handling, it’s wise to add a Conditional node after the AI step. This node can check if the AI’s output contains an “unidentified” value for the store or amount. If it does, you can set up a parallel path that sends the problematic receipt to a different spreadsheet or a manual review folder, so it doesn’t get lost. Finally, you can add notifications to stay informed about your workflow’s status. For example, you can set up a notification that says, “Receipt processed successfully for [Store Name],” or a more urgent message like, “Error: Receipt could not be processed.”

It Works

The end result of our automated workflow is a budget spreadsheet that is organized. Instead of a messy receipts, you have a clean Google Sheet where each new row is a reflected transaction. Columns for Date, Store, Paid Amount, and Category are all automatically populated.

Building a robust, automated budget tracker was a welcome challenge. I wanted to solve the tedious problem of manual data entry, so I used n8n to build a no-code workflow. The solution I came up with seamlessly connects several services to handle everything from receipt ingestion to final data analysis.

The workflow begins with Nextcloud, which acts as our receipt inbox. I simply drop a batch of receipts into a folder, and the workflow kicks off. I used Azure Computer Vision for the OCR, which extracts the raw, unstructured text from the receipt images. The real brain of the operation is an AI chat model that takes this text and converts it into a clean, structured JSON object with the store name, paid amount, and a budget category. Finally, a Google Sheets node appends this structured data to my budget spreadsheet.

The result is a solution that takes the manual work out of budgeting. The AI handles the tricky part of interpreting the data, and the automation ensures my budget is always up to date. It was a great project that proved to me that the right tools can simplify even the most frustrating tasks. If you have a repetitive chore, I’d highly recommend you give n8n a try.

From Zero to Crossword Generator with Windsurf AI

I recently took on a new project: building a crossword generator. Instead of starting from scratch, I decided to test out Windsurf AI in VS Code. It was an interesting five-hour sprint that gave me some working code and a few key takeaways about AI-assisted development.

My biggest lesson was that you have to break down the problem. Giving the AI a single prompt to “build a crossword generator” was overwhelming and unproductive. The successful approach was to tackle smaller, more specific tasks, like generating the grid or placing a word.

The code it produced was functional, but it wasn’t perfect. I quickly learned that the AI is a great starting point, but it doesn’t always produce performant or elegant code. A developer’s critical eye is still essential. I spent a good amount of time reviewing and refactoring the generated code with the AI to make it more efficient.

Debugging is still a part of my workflow, but it’s a different kind of debugging. I’m now less focused on finding errors from a blank page and more on refining and optimizing the AI’s output. It makes me wonder if there’s a way to feed my debugging process and code improvements back into the AI automatically.

The end result is a functional crossword generator (code on my GitHub) and a clear takeaway: AI is a powerful accelerator, but it’s a partnership that still requires your guidance.

Five Letter Word Six Guesses

Been enjoying this little game over the past week or so and thought I would try my hand at it.

Short little exercise that took 3 days

Check it out here

Been a little busy…

With my new job, I haven’t had much time to add much to this website.

So here’s this small perl script I wrote that sends a summary of the weather in SMS format from my iPhone daily.
There was a need, so I figured why not help out those who need the weather forecast =].

use LWP::Simple;
use JSON qw( decode_json );

my $lastRun = 0;
open (MYFILE, 'weatherSMS_LastRun');
while () {
        chomp;
        $lastRun = $_;
}
close (MYFILE);

my ($S,$M,$H,$d,$m,$Y) = localtime($lastRun);
my ($S2,$M2,$H2,$d2,$m2,$Y2) = localtime(time);
if ( $Y2 == $Y && $m2 == $m  && $d2 == $d ){
	exit 0;
}

$json = get("https://api.forecast.io/forecast/APIKEY/LATITUDE,LONGITUDE?units=ca&exclude=currently,minutely,hourly,flags");
my $decoded = decode_json($json);

#ensure we can write the file
open(F,'>weatherSMS_LastRun') || die "Could not open file to mark execution";
print F time;
close(F);

my @timeslot = @{ $decoded->{'daily'}{'data'} };
my $forcast = "";
foreach my $f ( @timeslot ) {
	my ($S,$M,$H,$d,$m,$Y) = localtime($f->{"time"});
	$m = $m + 1;
	$forcast .= "$m/$d: ".sprintf("%.3f",$f->{"temperatureMin"})."-".sprintf("%.3f",$f->{"temperatureMax"})."C ".$f->{"summary"}."\n";
}
#printf($forcast);

`/Applications/biteSMS.app/biteSMS -send -carrier PHONENUMBER "$forcast"`;

In order to execute this daily, the following plist was used:





	Label
	com.joshho.weatherSMS
	LowPriorityIO
	
	Nice
	1
	AbandonProcessGroup
	
	OnDemand
	
	ProgramArguments
	
		/usr/local/bin/perl
		/location to perlscript/weatherSMS.pl
	
	StartCalendarInterval
	
		Minute
		0
	


Submit it to be executed hourly by running `launchctl load [path to plist]`

Edit
Edits were made to cover the fact that your phone may not always have internet.
(It may be locked/sleeping – which may turn off the wifi?)

Website Maintenance (done)

With the change of my webhost, all my php scripts were moved up from 5.2 to a newer version.

Who knew PHP introduced a ton of changes that practically broke a large number of my apps.
Absolutely ridiculous, but nevertheless, an experience.

I would say all but One app is working just fine.
The app in question works 99% of the time fortunately.

Oh the woes of maintenance of program code.
Wouldn’t it be nice to write it once, and never having to look at it again?

Website Maintenance

Hi all.

This site may be unavailable in the next upcoming days due to a change in my web hosting company.

HTML5 – Bejeweled

Hi all.. haven’t been doing much as I’ve been spending my time looking for jobs.

Anyways, here’s yet another completed game – Bejeweled.. this one took almost 4 days and isn’t as awesome as the Popcap one.

The main point of this exercise was the add the ability to click a location on the canvas and have the corresponding (top) object fire an onclick method. The previous version of the engine it’s running on did not support this. As a by-product in doing adding this, this exercise was to dabble my hands in creating a robust solution to onclick() and object mapping. I do not think this was achieved as there is no propagation to parent classes.


You can play it here.

I tried to polish it up with the menu and the layout, but graphics unfortunately aren’t my strength.

HTML5 – Tetris

Not much to say about this one… after completing Pacman, I had an itch to do tetris.. an old favourite of mine.

Completed it in 2.5 days…
At this rate, I’m definitely going to need to step up my productivity for this if I want to make something comparable to Ludum Dare quality.
(2.5.. as a consolation it only included about 14 hours of coding.)

You can play it here.

HTML5 – Pacman

I was talking to a colleague the other day about HTML5-Snake, and somehow the conversation drifted to Pacman…

After a weekend of investigating the Pacman (enemy) strategies, I took a break from doing EulerProject questions and began work on it.
It seemed simple enough: the background was pre-made, (sprites were handmade),  and it seemed very do-able.

It took me 2 days for the base code, +1 for memory/cpu profiling.
It is without sound, but I think it’s quite playable.

You can try it out here.

Reddit key-code distribution tool

I recently received an email from cadenzainteractive.com inquiring about a distribution method for steam keys for their then-upcoming giveaway. At the time, all I had was a key-captcha webpage here that utilized captchas to prevent key farming by bots.

After investigating what existed, I found python script – code here  – that had successfully distributed 50,000 keys. This script required you had python and additional imports installed.

I was then interested to see if I could write something that others could get started in 1-2 minutes. I ended up whipping up a prototype tool written in javascript/php here within a day of that email.

Getting back to cadenzainteractive, using my tool, they successfully delivered 4,000 steam keys on reddit: http://redd.it/ve7dj.

Try it here

Notes:  The program uses the reddit API to reply back to unread messages from reddit users requesting a key.