Extract receipt data with Mindee’s API using NodeJS

 

In this example, we will build from scratch, a NodeJS based web application that accepts a receipt image or PDF, and returns extracted data from the receipt.

 

The code for the sample code is availble on Github, and also available on Glitch.  For simplicity, we'll follow along in the Glitch example.

 

 

 

Glitch example

 

In the server.js, we need to initialise all of our required modules:

 

const express = require('express');
const app = express();
const pug = require('pug');

const path = require('path');
const fs = require("fs");
const axios = require('axios');
let FormData = require('form-data');
//formidable takes the form data and saves the file, and parameterises the fields into JSON
const formidable = require('formidable')

 

We'll begin by building the website for our application.  We'll run it using Express with Pug as the page templating tool.  The other tools are for handling and creating the forms on the server and sending them to Mindee.

 

Private Token

 

We'll use environmental variables to store the Mindee API Token:

 

let receiptToken = process.env.receiptToken;

 

This keeps the token private and secure.  In Glitch, if you remix this code, you'll have to add your token to the .env file for the API call to work correctly.  

 

 

To create your own private token, visit the Mindee dashboard, create an account, confirm your email, and enter the "Receipt Parsing" section of the dashboard.  Click the "Credentials" tab to create your token:

 

 

Add this value to the .env file.

 

Webpage

 

The pages we are buiiding here have no CSS, but we'd store that file in the public folder.  Just in case you are more ambitious, we'll open up the public folder available to the browser, and make sure that Pug is the tool used for creating the pages.

 

// make all the files in 'public' available
// https://expressjs.com/en/starter/static-files.html
app.use(express.static("public"));
//set up pug as view engine
app.set('view engine','pug');
// https://expressjs.com/en/starter/basic-routing.html

 

 The first page is rendered when an initial GET request is made:

 

app.get("/", (request, response) => {
    return response.render('index');
});

 

The index.pug file is a simple page with a form.  Users are invited to add a receipt image, and click the upload button. The source is really basic:

 

head
  link(rel='stylesheet', href='../style.css')
  link(rel='icon', href='icon.ico', type='image/x-icon')
body
  header.header 
    h1 Node.js Receipt parsing

  h2 please upload an image of your receipt

  
  form(action="/" method = "POST" enctype="multipart/form-data").form  
    p
    |Your reciept
    input(type='file', name='imageSource').input   
    input(type='submit', value='Submit')

 

This renders the following in your browser:

 

When the user adds their receipt and presses submit, the image is sent up to the server:

 

Receipt image management

 

The form sends the image to the server root via POST.  When it comes in, formidable reads the form and parses out the files.  These files are saved in a temporary directory (with a temporary name). For simplicity, I rename the temp file with the original receipt filename, and send this off to the makeRequest function (that will send the image to Mindee for parsing:

 

app.post("/", (request, response) => {

	//this sends in a form with an image
  //formidable reads the form, saves the image
	let form = new formidable.IncomingForm({maxFileSize : 2000 * 1024 }); //2 MB
	
	form.parse(request, (err, fields, files) => {
    if (err) {
		  console.error('Error', err);
		  throw err;
    }
	//PARSED FORM
 	console.log("files data", JSON.stringify(files.imageSource));
  let imageName = path.parse(files.imageSource.name).name;
  let imagePath = files.imageSource.path;
  let imageType = files.imageSource.type;
	let imageSize = files.imageSource.size;
	
	//FORMIDIABLE USES A RANDOM NAME ON UPLOAD.  RENAME
	let newImagePath  = imagePath+ imageName;
  fs.rename(imagePath, newImagePath, function (err) {
 	  if (err) throw err;
 		console.log('File uploaded and moved!');
		//FILE IS RENAMED
		//NOW UPLOAD IT TO MINDEE WITH THE MAKEREQUEST FUNCTION
		makeRequest(newImagePath);
  });

 

 

Receipt extraction

 

The makeRequest function takes the image and sends it to Mindee, receives the response and sends the relevent fields back to the user. The axios library sends the data to Mindee, and we wait for the results:

First we take the image, and place it into a few form called 'data; with key "file".  Then we create a request configuration pointing to the Mindee endpoint, insterting our token and the 'data' form with the image

 

	async function makeRequest(newImagePath) {
  	let data = new FormData()
 		data.append('file', fs.createReadStream(newImagePath))
 		const config = {
  			 method: 'POST',
  			  url: 'https://api.mindee.net/products/expense_receipts/v2/predict',
  			  headers: { 
  				  'X-Inferuser-Token':receiptToken,
  			    ...data.getHeaders()
 				  },
				data
 		}
		console.log("config" ,config);
		try {
			let apiResponse = await axios(config)
			console.log(" api response", apiResponse.data);
		  //pull out the data I want to show on the page
			let predict = apiResponse.data.predictions;
			let merchant = apiResponse.data.predictions[0].merchant.name;
			let merchantType = apiResponse.data.predictions[0].category.value;
			let tax = apiResponse.data.predictions[0].taxes.amount;
			let total = apiResponse.data.predictions[0].total.amount

			console.log (merchant, merchantType, tax, total);
			return response.render('receipt',{merchant, tax, merchantType, total});
				  
		} catch (error) {
 			console.log(error)
 			}

  }
  
 });

 

 

Parsing the results

 

The response comes back as a JSON file of data (for full details, read more here).  In this case, we will just extract 4 pieces of data: Merchant, Merchant Type, Taxes and Total.

 

I extract these from the API response into variables, and send them to receipt.pug for rendering to the end user.

 

Showing the results

 

The receipt.pug file is again extremely basic, rendering just the four variables on the page, and a link back to the original form.

 

body
  header.header 
    
    
    
    p here's the data extracted:
	
    p Merchant: #{merchant}
    p Merchant Type: #{merchantType}
    p Tax: #{tax}
    p Total #{total}
    a(href='/') Parse another receipt!

 

 

If course, you can remix this Glitch and change the output to better suit your needs.

 

Conclusion

In this post, we've walked through the steps to create a NodeJS application that receives a receipt, uploads it to Mindee, extracts the results, and presents them back to the user.  Give the code a try, and let us know what you think through the chat box at the bottom of the page.