Extract invoice details with Mindee’s Invoice API and NodeJS

 

In this example, we will build from scratch a NodeJS based web application that accepts an image or PDF of an invoice, and return several points of data extracted from the invoice.

 

The code for the sample code is availble on Github, and also available on Glitch.  For simplicity, we'll follow along in the Glitch example.

 

 

Glitch example

 

In the server.js, we need to initialise all of our required modules:

 

const express = require('express');
const app = express();
const pug = require('pug');

const path = require('path');
const fs = require("fs");
const axios = require('axios');
let FormData = require('form-data');
//formidable takes the form data and saves the file, and parameterises the fields into JSON
const formidable = require('formidable')

 

We'll begin by building the website for our application.  We'll run it using Express with Pug as the page templating tool.  The other tools are for handling and creating the forms on the server and sending them to Mindee.

 

Private Token

 

We'll use environmental variables to store the Mindee API Token:

 

let invoiceToken = process.env.invoiceToken;

 

This keeps the token private and secure.  In Glitch, if you remix this code, you'll have to add your token to the .env file for the API call to work correctly.  

 

 

To create your own private token, visit the Mindee dashboard, create an account, confirm your email, and enter the "Invoice API" section of the dashboard.  Click the "Credentials" tab to create your token:

 

 

Add this value to the .env file.

 

Webpage

 

The pages we are buiiding here have no CSS, but we'd store that file in the public folder.  Just in case you are more ambitious, we'll open up the public folder available to the browser, and make sure that Pug is the tool used for creating the pages.

 

// make all the files in 'public' available
// https://expressjs.com/en/starter/static-files.html
app.use(express.static("public"));
//set up pug as view engine
app.set('view engine','pug');
// https://expressjs.com/en/starter/basic-routing.html

 

 The first page is rendered when an initial GET request is made:

 

app.get("/", (request, response) => {
    return response.render('index');
});

 

The index.pug file is a simple page with a form, where users can add an invoice image, and click the upload button. Not a lot to see here:

 

head
  link(rel='stylesheet', href='../style.css')
  link(rel='icon', href='icon.ico', type='image/x-icon')
body
  header.header 
   
 
    h1 Node.js Invoice extraction

  h2 please upload an image or PDF of your invoice

  
  form(action="/" method = "POST" enctype="multipart/form-data").form  
    p
    |Your reciept
    input(type='file', name='imageSource').input   
    input(type='submit', value='Submit')

 

This renders the following in your browser:

 

 

When the user adds an invoice and presses submit, the image is sent up to the server for processesing

 

Invoice image processing on the server

 

The form sends the image to the server root via POST.  When it comes in, formidable reads the form and parses out the files.  These files are saved in a temporary directory (with a temporary name). For simplicity, I rename the temp file with the original invoice filename.  I do this as the filename in the API response will match whatever is uploaded to Mindee, so keeping the filename the same makes identification easier. The file is then sent to the makeRequest function:

 

app.post("/", (request, response) => {

	//this sends in a form with an image
    //formidable reads the form, saves the image
	let form = new formidable.IncomingForm({maxFileSize : 2000 * 1024 }); //2 MB
	
	form.parse(request, (err, fields, files) => {
    if (err) {
		  console.error('Error', err);
		  throw err;
    }
	//PARSED FORM
 	console.log("files data", JSON.stringify(files.imageSource));
    let imageName = path.parse(files.imageSource.name).name;
    let imagePath = files.imageSource.path;
    let imageType = files.imageSource.type;
	let imageSize = files.imageSource.size;
	
	//FORMIDIABLE USES A RANDOM NAME ON UPLOAD.  RENAME
	let newImagePath  = imagePath+ imageName;
    fs.rename(imagePath, newImagePath, function (err) {
 	   	if (err) throw err;
 		  console.log('File uploaded and moved!');
		  //FILE IS RENAMED
		  //NOW UPLOAD IT TO MINDEE WITH THE MAKEREQUEST FUNCTION
		  makeRequest(newImagePath);
  	});
	

 

Invoice data extraction

 

The makeRequest function takes the image and sends it to Mindee, receives the response and sends the relevent fields back to the user. The axios library sends the data to Mindee, and we wait for the results:

First we take the image, and place it into a few form called 'data; with key "file".  Then we create a request configuration pointing to the Mindee endpoint, insterting our token and the 'data' form with the image

 

		async function makeRequest(newImagePath) {
  		  let data = new FormData()
 		   data.append('file', fs.createReadStream(newImagePath))
	
		//	console.log("form data ", data);
	
 		   const config = {
  			 method: 'POST',
  			   url: 'https://api.mindee.net/products/invoices/v1/predict',
  			   headers: { 
  				   'X-Inferuser-Token':invoiceToken,
  				   ...data.getHeaders()
 				  },
				  data
 			  }
			  console.log("config" ,config);
			  try {
				  let apiResponse = await axios(config)
				  console.log(" api response", apiResponse.data);
				  
			

 

 

Results

 

The response comes back as a JSON file of data that has been extracted from the invoice (for full details, [read more here](https://mindee.com/documentation/apis/invoice-parsing)).  We'll extract the invoice number, date, company name, the currency, and the total before and after taxes.

 

I extract these from the APII response into variables, and send them to invoice.pug for rendering to the end user.

 

 //pull out the data I want to show on the page
let currency = apiResponse.data.predictions[0].locale.currency;
let invoice = apiResponse.data.predictions[0].invoice_number.value;
let merchant = apiResponse.data.predictions[0].supplier.value;
let date = apiResponse.data.predictions[0].invoice_date.iso;
let beforeTax = apiResponse.data.predictions[0].total_excl.amount;
let total = apiResponse.data.predictions[0].total_incl.amount
				 
console.log (invoice, currency, merchant, date, beforeTax, total);
return response.render('invoice',{invoice, currency, merchant, date, beforeTax, total});
				

 

 

Displaying the data

 

The invoice.pug file renders just the six results we extracted and a link back to the original form.

 

body
  header.header 
    
    
    
    p here's the data extracted:
	
    p Merchant: #{merchant}
    p invoice number: #{invoice}
    p Date: #{date}
    p Currency: #{currency}
    p Total before Taxes: #{beforeTax}
    p Total with tax #{total}
    a(href='/') Parse another invoice!

 

 

 

If course, you can remix this Glitch and change the output to better suit your needs.

 

 

Conclusion

In this post, we've walked through the steps to create a NodeJS application that receives an invoice, uploads it to Mindee, extracts the results, and presents them back to the user.  Give the code a try, and let us know what you think through the chat box at the bottom of the page.