Blog / How To

Extract receipt data with Mindee’s API using Python

Jonathan Grandperrin

September 10, 2020

min read

Start for Free

In this tutorial, you will learn how to parse a receipt image using python. This can help you automatically extract data from receipts without human interaction and use it in your backend applications for many accounting purposes.

The final script can be used within any python middleware application or REST API serving your frontend interfaces.

In the end of this tutorial, we’ll show you how to highlight the extracted receipt fields on the image using the open source computer vision library OpenCV.

The final result looks like this:

API Prerequisites

You need to have python installed in your environement. We’re using the 3.7 version in this tutorial.
You’ll need a free Mindee account. Sign up and confirm your email to login.
A receipt. Look in your bag/wallet for a recent one, or do a Google Image search for a receipt and download a few to test with.

Setup the project

Create an empty directory wherever you want in your laptop, we’ll call ours “python_receipt’, and create a ‘main.py’ file in it.

Open up a terminal and go to the directory you just created. In this tutorial we’ll use the python library “requests” to call the API. If it’s not installed in your env, you can install by running:

pip install requests

Make sure you have a receipt image somewhere on your laptop, and you know the path.

That’s it for the setup, let’s call the API.

Call the Mindee receipt API

If you don’t have an API token yet for this API, go to the “credentials” section and create a new API Key.

Click on the documentation section in the navbar, and then click on the Python link on the sample code area. Copy the code.

Open the main.py file you created a few minutes ago, and paste your sample code, it should looks like:

import requests
url = "https://api.mindee.net/products/expense_receipts/v2/predict" 
with open("/path/to/my/file", "rb") as myfile: 
    files = {"file": myfile} 
    headers = {"X-Inferuser-Token": "my-token-here"} 
    response = requests.post(url, files=files, headers=headers) 
    print(response.text)

‍Replace the “path/to/my/file” placeholder in the code with the path of the receipt image you want to extract data from.

Replace the “my-token-here” with the API token you created previously in the platform.

You can get back to your console and run the script

python main.py

You should see a json object in your console with the results extracted from the receipt parsing API. Now let’s parse this json to get the data we need.

Parse the result

The API extract a list of different fields in the receipt (total amount, taxes, date …). You can find a description of all o fthe fields in the “documentation” tab on Mindee API platform page.

Now that we coded the API request, we’ll use the json object and get a few extracted features.

First, let’s create a parsing function that returns the receipt total amount, the date and hour of the payment and the expense category. We’ll focus only on those four features for the tutorial.

The features are inside the prediction object in the json response. As you can send a multi pages pdf to the API, the predictions object is an array, with a prediction object for each page.

In our case, as we are sending a simple jpg image, the predictions array contains only one object.

def get_features(json_response):
    parsed_data = {}    
    prediction = json_response["predictions"][0]    
    parsed_data["total_amount"] = prediction["total"]["amount"]
    parsed_data["time"] = prediction["time"]["iso"]
    parsed_data["date"] = prediction["date"]["iso"]
    parsed_data["category"] = prediction["category"]["value"]
    return parsed_data

Now that our function is ready, we’ll just print the result in the console and add a simple check to ensure the API call was well performed and we can access the result. You can replace the main.py script to this:

Now that our function is ready, we’ll just print the result in the console and add a simple check to ensure the API call was well performed and we can access the result.

You can replace the main.py script to this:

import requests
def get_features(json_response):
    parsed_data = {}
    prediction = json_response["predictions"][0]
    parsed_data["total_amount"] = prediction["total"]["amount"]
    parsed_data["time"] = prediction["time"]["iso"]
    parsed_data["date"] = prediction["date"]["iso"]
    parsed_data["category"] = prediction["category"]["value"]
    return parsed_data

url = "https://api.mindee.net/products/expense_receipts/v2/predict"

with open("/path/to/my/file", "rb") as myfile:
    files = {"file": myfile}
    headers = {"X-Inferuser-Token": "my-token-here"}
    response = requests.post(url, files=files, headers=headers)

    if response.status_code != 200:
        print("Request error")
    else:
        json_response = response.json()
        features = get_features(json_response)
        print("Date:", features["date"])
        print("Time:", features["time"])
        print("Total amount:", features["total_amount"])
        print("Category:", features["category"])

Run the script and check out the results on your image.

Finally we are going to add a last piece of code that will highlight the features extracted on the image. It can help you or your users to very quickly validate the data extraction.

Highlight features on the image

Let’s try to highlight the features as if someone did it with a pen.

First, you’ll need to install the computer vision python library OpenCV if you don’t have it already installed in your env. To do so, run:

pip install opencv-python

We need to change a bit our get_features function to return coordinates for each feature, as we’re using this to know where to draw the rectangles.

def get_features(json_response):
    parsed_data = {}
    coordinates = []
    prediction = json_response["predictions"][0]
    parsed_data["total_amount"] = prediction["total"]["amount"]
    parsed_data["time"] = prediction["time"]["iso"]
    parsed_data["date"] = prediction["date"]["iso"]
    parsed_data["category"] = prediction["category"]["value"]
    coordinates.append(prediction["total"]["segmentation"]["bounding_box"])
    coordinates.append(prediction["time"]["segmentation"]["bounding_box"])
    coordinates.append(prediction["date"]["segmentation"]["bounding_box"])
    return parsed_data, coordinates

Once it’s done, we are going to create our highlighter function in different steps, as follows:

Open the image from path
Create a mask image
Loop on each feature coordinates and draw the feature rectangle on our mask
Overlay the mask and original image with alpha
Display image to the user

Note: each coordinate returned by the API is relative (in % of the image). You’ll see there is a relative to absolute conversion in the code.

Here is the code step by step:

def highlight_features(img_path, coordinates):
    # step 1: Open the image from path
    cv_image = cv2.imread(img_path)

    # step 2: create mask image
    overlay = cv_image.copy()
    h, w = cv_image.shape[:2]

    # step 3: Loop on each feature coordinates and draw the feature rectangle on our mask
    for coord in coordinates:
        pt1 = (int(w*coord[0][0]), int(h*coord[0][1]))
        pt2 = (int(w*coord[2][0]), int(h*coord[2][1]))
        cv2.rectangle(overlay, pt1, pt2, (70, 230, 244), cv2.FILLED)

    # step 4: Overlay the mask and original image with alpha
    final_image = cv2.addWeighted(overlay, 0.5, cv_image, 0.5, 0)

    # step 5: Display image to the user
    cv2.imshow("highlghted_image", cv2.resize(final_image, (400, int(400*h/w))))
    cv2.waitKey(0)

Finally, we just have to modify a bit the code for executing the highlighting function and printing the results. Here is what the final code looks like:

import requests
import cv2
def get_features(json_response):
    parsed_data = {}
    coordinates = []
    prediction = json_response["predictions"][0]
    parsed_data["total_amount"] = prediction["total"]["amount"]
    parsed_data["time"] = prediction["time"]["iso"]
    parsed_data["date"] = prediction["date"]["iso"]
    parsed_data["category"] = prediction["category"]["value"]
    coordinates.append(prediction["total"]["segmentation"]["bounding_box"])
    coordinates.append(prediction["time"]["segmentation"]["bounding_box"])
    coordinates.append(prediction["date"]["segmentation"]["bounding_box"])
    return parsed_data, coordinates

def highlight_features(img_path, coordinates):
    # step 1: Open the image from path
    cv_image = cv2.imread(img_path)

    # step 2: create mask image
    overlay = cv_image.copy()
    h, w = cv_image.shape[:2]

    # step 3: Loop on each feature coordinates and draw the feature rectangle on our mask
    for coord in coordinates:
        pt1 = (int(w*coord[0][0]), int(h*coord[0][1]))
        pt2 = (int(w*coord[2][0]), int(h*coord[2][1]))
        cv2.rectangle(overlay, pt1, pt2, (70, 230, 244), cv2.FILLED)

    # step 4: Overlay the mask and original image with alpha
    final_image = cv2.addWeighted(overlay, 0.5, cv_image, 0.5, 0)

    # step 5: Display image to the user
    cv2.imshow("highlghted_image", cv2.resize(final_image, (400, int(400*h/w))))
    cv2.waitKey(0)

url = "https://api.mindee.net/products/expense_receipts/v2/predict"
with open("/path/to/my/file", "rb") as myfile:
    files = {"file": myfile}
    headers = {"X-Inferuser-Token": "my-token-here"}
    response = requests.post(url, files=files, headers=headers)

    if response.status_code != 200:
        print("Request error")
    else:
        json_response = response.json()
        features, coords = get_features(json_response)
        print("Date:", features["date"])
        print("Time:", features["time"])
        print("Total amount:", features["total_amount"])
        print("Category:", features["category"])
        highlight_features("path/to/my/file", coords)

In the last line of code, don’t forget to replace the placeholder path/to/my/file to your current image path.

And the final result!

Conclusion

In just over 1 second, a receipt was uploaded, the API extracted the receipt data, then the result was parsed and highlighted in the image. Cool right?

If you wanted to use this kind of script to display results to your users, i’d advice you to do the highlighting in the front-end application, as sending images back from your middleware is not the best option because of payload sizes. The other solution would be to store the final image using cv2.imwrite(…) but it would make your client download the result. To help you with client-side rendering for your images, you can check out our open source ReactJS SDK.

If you have questions, please reach out to us!

Frequently Asked Questions

Common questions about document processing and AI technologies that power modern document automation.

No items found.

Ready to transform your document processing?

Start automating your document workflows today with Mindee's intelligent document processing platform.

Start for Free

How To

Implementing Secure Form-Based Authentication in PHP: A Developer's Guide

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Read article