Invoice OCR

Instant key information extraction for any latin alphabet invoices

Transform invoices into structured data in real-time with our Invoice OCR API

Accurate

No setup or retraining required. Benefit from our battle-tested OCR trained on millions of invoices from all around the world.

Seamless integration

The full extraction process is performed without any humans in the loop, allowing you to offer real-time experience with a maximum level of data privacy.

Instant

Our API works synchronously, with an average processing time of 1.3 seconds per page for pdfs and 0.9 seconds for images.

What is an invoice OCR?

Invoice OCR stands for Invoice Optical Character Recognition. Invoice OCRs are software technologies transforming unstructured invoices, such as pdfs or images, into structured data. These technologies are mainly used to automate invoice scanning in order to reduce the need of manual data entry.

Optical character recognition (OCR)

OCR refers to technologies capable of detecting and reading text from images or documents in order to transform them into machine-readable format. More details in our blog.

Invoice OCR vs OCR

The acronym OCR commonly refers to the generic problem of text detection and recognition. When associated with a type of document such as Invoice OCR, the meaning slightly changes, as it refers to technologies performing key information extraction and not generic text extraction. It's common to use this with any document type. Examples: Receipt OCR or Passport OCR.

Invoice scanning

Invoice scanning refers to the whole process of collecting and processing invoices using software. Sometimes, the processing of the invoice includes an Invoice OCR in order to extract data from the unstructured invoice, that can be validated by a human or not.

Businesses using invoice OCR

Automated invoice parsing is becoming essential in many industries, particularly financial services. Invoice OCRs tend to reduce errors, and optimize the processing time of invoices in accounting, accounts payable or receivable, procurement... More generally, in any workflow involving the payment, validation, or analysis of invoices.

Move past traditional Invoice OCR language limitations with our global computer vision approach

Like humans, our algorithms don’t need to read all the document text in its language to extract the relevant information

Extracted invoice data

Transform any scan, photos, or native pdf invoices into usable data in your software

Customer and Supplier information

Three data are extracted for both the customer and the Supplier:

- Name
- Address
- Company Taxpayer ID (TIN, VAT NUMBER, SIRET... See full list here)

Amounts

Each amount is returned in the currency of the invoice.

- Total including taxes
- Total excluding taxes
- Tax breakdown, each tax object includes the tax amount and the tax rate when applicable

Invoice identifiers

- Invoice number
- Purchase Order (PO) number - coming soon

Geography Information

- Currency in ISO format (USD, EUR...)
- Language in ISO format (en, es...)

Dates

Each date is returned in ISO format YYYY-MM-DD

- Invoice date
- Due date (computed using either the due date or the invoice date and payment terms)

Payment details

Supplier payment details are extracted as a list of objects that includes necessary information for payments:

- Routing number
- Account number
- IBAN
- BIC

Frequently asked questions about Invoice OCR

Choosing the right Invoice OCR technology to use for your application can be a heavy task. In most of the use cases, criteria like extraction accuracy, precision, response time, integration time, pricing, scalability... should be taken into account in order to maximize the added value in your software. Feel free to contact us if you don't find the answers to your questions below.

📄 How can I test the invoice OCR API?

The invoice OCR API available to any user having an account on our platform and includes a free plan.

To test our APIs, you only have to create a free account using this link, and you'll be able to upload invoices in our user interface to see invoice OCR in action, as well as the json output. A demo page is also available here.

💸 Is Mindee's invoice OCR API free to use?

A free plan is available to everyone and allows you to perform 250 pages processing per month for free. No credit card is required.

Above 250 pages per month, the price per invoice page processed starts at $0.10 and can decrease to $0.01 per page depending on the monthly volume. See the pricing page for more information.

🗺️ What are the supported countries?

Our invoice recognition API is based on our computer vision technology that doesn't rely on text to extract the invoice data, but only on the image. This removes language limitations.

The OCR was trained with invoices from over 50 countries, ensuring that you can extract data from your invoices regardless of where they were created.

🕔 How complicated is it to integrate the API?

Mindee's API follows HTTP standards in order to allow any developer to integrate the invoice OCR API into their applications easily.

We also offer a set of client libraries in all the main back-end languages, and an open-source UI toolkit that helps create front-end features. You can check out our open-source repository or our API documentation for more details.

📈️ What is the OCR accuracy?

Our invoice OCR's accuracy is above 90%, with precision above 95% for most of the fields. These performances are computed on a data set including more than 50 countries.

Testing our OCR API is free, all you need is an account. Feel free to drag and drop invoices in the live interface to see the OCR performance on your data.

⚡ What's the average API response time?

The processing time is around 1.3 seconds per page for pdfs and 0.9 seconds for a invoice image.

We often improve this processing time and our target is below 500ms. Our goal is to make sure you can create real-time user experiences in your application.

🎯️ Does the OCR work on low-quality images?

Yes, we trained our Invoice OCR to process invoices from a large number of different layouts, even the ones with the most complex formatting.

We also use data augmentation to ensure that no blur or ink stains prevent our system from reading the data as long as it's readable.

🤓️ Do you offer technical support?

We have a Slack community where you can ask your questions and chat with our team.

We don't do the integration in your infrastructure ourselves but we can set up a custom level of support on a per-case basis if needed.

Try out our Invoice OCR API

Try out our products for free. No commitment or credit card required. If you want a custom plan or have questions, we’d be happy to chat.

logo Mindee

Schedule a meeting with one of our experts

Please provide the following information so we can connect you to the right teammate.

Oops! Something went wrong while submitting the form.