Getting started
Get started with invoice parsing in python.
1. Install the mindee python helper library
Install from PyPi using pip, a package manager for Python.
pip install mindee
Don't have pip installed? Try installing it, by running this from the command line:
$ curl https://bootstrap.pypa.io/get-pip.py | python
Getting started with the Mindee API couldn't be easier. Create a Client
and you're ready to go.
2. Instantiate your Client
The mindee.Client
needs your API credentials. You can either pass these directly to the constructor (see the code below) or via environment variables.
Depending on what type of document you want to parse, you need to add specifics auth token for each endpoint.
from mindee import Client
mindee_client = Client(
invoice_token="your_invoices_api_token_here",
raise_on_error=True
)
- invoice_token: (string) API key for invoices endpoint
- raise_on_error: (boolean, default True) Specify wheter or not raising an Exception when HTTP errors occur
We suggest storing your credentials as environment variables. Why? You'll never have to worry about committing your credentials and accidentally posting them somewhere public.
3. Parse an invoice
Using the parse_invoice method from your client object, you can pass any image or pdf file (see inputs file for pdf pages number limits) to get the invoice data.
from mindee import Client
mindee_client = Client(
invoice_token="your_invoices_api_token_here",
raise_on_error=True
)
parsed_data = mindee_client.parse_invoice("/path/to/file")
Input types
You can pass your input file in three ways:
From file path
invoice_data = mindee_client.parse_invoice('/path/to/file', input_type="path")
From a file object
with open('/path/to/file', 'rb') as fp:
invoice_data = mindee_client.parse_invoice(fp, input_type="file")
From a base64
invoice_data = mindee_client.parse_invoice(base64_string, input_type="base64")
invoice_data structure
The invoice_data object returned by the parse_invoice
method contains the following elements:
invoice_data.invoice
The document attribute is the Invoice object constructed by gathering all the pages into a single document.
invoice_data.invoice # returns a unique object from class Invoice
invoice_data.invoices
For multi pages pdf, the 'pages' attribute is a list of documents objects, each object is constructed using a unique page of the pdf
parsed_data.invoices # [Invoice, Invoice ...]
invoice_data.http_response
Contains the full Mindee API HTTP response object
parsed_data.http_response # full HTTP request object
4. Display the results
You only have to print your invoice object to display the different extracted fields:
from mindee import Client
mindee_client = Client(
invoice_token="your_invoices_api_token_here",
raise_on_error=True
)
invoice_data = mindee_client.parse_invoice("/path/to/invoice")
print(invoice_data.invoice)