Passport OCR

Automatically extract key information from passports in your application

Leverage our unique computer vision approach to accurately parse passports in your software

Accurate

No setup or retraining required. Benefit from our battle-tested API trained on millions of passports from all around the world.

Seamless integration

The full extraction process is performed without any humans in the loop, allowing you to offer real-time experience with a maximum level of data privacy.

Instant

Our API works synchronously, with an average processing time of 1.3 seconds per page for pdfs and 0.9 seconds for images.

What is passport OCR?

Passport Optical Character Recognition (OCR) refers to the set of software technology capable of transforming photos or scans of passports into meaningful machine-encoded data. It allows developers to build automated passport scanning features in software applications.

Machine Readable Zone (MRZ) and passport body OCR

Meaningful information such as the name, passport number, or date of birth can be found in two different areas in the passport: the body and the MRZ. All the data are included in the body of the passport while the MRZ only contains a subset of them. The MRZ is the zone including two lines of characters. It includes only the most important passport data and historically, technologies have only relied on this area to capture information from passport images. Our passport OCR reads the two pieces of information in order to maximize the extraction performances.

Passport OCR for KYC processes

Know Your Customer (KYC) has become an important part of business processes in many industries. Optimizing these processes is one of the key challenges of the decade because of the criticality of the task. OCR technologies reveal playing an important role in solving this challenge, as they can help reduce manual data entry, avoid mistakes and save time.

Passport scanning issues

Passport scanning refers to the process of creating a digital image or pdf out of the physical passport and processing it. Most of the time, passport pictures are taken from a smartphone or scanned from a printer. In both cases, the processing technology has to deal with image noise like motion blur, scanner artifacts, or low-quality cameras.

OCR-B font for MRZ

The need of reading passports automatically didn't emerge with software, but long before. The first machine-readable passports were issued in the '80s. The font used was designed a decade before, with the constraint of being easily readable by machines. This monospace typeface enhances the distinctness of each character, making them easier to be recognized by machines.

Move past OCR language limitations with Mindee’s global computer vision approach

Like humans, our algorithms don’t need to read all the document text in its language to extract the relevant information

Extracted fields

Holder's names

- Surname
- List of given names

Holder's birth information

Set of holder's birth data retrieved in the body of the passport

- Date of birth (ISO format)
- Gender
- Place of birth

Machine readable zones (MRZ)

The two lines of the MRZ are retrieved in order to cross validate the informations retrieved in the passport's body

Passport number

Passport's dates

ISO formatted passport's dates

- Issuance date
- Expiry date

Country code

ISO formatted passport's country code

Next steps

Try out our products for free. No commitment or credit card required. If you want a custom plan or have questions, we’d be happy to chat.

logo Mindee

Schedule a meeting with one of our experts

Please provide the following information so we can connect you to the right teammate.

Oops! Something went wrong while submitting the form.