Passport OCR
Automatically extract key information from passports in your application
Leverage our unique computer vision approach to accurately parse passports in your software
Accurate
No setup or retraining required. Benefit from our battle-tested API trained on millions of passports from all around the world.
Seamless integration
The full extraction process is performed without any humans in the loop, allowing you to offer real-time experience with a maximum level of data privacy.
Instant
Our API works synchronously, with an average processing time of 1.3 seconds per page for pdfs and 0.9 seconds for images.
What is passport OCR?
Passport Optical Character Recognition (OCR) refers to the set of software technology capable of transforming photos or scans of passports into meaningful machine-encoded data. It allows developers to build automated passport scanning features in software applications.
Machine Readable Zone (MRZ) and passport body OCR
Meaningful information such as the name, passport number, or date of birth can be found in two different areas in the passport: the body and the MRZ. All the data are included in the body of the passport while the MRZ only contains a subset of them. The MRZ is the zone including two lines of characters. It includes only the most important passport data and historically, technologies have only relied on this area to capture information from passport images. Our passport OCR reads the two pieces of information in order to maximize the extraction performances.
Passport OCR for KYC processes
Know Your Customer (KYC) has become an important part of business processes in many industries. Optimizing these processes is one of the key challenges of the decade because of the criticality of the task. OCR technologies reveal playing an important role in solving this challenge, as they can help reduce manual data entry, avoid mistakes and save time.
Passport scanning issues
Passport scanning refers to the process of creating a digital image or pdf out of the physical passport and processing it. Most of the time, passport pictures are taken from a smartphone or scanned from a printer. In both cases, the processing technology has to deal with image noise like motion blur, scanner artifacts, or low-quality cameras.
OCR-B font for MRZ
The need of reading passports automatically didn't emerge with software, but long before. The first machine-readable passports were issued in the '80s. The font used was designed a decade before, with the constraint of being easily readable by machines. This monospace typeface enhances the distinctness of each character, making them easier to be recognized by machines.
Move past OCR language limitations with Mindee’s global computer vision approach
Like humans, our algorithms don’t need to read all the document text in its language to extract the relevant information
Extracted fields
Holder's names
- Surname
- List of given names
Holder's birth information
Set of holder's birth data retrieved in the body of the passport
- Date of birth (ISO format)
- Gender
- Place of birth
Machine readable zones (MRZ)
The two lines of the MRZ are retrieved in order to cross validate the informations retrieved in the passport's body
Passport number
Passport's dates
ISO formatted passport's dates
- Issuance date
- Expiry date
Country code
ISO formatted passport's country code
Next steps
Try out our products for free. No commitment or credit card required. If you want a custom plan or have questions, we’d be happy to chat.