Passport Machine Readable Zone: protect your identity online

Your passport. Accepted as photo identification nearly everywhere, and required for international travel.  The data on your passport can be used to create accounts at banks, and potentially can lead to identity theft.  So, how do you protect your passport details from theft?

Mr. Bean's passport

Mr. Bean's passport

It should be pretty clear that you should not post your passport on social media, Of course, this does not stop people, and they are unwittingly exposing their private data.

The Human Readable Zone

The Top of your passport has all of the information you expect to see. In the example of Mr. Bean (above), we learn that he  was born 6 Jan 1955 in Enfield, UK. we can also read the passport number, and the date the passport was issued (and when it expires).

The Machine Readable Zone

The Machine Readable Zone (MRZ for short) is the 2 lines of text at the bottom of the passport. If you look at your passport, the first line of the MRZ has a 3 letter abbreviation for your country, followed by your name (all separated by "<").  The Mr. Bean passport is clearly fake, as it only states "GB" instead of the expected "GBR" that is used officially.

The second line of the MRZ appears to be more gibberish, but there are important values there (and this is where you might accidently leak your private information.  It kicks off with your passport number + a checksum digit), followied by your nationality, your date of birth (+1 digit checksum), your sex, and finally the expiration date of your passport.

The Mr. Bean passport above correctly shows his passport number and nationality. It begins to break down on date of birth - the MRZ reports as: 6 November 1981 (different from the human readable zone by several decades). It also shows the issuance date to be 42 April 2000 (hmmm).

Publicly posting your passport

Even if you carefully obfuscate your passport, you can leak information from the MRZ section of your passport. Here is a photo of my passport, where I've drawn a giant yellow box over the entire human readable region:

Trust me, this is my passport.

If I run this passport (with the MRZ zone exposed) through the Mindee Passport Extraction API, it correctly identifies all of my information - as it can extract it from the MRZ.   The curl command looks like:

curl -X POST 
  https://api.mindee.net/products/passport/v1/predict 
  -H 'X-Inferuser-Token: {myAPItoken}' 
  -H 'content-type: multipart/form-data;
  -F file=@/path/to/my/passport.jpg

 

I receive a JSON result (here are a few snippets of 'sharable' information that were correctly extracted):

	"country": {
				"probability": 1,
				"segmentation": { "bounding_box": [] },
				"value": "USA"
			},
    "given_names": [
				{
					"probability": 0.2,
					"segmentation": {
						"bounding_box": [
							[ 0.056, 0.915 ],
							[ 0.479, 0.915 ],
							[ 0.479, 0.939 ],
							[ 0.056, 0.939 ]
						]
					},
					"value": "DOUGLASS"
				}
			],

 

The API can read the MRZ and extract all of the private data that I so carefully hid in the human readable zone.

Exposing the MRZ

So, what are people exposing on social media?

The young man knew not to post the entire photo of his passport, but the MRZ gave away his passport number (I've contacted him to let him know).

Finding Fake Passports

Poorly forged documents (like Mr. Bean's passport above) will have discrepancies between the Human Readable and the MRZ.  In a recent article in Angola, a woman claims that her passport was forged and used in crimes.  The article includes an image of "her" passport:

If we read the bottom line of the MRZ, nothing matches the top of the passport. The can be tested with the Mindee API:

The MRZ Zones are predicted as:
 

			"mrz1": {
				"probability": 0.27,
				"segmentation": {
					"bounding_box": [
						[ 0.147, 0.803 ],
						[ 0.861, 0.803 ],
						[ 0.861, 0.882 ],
						[ 0.147, 0.882 ]
					]
				},
				"value": "PNAGOISABEL<<DOS<SANTOS<<<<<<<<<<<<<<<<<<<<<"
			},
			"mrz2": {
				"probability": 1,
				"segmentation": {
					"bounding_box": [
						[ 0.148, 0.867 ],
						[ 0.865, 0.867 ],
						[ 0.865, 0.951 ],
						[ 0.148, 0.951 ]
					]
				},
				"value": "N1473613<8AGO8909176M18090600444874<N01<1356"
			},

Extracting the passport number "N1473613" - it clearly does not match the number on the passport "N1471383".

Issuance date in the MRZ: 17 Sept 2008.  On the passport it states 6 Sept 2018.

The "M" following the issuance date provides the incorrect gender.

The Expiration date appears as 18-Sept 1989, while the passport states 6 Sept 2021.

This passport is clearly fake based on these details.  The icing on the forgery cake is the signature.  This woman is clearly not "Bruce Lee"

Conclusion

The MRZ zone is often overlooked by passport holders, who are unaware that private data is encoded in these two lines.  These two lines should be kept as secure as the rest of your passport, as they have exactly the same information in them.  To learn more about information extraction from passports, look no further than the Mindee API.