US Pay Stubs OCR

 

 

This article lays out the process recommended to build an OCR API that extracts data from US pay stubs using Mindee's deep learning engine.

 

 

Prerequisites

  1. You’ll need a free beta account. Sign up and confirm your email to login.
  2. You’ll need at least 20 US pay stubs (images or PDFs) to train your OCR.

 

 

Define your Pay Stub use case

 

You might need to automatically extract data from pay stubs to improve your user experience in payroll or loan eligibility workflows. This article will guide you over the few steps required to deploy your Pay Stubs data extraction API.

 

 

First, we’re going to define the fields we want to extract from your pay stubs.

 

Pay stub OCR API

 

Here is the list of fields we are going to extract using our OCR API:

 

  • Employer: The full name of the employer issuing the pay stub
  • Net pay: Total net paid to the employee
  • Pay date: Date of wage payment
  • Period beginning: Pay stub start date
  • Period ending: Pay stub end date
  • Gross pay: Total gross pay before taxes and deductions
  • Total tax: Total tax deducted

 

 

You can add as many relevant fields as you need to better fit your requirements.

 

 

 

Deploy your API

 

Once you have defined what fields you want to extract, head over to the platform and press the ‘build a new endpoint’ button.

 

You land now on the setup page. Here is the image you can use to set up the API. For instance, my setup is as follows:

 

 

Setup your Pay stub OCR API

 

 

Once you’re ready, click on the “next step” button. We are going to specify the data types for each of the fields we want our API to extract.


 

To move forward, you can download this json config to set up your data model, or you can do it manually.

 

 

Employer: type String 

 

Employer field for Pay Stub OCR

 

 

Net pay: type Amount

 

Net paid field for Pay Stub OCR

 

Pay date: type Date

 

Pay date field for Pay Stub OCR

 

 

Period beginning: type Date 

 

Pay stub beginning of period field for Pay stub OCR

 

 

Period ending: type Date

Pay stub end of period field for Pay stub OCR

 

 

Gross pay: type Amount

 

Total gross paid for Pay Stub OCR

 

 

Total tax: Total tax deducted

 

Total tax field for Pay stub OCR

 

 

 

Train your Pay Stub OCR

 

 

You’re all set! 

 

Now is the time to train your US Pay Stub deep learning model in the Training section of your API. 

 

To get more information about the training phase, please refer to the Getting Started tutorial.

If you have any questions regarding your use case, feel free to reach out on the Mindee Community on Slack!