1040 Forms OCR

 

This article describes how to build an OCR API that extracts data from 1040 Forms using our deep learning engine. If you want to automate your workflow, this article is for you. 

 

 

Prerequisites

  1. You’ll need a free account. Sign up and confirm your email to login.
  2. You’ll need at least 20 1040 Form images or pdfs to train your OCR.

 

 

Define your 1040 Forms use case

 

First, we’re going to define what fields we want to extract from your 1040 Forms. 

 

 

1040 Forms OCR API

1040 Forms OCR

 

 

  • First Name: The First name and middle name initial of the taxpayer. 

 

  • Last Name: The Last name of the taxpayer. 

 

  • Spouse First Name: The First name and middle name initial of the taxpayer's spouse. 

 

  • Spouse Last Name: The Last name of the taxpayer's spouse. 

 

  • SSN: The Social Security Number of the taxpayer

 

  • Spouse SSN: The Social Security Number of the taxpayer's spouse

 

  • Salary: The taxpayer's wages, salaries, tips, etc.

 

  • Ordinary Dividends: The taxpayer's ordinary dividends (3b)

 

  • Occupation: The taxpayer's occupation. 

 

  • Spouse Occupation: The taxpayer's spouse's occupation. 

 

  • Identity Protection PIN: The taxpayer's identity protection PIN provided by the IRS.

 

 

That’s it for our use case. Feel free to add any other relevant data to fit your requirements. 

 

 

Deploy your API

 

Once you have defined the list of fields you want to extract from your 1040 forms, head over to the platform and press the ‘Create a new API’ button.

 

You land now on the setup page. Here is the image you can use for setting up the API, and my set up looks like this:

 

 

Set up your  1040 forms OCR API

 

Once you’re ready, click on the “next” button. We are going to specify the data types for each of the fields we want our API to extract.


Define your  1040 forms OCR API

 

 

To go further, you can download this json config to set up your data model or do it manually.


 

First Name: type String that never contains numeric characters. 

 

First Name for  1040 form  OCR

 

Last Name: type String that never contains numeric characters. 

 

Last Name for  1040 form  OCR

 

Spouse First Name: type String that never contains numeric characters. 

 

Spouse First Name for  1040 form  OCR

 

Spouse Last Name: type String that never contains numeric characters. 

 

Spouse Last Name for  1040 form  OCR

 

SSN: type Number without specifications. 

 

SSN field for 1040 form OCR

 

Spouse SSN: type Number without specifications. 

 

Spouse SSN field for 1040 form OCR

 

Salary: type Number without specifications. 

 

Salary for  1040 form OCR

 

Ordinary Dividends: type Number without specifications. 

 

Ordinary dividens for  1040 form OCR

 

Occupation: type String that never contains numeric characters. 

 

Occupation for  1040 form OCR

 

Spouse Occupation: type String that never contains numeric characters. 

 

Spouse Occupation for  1040 form OCR

 

Identity Protection PIN: type Number without specifications. 

 

Identity Protection PIN for  1040 form OCR

 

 

Once you’re done setting up your data model, press the Start training your model button at the bottom of the screen.

 

Train your  1040 forms OCR API

 

 

Train your 1040 forms OCR

 

 

Deep learning 1040 forms OCR API

 

 

You’re all set! 

 

Now is the time to train your 1040 form deep learning model in the Training section of our API. 

 

In a few hours (minutes if you're fast), you’ll get your first model trained and will be able to use your custom OCR API for parsing 1040 forms in your application.

 

To get more information about the training phase, please refer to the Getting Started tutorial. If you have any questions regarding your use case, feel free to reach out using our chat!