French Payslip OCR
Bulletins de salaire



This article explains how to build an OCR API that automatically extracts data from French payslips (bulletins de salaire)




  1. You’ll need a free account. Sign up and confirm your email to login.
  2. You’ll need at least 20 French payslips images or PDFs to train your OCR.




Define your French payslips use case


First, we need to specify the fields we want to extract from our payslips.



French Payslip OCR



For our example, we are going to extract the following list of fields from our French payslips:


  • Employee full name: First and last names of the employee
  • Employee SSN: Employee social security number
  • Employer SIRET: Employer SIRET number 
  • Payslip period:  Payslip month and year 
  • Net paid: Total net paid 
  • Gross salary: Total gross salary before taxes


Feel free to add any data you'd like the OCR to extract.



Deploy your API


Once you have defined the fields you want to extract, head over to the platform and press the ‘create a new API’ button.


You now land now on the setup page. Here is the image you can use to set up the API, and my setup looks like this:



setup french pyslip ocr



We're ready! Press the “next” button. We are going to build our data model in the next section.


At this point, you can manually add each field as described below or you can download this json config and upload it in the left section of the screen.


French payslip document OCR



Employee full name: type String with no numeric characters


employee full name payslip ocr



Employee SSN: type String. Note that we haven't checked the "It never contains alpha characters" as social security numbers can contain 'a' or 'b' for Corsican.


employee ssn payslip ocr



Employer company SIRET: type String that never contains alpha characters.


company siret payslip ocr



Payslip period: type Date 


payslip date ocr



Net paid: type Amount


Net paid payslip ocr



Gross salary: type Amount


Gross salary payslip OCR



You are now ready to train your model!


payslip document ocr



Train your Payslip OCR


Payslip document OCR


You’re all set! 


Now is the time to train your custom Payslip deep learning model. To get more information about the training phase, please refer to the Getting Started tutorial. And if you have any questions regarding your use case, feel free to reach out to us on our chat!