docTR

Star

Open-source python document understanding library for developers and data scientists

See on Github

Talk to an expert

Trainable deep learning OCR enabling the most advanced document understanding use cases

State-of-the-art

Benefit from the latest computer vision breakthroughs to solve the most complex document processing use cases.

Open source

Build a tailor-made OCR capability that can be hosted in your environment to comply with your data privacy policy.

Trainable

Achieve high extraction performances at scale on US, Europe, or any latin alphabet receipts, from various industries and sectors.

A fully packaged document understanding library for developers and data scientists

Pretrained OCR

Plug and play python OCR trained on millions of latin alphabet documents

See docs

End-to-end-Pipeline

Two-stages OCR pipeline using text detection and recognition algorithms

Training

Text detection and recognition training scripts for PyTorch and TensorFlow

See references

Public Datasets

Built-in support for the most famous OCR challenges public datasets

See datasets

Artefact Detection

Detection algorithms for QR codes, bar codes, signatures, faces...

See docs

Benchmark

Recall, precision and FPS benchmark between different models

See benchmarks

TensorFlow JS

OCR inference in web browser powered by TFJS

See demo app

Local Demo App

Local demo UI generator powered by Streamlit

See docs

Model Compression

Half-precision and quantization support for model optimization

See docs

See full documentation

A fully trainable two-stage approach to OCR

Achieve maximum accuracy by training both the text detection and recognition layers for your specific problem

Detection models

Recognition models:

See live demo on Hugging Face

Mindee use cookies to give you the best online experience. Cookies allows us to improve your website browsing experience and measure statistics associated with your visits. By continuing to browse or use our services, you accept the use of cookies in accordance with our privacy policy.

docTR

Trainable deep learning OCR enabling the most advanced document understanding use cases

State-of-the-art

Open source

Trainable

A fully packaged document understanding library for developers and data scientists

Pretrained OCR

End-to-end-Pipeline

Training

Public Datasets

Artefact Detection

Benchmark

TensorFlow JS

Local Demo App

Model Compression

A fully trainable two-stage approach to OCR

Have more questions?

Send us a message

Book a call