Blog / How To

Digit Recognition in Python: From OpenCV Fundamentals to Mindee's OCR Tools

The Mindee Team

April 8, 2025

min read

Start for Free

When you want to extract digits from images—think ZIP codes, meter readings, or totals on invoices—there are two main paths in Python: the low-level route (using OpenCV) and the high-level route (using prebuilt OCR tools like Mindee).

This post is for developers who want to understand both: we'll walk through building a basic digit recognizer using OpenCV, and then compare it with modern OCR solutions using Mindee’s Python SDK and the docTR library. We'll also cover practical image preprocessing, contour sorting, and actual use cases.

Why Digit Recognition?

Digit recognition is a useful subtask in OCR. It shows up in:

Bank checks
Utility meter readings
Forms and invoices
Parking tickets
Product barcodes
Sudoku solvers (yes, really)

The challenge? Digits are often distorted, handwritten, scanned, or photographed in less-than-ideal conditions. That’s why robust recognition is essential.

Part 1 – Building a Digit Recognizer from Scratch with OpenCV

Let’s begin with the classic computer vision route using OpenCV and k-Nearest Neighbors.

Step 1 – Training on OpenCV’s `digits.png`

OpenCV ships with a pre-labeled image called digits.png—a 50x100 grid of 20x20 digit images, totaling 5000 handwritten digits.

We’ll split this into training and testing sets.

import numpy as np
import cv2 as cv

img = cv.imread(cv.samples.findFile('digits.png'))
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)

# Split into individual digit cells (20x20 pixels)
cells = [np.hsplit(row, 100) for row in np.vsplit(gray, 50)]
x = np.array(cells)

train = x[:, :50].reshape(-1, 400).astype(np.float32)
test = x[:, 50:100].reshape(-1, 400).astype(np.float32)

# Labels: 250 of each digit 0–9
labels = np.repeat(np.arange(10), 250)[:, np.newaxis]
train_labels = labels.copy()
test_labels = labels.copy()

‍

Train and test a basic kNN model:

knn = cv.ml.KNearest_create()
knn.train(train, cv.ml.ROW_SAMPLE, train_labels)

ret, result, neighbours, dist = knn.findNearest(test, k=5)

accuracy = (result == test_labels).mean() * 100
print(f"Test accuracy: {accuracy:.2f}%")

✅ Output: Test accuracy: ~91.76%

Pretty good—especially given the simplicity of the model.

Step 2 – Recognizing Digits in a New Image

Now let’s simulate a real use case: you have a photo with a sequence of digits and want to recognize them.

A) Preprocessing and Contour Detection

We’ll convert the image to grayscale, threshold it, and extract individual digits using contours.

def extract_digits_from_image(img_path):
    img = cv.imread(img_path)
    gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
    blur = cv.GaussianBlur(gray, (5, 5), 0)
    _, thresh = cv.threshold(blur, 0, 255, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)

    contours, _ = cv.findContours(thresh, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_SIMPLE)

    digit_regions = []
    bounding_boxes = []

    for cnt in contours:
        x, y, w, h = cv.boundingRect(cnt)
        if h > 10 and w > 5:
            roi = thresh[y:y+h, x:x+w]
            resized = cv.resize(roi, (20, 20))
            digit_regions.append(resized)
            bounding_boxes.append((x, y, w, h))

    # Sort digits left to right
    sorted_digits = [x for _, x in sorted(zip(bounding_boxes, digit_regions), key=lambda b: b[0][0])]
    return sorted_digits

B) Recognizing the Digits

Once the digits are extracted and sorted, we can classify them:

def recognize_digits(digit_images, knn_model):
    results = []
    for digit_img in digit_images:
        sample = digit_img.reshape((1, 400)).astype(np.float32)
        ret, result, _, _ = knn_model.findNearest(sample, k=5)
        results.append(int(result[0][0]))
    return results

‍

Usage:

digits = extract_digits_from_image("sample_digits.png")
predictions = recognize_digits(digits, knn)
print("Detected digits:", predictions)

✅ Example Output: Detected digits: [3, 1, 4, 1, 5]

Part 2 – The Limits of Classic OCR

Building from scratch gives you control and a better understanding of the pipeline, but:

The model only recognizes digits
You must manually segment the characters
It’s sensitive to noise, skew, and font changes
Accuracy is decent but not production-grade

If you want to recognize real-world text in different fonts, layouts, or languages—you need better tools.

Part 3 – Using Mindee’s OCR Tools

Mindee provides modern OCR APIs and open-source tools like docTR that work out of the box for printed and scanned documents, receipts, invoices, and more.

You don’t need to build your own classifier or segmentation logic—just feed in an image.

Option 1 – Using the Mindee Python SDK

Install the SDK:

pip install mindee

‍

Recognize structured fields like totals, dates, and tax amounts:

import mindee

client = mindee.Client(api_key="your_api_key")
doc = client.source_from_path("path/to/receipt.jpg")
result = client.parse(mindee.product.ReceiptV4, doc)

print(result.document)

‍

✅ Example Output:

total_amount: 42.80
date: 2025-04-08
tax: 4.28

‍

This is great for developers who want structured, field-level extraction with no ML work.

Option 2 – Using docTR for General Text & Digit OCR

Install with PyTorch backend:

pip install "python-doctr[torch]"

‍

OCR on any image (PDF, photo, scan):

from doctr.io import DocumentFile
from doctr.models import ocr_predictor

model = ocr_predictor(pretrained=True)
doc = DocumentFile.from_images(["path/to/image.jpg"])
result = model(doc)

# View predictions
for page in result.pages:
    for block in page.blocks:
        for line in block.lines:
            print(" ".join([word.value for word in line.words]))

‍

✅ Output:

ORDER NUMBER: 123456
TOTAL: $89.99

‍

docTR handles layout detection, line grouping, and character recognition—all in a few lines of code.

Comparison Table: OpenCV kNN vs. Mindee SDK / docTR

Feature	OpenCV kNN	Mindee SDK / docTR
Setup	Manual	One-line install
Preprocessing required	Yes	Optional
Layout handling	No	Yes
Recognition targets	Only digits	Any text
Language support	None	English, French, etc.
Confidence scores	No	Yes
Real-world accuracy	Moderate	High

Conclusion

OpenCV is a great educational tool and works well for tightly controlled use cases, like digit recognition from scanned forms. But if you're building anything customer-facing—or just want to skip the hassle of preprocessing and model tuning—Mindee's SDK and docTR are ready-to-use solutions.

You still get full control as a developer, just with better accuracy and fewer headaches.

‍

Frequently Asked Questions

Common questions about document processing and AI technologies that power modern document automation.

Can I build a digit recognition system using only OpenCV?

Yes, you can build a digit recognizer using OpenCV with techniques like contour detection and k-Nearest Neighbors. However, accuracy and flexibility are limited for real-world applications.

What's the advantage of using Mindee over OpenCV for digit recognition?

Mindee's OCR tools handle layout analysis, text detection, and recognition out of the box—no manual preprocessing or training required. It's faster to implement and more accurate on real documents.

Can Mindee's OCR work with handwritten digits?

Mindee's docTR is primarily designed for printed text. While it can sometimes recognize clean handwriting, it's best suited for typed or scanned text in documents like invoices, receipts, and forms.

Ready to transform your document processing?

Start automating your document workflows today with Mindee's intelligent document processing platform.

Start for Free

How To

Implementing Secure Form-Based Authentication in PHP: A Developer's Guide

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Read article

How To

How to perform a meaningful OCR benchmark: a receipt OCR case study

Read article

Java OCR programmer sitting at desktop looks outside nearby window

How To

Java OCR with Mindee's SDK: A Comprehensive Guide

Read article

Digit Recognition in Python: From OpenCV Fundamentals to Mindee's OCR Tools

Table of Contents

Related Articles

Why Digit Recognition?

Part 1 – Building a Digit Recognizer from Scratch with OpenCV

Step 1 – Training on OpenCV’s `digits.png`

Step 2 – Recognizing Digits in a New Image

A) Preprocessing and Contour Detection

B) Recognizing the Digits

Part 2 – The Limits of Classic OCR

Part 3 – Using Mindee’s OCR Tools

Option 1 – Using the Mindee Python SDK

Option 2 – Using docTR for General Text & Digit OCR

Conclusion

Key Takeway

Key Takeway

Frequently Asked Questions

Ready to transform your document processing?

Related Articles

Implementing Secure Form-Based Authentication in PHP: A Developer's Guide

How to perform a meaningful OCR benchmark: a receipt OCR case study

Java OCR with Mindee's SDK: A Comprehensive Guide

Digit Recognition in Python: From OpenCV Fundamentals to Mindee's OCR Tools

Table of Contents

Related Articles

Why Digit Recognition?

Part 1 – Building a Digit Recognizer from Scratch with OpenCV

Step 1 – Training on OpenCV’s digits.png

Step 2 – Recognizing Digits in a New Image

A) Preprocessing and Contour Detection

B) Recognizing the Digits

Part 2 – The Limits of Classic OCR

Part 3 – Using Mindee’s OCR Tools

Option 1 – Using the Mindee Python SDK

Option 2 – Using docTR for General Text & Digit OCR

Conclusion

Key Takeway

Key Takeway

Frequently Asked Questions

Ready to transform your document processing?

Related Articles

Implementing Secure Form-Based Authentication in PHP: A Developer's Guide

How to perform a meaningful OCR benchmark: a receipt OCR case study

Java OCR with Mindee's SDK: A Comprehensive Guide

Step 1 – Training on OpenCV’s `digits.png`