Open Source SDKs & Libraries New Release3 min read
We are pleased to announce the release of our Node.js SDK v1.1.0, Python SDK v1.3.0, and docTR v0.5.0. The Node.js SDK supports invoice, receipt, and finance documents while the Python SDK supports invoice, passport, and receipt documents including Europe license plates. docTR is used to parse textual information (localize and identify each word) from your documents. These updates were made to increase and improve the efficiency and performance of parsing documents.
For the Node.jS SDK, we added support for V2 and OS in User-Agent headers.
- A big part of this release was to support the Mindee API V2. The invoice API for this SDK now supports native PDF textual content for currency, amount/tax, invoice number, payment information, business registration number, and 17 other currencies.
- OS support has also been added to the user-agent headers in this new release. This enables us to determine which OS our clients use our SDK on and hence optimize for performance and compactibilty.
The Python SDK now fully supports URLs on Windows OS and sending files using base64.
- An important part of this update was improving and fixing URLs not built properly on Windows OS.
- Another update we made was eliminating errors that occur while sending files using base64.
- There is also support for the user-agent header. This enables us to determine which operating systems our clients use our SDK on and hence optimize for performance and compactibilty.
- Finally, we replaced the PDF manipulation library we were using with a more performant one and moved from GPL2 to MIT license, which means less restriction.
The new docTR release now includes support for text detection on rotated and skewed documents and updates on the classification model.
- Rotated and skewed documents are now supported fully in this release The goal is to bring the same level of performance that we already have to rotated documents.
- We updated all of the checkpoints of the classification models zoo used in both PyTorch and TensorFlow. These models were trained using our synthetic character classification dataset. For more information, see Character classification training.
- Remarkably, the number of datasets that are supported has increased significantly. This comprises of highly used datasets that are used for benchmarking OCR-related activities; you can find the whole list of these datasets here.
For the time being, the new version will actively be maintained, critical bug fixes will be applied to the new releases. Development and maintenance will cease on previous releases. If you have any problem with the SDKs or libraries, we invite you to create an issue in their respective Github repository Node.js SDK issues, Python SDK issues, docTR Issues, or reach out to us on our Slack community.
If you are an existing customer using the Node.js SDK and Python SDK or docTR, we strongly encourage you to upgrade to the newest versions as soon as possible. This will ensure you have access to all the latest features and bug fixes now and in the future.