iOS OCR Module

The Scanbot SDK's OCR Engine can transform written text into machine-readable data – both from still images and a live camera stream. It is the backbone of the SDK's Data Capture Modules, enabling fast and accurate data extraction from various document formats.

The Scanbot OCR feature comes with two OCR engines: legacy and ML. The legacy engine is based on the Tesseract OCR engine with some modifications and enhancements. Later, we introduced the ML (machine learning-based) engine. It is much faster and more accurate, but it only supports languages with latin letters.

Our recommendation is to use the ML engine whenever possible and use the legacy engine only if you want to recognize text from non-latin languages like Arabic, Japanese, Chinese, Russian, Greek, Korean etc.

When using the legacy OCR engine for each desired OCR language, a corresponding .traineddata file (aka tessdata) must be installed in the optional resource bundle named ScanbotSDKOCRData.bundle. Also, the special data file osd.traineddata is required and must be installed. It is used for orientation and script detection.

The newer ML engine does not require any language training data!

The ScanbotSDK.framework itself does not contain any OCR language files to keep the framework small in size. The optional bundle ScanbotSDKOCRData.bundle, provided in the ZIP archive of the Scanbot SDK, contains the language files for English and German as well as the osd.traineddata as examples. You can replace or complete these language files as needed. Add this bundle to your project and make sure that it is copied along with your resources into your app.

Preconditions to achieve good OCR results

A perfect document for OCR is flat, straight, in the highest possible resolution and does not contain large shadows, folds, or any other objects that could distract the recognizer. The SDK's UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.

Size and position

Put the document on a flat surface. Take the photo from straight above and hold the device in parallel to the document to minimize the need for perspective correction. The document should fill as much of the camera frame as possible while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.

Light and shadows

More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. Try to make sure there are no visible shadows. If you encounter large shadows, take the shot at an angle instead.

We do not recommend using the flashlight – from a small distance, using it creates a light spot at the center of the document that decreases the recognition quality.

Focus

The document needs to be properly focused so that the characters are sharp and clear. The auto-focus of the camera works well if you meet the minimum required distance for the lens to be able to focus, usually around 5–10 centimeters (approx. 2–4 inches).

Typefaces

The Scanbot OCR Engine is optimized for common serif and sans-serif font types. Decorative or script fonts drastically decrease the recognition quality.

Languages (applies to legacy OCR engine only)

You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you're trying to detect is supported by the SDK and added to the project.

Downloading the OCR language files (applies to legacy OCR engine only)

You can find a list of all supported OCR languages and corresponding download links in the Tesseract documentation.

⚠️️️ Please choose and download the proper version of the language data files:

For the latest version of Scanbot SDK 1.9.0 or newer -
LSTM Data Files for Version 4.00
For the older versions of Scanbot SDK ** < = 1.8.6** -
Data Files for Version 3.04/3.05

Want to scan longer than one minute?

Generate a free trial license to test the Scanbot SDK thoroughly.

Get free trial license

Scanbot SDK is part of the Apryse SDK product family

A mobile scan is just the start. With Apryse SDKs, you can expand mobile workflows into full cross‑platform document processing. Whether you need to edit PDFs, add secure digital signatures, or use a fast, customizable document viewer and editor, Apryse gives you the tools to build powerful features quickly.

Learn more

Preconditions to achieve good OCR results​

Size and position​

Light and shadows​

Focus​

Typefaces​

Languages (applies to legacy OCR engine only)​

Downloading the OCR language files (applies to legacy OCR engine only)​