Skip to main content

Web Optical Character Recognition - SDK Features

The Scanbot SDK plugin provides a simple and convenient API to run Optical Character Recognition (OCR) on images. The OCR feature is a part of the Scanbot SDK Package II.

Preconditions to achieve a good OCR result

Conditions while scanning

A perfect document for OCR is flat, straight, in the highest possible resolution and does not contain large shadows, folds, or any other objects that could distract the recognizer. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.

Languages

You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you are trying to detect is supported by the SDK and added to the project.

Size and position

Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction does not need to be applied much. The document should fill most of the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.

Light and shadows

More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it is better to take the shot at an angle instead. We also do not recommend using the flashlight - from this low distance it creates a light spot at the center of the document which decreases the recognition quality.

Focus

The document needs to be properly focused so that the characters are sharp and clear. The autofocus of the camera works well if you meet the minimum required distance for the lens to be able to focus. This usually starts at 5-10cm.

Typefaces

The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts drastically decrease the quality of the recognition.

OCR API

async createOcrEngine(options?: { mode: string }): Promise<OcrEngine>

Use this function to create Scanbot SDK OCR Engine. If your input text is a single line, use SINGLE_LINE mode.

Perform OCR

async recognizeURL(imageURL: string ): Promise<OcrData[]>

This function takes an array of images and performs Optical Character Recognition on each of the images. An array of OcrData is returned as a result:

interface OcrData {
text: string;
confidence: number;
boundingBox: Rect;
}

Example:

...
reader.readAsDataURL(file);

reader.onload = async (e) => {
const ocr = await scanbotSDK.createOcrEngine({ mode: "SINGLE_LINE" });
const result = await ocr.recognizeURL(reader.result);
console.log("ocr result: ", result);

await ocr.release();
}

Want to scan longer than one minute?

Generate your free "no-strings-attached" Trial License and properly test the Scanbot SDK.

Get your free Trial License

What do you think of this documentation?