Skip to main content

Web OCR Module

The Scanbot SDK's OCR Engine can transform written text into machine-readable data – both from still images and a live camera stream. It is the backbone of the SDK's Data Capture Modules, enabling fast and accurate data extraction from various document formats.

The Scanbot SDK for Web provides a simple and convenient API to run Optical Character Recognition (OCR) on images. As a result, you get:

  • recognized text as plain text,
  • bounding boxes of all recognized paragraphs, lines and words,
  • text results and confidence values for each bounding box.

The OCR feature is based on the Scanbot OCR Engine created and polished by the Scanbot SDK team to provide the best text recognition speed and quality for our users.

Preconditions to achieve a good OCR result

Conditions while scanning

A perfect document for OCR is flat, straight, in the highest possible resolution and does not contain large shadows, folds, or any other objects that could distract the recognizer. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.

Size and position

Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction does not need to be applied much. The document should fill most of the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.

Light and shadows

More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it is better to take the shot at an angle instead. We also do not recommend using the flashlight – from this distance, it creates a light spot at the center of the document which decreases the recognition quality.

Focus

The document needs to be properly focused so that the characters are sharp and clear. The autofocus of the camera works well if you meet the minimum required distance for the lens to be able to focus. This usually starts at 5–10 centimeters (approx. 2–4 inches).

Typefaces

The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts drastically decrease the quality of the recognition.

OCR API

Use async createOcrEngine(): Promise<OcrEngine> to create the Scanbot SDK OCR Engine.

Performing OCR

With async performOcr(image: Image): Promise<Page> and async recognizeURL(imageURL: string ): Promise<Page>, you can take an array of images and perform Optical Character Recognition on each images.

A Page object is returned.

Example:

...
reader.readAsArrayBuffer(file);

reader.onload = async (e) => {
const engine = await scanbotSDK.createOcrEngine();
const buffer = reader.result as ArrayBuffer;
const image = ScanbotSDK.Config.Image.fromEncodedBinaryData(buffer);
const result = await engine.run(image);
engine.destroy();
console.log("ocr result: ", result);
}

Want to scan longer than one minute?

Generate a free trial license to test the Scanbot SDK thoroughly.

Get free trial license