Web Optical Character Recognition - SDK Features
The Scanbot SDK plugin provides a simple and convenient API to run Optical Character Recognition (OCR) on images. The OCR feature is a part of the Scanbot SDK Data Capture Modules.
Preconditions to achieve a good OCR result
Conditions while scanning
A perfect document for OCR is flat, straight, in the highest possible resolution and does not contain large shadows, folds, or any other objects that could distract the recognizer. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.
Languages
You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you are trying to detect is supported by the SDK and added to the project.
Size and position
Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction does not need to be applied much. The document should fill most of the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.
Light and shadows
More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it is better to take the shot at an angle instead. We also do not recommend using the flashlight - from this low distance it creates a light spot at the center of the document which decreases the recognition quality.
Focus
The document needs to be properly focused so that the characters are sharp and clear. The autofocus of the camera works well if you meet the minimum required distance for the lens to be able to focus. This usually starts at 5-10cm.
Typefaces
The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts drastically decrease the quality of the recognition.
OCR API
async createOcrEngine(options?: { mode: string }): Promise<OcrEngine>
Use this function to create Scanbot SDK OCR Engine. If your input text is a single line, use SINGLE_LINE
mode.
Perform OCR
async recognizeURL(imageURL: string ): Promise<OcrData[]>
This function takes an array of images and performs Optical Character Recognition on each of the images. An array of OcrData
is returned as a result:
interface OcrData {
text: string;
confidence: number;
boundingBox: Rect;
}
Example:
...
reader.readAsDataURL(file);
reader.onload = async (e) => {
const ocr = await scanbotSDK.createOcrEngine({ mode: "SINGLE_LINE" });
const result = await ocr.recognizeURL(reader.result);
console.log("ocr result: ", result);
await ocr.release();
}
Want to scan longer than one minute?
Generate a free trial license to test the Scanbot SDK thoroughly.
Get your free Trial LicenseWhat do you think of this documentation?
What can we do to improve it? Please be as detailed as you like.