Data Capture
MRZ Scanner
- MAUI
- .NET Android
- .NET iOS
Launches the MRZ scanner. The scanned data will be returned asynchronously.
loading...
Detect MRZ on a still image
The Scanbot SDK also provides a feature to detect the MRZ data from a still image. Please refer to the below MAUI code snippet.
loading...
loading...
loading...
loading...
EHIC Scanner
The Scanbot SDK detects and extracts data from European Health Insurance Cards.
- MAUI
- .NET Android
- .NET iOS
Launches the EHIC Scanner. The scanned EHIC card will be returned asynchronously.
The European Health Insurance Card Scanner is based on the OCR feature and thus requires the proper installation of the OCR language files deu.traineddata
and eng.traineddata
(aka. blob files).
For more details on how to set up OCR language files please refer to the OCR section.
loading...
Detect EHIC on a still image
The Scanbot SDK also provides a feature to detect the EHIC data from a still image. Please refer to the below MAUI code snippet.
loading...
loading...
loading...
loading...
Check Scanner
You can use the Check Recognizer UI to conveniently scan and extract data from checks.
- MAUI
- .NET Android
- .NET iOS
Launches the Check Recognizer UI. The scanned check will be returned asynchronously.
loading...
Detect Check on a still image
The Scanbot SDK also provides a feature to detect the Check data from a still image. Please refer to the below MAUI code snippet.
loading...
loading...
loading...
loading...
Optical Character Recognition
The Scanbot SDK provides simple and convenient APIs to run Optical Character Recognition (OCR) on images.
As result you can get:
- a searchable PDF document with the recognized text layer (aka. sandwiched PDF document)
- recognized text as plain text
- bounding boxes of all recognized paragraphs, lines and words
- text results and confidence values for each bounding box
The Scanbot OCR feature comes with two OCR engines: Legacy
and ML
. The Legacy
engine is based on the Tesseract
OCR engine with some modifications and enhancements. The ML
(machine learning based) engine was added later. It is much faster and more accurate, but it only supports languages with latin letters. Our recommendation is to use the ML
engine whenever possible and use the legacy engine only if you want to recognize text from non-latin languages like Arabian, Japanese, Chinese, Russian, Greek, Korean etc.
When using the Legacy
OCR engine for each desired OCR language, a corresponding OCR training data file (.traineddata
) must be provided. Furthermore, the special data file osd.traineddata
is required (used for orientation and script detection). The Scanbot SDK package contains no language data files to keep the SDK small in size. You have to download and include the desired language files in your app.
The newer ML engine does not require any language training data!
Preconditions to achieve a good OCR result
Conditions while scanning
A perfect document for OCR is flat, straight, in the highest possible resolution and does not contain large shadows, folds, or any other objects that could distract the recognizer. Our UI and algorithms do their best to help you meet these requirements. But as in photography, you can never fully get the image information back that was lost during the shot.
Languages
You can use multiple languages for OCR. But since the recognition of characters and words is a very complicated process, increasing the number of languages lowers the overall precision. With more languages, there are more results where the detected word could match. We suggest using as few languages as possible. Make sure that the language you are trying to detect is supported by the SDK and added to the project.
Size and position
Put the document on a flat surface. Take the photo from straight above in parallel to the document to make sure that the perspective correction does not need to be applied much. The document should fill most of the camera frame while still showing all of the text that needs to be recognized. This results in more pixels for each character that needs to be detected and hence, more detail. Skewed pages decrease the recognition quality.
Light and shadows
More ambient light is always better. The camera takes the shot at a lower ISO value, which results in less grainy photos. You should make sure that there are no visible shadows. If you have large shadows, it is better to take the shot at an angle instead. We also do not recommend using the flashlight - from this low distance it creates a light spot at the center of the document which decreases the recognition quality.
Focus
The document needs to be properly focused so that the characters are sharp and clear. The autofocus of the camera works well if you meet the minimum required distance for the lens to be able to focus. This usually starts at 5-10cm.
Typefaces
The OCR trained data is optimized for common serif and sans-serif font types. Decorative or script fonts drastically decrease the quality of recognition.
Download and Provide OCR Language Files
You can find a list of all supported OCR languages and download links on this Tesseract page.
⚠️️️ Please choose and download the proper version of the language data files:
- For the latest version of ScanbotSDK.MAUI OR ScanbotSDK.NET package -
Download the desired language files as well as the osd.traineddata
file and place them in the Assets sub-folder SBSDKLanguageData/
of your Android app or
in the Resources sub-folder ScanbotSDKOCRData.bundle/
of your iOS app.
- .NET Android
- .NET iOS
Assets/SBSDKLanguageData/eng.traineddata // english language file
Assets/SBSDKLanguageData/deu.traineddata // german language file
Assets/SBSDKLanguageData/osd.traineddata // required special data file
Resources/ScanbotSDKOCRData.bundle/eng.traineddata // english language file
Resources/ScanbotSDKOCRData.bundle/deu.traineddata // german language file
Resources/ScanbotSDKOCRData.bundle/osd.traineddata // required special data file
OCR API
- MAUI
- .NET Android
- .NET iOS
loading...
loading...
loading...
VIN Scanner
You can use the VIN Scanner UI to conveniently scan and extract vehicle identification numbers.
- MAUI
- .NET Android
- .NET iOS
loading...
loading...
loading...
loading...
Text Data Scanner
The Text Data Scanner recognizes text (OCR) within a user-defined rectangular area of interest, in consecutive video frames. A customizable block lets you clean up the raw string by filtering it against unwanted characters and OCR noise. Additionally, you can validate the result using pattern-matching or another block.
- MAUI
- .NET Android
- .NET iOS
loading...
loading...
loading...
loading...
License Plate Scanner
The Scanbot SDK provides the ability to scan car license plates and parse data fields. Scanning is currently limited to common EU license plates (country code on blue background on the left side).
- MAUI
- .NET Android
- .NET iOS
loading...
loading...
loading...
loading...
Generic Document Recognizer
The Scanbot SDK provides the ability to detect various types of documents on the image, crop them, and recognize the fields' data via the Generic Document Recognizer.
Currently, the Generic Document Recognizer supports the following types of documents:
- German ID Card
- German Passport
- German Driver's License
- German Residence Permit
The Generic Document Recognizer is based on the OCR feature and thus requires the proper installation of the corresponding
OCR language files (e.g. for English please add the file eng.traineddata
). For more details on how to set up the OCR language files
please refer to the OCR section.
- MAUI
- .NET Android
- .NET iOS
loading...
For API references please check:
GenericDocument: Document object from the scanned result.
GenericDocumentRootType: Supported Generic document types. Please also see GenericDocumentFormat.
Detect Generic Document on a still image
The Scanbot SDK also provides a feature to detect the Generic Document data from a still image. Please refer to the below MAUI code snippet.
loading...
loading...
loading...
For API references please check:
GenericDocument: Document object from the scanned result.
RootDocumentType: Supported Generic document types.
loading...
For API references please check:
SBSDKGenericDocument: Document object from the scanned result.
SBSDKUIDocumentType: Supported Generic document types. Please also see SBSDKGenericDocumentRootType.
Medical Certificate Scanner
The Scanbot SDK provides the ability to find and extract content from German Medical Certificates (MC / AU-Bescheinigung forms).
- MAUI
- .NET Android
- .NET iOS
loading...
For API references please check:
Detect Medical Certificate on a still image
The Scanbot SDK also provides a feature to detect the Medical Certificate data from a still image. Please refer to the below MAUI code snippet.
loading...
loading...
loading...
For API references please check:
loading...
For API references please check:
What do you think of this documentation?
What can we do to improve it? Please be as detailed as you like.