Skip to main content

Using Generic Document Recognizer | Android Document Scanner

The Scanbot SDK provides the ability to detect various types of documents on the image, crop them and recognize fields data via Generic Document Recognizer.

Currently, Generic Document Recognizer supports the following types of documents:

  • German ID Card
  • German Passport
  • German Driver's license

As a result of scanning, the user gets the GenericDocumentRecognitionResult object, which contains a cropped document image and a GenericDocument object if the result of the scanning was successful. Each field is represented by the Field class, holding field's type, cropped visual source, recognized text and confidence level value.

There are two ways to integrate the component into the application:

Generic Document Recognizer classical component#

Try our Generic Document Recognizer Example For Live Detection, Generic Document Recognizer Example For Auto Snapping or check the following step by step integration instructions.

Step 1 - Add Generic Document Recognizer Feature as a Dependency#

GenericDocumentRecognizer is available with the SDK Package 3. You have to add the following dependencies for it:

api "io.scanbot:sdk-package-3:$latestSdkVersion"api "io.scanbot:sdk-genericdocument-assets:$latestSdkVersion"

It can be used both in conjunction with ScanbotCameraView (e.g. live detection for preview) and by itself for detection on a Bitmap or JPEG byte array.

Step 2 - Add desired blobs prefetching to SDK config#

The Generic Document Recognizer is based on the OCR feature of Scanbot SDK. Please check the Optical Character Recognition docs for more details.

In order to use the Generic Document Recognizer you need to prepare the German and English OCR language files. Place the deu.traineddata and eng.traineddata files in the assets sub-folder assets/ocr_blobs/ of your app.

Add a call to .prepareOCRLanguagesBlobs(true) methods for ScanbotSDKInitializer in your Application class:

override fun onCreate() {    super.onCreate()
    ScanbotSDKInitializer()            .license(this, licenseKey)            // TODO: other configuration calls            .prepareOCRLanguagesBlobs(true)            .initialize(this)}

Step 3 - Add ScanbotCameraView to layout#

<io.scanbot.sdk.camera.ScanbotCameraView    android:id="@+id/camera_view"    android:layout_width="match_parent"    android:layout_height="match_parent" />

Step 4 - get GenericDocumentRecognizer instance from ScanbotSDK, set the required document types, blurriness acceptance score and attach it to ScanbotCameraView#

val scanbotSdk = ScanbotSDK(this)
// Please note that each call to this method will create a new instance of GenericDocumentRecognizer// It should be used on a "single instance per screen" basisval genericDocumentRecognizer = scanbotSdk.createGenericDocumentRecognizer()genericDocumentRecognizer.acceptedSharpnessScore = 80f
// Uncomment to scan only ID cards and passports// genericDocumentRecognizer.acceptedDocumentTypes = listOf(//     RootDocumentType.DePassport,//     RootDocumentType.DeIdCardFront,//     RootDocumentType.DeIdCardBack// )
// Uncomment to scan only Driver's licenses// genericDocumentRecognizer.acceptedDocumentTypes = listOf(//     RootDocumentType.DeDriverLicenseFront,//     RootDocumentType.DeDriverLicenseBack// )
// To scan all the supported document types (default value)genericDocumentRecognizer.acceptedDocumentTypes = RootDocumentType.ALL_TYPES
val frameHandler = GenericDocumentRecognizerFrameHandler.attach(cameraView, genericDocumentRecognizer)

Step 5 - Add result handler for GenericDocumentRecognizerFrameHandler:#

Add frame handler which, for example, observes consecutive successful recognition statuses and shows toast whenever two or more such statuses are received.

frameHandler.addResultHandler(object : GenericDocumentRecognizerFrameHandler.ResultHandler {    private var successCounter = 0
    override fun handle(result: FrameHandlerResult<GenericDocumentRecognitionResult, SdkLicenseError>): Boolean {        val isSuccess = result is FrameHandlerResult.Success        when {            isSuccess && successCounter >= 2 -> {                // NOTE: 'handle' method runs in background thread                //   - don't forget to switch to main before touching any Views                runOnUiThread {                    Toast.makeText(                        this@MainActivity,                        "Document found!\nYou can now snap picture.",                        Toast.LENGTH_SHORT                    ).show()                }            }            isSuccess -> successCounter++            else -> successCounter = 0        }        return false    }})

Step 6 - Pass snapped picture to GenericDocumentRecognizer, process results#

First, decode image ByteArray obtained from camera's callback, taking into account image orientation. For this our ImageProcessor component can be used:

val imageProcessor = ScanbotSDK(this).imageProcessor()val resultBitmap = imageProcessor.processJpeg(image, listOf(RotateOperation(imageOrientation)))

Next, we perform a recognition:

val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)

As an example of further application, set obtained scan parameters to a TextView:

val myTextView = findViewById<TextView>(R.id.my_text_view)
val resultsMessage = "Recognition results:\n" +    "Recognition status: ${recognitionResult.status}\n" +    "Card type: ${recognitionResult.document.type}\n" +    "Number of fields scanned: ${recognitionResult.document?.fields?.size ?: 0}"
myTextView.text = resultsMessage

It is also possible to use the GenericDocumentWrapper successors to use strongly typed objects and conveniently get access to fields of the corresponding document.

To receive an instance of the scanned document wrapper, use GenericDocumentLibrary or wrap() extension function as follows:

val recognitionResult = genericDocumentRecognizer.scanBitmap(resultBitmap)
val genericDocument = recognitionResult.documentif (genericDocument != null) {    // Alternatively use GenericDocumentLibrary.wrapperFromGenericDocument(genericDocument)    when (val wrapper = genericDocument.wrap()) {        is DeIdCardFront -> {            val id = wrapper.id            val name = wrapper.givenNames            val surname = wrapper.surname            val pin = wrapper.pin        }        is DeDriverLicenseFront -> {            val id = wrapper.id            val name = wrapper.givenNames            val surname = wrapper.surname            val categories = wrapper.licenseCategories        }    }}

Generic Document Recognizer Ready to use UI component#

To integrate the RTU UI component for the Generic Document Recognizer, check our RTU UI example project or follow the steps below.

Please note: The main idea of the RTU UI is to provide simple-to-integrate and simple-to-customize Activity components. Due to this idea there are some limitations with the possibilities of customization. For extended customization flexibility, implementing custom Activities using our "Classical SDK UI Components" is required.

Step 1 - Add Generic Document Recognizer Feature as a Dependency#

GenericDocumentRecognizer is available with the SDK Package 3. You have to add the following dependencies for it:

api "io.scanbot:sdk-package-3:$latestSdkVersion"api "io.scanbot:sdk-genericdocument-assets:$latestSdkVersion"

Step 2 - Add desired blobs prefetching to SDK config#

Add OCR training data file (.traineddata) for the German language to the assets. See [[Optical Character Recognition]].

Add a call to .prepareOCRLanguagesBlobs(true) methods for ScanbotSDKInitializer in your Application class:

override fun onCreate() {    super.onCreate()
    ScanbotSDKInitializer()            .license(this, licenseKey)            // TODO: other configuration calls            .prepareOCRLanguagesBlobs(true)            .initialize(this)}

Step 3 - Create and customize the configuration object in your activity#

Ready to use UI component is an activity, which should be started via an intent with a corresponding configuration object.

val genericDocumentConfiguration = GenericDocumentRecognizerConfiguration()

The next step is to apply the parameters, which are required for customization (optional).

// Apply the color configurationgenericDocumentConfiguration.setTopBarButtonsInactiveColor(context.getColor(this, R.color.white))genericDocumentConfiguration.setTopBarBackgroundColor(context.getColor(this, R.color.colorPrimaryDark))...
// Apply the text configurationgenericDocumentConfiguration.setClearButtonTitle(context.getString(R.string.clear_button))genericDocumentConfiguration.setSubmitButtonTitle(context.getString(R.string.submit_button))...
// Apply the parameters for fieldsgenericDocumentConfiguration.setFieldsDisplayConfiguration(    hashMapOf(        // Use constants from NormalizedFieldNames objects from the corresponding document type        DePassport.NormalizedFieldNames.PHOTO to FieldProperties(            "My passport photo",            FieldProperties.DisplayState.AlwaysVisible        ),        MRZ.NormalizedFieldNames.CHECK_DIGIT to FieldProperties(            "Check digit",            FieldProperties.DisplayState.AlwaysVisible        )        ...    ))

Step 4 - Create an intent out of the configuration object and start the activity#

val intent = GenericDocumentRecognizerActivity.newIntent(this, genericDocumentConfiguration)startActivityForResult(intent, GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI)

Step 5 - Process the result in onActivityResult#

override fun onActivityResult(requestCode: Int, resultCode: Int, data: Intent?) {    super.onActivityResult(requestCode, resultCode, data)
    if (resultCode == Activity.RESULT_OK && requestCode == GENERIC_DOCUMENT_RECOGNIZER_DEFAULT_UI) {        // Get the ResultWrapper object from the intent        val resultWrappers = data.getParcelableArrayListExtra<ResultWrapper<GenericDocument>>(EXTRACTED_FIELDS_EXTRA)
        // For simplicity we will take only the first document        val firstResultWrapper = resultWrappers.first()
        // Get the ResultRepository from the ScanbotSDK instance        // scanbotSDK was created in onCreate via ScanbotSDK(context)        val resultRepository = scanbotSDK.resultRepositoryForClass(firstResultWrapper.clazz)
        // Receive an instance of GenericDocument class from the repository        // This call will also remove the result from the repository (to make the memory usage less)        val genericDocument = resultRepository.getResultAndErase(firstResultWrapper.resultId)
        // If you do not need to remove the result, use resultRepository.getResult(firstDocument.resultId)        // Please note that this repository does not use persistent storage and is based on an LRU Cache        // The repository will be cleared if the app process is terminated or the memory consumption        // is too high.
        ...    }}

It is also possible to use the GenericDocumentWrapper successors to use strongly typed objects and conveniently get access to fields of the corresponding document.

To receive an instance of the scanned document wrapper, use GenericDocumentLibrary or wrap() extension function as follows:

// Alternatively use GenericDocumentLibrary.wrapperFromGenericDocument(genericDocument)when (val wrapper = genericDocument.wrap()) {    is DeIdCardFront -> {        val id = wrapper.id        val name = wrapper.givenNames        val surname = wrapper.surname        val pin = wrapper.pin    }    is DeDriverLicenseFront -> {        val id = wrapper.id        val name = wrapper.givenNames        val surname = wrapper.surname        val categories = wrapper.licenseCategories    }}